Signals User Guide

Signals is a powerful, yet easy-to-use augmented intelligence platform.

  • Signals’ Analytics Engine leverages unsupervised deep machine learning to identify statistically significant topics within any unstructured data.
  • Signals is completely web-based. The only software required is a modern web browser such as Google Chrome, Mozilla Firefox, Microsoft Edge (Microsoft IE is not recommended).
  • Signals’ web interface is point-and-click. No programming knowledge is required.

This guide covers every feature and function within Signals, if you’re looking for information about a specific topic, jump to a chapter in the left nav bar.

For more information about Signals or Stratifyd, please visit our website.

The chapters below go into detail about everything from getting data, to analyzing it, to sharing your newly gleaned insights.

Quick Start Guide

By the end of this section you’ll have access to data of your interest, and a dashboard that allows you to analyze and navigate through structured and unstructured data.

Create an account

Begin by Creating an Account. Once you’ve logged in, check out some of the pre-built public dashboards in the folders on the homepage.

_images/examplefolders.png

Create your first dashboard

To create your own dashboard, click “Create a new Dashboard”.

_images/newdashboard.png

Connect to data

Connect to your database, upload a CSV or Excel file, or use one of our Data Connectors for easy access to a wealth of publicly available data. Add as many data source to your dashboard as needed.

_images/dataconnectors.png

If you’re bringing your own data, specify which fields contain Textual, Temporal, or Geographical information for analysis and normalization. All other fields will be brought in as pivot points and additional visualizations.

_images/csvorexcel.png

Note

See the Mapping Data guide for more details when uploading CSV or Excel Files

Create visualizations

Your dashboard will contain visualizations based on the type of data in your analysis by default. See the guide on Processing Data for information on how to interpret these visualizations. Every visualization is clickable, and clicking will apply a filter on your data based on your selection.

_images/signalsgif1.gif

Now you’re ready to Create Your Own Widgets using your other datapoints to home in on interesting segments within your data and discover topics and themes within your textual data.

Find meaningful insights

Let the visualizations guide your analysis. Click on areas of interest to drill-down into subsets of data

Click on the “Data” tab to view the original documents. If you’ve filtered down to a subset of your data in your dashboard, you’ll see only those documents.

For more information see the chapter on Analyzing Data.

Share your results

Make your dashboard public or share with colleagues.

Connecting to Data

There are three categories of Data Connectors in Signals:

Enterprise Data Connectors are used to connect to internal databases or other 1st party data platforms in your organization.

3rd Party Data Connectors provide access to data owned by your organization but stored in 3rd party systems. Examples include salesforce, Google Analytics, and Zendesk among many others.

Public Data Connectors allow you to access data that anyone can view on the internet, but in a uniform format. We collect data from Amazon, Facebook, the Google Play store, consumeraffairs.com, and many other sites.

You can also upload a csv or excel file from your local computer.

_images/csvorexcel.png

Note

Enterprise Data Connectors and some 3rd Party Data Sources require you to map fields in your data prior to uploading. See Mapping Data to learn about this process.

Enterprise Data Connectors

The Enterprise Data Connector application allows users to connect with a variety of internal data sources.

Examples include:

  • Hive
  • mySQL
  • Microsoft SQL Server
  • PostgreSQL
  • Oracle SQL

3rd Party Data Connectors

3rd Party Data Connectors usually have a pop-up window that allows you to enter your credentials with the 3rd party in order to authenticate with Signals.

Note

Make sure you have pop-up blocking disabled when connecting with a 3rd Party Data Connector

salesforce

_images/salesforce.png

Foresee

_images/foresee.png

Surveymonkey

_images/surveymonkey.png

Gmail

Gmail allows you to connect with a Gmail account and enter a query that will pull data matching the query.

_images/gmail.png

Livechat

The LiveChat connector allows you to select a date range and the type of user you want to analyze chats for: Agent, Visitor, or both.

_images/livechat.png

Intercom

JIRA

_images/jira.png

With the JIRA Data Connector, you can select which projects you want to analyze. You can also specify a ticket type if desired. If nothing is specified, Signals will pull all of your JIRA data.

UserVoice

The UserVoice data connector requires a login and api key to connect to your uservoice data. The permissions associated with the login correspond with what data will be available through the connector.

_images/uservoice.png

Users can specify which objects you want to analyze.

Zendesk

Google Analytics

Trello

Public Data Connectors

iOS Store Reviews

_images/iosstore.png

Google Play Store Reviews

Home Depot Product Reviews

_images/homedepot.png

Lowes Product Reviews

_images/lowes.png

Best Buy Product Reviews

_images/bestbuy.png

Wal Mart Product Reviews

_images/walmart.png

Etsy Shop Reviews

Etsy Product Reviews

Amazon Product Reviews

_images/amazon.png

Twitter

_images/twittersearch.png _images/twitteruser.png

Facebook

_images/facebook.png

Youku

YouTube Comments

_images/youtube.png

Consumer Financial Protection Bureau

_images/cfpb.png

Indeed

_images/indeed.png

Consumer Affairs

_images/consumeraffairs.png

Signals SDK

The Signals SDK should be used to upload files more than 500 MB.

It can also be used to develop custom connections and/or schedule uploads.

We offer a node.js wrapper for our SDK. https://www.npmjs.com/package/signals-api

To generate an API key:

  1. go to Settings in the left-hand menu from the Signals homepage.
  2. In the Settings page, click the button to generate an api key towards the bottom.
  3. This API key will be generated and downloaded in JSON format to be included in your application.

After the API key has been generated, you can always download it again from the Settings page.

_images/apikey.png

The “Revoke” button will immediately invalidate the API key from further use.

Note

API keys can only be generated by authenticated users. All API keys are tied exclusively to a single user account.

Mapping Data

Signals can be used to analyze any type of data; structured or unstructured.

However, there are 4 types of data that Signals treats differently to give users more information. Those data types are:

  1. Textual - e.g. product review, customer feedback, news article text, etc.
  2. Temporal - a date, time, timestamp etc.
  3. Geographical - location, lat/lon, ip address, etc.
  4. Contributor - a 1:1 identifier for the creator of the textual data e.g. name, email, GUID, etc.

Data mapping occurs when a new data source is introduced to the platform. CSV, Excel files, Enterprise Data Connectors and certain 3rd Party Data Connectors require this before analysis.

On the left side of the Map Fields menu, you’ll see every field from your data source. Click on the + icon to identify any of the 4 data types mentioned earlier.

_images/mapdata2.png

When you’ve identified the fields, click next and this configuration will be used during processing in the Signals Analytics Engine.

_images/mapdata3.png

Textual

The textual data is the data that you want to run unsupervised machine learning to generate buzzwords, a topic model, and sentiment analysis upon. The next section, on Processing Data, discusses the outputs of this process.

Temporal

Identifying where you temporal data is within your dataset allows the Signals Analytics Engine to normalize dates, allowing you the flexibility to roll-up data on different granularities like yearly, monthly, daily, hourly, etc.

Temporal data can be represented in many different formats:

  • If your data has a field that contains Day, Month, Year, and/or time in any order, you can simply select the “Date” option in the Map Fields menu.
  • If your Day, Month and Year are all contained in separate columns, use the corresponding Map Fields menu items for each to indicate this.
  • If your date is represented in some other format, select “Date” and the Signals Analytics Engine will do a best-guess to normalize the data.

Geographical

The Signals Analytics Engine will create a geographical hierarchy based on the input you provide, giving you access to compare your data at any level, whether it’s by country, region, or zip code.

If your data contains multiple geographical or spatial data points, you can indicate where that data resides using the Map Fields menu.

Some of the Geographical options are absolute and therefore don’t need additional information:

  • For example if your data has a Lat/Long field, Long/Lat field, or one Latitude and one Longitude field, there is no need to specify country, city, or any other geographical information.
  • Same goes for IP address and phone number.
  • Street, City, State, and Country however, are not 100% by themselves, so map as much information as possible if your geographical data is not absolute.

Contributor

Contributor data is used to tie all the insights back to an individual to help organizations close the loop with their customers.

This field should represent the person or entity that generated the record or document in your data.

Processing Data

When a dashboard is created with new data, the data goes through the Signals Analytics Engine for processing. This chapter describes what occurs as the engine is processing data. The output is described in the Analyzing Data chapter.

_images/journey.png

The sections below discuss how the output of the Signals Analytics Engine can be accessed and used.

Note

To learn how to tune the analytics engine, see the Advanced Options page.

Signals Analytics Engine

The Signals Analytics Engine leverages Machine Learning and Deep Learning algorithms to help navigate and pivot a large set of textual data.

Built on top of our proprietary Bayesian Neural Network and Generative Model, Signals dynamically identifies semantic topic groups based on the context in your input data.

This is all done in a three-step process:

  1. The engine starts by performing NLP in over 24 languages.
In this step, your input documents are tokenized into corresponding N-Grams (N>=2), lemmatized (words with the same root are grouped together e.g. run & ran), stemmed, spam/junk and stop words are filtered out, and part-of-speech tagging and named entity extraction are performed. A large N-Gram-based content network is then created based on your input data files.
  1. The engine runs a Multi-Model approach on top of the N-Gram-based content network.

This includes using our proprietary text analytics algorithms extended from Bayesian Neural Network, Generative Model, LSTM (Long Short Term Memory), and Seq2Seq NLU. In this step, data input is clustered into semantically meaningful groups.

The groups are generated and visualized by statistical significance (i.e. the percentage attributed to each topic category in the Semantic Topic Visualization). Each topic is tagged with top representative terms in Buzzwords.

  1. Signals automatically processes all geographical (Where), temporal (When), contributor (Who), as well as any other structured data.
It joins the data with the N-Gram-based content network for you to pivot and construct analytics questions against your dataset.

Reprocessing

There are a number of Advanced Options that can be applied to customize the output of the Analytics Engine. When applying advanced options your data will need to be reprocessed. This can either be done directly through Edit Mode, or through clicking the reprocess button on your dataset under Manage Data.

Creating a Dashboard

_images/signalsgif1.gif

The dashboard interface is extremely flexible, allowing dashboards to be used for both passive monitoring as well as for active deep-dive or root-cause analysis.

Dashboards are interactive and allow you to visualize and pivot your data on multiple facets, analyzing them holistically or granularly. A Dashboard can hold multiple datasets for disparate data sources.

Dashboards are made up of widgets, or custom visualizations of your data. A dashboard can be split into multiple tabs to create focus areas within your dashboard. The level of connectedness between tabs is customizable. Interactions can affect the entire dashboard, only certain tabs, or just the tab you’re on. See the Dashboard Tabs page to learn more about this.

To create a new dashboard, from the homepage, click on the New Dashboard icon:

_images/newdashboard.png

Create Your Own Widgets

The widget editor is where you will create the building blocks of your dashboard.

To begin creating a new widget, click on the new widget button: newwidget

There are two entry points to begin creating a widget:

  1. You can start by choosing the Dimension or Metric you wish to visualize, or
  2. You can start by choosing the Visualization type you want to use.
_images/blankeditor.png

Visualizations

_images/vizzes.png

Each visualization supports certain combinations of data types. By selecting a viz first, your available dimensions and metrics will be highlighted.

Options

The options tab will change based on your current visualization.

This is where you can do aggregations, sorting, filtering, and other operations on the data you’ve selected.

Widget-specific options will also be here, such as the style of map on a geographical widget.

Color

The color tab lets you choose what conditions to color the visualization on.

Select the dimension or metric you want to separate by color, and then choose a method.

Options are:

  • Single Color
  • Gradient Color (diverging colors based on a range of metrics)
  • Palette (pick a color for each dimension)
  • Range Palette (pack a color for a range of metrics)
_images/rangepalette.png

Font

If there is text in your widget, you can specify a custom font in the font tab

Style

The style of an individual widget can be changed here. This will override any customizations on style you’ve made at the dashboard level.

To see how to change style at the dashboard level, see the Dashboard Styles page.

Dashboard Tabs

_images/tabs.png

A dashboard can contain as many or as few tabs as desired. Tabs are there to help the designer organize visualizations into logical or intuitive groups based on the analysis context.

To add a tab, simply click the newtab button in the tab menu:

Options can be set at a tab level by hovering over the menu icon on your tab:

_images/taboptions.png

Clicking on the edit button will bring up the Tab Detail menu:

_images/tabdetail.png

By default, the filters from every tab are tied together. Users have the option to set a local query to merge with global queries or to override global queries with a local query. In next chapter, Analyzing Data, this is discussed in detail.

Dashboard Styles

The aesthetics of any dashboard can be customized. To see options, click on the Customize Style button in the Edit menu:

_images/editmenu.png

All of the options in the Dashboard Style menu are there to make the dashboard feel like your own:

_images/dashboardstyle.png

Dashboard Options

Once you’ve created a dashboard, you can Permission Levels, delete, or modify properties by clicking on dashboard menu:

_images/dashboardmenupermission.png

Dashboard Properties

This is where you can add tags, an image, or even a video, as well as see who else has access.

_images/dashboardproperties.png

Note

If a dashboard has been shared with you, you might see a dashboard menu that looks like this:

_images/dashboardmenunopermission.png

This means you do not have permission to share with others. For more on permissions, see Administration

Analyzing Data

This chapter covers how to interpret the data generated by the Signals Analytics Engine as well as how to interact with your data to find meaningful insights.

Analytics Engine Output

The Signals Analytics Engine generates new datapoints based on your unstructured data. This information is represented in the following way.

Buzzwords

The Signals Analytics Engine reads through every piece of text in a dataset and compares every possible combination of words.

The Buzzwords visualization is made of every statistically significant N-Gram found in the textual data. In order for an N-Gram to be deemed statistically significant, we look at how often they occur together in the dataset vs. how often they occur individually.

_images/cloud.png

A word-cloud view of N-Grams

_images/buzzwordlist.png

N-Grams in a detail list visualization

Topics

Topics are generated by performing unsupervised machine learning on top of the N-Grams to determine hidden themes and group the documents accordingly.

Any document can occur in more than one topic, therefore the % of documents contained in each topic will add up to > 100%

The Semantic Topics visualization can be represented in a donut chart or a network graph.

_images/topics.png

The words around each slice are just the top one or two N-Grams included in that topic. Each topic is represented by a slice of the donut chart. The order (starting at 12:00 and moving clockwise) and size of the slice is determined by the statistical relevance or “tightness” of that topic. The color of the slice represents the sentiment.

_images/topicnetwork.png

Semantic Topics represented in a network graph of related N-Grams. Each bubble represents an N-Gram, the size indicating the count, and the color indicating the sentiment. The lines show the level of connection, or co-ocurrence, between two N-Grams.

Contributors

Contributors help to identify an individual’s influence in a data set.

Name: a list of unique identifiers mapped as “name” Count: the # of feedback from an individual or unique ID. Sentiment: The aggregated sentiment score on all the documents contributed by a contributor.

_images/contributors.png

Interactions and Filters

Every interaction in the dashboard essentially applies a filter on your data. This method is designed to help you access insights learned from your unstructured or structured data, and tie it to other dimensions.

When interacting, the filters you’ve applied will appear at the bottom of the page next to the name of the dataset the filter is applied to.

_images/filterbreadcrumb.png

Clicking on the name of the filter will remove it.

Filters can also be saved to a dashboard and can be set on one of three levels:

  1. Dashboard
  2. Tab
  3. Widget

Dashboard-Level Filters

To set a filter on the entire dashboard, click the filterdashboard icon in the bottom left corner inside your dashboard. This is the “global” filter panel for your dashboard.

Filters set from here will not persist when you leave the dashboard.

Tab-Level Filters

Tab level filters can be applied on a per-tab basis. To do this, click the taboptions icon and select one of the two methods:

  1. Merge with the current query will allow you to set a permanent filter, but still interact with the data. Interactions with other tabs can still affect the data.
  2. Override the current query will apply the filter selections you make and freeze the visualizations so that the tab is no longer interactive. Selections on other tabs will not affect the data.

The main datapoints are available for selection from the Tab Detail page. Other datapoints can be accessed by clicking on the “+” icon next to Extra Structured Vis Parameters

_images/fulltabdetail.png

Widget-Level Filters

Widget-level filters will override all other filters for that widget.

To apply a widget-level filter, you can either click on the funnel icon in the widget settings menu from your dashboard view, or you can apply a filter from the Create Your Own Widgets in the Options tab.

_images/filterwidget.png

Widget Settings Menu

Advanced Options

Advanced options allow users to customize the way data is processed. These options are not always needed, but there are some scenarios in which they are very useful. For example:

Stopwords

Signals provides an out-of-the-box list of stopwords that include the main commonly used non-informative words such as: “the”, “an”, “I”, etc.

You can create additional lists that can be applied as needed to cut through noise in your textual data on a per-data-source basis.

Typically a brand new data-source is run with the default stopword list, noisy signals can be easily identified in the dashboard and our in-dashboard editor (below) allows you to select N-Grams, topics, contributors, or other features to suppress.

_images/stopwordeditor.gif

For example, RSS news feeds typically contain the same few sentences at the end of every article:

Reporting By Laila Bassam in Aleppo and Tom Perry, John Davison and Lisa Barrington in Beirut;

Writing by Angus McDowall in Beirut, editing by Peter Millership

Appears at the end of every news article in some publications. “Reporting By” and “Writing By” will likely be identified as Buzzwords. These terms could potentially link unlrelated documents since they aren’t related to the article topics.

To suppress the noise caused by these terms, in the edit menu, click “Tune up data” and select the terms that are non-informative.

_images/tunedata.png

When finished, click submit and your data will begin reprocessing with the feedback you’ve provided.

Junk/Spam

Creating a Junk/Spam list can be useful in many different datasets

For example in Twitter data if there is any spam, it will quickly be identified through the visualization. Users can choose to filter out the spam and reprocess the data, indicating to the Analytics Engine that the matching documents should be ignored when processing (generating buzzwords and categories).

Note

Spam is typically identified in two ways:

  1. A specific user in your dataset is posting junk content - in this case, you can simply select the user to ignore.
  2. A specific message has gotten picked up and re-posted by many different users - in this case, you can select the text that is unique to that message and it will be filtered out based on the content.

This is just one example, but as you can see it is generalizable to many other data-types.

Sentiment

Signals Provides an out-of-the-box sentiment package that is trained on a breadth of data sources however, you have the ability to customize sentiment to tailor results to your specific use case.

Certain terms are generally neutral, but when used in the context of a specific data-set or industry, always have a sentiment associatd with them. Or vise-versa, some terms are very negative generally, but are actually commonly used industry lingo among certain data-sets.

Taxonomy

Signals supports importing existing taxonomies or you can easily build your own from the ground up. The in-dashboard taxonomy editor allows you to tweak your taxonomy as you analyze your data. Drag and drop buzzwords into your label logic to increase your coverage of existing categories, or as you find new categories.

_images/taxonomyeditor.png

Creating an Account

To register for a trial account on Signals, go to https://signals.stratifyd.com and chat with a support specialist who will help you start your trial period.

_images/loginchat.png

Existing Customers: If your organization has already purchased Signals, you can simply go to

https://[your-organization-name].stratifyd.com

and create an account with your company e-mail and password.

Should you have any questions, please send a note to our Client Support Team or clientsupport@stratifyd.com.

Administration

Deploying Signals

A Cloud Deployment is our most common method of deployment. Among other services, Amazon Web Services S3 and EC2 are used to provide an optimal experience in computing and visualization.

Contact us for information on alternative deployment methods.

Account Info

Your account info and management console is all available on the Settings page.

_images/settings.png

Permission Levels

Permissions on any asset (dashboard, stopword list, taxonomy, etc.) are set by the creator of that object.

Can View users can view and use the asset but cannot permanently modify it.

Can Edit can view and modify the asset, but cannot share it with others.

Can Share can edit and share the asset with others, but cannot remove others’ access.

Owner can edit, share, and add or remove users with less privilege.

User Groups

Group management can be done from the Groups page on the main nav bar.

There are 4 levels of access at a group level:
  1. Group Admin
  2. Admin
  3. Can Edit
  4. Can View

The creator of a group is automatically a Group Admin for the group.

Group Admin users:

  • have a maximum* access level of Owner to any asset shared with the group.
  • can add users to the group and give them Group Admin rights or lower to the group.
  • can remove users with less permission (Admin, Edit, and View Only users)

Note

A group can only have one Group Admin. Group Admins can transfer ownership to another user, but this action demotes the transferer to Admin

Admin users:

  • have a maximum* access level of Owner to any asset shared with the group.
  • can add users to the group and give them Admin rights or lower to the group.
  • can remove users with less permission (Edit and View Only users)

Can Edit users:

  • have a maximum* access level of Can Edit to any asset shared with the group.
  • cannot add or remove other users from the group.

Can View users:

  • have a maximum* access level of Can View on any asset shared with the group.

*when sharing an asset with a group, the sharer dictates the level of permission granted to the group. The lesser privilege (between permission granted to the asset and permission within the group) will then take precedence for each user. For example if a user grants Admin permissions on a dashboard to a group, group members with Can View in the group will only receive Can View on that object.

Release Notes

Version 2.5 coming early February

FAQ

Where does the name “Stratifyd” come from?

Stratifyd is a unique way of spelling Stratified Sampling, which is a statistical method that illustrates the peeling of layers and layers of data to derive a proper statistical result. We took the name Stratifyd to emphasize next-gen Data Analytics that will be mathematically sound and interpretable by humans. This truly reflects our company mission on providing an Augmented Intelligence environment that leverages Artificial Intelligence to Augment Human’s Decision Making process.

For more reading on Stratified: In statistics, stratified sampling is a method of sampling from a population. In statistical surveys, when subpopulations within an overall population vary, it is advantageous to sample each subpopulation (stratum) independently.

Is there a limit to how much data I can upload or analyze?

Trial accounts are limited to 30MB uploads.

While there is no limit on Enterprise accounts, the web interface supports up to 500MB of data. Larger files should be uploaded using a Data Connector or via the Signals SDK.

How many languages does Signals support?

Signals supports all NLP and Text Analytics functions natively in 25 languages including English, Chinese, Japanese, Spanish, Italian, German, and French.

How do I interpret the word cloud and topic wheel?

Refer to the section in Analyzing Data on Buzzwords and Topics

Comprehensive FAQ page at http://help.stratifyd.com

Contact

United States

Silicon Valley

China

London

Misc.

Signals for Students

We are a group of data scientists and PhDs from universities, thus we support students expanding their analytics horizon by providing free access to advanced technologies.

Feel free to drop us a line with your EDU account at clientsupport@stratifyd.com. Our specialists will be more than happy to create an account for you.

If you are a lecturer or professor who wants to use Signals in your classroom, please contact us and we will be happy to arrange licenses for your students. Feel free to drop us a line at clientsupport@stratifyd.com.