Auf der Tableau Conference 2015 wurde eines besonders deutlich: Datenanalyse ist unverzichtbar und relevant für alle Unternehmensbereiche. Daten für alle Mitarbeiter im Unternehmen nutzbar machen, nennt man auch Daten-Demokratisierung. Damit können dann Entscheidungen getroffen werden, die auf konkreten, leicht verständlichen und auf das Unternehmen bezogenen Daten basieren.
Daten werden somit zum Rohstoff der Zukunft eines jeden Unternehmens!
Daten-Demokratisierung funktioniert nur dann, wenn Daten mit Self-Service-Tools, wie Tableau, zugänglich gemacht werden, sodass die Daten auch ohne tiefes technisches Wissen analysiert werden können. Neben klassischen Datenquellen wie ERP- und CRM-Systemen entstehen auch in anderen Bereichen relevante Daten, z.B. dateibasierte Formate wie Excel, oder Cloud-Dienste wie Microsoft Azure.
Today I’d like to follow up on this and show how to implement sentiment analysis in Tableau using Tableau’s R integration. Some of the many uses of social media analytics is sentiment analysis where we evaluate whether posts on a specific issue are positive, neutral, or negative (polarity), and which emotion in predominant.
What do customers like or dislike about your products? How do people perceive your brand compared to last year?
The sentiment package requires the tm and Rstem packages, so make sure that they are installed properly. Execute these commands in your R console to install sentiment from GitHub (see alternative way to install at the end of this blog post):
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
The sentiment package offers two functions, which can be easily called from calculated fields in Tableau:
The function get_polarity returns „positive“, „neutral“, or „negative“:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
The function get_emotion returns „anger“, „disgust“, „fear“, „joy“, „sadness“, „surprise“, or „NA“:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
The sentiment package follows a lexicon based approach and comes with two files emotions_english.csv.gz (source and structure) and subjectivity_english.csv.gz (source and structure). Both files contain word lists in English and are stored in the R package library under /sentiment/data directory.
If text is incorrectly classified, you could easily fix this issue by extending these two files. If your aim is to analyze text other than English, you need to create word lists for the target language. Kindly share them in the comments!
Feel free to download the Packaged Workbook (twbx) here.
Update 11 Aug 2016: If you are having trouble with install_github, try to install directly form this website:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
Tableau is an incredibly versatile tool, commonly known for its ability to create stunning visualizations. But did you know that with Tableau, you can also perform real-time, interactive text mining? Let’s delve into how we can harness this function to gain rapid insights from our textual data.
Previously, during text mining tasks, you might have found yourself reaching for a scripting language like R, Python, or Ruby, only to feed the results back into Tableau for visualization. This approach has Tableau serving merely as a communications tool to represent insights.
However, wouldn’t it be more convenient and efficient to perform text mining and further analysis directly in Tableau?
While Tableau has some relatively basic text processing functions that can be used for calculated fields, these often fall short when it comes to performing tasks like sentiment analysis, where text needs to be split into tokens. Even Tableau’s beloved R integration does not lend a hand in these scenarios.
The Power of Postgres for Text Mining in Tableau
Faced with these challenges, I decided to harness the power of Postgres‘ built-in string functions for text mining tasks. These functions perform much faster than most scripting languages. For example, I used the function regexp_split_to_table for word count, which takes a piece of text (like a blog post), splits it by a pattern, and returns the tokens as rows:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Incorporating Custom SQL into Tableau Visualization
I joined this code snippet as a Custom SQL Query to my Tableau data source, which is connected to the database that is powering my blog:
And here we go, I was able to create an interactive word count visualization right in Tableau:
This example can be easily enhanced with data from Google Analytics, or adapted to analyze user comments, survey results, or social media feeds. The possibilities for Custom SQL in Tableau are vast and versatile. Do you have some more fancy ideas for real-time text mining with Tableau? Leave me a comment!
Update (TC Pro Tip): Identifying Twitter Hashtags in Tableau
A simple calculated field in Tableau can help identify words within tweets as hashtags or user references, eliminating the need for another regular expression via a Custom SQL Query:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Any more feedback, ideas, or questions? I hope this post provides you with valuable insights into how to master text mining in Tableau, and I look forward to hearing about your experiences and creative applications. You can find more tutorials like this in my new book Visual Analytics with Tableau (Amazon).
Transparency: This blog contains affiliate links. If you click on them, you will be redirected to the merchant. If you decide to make a purchase, we will receive a small commission. The price does not change for you. Affiliate links have no influence on our writing.
In the recent months, 800 automotive executives from 38 countries gave their insights to KPMG. You can discover the key highlights of the KPMG Global Automotive Executive Survey in this eye-catching interactive Tableau story.
This is a fabulous example how you can use stories to present a narrative to an audience. Just as dashboards provide spatial arrangements of analysis that work together, stories present sequential arrangements of analysis that create a narrative flow for your audience.
This YouTube tutorial shows you a handy way to load your Excel data to Cloudera Hadoop with Alteryx, and how to see and understand your data even faster with Tableau connected to Impala.
The same tool chain to load and access data can be used with Hive (eg. on Hortonworks) or Spark SQL (eg. on MapR). A overview on common data process technologies can be found in the Big Data jungle guide.
We use cookies to optimize our website and our service.
Functional
Immer aktiv
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes.The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.