How to perform Text Mining at the Speed of Thought directly in Tableau?

Interactive real-time text mining with Tableau 9.2
Interactive real-time text mining with Tableau 9.2

When I was doing text mining, I was often tempted to reach out for a scripting language like R, Python, or Ruby – and then I feed the results into Tableau. Tableau served as a communications tool to represent the insights in a pleasant way.

Wouldn’t it be handy to perform text mining and further analysis at the speed of thought directly in Tableau?

Tableau has some relatively basic text processing functions that can be used for calculated fields. This is, however, not enough to perform text mining such as sentiment analysis, where it is required to split up text in tokens. Also Tableau’s beloved R integration will not help in this case.

As a workaround, I decided to use Postgres’ built-in string functions for such text mining tasks, which perform much faster than most scripting languages. For the following word count example, I applied the function regexp_split_to_table that takes a piece of text (such as a blog post), splits it by a pattern, and returns the tokens as rows:

I joined this code snippet as a Custom SQL Query to my Tableau data source, which is connected to the database that is powering my blog:

Screenshot 2016-01-14 15.34.46

And here we go, an interactive word count visualization:

 

This example could be easily enhanced with data from Google Analytics, or altered to analyse user comments, survey results, or social media feeds. Do you have some more fancy ideas for real-time text mining with Tableau? Leave me a comment!

[Update 19 Jan 2016]: How to identify Twitter hashtags? Do I need another RegEx?

Another regular expression via a Custom SQL Query is not required for identifying words within tweets as hashtags. A simple calculated field in Tableau will do the job:

Looking for an example? Feel free to check out the Tweets featuring #tableau Dashboard on Tableau Public and download the Packaged Workbook (twbx):

Tweets featuring #tableau Dashboard

Any more feedback, ideas, or questions?

  • Karen Hall

    Thanks for sharing, that’s exactly what I’m looking for! I am a newbie in tableau and just install tableau desktop on my local Mac machine.

  • Jennifer Schmidt Pritchard

    Do you have thoughts on how you could take it a step further and do the sentiment analysis with this minimal setup? Seems like if Tableau could offer an optional plugin or something for this, that’d be a game changer.

  • Sure. There are R packages that you could use directly with Tableau’s R integration: qdap (https://cran.r-project.org/web/packages/qdap/index.html) or sentiment (https://cran.r-project.org/src/contrib/Archive/sentiment/). Using one of these in a calculated field should work well. Sample how your calc should look: https://gist.github.com/aloth/eb4e59d1b45826db5e8a

  • NZH

    Thanks for sharing, an elegant solution, I definitely will try it! I also played with something similar but with scrapy and python clustering algorithms and Tableau. It detects opinion clusters based on tf-idf: https://public.tableau.com/profile/nezach#!/vizhome/OpinionClusters/Dashboard2

  • Nice, dashboards! It would be very interesting to embed the algorithm calls directly to the database. This would allow even more interactivity (modifying parameters by user) and would always reflect latest data. I’ll try this with MADlib (http://madlib.net/) on Postgres or Pivotal once I’ll find some time.

  • Jennifer Schmidt Pritchard

    Perfect, thanks!

  • NZH

    Thanks! 🙂 And also let us know what you found, sound really interesting…! Extending Tableau’s capabilities in such ways would be a great thing to see.

  • Bora Beran

    Hi Jennifer,
    I am the product manager at Tableau whose team is responsible for the R integration feature among many other things. This was one of the demanded topics when we launched the feature in Tableau 8.1 so I have written about it back in 2013. You can find the details here

    https://boraberan.wordpress.com/2013/12/24/sentiment-analysis-in-tableau-with-r/

    I hope this helps.

    Sentiment package is very convenient since it takes care of a lot of preprocessing steps automatically but it is not in CRAN (R’s repository for packages) anymore so I used an alternative download link in this blog post yet the package is not maintained anymore so you may run into version compatibility issues during install. I have been meaning to write another blog post that relies on a different package that is being actively maintained but haven’t gotten around to it.

    Thanks,

    Bora

  • spha s.

    hello! the customSQL is that function only applicable in postgres?

  • This SQL statement should also work well on Greenplum, Vertica, and other databases derived from Postgres: https://wiki.postgresql.org/wiki/PostgreSQL_derived_databases

    I’m sure you will find equivalent functions with similar syntax in many other database systems. Feel free to share them in the comments!

  • Daniël Mulder

    Nice! I don’t even know what Tableau is but still! I see it does make nice tag clouds and implements sentiment analyses? Myself I have build a tool that uses the Twitter public streaming api to scan for keywords of the accounts in the tool. Users can enter keywords and the tool scans for them and downloads all tweets, user profiles, hashtags and all users who comment for the matched tweets. It works remarkably well and the database has over 1.000.000 user profiles, as many tweets at one time and scanes for 300 keywords at a time without a glitch. Thing is that it’s a lot of working to maintain and more then it pays so to speak. If we could make this into some sort of cool open source open data stream to use with this tutorial or some project to monitor tweets etc then I might consider placing it into the public domain to give it a second live? I also have web crawlers that lookup data form the website form these profiles but that would not be wise to let loose in the wild I think.

    Some screenshots:
    https://goo.gl/photos/wEF3ghB4ULEbUxio6

    Darn shame to go to waste?

    Gr