How to perform Text Mining at the Speed of Thought directly in Tableau

Interactive real-time text mining with Tableau Desktop 9.2
Interactive real-time text mining with Tableau Desktop

Tableau is an incredibly versatile tool, commonly known for its ability to create stunning visualizations. But did you know that with Tableau, you can also perform real-time, interactive text mining? Let’s delve into how we can harness this function to gain rapid insights from our textual data.

Previously, during text mining tasks, you might have found yourself reaching for a scripting language like R, Python, or Ruby, only to feed the results back into Tableau for visualization. This approach has Tableau serving merely as a communications tool to represent insights.

However, wouldn’t it be more convenient and efficient to perform text mining and further analysis directly in Tableau?

While Tableau has some relatively basic text processing functions that can be used for calculated fields, these often fall short when it comes to performing tasks like sentiment analysis, where text needs to be split into tokens. Even Tableau’s beloved R integration does not lend a hand in these scenarios.

The Power of Postgres for Text Mining in Tableau

Faced with these challenges, I decided to harness the power of Postgres‘ built-in string functions for text mining tasks. These functions perform much faster than most scripting languages. For example, I used the function regexp_split_to_table for word count, which takes a piece of text (like a blog post), splits it by a pattern, and returns the tokens as rows:

select
guid
, regexp_split_to_table(lower(post_content), '\s+') as word
, count(1) as word_count
from
alexblog_posts
group by
guid, word

Incorporating Custom SQL into Tableau Visualization

I joined this code snippet as a Custom SQL Query to my Tableau data source, which is connected to the database that is powering my blog:

Join with Custom SQL Query in Tableau applying the Postgres function regexp_split_to_table
Join with Custom SQL Query in Tableau applying the Postgres function regexp_split_to_table

And here we go, I was able to create an interactive word count visualization right in Tableau:

This example can be easily enhanced with data from Google Analytics, or adapted to analyze user comments, survey results, or social media feeds. The possibilities for Custom SQL in Tableau are vast and versatile. Do you have some more fancy ideas for real-time text mining with Tableau? Leave me a comment!

Update (TC Pro Tip): Identifying Twitter Hashtags in Tableau

A simple calculated field in Tableau can help identify words within tweets as hashtags or user references, eliminating the need for another regular expression via a Custom SQL Query:

CASE LEFT([Word], 1)
WHEN "#" THEN "Hash Tag"
WHEN "@" THEN "User Reference"
ELSE "Regular Content"
END

Looking for an example? Feel free to check out the Tweets featuring #tableau Dashboard on Tableau Public and download the Packaged Workbook (twbx):

Tableau dashboard that shows tweets featuring the hashtag #tableau (presented at Tableau Conference)
Tableau dashboard that shows tweets featuring the hashtag #tableau (presented at Tableau Conference)

Any more feedback, ideas, or questions? I hope this post provides you with valuable insights into how to master text mining in Tableau, and I look forward to hearing about your experiences and creative applications. You can find more tutorials like this in my new book Visual Analytics with Tableau (Amazon).

Transparency: This blog contains affiliate links. If you click on them, you will be redirected to the merchant. If you decide to make a purchase, we will receive a small commission. The price does not change for you. Affiliate links have no influence on our writing.

Quantitative Finance Applications in R

Do you want to do some quick, in depth technical analysis of stock prices?

After I left CERN to work as consultant and to earn an MBA, I was engaged in many exciting projects in the finance sector, analyzing financial data, such as stock prices, exchange rates and so on. Obviously there are a lot of available models to fit, analyze and predict these types of data. For instance, basic time series model arima(p,d,q), Garch model, and multivariate time series model such as VARX model, state space models.

Although it is a little hard to propose a new and effective model in a short time, I believe that it is also meaningful to apply the existing models and methods to play the financial data. Probably some valuable conclusions will be found. For those of you who wish to have data to experiment with financial models, I put together a web application written in R:

TSLA
Quantitative Finance Analysis in R (click image to open application)

Data Science Toolbox: How to use R with Tableau

Recently, Tableau released an exciting feature that enhances the capabilities of data analytics: R integration via RServe. By bringing together Tableau and R, data scientists and analysts can now enjoy a more comprehensive and powerful data science toolbox. Whether you’re an experienced data scientist or just starting your journey in data analytics, this tutorial will guide you through the process of integrating R with Tableau.

Step by Step: Integrating R in Tableau

1. Install and start R and RServe

You can download base R from r-project.org. Next, invoke R from the terminal to install and run the RServe package:

> install.packages("Rserve")
> library(Rserve)
> Rserve()

To ensure RServe is running, you can try Telnet to connect to it:

Telnet

Protip: If you prefer an IDE for R, I can highly recommend you to install RStudio.

2. Connecting Tableau to RServe

Now let’s open Tableau and set up the connection:

Tableau 10 Help menu
Tableau 10 External Service Connection

3. Adding R code to a Calculated Field

You can invoke R scripts in Tableau’s Calculated Fields, such as k-means clustering controlled by an interactive parameter slider:

SCRIPT_INT('
kmeans(data.frame(.arg1,.arg2,.arg3),' + STR([Cluster Amount]) + ')$cluster;
',
SUM([Sales]), SUM([Profit]), SUM([Quantity]))
Calculated Field in Tableau 10

4. Use Calculated Field in Tableau

You can now use your R calculation as an alternate Calculated Field in your Tableau worksheet:

Tableau 10 showing k-means clustering

Feel free to download the Tableau Packaged Workbook (twbx) here.

Connect and Stay Updated

Stay on top of the latest in data science and analytics by following me on Twitter and LinkedIn. I frequently share tips, tricks, and insights into the world of data analytics, machine learning, and beyond. Join the conversation, and let’s explore the possibilities together!

Blog post updates: