Tableau: How to find the most important variables for determining Sales

Random Forest Animation
Interactive dashboard displaying the most important variables for determining the Sales measure in Tableau 10.0 (click screenshot to enlarge)

During the Q&A session of a recent talk on Data Strategy, I was challenged with a rather technical question: I was asked how to identify the variables that are heavily influencing a certain measure – with an interactive solution that matches a modern data strategy as suggested in my presentation.

Of course, this could be done by executing a script. The result however would be static and it would be not convenient for a Business Analyst to run it over and over again. Instead of applying a script every time the data changes, it would be much more innovative to get the answer immediately with every data update or interactivity such as a changed filter.

So why not solve this with Tableau? The magic underneath this easy-to-use Tableau dashboard is a nifty R script, embedded in a calculated field. This script calls a statistical method known as Random Forest, a sophisticated machine learning technique used to rank the importance of variables as described in Leo Breiman’s original paper.

The Tableau Packaged Workbook (twbx) is available here. Do you have more ideas or use cases? Feel free to leave a comment or send me an email: aloth@tableau.com