During the Q&A session of a recent talk on Data Strategy, I was challenged with a rather technical question: I was asked how to identify the variables that are heavily influencing a certain measure – with an interactive solution that matches a modern data strategy as suggested in my presentation.
Of course, this could be done by executing a script. The result however would be static and it would be not convenient for a Business Analyst to run it over and over again. Instead of applying a script every time the data changes, it would be much more innovative to get the answer immediately with every data update or interactivity such as a changed filter.
So why not solve this with Tableau? The magic underneath this easy-to-use Tableau dashboard is a nifty R script, embedded in a calculated field. This script calls a statistical method known as Random Forest, a sophisticated machine learning technique used to rank the importance of variables as described in Leo Breiman’s original paper.