Dilyana’s session: Chart Choice – many ways to visualize data
Recently we had the 19th edition of our Data & AI Meetup. This meetup focused on Chart Choice & Anomaly Detection for Warranty Cases. Let’s have a quick recap!
Agenda:
Meetup discussion: Sven, Alexander, and Shubham
Intro & announcements: our 5th anniversary
Chart Choice by Dilyana Bossenz, Business Analytics and Enablement Manager at M2.
New Book: Decisively Digital – From Creating a Culture to Designing Strategy by Alexander Loth, author & executive advisor at Microsoft
Anomaly Detection for warranty cases with an example of the automotive industry by Shubham Agarwal, Lead Data Scientist at ATCS and Frank Schlemmbach, Sr. Consultant at ATCS and Sven Sommerfeld, Managing Director at ATCS
A few weeks after the fantastic Tableau Conference in New Orleans, I received an email from a data scientist who attended my TC18 social media session, and who is using Azure+Tableau. She had a quite interesting question:How can a Tableau dashboard that displays contacts (name & company) automatically lookup LinkedIn profile URLs?
Of course, researching LinkedIn profiles for a huge list of people is a very repetitive task. So let’s find a solution to improve this workflow…
1. Python and TabPy
We use Python to build API requests, communicate with Azure Cognitive Services and to verify the returned search results. In order to use Python within Tableau, we need to setup TabPy. If you haven’t done this yet: checkout my TabPy tutorial.
2. Microsoft Azure Cognitive Services
One of many APIs provided by Azure Cognitive Services is the Web Search API. We use this API to search for name + company + “linkedin”. The first three results are then validated by our Python script. One of the results should contain the corresponding LinkedIn profile.
3. Calculated Field in Tableau
Let’s wrap our Python script together and create a Calculated Field in Tableau:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
Adding a URL action with our new Calculated Field will do the trick. Now you can click on the LinkedIn icon and a new browser tab (or the LinkedIn app if installed) opens.
Michael, a data scientist, who is working for a German railway and logistics company, recently told me during a FATUG Meetup that he loves Tableau’s R and Python integration. As he continued, he raised the question of using functions they have written in Julia. Julia, a high-level dynamic programming language for high-performance numerical analysis, is an integral part of the newly developed data strategy in Michael’s organization.
Tableau, however, does not come with native support for Julia. I didn’t want to keep Michael’s team down and was looking for an alternative way to integrate Julia with Tableau.
This solution is working flawlessly in a production environment for several months. In this tutorial, I’m going to walk you through the installation and connecting Tableau with R and Julia. I will also give you an example of calling a Julia statement from Tableau to calculate the sphere volume.
XRJulia provides an interface from R to Julia. RServe is a TCP/IP server that allows Tableau to use facilities of R.
3. Load libraries and start RServe
After packages are successfully installed, we load them and run RServe:
> library(XRJulia) > library(Rserve) > Rserve()
Make sure to repeat this step every time you restart your R session.
4. Connecting Tableau to RServe
Now let’s open the Help menu in Tableau Desktop and choose Settings and Performance >Manage External Service connection to open the External Service Connection dialog box:
Enter a server name using a domain or an IP address and specify a port. Port 6311 is the default port used by Rserve. Take a look at my R tutorial to learn more about Tableau’s R integration.
5. Adding Julia code to a Calculated Field
You can invoke Calculated Field functions called SCRIPT_STR, SCRIPT_REAL, SCRIPT_BOOL, and SCRIPT_INT to embed your Julia code in Tableau, such as this simple snippet that calculates sphere volume:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
Demo: using Julia calculations within Tableau (click to enlarge)
We have already seen some love from Tableau for R and Python, boosting Tableau’s Advanced Analytics capabilities.
So what is the next big thing for our Data Science Rockstars? Julia!
Who is Julia?
Julia is a high-level dynamic programming language introduced in 2012. Designed to address the needs of high-performance numerical analysis its syntax is very similar to MATLAB. If you are used to MATLAB, you will be very quick to get on track with Julia.
Compared to R and Python, Julia is significantly faster (close to C and FORTRAN, see benchmark). Based on Tableau’s R integration, Julia is a fantastic addition to Tableau’s Advanced Analytics stack and to your data science toolbox.
Where can I learn more?
Do you want to learn more about Advanced Analytics and how to leverage Tableau with R, Python, and Julia? Meet me at the 2017 Tableau Conferences in London, Berlin, or Las Vegas and join my Advanced Analytics sessions:
Yes, of course! I published tutorials for R and Python on this blog. And I will also publish a Julia tutorial soon. Feel free to follow me on Twitter @xlth, and leave me your feedback/suggestions in the comment section below.
TabPy allows Tableau to execute Python code on the fly
In 2013, Tableau introduced R Integration, the ability to call R scripts in calculated fields. This opened up possibilities such as K-means clustering, Random Forest models, and sentiment analysis. With the release of Tableau 10.2, we can enjoy a new, fancy addition to this feature: the Python Integration through TabPy, the Tableau Python Server.
Python is a widely used general-purpose programming language, popular among academia and industry alike. It provides a wide variety of statistical and machine learning techniques and is highly extensible. Together, Python and Tableau are the data science dream team to cover any organization’s data analysis needs.
In this tutorial, I’m going to walk you through the installation and connecting Tableau with TabPy. I will also give you an example of calling a Python function from Tableau to calculate correlation coefficients for a trellis chart.
1. Install and start Python and TabPy
Start by clicking on the Clone or download button in the upper right corner of the TabPy repository page, downloading the zip file, and extracting it.
TabPy download via GitHub web page
Protip: If you are familiar with Git, you can download TabPy directly from the repository:
> git clone git://github.com/tableau/TabPy
TabPy download via Git command line interface
Within the TabPy directory, execute setup.sh (or setup.bat if you are on Windows). This script downloads and installs Python, TabPy, and all necessary dependencies. After completion, TabPy is starting up and listens on port 9004.
2. Connecting Tableau to TabPy
In Tableau 10.2 (and later versions), a connection to TabPy can be added in Help > Settings and Performance > Manage External Service Connection:
Tableau main menu
Set port to 9004:
External Service Connection dialogue
3. Adding Python code to a Calculated Field
You can invoke Calculated Field functions called SCRIPT_STR, SCRIPT_REAL, SCRIPT_BOOL, and SCRIPT_INT to embed your Python script in Tableau:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
We use cookies to optimize our website and our service.
Functional
Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes.The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.