It’s My 10 Year Blogging Anniversary!

Photo from an early blog post: 2007 Hampi, a temple town in South India recognised as UNESCO World Heritage Site
Photo from an early blog post: 2007 Hampi, a temple town in South India recognized as UNESCO World Heritage Site (Flickr)

Woohoo, it’s already ten years since I started this blog. Can’t believe it! Thanks to all of those who read my posts, and who encouraged and inspired me. Without you blogging would be only half the fun! Now, let’s have a little recap…

2007-2009 SAP and India:

It all started in 2007. I was studying Computer Science, and decided to go for an internship abroad. China and India were on my short list. I decided for India, applied for a scholarship and asked some companies for interesting project work. Before starting the adventure, I published my very first blog post to keep family and friends in loop.

For the next seven month, I lived in Bangalore, and worked for SAP Labs India to develop prototypes for mobile BI apps. I spent plenty of weekends to explore India and surrounding countries. After returning from India, I continued to work for SAP at their headquarters while finishing my degree in Karlsruhe.

2009-2012 CERN:

CERN, surrounded by snow-capped mountains and Lake Geneva, grabbed my attention during the end of my studies. CERN has tons of data: some petabytes! Challange accepted. CERN is known for its particle accelerator Large Hadron Collider (LHC). We applied machine learning to identify new correlations between variables (LHC data and external data) that were not previously connected.

2012-2015 Capgemini and MBA:

Back in Germany, I wanted to bring Big Data Analytics to companies. To one company? No, to many companies! So instead of getting hired as Head of BI for an SME, I started to work for Capgemini. I had fantastic projects, designed data-driven usecases for the financial sector, and gave advice for digital transformation inititives.

In order to keep in balance with all the project work, I dedicated many of my weekend for studies and got enrolled in Frankfurt School’s Executive MBA programme. During my studies, I focused on Emerging Markets and visited a module at CEIBS in Shanghai.

2015-201? Tableau and Futura:

I knew Tableau from my time as consultant. It is an awesome company with a great product and a mission: help people see and understand their data. That’s me! I joined Tableau to help organizations through the transition from classic BI factories to modern self-service analytics by developing data strategies, so that data can be treated as a corporate asset. This includes education, evangelism and establishing a data-driven culture.

In the evenings I’m working for Futura Analytics, a fintech startup, which I co-founded in 2017. Futura Analytics offers real-time information discovery, and transforms data from social media and other public sources into actionable signals.

What’s next?

Currently I’m looking forward to give my Data Strategy talk on TC17 accompanied by a TensorFlow demo scenario. I’m also learning Mandarin, the predominant language of business, politics, and media in China and Taiwan, for quite a while. Let’s see if that is going to influence my next steps… 🙂

Monkey Business: Always be Ready to Demo

The famous Tableau Superstore demo data set
The famous Tableau Superstore demo data set

Usually, I really don’t like looking on the screens of other passengers. On this early morning train from Frankfurt to Cologne, however, the screen of my seatmate caught my attention. Where have I seen the logo on his slide deck before? Two coffee sips later, it came to me: it was the logo of Monkey 47, a very delicious gin, distilled in the heart of the Black Forest. So I asked my neighbor: “Is that the Monkey 47 logo?”

He was happy that I recognized his brand and we had a small chat about gin and the Black Forest. Turns out his name is Thomas, and he is the head of Sales and Marketing for Monkey 47. Thomas mentioned that his team is planning a tour to promote Monkey 47 in a number of cities. That sounds similar to what we are doing with the Tableau Cinema Tour, so I showed him our Cinema Tour landing page and explained briefly who we are and what our mission is.

I asked him how he is organizing his data. Thomas revealed that he lives in Excel hell: “spreadsheets with thousands of rows and way too many columns”. This also sounded familiar. I opened up our Superstore.xlsx in Excel and asked: “Do your Excel sheets look like this?” Thomas replied: “Yes!”

Here we go! I drag’n’dropped the file on my Tableau desktop icon and paced through a 7-minute-demo ending with an interactive dashboard. Thomas was flabbergasted. To polish things off, I showed him the interactive Twitter Sentiment Dashboard embedded in my blog. Thomas grabbed his jacket and gave me a business card, saying: “We need Tableau!”

Monkey 47 business card (back side)
Monkey 47 business card (back side)

This story was originally written for Tableau’s EMEA Sales Newsletter. I think it’s a good read for the holidays, and wish you all Merry Christmas!

TabPy Tutorial: Integrating Python with Tableau for Advanced Analytics

TabPy allows Tableau to execute Python code on the fly
TabPy allows Tableau to execute Python code on the fly

In 2013 Tableau introduced the R Integration, the ability to call R scripts in calculated fields. This opened up possibilities such as K-means clustering, Random Forest models and sentiment analysis. With the release of Tableau 10.2, we can enjoy a new, fancy addition to this feature: the Python Integration through TabPy, the Tableau Python Server.

Python is a widely used general-purpose programming language, popular among academia and industry alike. It provides a wide variety of statistical and machine learning techniques, and is highly extensible. Together, Python and Tableau is the data science dream team to cover any organization’s data analysis needs.

In this tutorial I’m going to walk you through the installation and connecting Tableau with TabPy. I will also give you an example of calling a Python function from Tableau to calculate correlation coefficients for a trellis chart.

1. Install and start Python and TabPy

Start by clicking on the Clone or download button in the upper right corner of the TabPy repository page, downloading the zip file and extracting it.

TabPy download via GitHub web page

Protip: If you are familar with Git, you can download TabPy directly from the repository:

> git clone git://

TabPy download via Git command line interface

Within the TabPy directory, execute (or setup.bat if you are on Windows). This script downloads and installs Python, TabPy and all necessary dependencies. After completion, TabPy is starting up and listens on port 9004.

2. Connecting Tableau to TabPy

In Tableau 10.2, a connection to TabPy can be added in Help > Settings and Performance > Manage External Service Connection:

Tableau main menu
Tableau main menu

Set port to 9004:

External Service Connection dialogue
External Service Connection dialogue

3. Adding Python code to a Calculated Field

You can invoke Calculated Field functions called SCRIPT_STR, SCRIPT_REAL, SCRIPT_BOOL, and SCRIPT_INT to embed your Python script in Tableau:

Python script within Tableau
Python script within Tableau

4. Use Calculated Field in Tableau

Now you can use your Python calculation as Calculated Field in your Tableau worksheet:

Tableau workbook featuring a Python calculation
Tableau workbook featuring a Python calculation

Feel free to download the Tableau Packaged Workbook (twbx) here.

[Update 3 Jan 2017]: Translated to Japanese by Tomohiro Iwahashi: Tableau + Python 連携 (Tabpy) を使ってみよう!

[Update 30 Mar 2017]: A German translation of this post is published on the official Tableau blog: TabPy Tutorial: Integration von Python mit Tableau für Advanced Analytics

7 Fragen, die Unternehmen helfen ihr Ergebnis mit Social Media zu steigern

Twitter Sentiment Analysis: klicken, um interaktives Dashboard zu öffnen
Twitter Sentiment Analysis: klicken, um interaktives Dashboard zu öffnen

Ist der Einsatz sozialer Netze in Ihrem Unternehmen auf Marketing beschränkt, und lässt dadurch Chancen ungenutzt?

Noch immer schöpfen viele Unternehmen in Deutschland die Möglichkeiten von Social Media nur unzureichend aus. Die meisten Firmen nutzen Social Media lediglich als Marketinginstrument, senden zum Beispiel in Intervallen die gleichen Inhalte. Wesentlich weniger Unternehmen setzen Social Media dagegen in der externen Kommunikation, in Forschung und Entwicklung, zu Vertriebszwecken, oder im Kundenservice ein.

Nachfolgend betrachten wir die Twitter-Kommunikation von vier Social-Media-affinen Unternehmen etwas näher, und zeigen anhand sieben Fragestellungen was sie anders machen und wo die übrigen Nachholbedarf haben.

1. Wann und wie werden Tweets gesendet?

Ein Blick auf das Histogram lässt auf reichlich Interaktion schließen (Tweets und Replies), während das Weiterverbreiten von Tweets (Retweets) eher sporadisch auftritt:


2. Wie umfangreich sind die Tweets?

Wie es scheint, reitzen die meisten Tweets die von Twitter vorgesehenen 140 Zeichen aus – oder sind zumindest nahe dran:


3. An welchen Wochentagen wird getweetet?

Am Wochenende lässt die Kommunikation via Twitter nach. Die Verteilung der Emotionen bleibt dabei gleich, unterscheidet sich aber von Unternehmen zu Unternehmen:


4. Zu welcher Tageszeit wird getweetet?

Auch nachts werden weniger Tweets verfasst. Bei Lufthansa kommt es dabei recht früh zu einem Anstieg durch Pendler-Tweets, etwas später tritt dieser Effekt bei der Deutschen Bahn ein: 


5. Welche Art der Kommunikation herrscht vor?

Der hohe Anteil an Replies bei Telekom, Deutsche Bahn und Lufthansa impliziert, dass diese Unternehmen Twitter stark zum Dialog nutzen. Unter den Tweets der Deutsche Bank ist hingegen der Anteil an Retweets – insbesondere bei jenen mit Hashtag – deutlich höher, was auf einen höheren Nachrichtengehalt schließen lässt:


6. Welche User sind besonders aktiv?

Nun betrachten wir die Twitter-User, welche die entsprechend Twitter-Handles der Unternehmen besonders intensiv nutzen:


7. Welche Tweets erzeugen Aufmerksamkeit?

Diese Frage lässt sich am besten interaktiv im Dashboard (siehe auch Screenshot oben) untersuchen. Entscheidend ist bei dieser Betrachtung die Ermittlung der Emotion durch eine Sentiment-Analyse.

Je nach Emotion und Kontext ist es vor allem für das adressierte Unternehmen von Interesse rechtzeitig und angemessen zu reagieren. So lässt sich eine negative Stimmung frühzeitig relativieren, und so Schaden an der Marke abwenden. Positive Nachrichten können hingegen durch Weiterreichen als Multiplikator dienen.

How to implement Sentiment Analysis in Tableau using R?

Interactive sentiment analysis with Tableau 9.2
Interactive sentiment analysis with Tableau 9.2

In my previous post I highlighted Tableau’s text mining capabilities, resulting in fancy visuals such as word clouds:

Today I’d like to follow up on this and show how to implement sentiment analysis in Tableau using Tableau’s R integration. Some of the many uses of social media analytics is sentiment analysis where we evaluate whether posts on a specific issue are positive, neutral, or negative (polarity), and which emotion in predominant.

What do customers like or dislike about your products? How do people perceive your brand compared to last year?

In order to answer such questions in Tableau, we need to install an R package that is capable of performing the sentiment analysis. In the following example we use an extended version of the sentiment package, which was initiated by Timothy P. Jurka.

The sentiment package requires the tm and Rstem packages, so make sure that they are installed properly. Execute these commands in your R console to install sentiment from GitHub (see alternative way to install at the end of this blog post):

The sentiment package offers two functions, which can be easily called from calculated fields in Tableau:

Screenshot 2016-01-31 15.25.24 crop

The function get_polarity returns “positive”, “neutral”, or “negative”:

The function get_emotion returns “anger”, “disgust”, “fear”, “joy”, “sadness”, “surprise”, or “NA”:

The sentiment package follows a lexicon based approach and comes with two files emotions_english.csv.gz (source and structure) and subjectivity_english.csv.gz (source and structure). Both files contain word lists in English and are stored in the R package library under /sentiment/data directory.

If text is incorrectly classified, you could easily fix this issue by extending these two files. If your aim is to analyze text other than English, you need to create word lists for the target language. Kindly share them in the comments!

Feel free to download the Packaged Workbook (twbx) here.

[Update 11 Aug 2016]: If you are having trouble with install_github, try to install directly form this website: