Brave. Proud to support a faster, safer web.

Hyper Kickoff Event: 5th Frankfurt Analytics + Tableau User Group Meetup

Tableau Hyperfest: Hyper Kickoff Event at Tableau's Frankfurt office
Tableau Hyperfest: Hyper Kickoff Event at Tableau’s Frankfurt office

We’d like to invite you to the 5th Frankfurt Analytics + Tableau User Group Meetup.

Join us for the global launch of Tableau’s super fast data engine, Hyper! Hyper brings faster data refreshes and query performance to Tableau extracts, plus increased scalability in a platform-wide update.

This is your opportunity to get to know the Hyper dev team, hear from Tableau beta customers about their hands-on Hyper experience, and participate in live Q&A. Best of all, learn more about Hyper’s patent-pending technology as well as some of the other features headed your way in 10.5. (Viz in Tooltip, anyone?)

Tableau is hosting the Hyperfest meetup – come and celebrate with the community and the world on the upcoming release of Hyper. In addition to the Hyper presentation, we will also have food, drinks and Tableau swag, so don’t miss it!

-> Sign Up <-

Tableau Hyperfest meetup event page
Sign up for free at the Hyperfest meetup event page

Agenda

9:00pm: Doors Open

9:30pm: Presentations:

10:30pm: Drinks & Networking

11:00pm: Live Hyperfest Viewing Party

Midnight: Event Concludes

Livestream: Follow us on Twitter @FraAnalytics and check for the livestream and additional content!

Feedback and ideas: Let us know if you’d like to discuss a particular topic or if you want to become one of our future speakers – email or twitter.

5 Takeaways from Tableau’s Hybrid Transactional/Analytical Processing

What makes Hyper so fast?
The Future of Enterprise Analytics: Hyper can handle both OLTP and OLAP simultaneously. In the future it will address NoSQL and graph workloads.

1. What is Hyper’s key benefit?

Hyper is a Hybrid transactional/analytical processing (HTAP) database system and replaces Tableau Data Extracts (TDE). The change will be mostly transparent for end users, other than everything being faster. Hyper significantly improves extract refresh times, query times and overall performance.

2. What is Hybrid transactional/analytical processing?

As defined by Gartner:

Hybrid transaction/analytical processing (HTAP) is an emerging application architecture that “breaks the wall” between transaction processing and analytics. It enables more informed and “in business real time” decision making.

The two areas of online transaction processing (OLTP) and online analytical processing (OLAP) present different challenges for database architectures. Currently, customers with high rates of mission-critical transactions have split their data into two separate systems, one database for OLTP and one so-called data warehouse for OLAP. While allowing for decent transaction rates, this separation has many disadvantages including data freshness issues due to the delay caused by only periodically initiating the Extract Transform Load (ETL) data staging and excessive resource consumption due to maintaining two separate information systems.

3. Does Hyper satisfy the ACID properties?

Hyper, initially developed at the Technical University of Munich and acquired by Tableau in 2016, can handle both OLTP and OLAP simultaneously. Hyper possesses the rare quality of being able to handle data updates and insertions at the same time as queries by using hardware-assisted replication mechanisms to maintain consistent snapshots of the transactional data. Hyper is an in-memory database that guarantees the ACID properties (Atomicity, Consistency, Isolation, Durability) of OLTP transactions and executes OLAP query sessions (multiple queries) on the same, arbitrarily current and consistent snapshot.

4. What makes Hyper so fast?

The utilization of the processor-inherent support for virtual memory management (address translation, caching, copy on update) yields both at the same time: unprecedentedly high transaction rates as high as 100,000 per second and very fast OLAP query response times on a single system executing both workloads in parallel. This would support real-time streaming of data in future releases of Tableau. These performance increases come from the nature of the Hyper data structures, but also from smart use of contemporary hardware technology, and particularly nvRam memory. Additional cores provide a linear increment in performance.

5. What does this mean for Tableau?

With Hyper now powering the Tableau platform, your organization will see faster extract creation and better query performance for large data sets. Since Hyper is designed to handle exceptionally large data sets, you can choose to extract your data based on what you need, not data volume limitations. Hyper improves performance for common computationally-intensive queries, like count distinct, calculated fields, and text field manipulations. This performance boost will improve your entire Enterprise Analytics workflow.

Join our “The Future of Enterprise Analytics” events and get a sneak peak at upcoming features and the Tableau Roadmap: 14th of November in Düsseldorf and 6th of December in Frankfurt.

[Update 20 Dec 2017] Hyper Kickoff Event: Join us for the Hyper Kickoff Event at the 18th of January 2018 in Tableau’s Frankfurt Office.

Data Science Toolbox: How to use Julia with Tableau

R allows Tableau to execute Julia code on the fly
R allows Tableau to execute Julia code on the fly

Michael, a data scientist, who is working for a German railway and logistics company, recently told me during an FATUG Meetup that he loves Tableau’s R and Pyhton integration. As he continued, he raised the raised the question for using functions they have written in Julia. Julia, a high-level dynamic programming language for high-performance numerical analysis, is an integral part of newly developed data strategy in the Michael’s organization.

Tableau, however, does not come with native support for Julia. I didn’t want to keep Michael’s team down and was looking for an alternative way to integrate Julia with Tableau.

This solution is working flawless in a production environment since several months. In this tutorial I’m going to walk you through the installation and connecting Tableau with R and Julia. I will also give you an example of calling a Julia statement from Tableau to calculate the sphere volume.

1. Install Julia and add PATH variable

You can download Julia from julialang.org. Add Julia’s installation path to the PATH environment variable.

2. Install R, XRJulia and RServe

You can download base R from r-project.org. Next, invoke R from the terminal to install the XRJulia and the RServe packages:

> install.packages("XRJulia")
> install.packages("Rserve")

XRJulia provides an interface from R to Julia. RServe is a TCP/IP server which allows Tableau to use facilities of R.

3. Load libraries and start RServe

After packages are successfully installed, we load them and run RServe:

> library(XRJulia)
> library(Rserve)
> Rserve()

Make sure to repeat this step everytime you restart your R session.

4. Connecting Tableau to RServe

Now let’s open the Help menu in Tableau Desktop and choose Settings and Performance >Manage External Service connection to open the External Service Connection dialog box:

TC17 External Service Connection

Enter a server name using a domain or an IP address and specify a port. Port 6311 is the default port used by Rserve. Take a look on my R tutorial to learn more about Tableau’s R integration.

5. Adding Julia code to a Calculated Field

You can invoke Calculated Field functions called SCRIPT_STR, SCRIPT_REAL, SCRIPT_BOOL, and SCRIPT_INT to embed your Julia code in Tableau, such as this simple snippet that calculates sphere volume:

6. Use Calculated Field in Tableau

You can now use your Julia calculation as an alternate Calculated Field in your Tableau worksheet:

Using Julia within calculations in Tableau (click to enlarge)
Using Julia calculations within Tableau (click to enlarge)

Feel free to download the Tableau Packaged Workbook (twbx) here.

Further reading: Mastering Julia

TabPy Tutorial: Integrating Python with Tableau for Advanced Analytics

TabPy allows Tableau to execute Python code on the fly
TabPy allows Tableau to execute Python code on the fly

In 2013 Tableau introduced the R Integration, the ability to call R scripts in calculated fields. This opened up possibilities such as K-means clustering, Random Forest models and sentiment analysis. With the release of Tableau 10.2, we can enjoy a new, fancy addition to this feature: the Python Integration through TabPy, the Tableau Python Server.

Python is a widely used general-purpose programming language, popular among academia and industry alike. It provides a wide variety of statistical and machine learning techniques, and is highly extensible. Together, Python and Tableau is the data science dream team to cover any organization’s data analysis needs.

In this tutorial I’m going to walk you through the installation and connecting Tableau with TabPy. I will also give you an example of calling a Python function from Tableau to calculate correlation coefficients for a trellis chart.

1. Install and start Python and TabPy

Start by clicking on the Clone or download button in the upper right corner of the TabPy repository page, downloading the zip file and extracting it.

TabPy download via GitHub web page

Protip: If you are familar with Git, you can download TabPy directly from the repository:

> git clone git://github.com/tableau/TabPy

TabPy download via Git command line interface

Within the TabPy directory, execute setup.sh (or setup.bat if you are on Windows). This script downloads and installs Python, TabPy and all necessary dependencies. After completion, TabPy is starting up and listens on port 9004.

2. Connecting Tableau to TabPy

In Tableau 10.2, a connection to TabPy can be added in Help > Settings and Performance > Manage External Service Connection:

Tableau main menu
Tableau main menu

Set port to 9004:

External Service Connection dialogue
External Service Connection dialogue

3. Adding Python code to a Calculated Field

You can invoke Calculated Field functions called SCRIPT_STR, SCRIPT_REAL, SCRIPT_BOOL, and SCRIPT_INT to embed your Python script in Tableau:

Python script within Tableau
Python script within Tableau

4. Use Calculated Field in Tableau

Now you can use your Python calculation as Calculated Field in your Tableau worksheet:

Tableau workbook featuring a Python calculation
Tableau workbook featuring a Python calculation

Feel free to download the Tableau Packaged Workbook (twbx) here.

[Update 3 Jan 2017]: Translated to Japanese by Tomohiro Iwahashi: Tableau + Python 連携 (Tabpy) を使ってみよう!

[Update 30 Mar 2017]: A German translation of this post is published on the official Tableau blog: TabPy Tutorial: Integration von Python mit Tableau für Advanced Analytics

Tableau: How to find the most important variables for determining Sales

Random Forest Animation
Interactive dashboard displaying the most important variables for determining the Sales measure in Tableau 10.0 (click screenshot to enlarge)

During the Q&A session of a recent talk on Data Strategy, I was challenged with a rather technical question: I was asked how to identify the variables that are heavily influencing a certain measure – with an interactive solution that matches a modern data strategy as suggested in my presentation.

Of course, this could be done by executing a script. The result however would be static and it would be not convenient for a Business Analyst to run it over and over again. Instead of applying a script every time the data changes, it would be much more innovative to get the answer immediately with every data update or interactivity such as a changed filter.

So why not solve this with Tableau? The magic underneath this easy-to-use Tableau dashboard is a nifty R script, embedded in a calculated field. This script calls a statistical method known as Random Forest, a sophisticated machine learning technique used to rank the importance of variables as described in Leo Breiman’s original paper.

The Tableau Packaged Workbook (twbx) is available here. Do you have more ideas or use cases? Feel free to leave a comment or send me an email: aloth@tableau.com