This YouTube tutorial shows you a handy way to load your Excel data to Cloudera Hadoop with Alteryx, and how to see and understand your data even faster with Tableau connected to Impala.
The same tool chain to load and access data can be used with Hive (eg. on Hortonworks) or Spark SQL (eg. on MapR). A overview on common data process technologies can be found in the Big Data jungle guide.
Getting your dashboards up to speed can be quite difficult if you don’t know where the latency is situated. The first and most important rule about making workbooks more efficient is to understand that if it loads slowly in Desktop on your computer, then it will be slow on the server too once it is published. Tableau Desktop and Tableau Server each have their own way to enable, record, and analyze performance.
A must have for performance tuning your workbooks. All you have to do is start the Tableau Performance Recording, make your workbook action and stop the Performance Recording. A few seconds later, Tableau opens a new workbook with the Performance Summary dashboard in it.
Create a performance recording in Tableau Desktop
To start recording performance, follow this step: Help > Settings and Performance > Start Performance Recording
Make some dashboard operations and/or refresh your data source(s).
To stop recording, and then view a temporary workbook containing results from the recording session, follow this step: Help > Settings and Performance > Stop Performance Recording
You can now view the Performance Summary dashboard and begin your analysis.
Create a performance recording on Tableau Server
Administrators must enable the feature. This is located under settings, for each site.
Check the box and save for Workbook Performance Metrics.
Navigate to a view on the server.
Remove the iid=xx from the URL.
Enter in its place record_performance=yes. Your full URL should now look something like this: https://data.alexloth.com/#/site/AA/views/Superstore/Summary?:record_performance=yes
After the page reloads, you’ll notice the ID is added automatically back to the URL and that a performance button appears within the View’s toolbar. Don’t click on the performance button yet.
Do some filtering and some clicking within the workbook such as applying filters, selecting marks/rows, and clicks that cause actions to other elements of the visualization.
Then click the performance button.
Now you’re ready to click on the Performance button which will launch a new window with the Performance Summary dashboard.
Don’t forget to disable the performance recording in the admin settings when you are finished.
Understand the Performance Summery dashboard
The Performance Summery dashboard contains three views:
Timeline: a Gantt chart displaying event start time and duration.
Events sorted by time: a bar chart showing event duration by type.
Query text: It optionally appears when clicking-on an executing query event in the bar chart.
Time line Gantt chart
The uppermost view in a performance recording dashboard shows the events that occurred during the recording, arranged chronologically from left to right. The bottom axis shows elapsed time since Tableau started, in seconds.
In the Timeline view, the Workbook, Dashboard, and Worksheet columns identify the context for the events. The Event column identifies the nature of the event, and the final column show each event’s duration and how it compares chronologically to other recorded events.
The events sorted by time
This section of the workbook shows the duration of recorded events in descending order. This is useful for observing the execution time of each event that occurs during the performance recording. This will help you identify any lengthy events that may be the cause of performance problems.
Events with longer durations can help you identify where to look first if you want to speed up your workbook.
Different colors indicate different types of events. The range of events that can be recorded is:
Computing layouts: If layouts are taking too long, consider simplifying your workbook.
Connecting to a data source: Slow connections could be due to network issues or issues with the database server.
Executing query: If queries are taking too long, consult your database server’s documentation.
Generating extract: To speed up extract generation, consider only importing some data from the original data source. For example, you can filter on specific data fields, or create a sample based on a specified number of rows or percentage of the data.
Geocoding: To speed up geocoding performance, try using less data or filtering out data.
Blending data: To speed up data blending, try using less data or filtering out data.
Server rendering: You can speed up server, rendering by running additional VizQL Server processes on additional machines.
Alternatively, the workbook also displays the query text for any specific event that you want to examine in detail. You can access the detail by clicking on any of the green executing query events in the bar chart. This is a handy feature which allows you to review any query text that may be of interest without having to leave the tableau performance summary dashboard.
If you click on an Executing Query event in either the Timeline or Events section of a performance recording dashboard, the text for that query is displayed in the Query section.
How about some visual takeaways from the IMF’s World Economic Outlook? Recently I prepared two nifty data visualizations with Tableau that I like to share with you.
These visualizations allow you to explore plenty of economical data, including IMF staff estimates until 2020. Don’t forget to choose “Units” after switching “Subject” on the right-side bar. A detailed description on each subject is displayed below.
Recently Tableau released an exciting new feature: R integration via RServe. Tableau with R seems to bring my data science toolbox to the next level! In this tutorial I’m going to walk you through the installation and connecting Tableau with RServe. I will also give you an example of calling an R function with a parameter from Tableau to visualize the results in Tableau.
1. Install and start R and RServe
You can download base R from r-project.org. Next, invoke R from the terminal to install and run the RServe package:
[Update 26 Jun 2016]: Tableau 8.1 screenshots were updated with Tableau 10.0 (Beta) screenshots due to my upcoming Advanced Analytics session at TC16, which is going to reference back to this blog post.
About a year ago, I had a first try with Tableau and some survey data for a university project. Last week, I finally found time to test Tableau with High Energy Physics (HEP) data from CERN’s Proton Synchrotron (PS). Tableau enjoys a stellar reputation among the data visualization community, while the HEP community heavily uses Gnuplot and Python.
I was using an ordinary CSV file as data source for this quick visualization. Furthermore, Tableau can connect to other file types such as Excel, as well as to databases like Microsoft SQL Server, Oracle, and Postgres.
I’m also quite impressed by the ease and speed with which insightful analysis seems to appear out of bland data. Even though your analysis toolchain is script-based (as usual at CERN where batch processing is mandatory), I highly recommend using Tableau for prototyping and for ad-hoc data exploration.
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.