#datamustread: Visualize This: The FlowingData Guide to Design, Visualization, and Statistics (2nd Edition) by Nathan Yau

A bookshelf neatly arranged with several books on data visualization and analytics: Displayed in the center is the 2nd edition of "Visualize This: The FlowingData Guide to Design, Visualization, and Statistics" by Nathan Yau. Surrounding this book are various other titles, including those by the Alexander Loth: "Decisively Digital", "Teach Yourself VISUALLY Power BI", "Visual Analytics with Tableau", "Datenvisualisierung mit Tableau", "Datenvisualisierung mit Power BI", and "KI für Content Creation." Other visible titles include "Rewired" and "Self-Service BI & Analytics." The arrangement highlights a strong focus on data visualization, analytics, and AI.
The 2nd edition of Visualize This by Nathan Yau, surrounded by several influential data and AI books, including my own works like Decisively Digital and Teach Yourself Visually Power BI.

While my latest book, KI für Content Creation, has just been reviewed by the renowned c’t magazine, I’m happy to continue reviewing books myself. Today, I’m reviewing the just-released second edition of a cornerstone of the data visualization community, Visualize This: The FlowingData Guide to Design, Visualization, and Statistics by Nathan Yau.

Visualize This: A Deep Dive into Data Visualization

Nathan Yau’s Visualize This has long been a staple for data enthusiasts, and the updated second edition brings fresh techniques, technologies, and examples that reflect the rapidly evolving landscape of data visualization.

Core Highlights of This Book

Data-First Approach: Yau emphasizes that effective visualizations start with a deep understanding of the data. This foundational principle ensures that the resulting graphics are not just visually appealing but also accurately convey the underlying information.

Diverse Toolkit: The book introduces a wide range of tools, including the latest R packages, Python libraries, JavaScript libraries, and illustration software. Yau’s pragmatic approach helps readers choose the right tool for the job without feeling overwhelmed by options.

Real-World Applications: With practical, hands-on examples using real-world datasets, readers learn to create meaningful visualizations. This experiential learning approach is particularly valuable for grasping the subtleties of data representation.

Comprehensive Tutorials: The step-by-step guides are a standout feature, covering statistical graphics, geographical maps, and information design. These tutorials provide clear, actionable instructions that make complex visualizations accessible.

Web and Print Design: Yau details how to create visuals suitable for various mediums, ensuring versatility in application whether for digital platforms or printed materials.

Personal Insights on Visualize This

Having taught data strategy and visualization for seven years, I find Visualize This to be an exceptional resource for a broad audience. Yau skillfully integrates scientific data visualization techniques with graphic design principles, providing practical advice along the way. The book’s toolkit is extensive, featuring R, Illustrator, XML, Python (with BeautifulSoup), JSON, and more, each with working code examples to demonstrate real-world applications.

The image shows an open page from the second edition of "Visualize This: The FlowingData Guide to Design, Visualization, and Statistics" by Nathan Yau. The page, from Chapter 5 titled "Visualizing Categories," features a colorful visualization titled "Cycle of Many," which depicts a 24-hour snapshot of daily activities based on data from the American Time Use Survey. This visual highlights how categories change over time and demonstrates the book's practical approach to data visualization.
Visualize This featuring a colorful 24-hour activity visualization based on data from the American Time Use Survey.

Even though I read the first edition years ago, I couldn’t put the second edition down all weekend. This book is a must-read for anyone who handles data or prepares data-based reports. Its beautiful presentation and careful consideration of every aspect—from typeface to page layout—make it a pleasure to read.

The book is user-friendly, offering a massive set of references and free tools for obtaining interesting datasets across various fields, from sports to politics to health. This breadth of resources is crucial for anyone looking to create impactful visualizations across different domains.

While the focus on Adobe Illustrator might be daunting due to its cost and learning curve, Yau’s examples show how Illustrator can enhance graphics created in other tools like SAS and R. I personally prefer the open-source Inkscape, but Yau’s insights helped me overcome my initial reluctance to use Illustrator, leading to more polished and professional visuals.

Yau uses R, Python, and Adobe Illustrator to demonstrate what can be achieved with imagination and creativity. Although some readers might desire more complex walkthroughs from raw data to final graphics, such material would require substantial foundational knowledge in R and Python. Including this would make the book significantly thicker and veer off from its focus on creating visually appealing graphics.

Conclusion: Visualize This is Essential Reading for Data Professionals

Visualize This (Amazon) is an indispensable guide for anyone serious about data visualization. Its methodical, data-first approach, combined with practical tutorials and a comprehensive toolkit, makes it a must-read for information designers, analysts, journalists, statisticians, and data scientists.

For those looking to refine their data visualization skills and create compelling, accurate graphics, this book offers invaluable insights and techniques.

Connect with me on LinkedIn and Twitter for more reviews and insights on the latest in data and AI! #datamustread

Die Top 5 Bücher für erfolgreiche Data Science: Unverzichtbare Lektüre für angehende Data Scientists

Welche Data Science Bücher sollten Sie lesen um als Data Scientist erfolgreich zu sein? | Photo Credit: via Sebastian Sikora
Welche Data Science Bücher sollten Sie lesen um als Data Scientist erfolgreich zu sein? | Photo Credit: via Sebastian Sikora

Möchten Sie eine Karriere in Data Science verfolgen und fragen sich, welche Bücher Ihnen auf diesem Weg helfen können? In diesem Blogbeitrag präsentiere ich Ihnen die fünf entscheidenden Bücher, die für Ihre Ausbildung und Ihren beruflichen Werdegang in Data Science unerlässlich sind. 📚

1. Einstieg und Überblick: „Data Science in der Praxis“

Einstieg und Überblick: Data Science in der Praxis

Data Science in der Praxis von Tom Alby bietet Ihnen einen umfassenden Einstieg in die Welt der Daten. Dieses Buch vermittelt Ihnen nicht nur die Grundlagen von Data Science, sondern bietet auch praktische Beispiele und Fallstudien, die Ihnen helfen, das Gelernte anzuwenden und zu vertiefen.

2. Python Crashkurs: „Data Science mit Python“

Python Crashkurs: Data Science mit Python

Python ist die universelle Programmiersprache, die sich hervorragend zur Lösung von Data-Science-Fragestellungen eignet. Mit Data Science mit Python von Jake VanderPlas lernen Sie Python auf effiziente Weise und bereiten sich auf komplexere Data-Science-Aufgaben vor.

3. Der Statistikwerkzeugkasten: „Statistik I und II für Dummies“

Der Statistikwerkzeugkasten: Statistik I und II für Dummies

Statistik ist das Rückgrat von Data Science. Statistik für Dummies und Statisik II für Dummies von Deborah J. Rumsey bieten eine umfassende und leicht verständliche Einführung in die Statistik. Die Bücher decken eine Vielzahl von Themen ab, einschließlich Regression, Varianzanalyse, Chi-Quadrat-Tests und nichtparametrische Verfahren.

4. Für große Datenmengen: „Hadoop: The Definitive Guide“

Für große Datenmengen: Hadoop: The Definitive Guide

Hadoop: The Definitive Guide von Tom White ist unerlässlich, wenn Sie mit großen Datenmengen arbeiten. Dieses Buch führt Sie durch die Komplexitäten von Hadoop und hilft Ihnen, das Potenzial Ihrer Daten voll auszuschöpfen.

5. Mehrwert durch Data Science: „Data Science für Unternehmen“

Dieses #datamustread Buch von Foster Provost und Tom Fawcett zeigt auf, wie Data Science für Unternehmen Mehrwert schafft. Es erklärt die Grundlagen der Data Science und zeigt auf, wie Sie die Prinzipien auf reale Geschäftssituationen anwenden können.

Gewonnene Erkenntnisse visualisieren: Bücher zu Datenvisualisierung

Sie möchten Ihre mit Data Science gewonnenen Erkenntnisse visualisieren und interessieren sich für Bücher zum Thema Datenvisualisierung? Dann werfen Sie gerne einen Blick in meine Bücher!


Haben Sie andere Empfehlungen für Data Science Bücher? Teilen Sie Ihre Gedanken und Kommentare in diesem Tweet:

„Die Top 5 Bücher für erfolgreiche Data Science: Unverzichtbare Lektüre für angehende Data Scientists“ weiterlesen

Meetup #19 – Chart Choice & Anomaly Detection for Warranty Cases

Dilyana's session: Chart Choice - many ways to visualize data
Dilyana’s session: Chart Choice – many ways to visualize data

Recently we had the 19th edition of our Data & AI Meetup. This meetup focused on Chart Choice & Anomaly Detection for Warranty Cases. Let’s have a quick recap!

Agenda:

Meetup discussion: Sven, Alexander, and Shubham
Meetup discussion: Sven, Alexander, and Shubham

  1. Intro & announcements: our 5th anniversary
  2. Chart Choice
    by Dilyana Bossenz, Business Analytics and Enablement Manager at M2.
  3. New Book: Decisively Digital – From Creating a Culture to Designing Strategy
    by Alexander Loth, author & executive advisor at Microsoft
  4. Anomaly Detection for warranty cases with an example of the automotive industry
    by Shubham Agarwal, Lead Data Scientist at ATCS
    and Frank Schlemmbach, Sr. Consultant at ATCS
    and Sven Sommerfeld, Managing Director at ATCS
  5. Wrap-up

Session recording:

Further information:

The next Data & AI Meetup?

The next Data & AI Meetup will be announced on the Data & AI LinkedIn group and on the Data & AI Meetup page. Feel free to join!

If you’ve dreamed of sharing your Data & AI story with many like-minded Data & AI enthusiasts, please submit your session proposal.

How to Research LinkedIn Profiles in Tableau with Python and Azure Cognitive Services in Tableau

Azure Cognitive Services in Tableau: using Python to access the Web Services API provided by Microsoft Azure Cognitive Services
Azure Cognitive Services in Tableau: using Python to access the Web Services API provided by Microsoft Azure Cognitive Services

A few weeks after the fantastic Tableau Conference in New Orleans, I received an email from a data scientist who attended my TC18 social media session, and who is using Azure+Tableau. She had quite an interesting question:

How can a Tableau dashboard that displays contacts (name & company) automatically look up LinkedIn profile URLs?

Of course, researching LinkedIn profiles for a huge list of people is a very repetitive task. So let’s find a solution to improve this workflow…

Step by Step: Integrating Azure Cognitive Services in Tableau

1. Python and TabPy

We use Python to build API requests, communicate with Azure Cognitive Services and to verify the returned search results. In order to use Python within Tableau, we need to setup TabPy. If you haven’t done this yet: checkout my TabPy tutorial.

2. Microsoft Azure Cognitive Services

One of many APIs provided by Azure Cognitive Services is the Web Search API. We use this API to search for name + company + „linkedin“. The first three results are then validated by our Python script. One of the results should contain the corresponding LinkedIn profile.

3. Calculated Field in Tableau

Let’s wrap our Python script together and create a Calculated Field in Tableau:

SCRIPT_STR("
import http.client, urllib, base64, json
YOUR_API_KEY = 'xxx'
name = _arg1[0]
company = _arg2[0]
try:
headers = {'Ocp-Apim-Subscription-Key': YOUR_API_KEY }
params = urllib.urlencode({'q': name + ' ' + company + ' linkedin','count': '3'})
connection = http.client.HTTPSConnection('api.cognitive.microsoft.com')
connection.request('GET', '/bing/v7.0/search?%s' % params, '{body}', headers)
json_response = json.loads(connection.getresponse().read().decode('utf-8'))
connection.close()
for result in json_response['webPages']['value']:
if name.lower() in result['name'].lower():
if 'linkedin.com/in/' in result['displayUrl']:
return result['displayUrl']
break
except Exception as e:
return ''
return ''
", ATTR([Name]), ATTR([Company]))

4. Tableau dashboard with URL action

Adding a URL action with our new Calculated Field will do the trick. Now you can click on the LinkedIn icon and a new browser tab (or the LinkedIn app if installed) opens.

LinkedIn demo on Tableau Public

Is this useful for you? Feel free to download the Tableau workbook – don’t forget to add your API key!

Get More Insights

This tutorial is just the tip of the iceberg. If you want to dive deeper into the world of data visualization and analytics, don’t forget to order your copy of my new book, Visual Analytics with Tableau (Amazon).  This comprehensive guide offers an in-depth exploration of data visualization techniques and best practices.

I’d love to hear your thoughts. Feel free to leave a comment, share this tweet, and follow me on Twitter and LinkedIn for more tips, tricks, and tutorials on Azure Cognitive Services in Tableau and other data analytics topics.

Also, feel free to comment and share my Azure Cognitive Services in Tableau tweet:

Data Science Toolbox: How to use Julia with Tableau

Julia in Tableau: R allows Tableau to execute Julia code on the fly, enhancing your data analytics experience.
Julia in Tableau: R allows Tableau to execute Julia code on the fly, enhancing your data analytics experience.

Michael, a data scientist, who is working for a German railway and logistics company, recently told me during a FATUG Meetup that he loves Tableau’s R integration and Tableau’s Python integration. As he continued, he raised the question of using functions they have written in Julia. Julia, a high-level dynamic programming language for high-performance numerical analysis, is an integral part of the newly developed data strategy in Michael’s organization.

Tableau, however, does not come with native support for Julia. I didn’t want to keep Michael’s team down and was looking for an alternative way to integrate Julia with Tableau.

This solution is working flawlessly in a production environment for several months. In this tutorial, I’m going to walk you through the installation and connecting Tableau with R and Julia. I will also give you an example of calling a Julia statement from Tableau to calculate the sphere volume.

Step by Step: Integrating Julia in Tableau

1. Install Julia and add PATH variable

You can download Julia from julialang.org. Add Julia’s installation path to the PATH environment variable.

2. Install R, XRJulia, and RServe

You can download base R from r-project.org. Next, invoke R from the terminal to install the XRJulia and the RServe packages:

> install.packages("XRJulia")
> install.packages("Rserve")

XRJulia provides an interface from R to Julia. RServe is a TCP/IP server that allows Tableau to use facilities of R.

3. Load libraries and start RServe

After packages are successfully installed, we load them and run RServe:

> library(XRJulia)
> library(Rserve)
> Rserve()

Make sure to repeat this step every time you restart your R session.

4. Connecting Tableau to RServe

Now let’s open the Help menu in Tableau Desktop and choose Settings and Performance >Manage External Service connection to open the External Service Connection dialog box:

TC17 External Service Connection

Enter a server name using a domain or an IP address and specify a port. Port 6311 is the default port used by Rserve. Take a look at my R tutorial to learn more about Tableau’s R integration.

5. Adding Julia code to a Calculated Field

You can invoke Calculated Field functions called SCRIPT_STR, SCRIPT_REAL, SCRIPT_BOOL, and SCRIPT_INT to embed your Julia code in Tableau, such as this simple snippet that calculates sphere volume:


SCRIPT_INT('
library(XRJulia)
if (!exists("ev")) ev <- RJulia()
y <- juliaEval("
4 / 3 * %s * ' + STR([Factor]) + ' * pi ^ 3
", .arg1)
',
[Radius])

6. Use Calculated Field in Tableau

You can now use your Julia calculation as an alternate Calculated Field in your Tableau worksheet:

Using Julia within calculations in Tableau (click to enlarge)
Using Julia calculations within Tableau (click to enlarge)

Feel free to download the Tableau Packaged Workbook (twbx) here.

Further Reading: Mastering Julia

If you want to go beyond this tutorial and explore more about Julia in the context of data science, I recommend the book Mastering Julia. You can find it here.

Further Reading: Visual Analytics with Tableau

Join the data science conversation and follow me on Twitter and LinkedIn for more tips, tricks, and tutorials on Julia in Tableau and other data analytics topics. If you’re looking to master Tableau, don’t forget to preorder your copy of my upcoming book, Visual Analytics with Tableau. (Amazon). It offers an in-depth exploration of data visualization techniques and best practices.

Also, feel free to comment and share my Tableau Julia Tutorial tweet: