Sommermärchen ade: Was Sportdaten über Toni Kroos und den deutschen Fußball verraten – Neue Folge von „Die Digitalisierung und Wir“

Junger Fußballspieler bei Datenanalyse-Training mit Kameras, Podcast-Cover für "Die Digitalisierung und Wir" Folge 23, Gespräch zu Sportdaten mit Dr. Karsten Görsdorf
Sportdaten: Die Digitalisierung und Wir Folge 23 – Ein Gespräch mit Dr. Karsten Görsdorf über Datenanalyse im Sport, mit einem jungen Fußballspieler bei einem Datenanalyse-Training.

Sommermärchen ade

In der 120+6. Minute versuchte die Deutsche Nationalmannschaft sich ein letztes Mal gegen das bevorstehende Ausscheiden aus der Europameisterschaft zu stemmen. Toni Kroos brachte die Freistoßflanke in den Strafraum – der spanische Keeper Unai Simón hatte aber keine Probleme, den Ball zu behaupten. Kurz danach pfeift der Schiedsrichter ab – das Sommermärchen ist für Deutschland zu Ende.

Der Freistoß war gleichzeitig auch die letzte Ballberührung von Toni Kroos, der seine Fußballkarriere damit beendet.

Neben den stets gefährlichen Standardsituationen wird Toni Kroos natürlich als die deutsche “Passmaschine” in die Geschichtsbücher eingehen. Einer, wenn nicht der beste defensive Mittelfeldspieler seiner Zeit, der durch seine geduldige Passverteilung von hinten heraus das Spiel leiten und lenken konnte. Mit seiner kurzen Rückkehr ins pinke Nationaltrikot hatte Kroos einen erheblichen Anteil daran, dass wir zumindest eine Zeit lang vom EM-Titel träumen durften.

Podcastfolge zu Sportdaten

Eine Messzahl, die oft mit Kroos in Verbindung gebracht wird, ist die Passquote. Der Anteil der Pässe, die beim Empfänger angekommen sind. Bei Toni Kroos lag diese Zahl meist weit über 90%. Im Auftaktspiel der Heim-EM waren es sage und schreibe 99% – Kroos brachte 101 Pässe an den Mann, bei nur einem Fehlpass.

Der Sport wird auch immer digitaler, und von daher werden auch immer mehr Sportdaten, wie die Passquote, erfasst und teilweise auch in Echtzeit bei Fußballspielen ausgewertet.

Welche Daten das sind, wie genau sie erhoben werden und welche Aussagen man damit treffen kann, wollten wir uns von einem Experten erläutern lassen. Daher hatten wir den Geschäftsführer der 4talents analytics GmbH, Dr. Karsten Görsdorf, der schon viele Jahre im Bereich Sport-Analytics arbeitet, bei uns als Gast im Podcast “Die Digitalisierung und Wir”.

Sportdaten: Das Problem mit den Box-Score-Zahlen

Wie in der Podcast-Folge zu hören ist, hält Dr. Görsdorf wenig von sogenannten Box-Score-Statistiken, die ursprünglich aus den US-Sportarten wie Basketball, Football oder Baseball kommen. Mit Box-Scores sind Zahlen gemeint, die die beobachtbaren Ereignisse eines Fußballspiels beschreiben, wie z.B. die Anzahl der Torschüsse, die gelaufenen Kilometer oder eben die Passquote. Der Fußball mit seiner großen Zufallskomponente lässt sich nicht in diese Box-Score-Logik pressen, so Görsdorf.

Vielmehr sieht er eine Zukunft für viel feinere Auswertungen von Positionsdaten, die zum Beispiel “Energie-Missmatches” aufzeigen können. Also, wenn ein frisch eingewechselter Stürmer zum Beispiel auf einen schon müde gespielten Verteidiger trifft. So wie es vermutlich auch im Spanienspiel war, als Antonio Rüdiger, der bis dahin ein Weltklasse-Spiel absolviert hatte, in der 119. Minute Mikel Merino nicht mehr am Kopfball zum spanischen Siegtreffer hindern konnte.

Einsatz von KI und Datenkompetenz im Sport

In unserer Diskussion mit Dr. Görsdorf wurde auch die Rolle von Künstlicher Intelligenz (KI) im Sport thematisiert. KI kann enorme Mengen an Spielerdaten analysieren und Muster erkennen, die für menschliche Analysten unsichtbar bleiben. Durch den Einsatz von Machine Learning und Deep Learning können Trainer und Teams tiefere Einblicke in die Spielstrategie und -taktik gewinnen, was letztlich die Leistung auf dem Platz verbessert.

Ebenso wichtig ist die Datenkompetenz derjenigen, die mit diesen Technologien arbeiten. Nur durch ein tiefes Verständnis der Daten und deren korrekte Interpretation können sinnvolle Entscheidungen getroffen werden. Dr. Görsdorf betont, dass Datenkompetenz nicht nur in der Analyse von Sportdaten, sondern in allen Bereichen der modernen Arbeitswelt immer wichtiger wird.

Weitere Podcast-Empfehlung

Toni Kroos selber hält übrigens auch sehr wenig von der Passquote, wie er in der neuesten Folge von “Lanz und Precht” erzählt. Im Podcast-Interview, das wenige Tage vor dem Spanienspiel aufgezeichnet wurde, erklärt er, dass er die Packing-Statistik, welche die Anzahl der überspielten Gegenspieler bei einem Pass angibt, viel aussagekräftiger findet.

Die Folge ist auch zu empfehlen, weil er einen guten Einblick hinter die Kulissen des Mannschaftsquartiers und in die Methoden von Nationaltrainer Julian Nagelsmann gibt.

Buchempfehlungen

Apropos Empfehlungen, in unserem Gespräch hat Dr. Görsdorf auch drei spannende Bücher erwähnt, die für alle Sportfans interessant sein könnten:

Diese seien den Lesern als Sommerlektüre ans Herz gelegt. Denn auch wenn es kein Sommermärchen geworden ist, ist der Sportsommer noch lange nicht zu Ende. In weniger als drei Wochen beginnen ja die Olympischen Sommerspiele in Paris.

#datamustread: Visualize This: The FlowingData Guide to Design, Visualization, and Statistics (2nd Edition) by Nathan Yau

A bookshelf neatly arranged with several books on data visualization and analytics: Displayed in the center is the 2nd edition of "Visualize This: The FlowingData Guide to Design, Visualization, and Statistics" by Nathan Yau. Surrounding this book are various other titles, including those by the Alexander Loth: "Decisively Digital", "Teach Yourself VISUALLY Power BI", "Visual Analytics with Tableau", "Datenvisualisierung mit Tableau", "Datenvisualisierung mit Power BI", and "KI für Content Creation." Other visible titles include "Rewired" and "Self-Service BI & Analytics." The arrangement highlights a strong focus on data visualization, analytics, and AI.
The 2nd edition of Visualize This by Nathan Yau, surrounded by several influential data and AI books, including my own works like Decisively Digital and Teach Yourself Visually Power BI.

While my latest book, KI für Content Creation, has just been reviewed by the renowned c’t magazine, I’m happy to continue reviewing books myself. Today, I’m reviewing the just-released second edition of a cornerstone of the data visualization community, Visualize This: The FlowingData Guide to Design, Visualization, and Statistics by Nathan Yau.

Visualize This: A Deep Dive into Data Visualization

Nathan Yau’s Visualize This has long been a staple for data enthusiasts, and the updated second edition brings fresh techniques, technologies, and examples that reflect the rapidly evolving landscape of data visualization.

Core Highlights of This Book

Data-First Approach: Yau emphasizes that effective visualizations start with a deep understanding of the data. This foundational principle ensures that the resulting graphics are not just visually appealing but also accurately convey the underlying information.

Diverse Toolkit: The book introduces a wide range of tools, including the latest R packages, Python libraries, JavaScript libraries, and illustration software. Yau’s pragmatic approach helps readers choose the right tool for the job without feeling overwhelmed by options.

Real-World Applications: With practical, hands-on examples using real-world datasets, readers learn to create meaningful visualizations. This experiential learning approach is particularly valuable for grasping the subtleties of data representation.

Comprehensive Tutorials: The step-by-step guides are a standout feature, covering statistical graphics, geographical maps, and information design. These tutorials provide clear, actionable instructions that make complex visualizations accessible.

Web and Print Design: Yau details how to create visuals suitable for various mediums, ensuring versatility in application whether for digital platforms or printed materials.

Personal Insights on Visualize This

Having taught data strategy and visualization for seven years, I find Visualize This to be an exceptional resource for a broad audience. Yau skillfully integrates scientific data visualization techniques with graphic design principles, providing practical advice along the way. The book’s toolkit is extensive, featuring R, Illustrator, XML, Python (with BeautifulSoup), JSON, and more, each with working code examples to demonstrate real-world applications.

The image shows an open page from the second edition of "Visualize This: The FlowingData Guide to Design, Visualization, and Statistics" by Nathan Yau. The page, from Chapter 5 titled "Visualizing Categories," features a colorful visualization titled "Cycle of Many," which depicts a 24-hour snapshot of daily activities based on data from the American Time Use Survey. This visual highlights how categories change over time and demonstrates the book's practical approach to data visualization.
Visualize This featuring a colorful 24-hour activity visualization based on data from the American Time Use Survey.

Even though I read the first edition years ago, I couldn’t put the second edition down all weekend. This book is a must-read for anyone who handles data or prepares data-based reports. Its beautiful presentation and careful consideration of every aspect—from typeface to page layout—make it a pleasure to read.

The book is user-friendly, offering a massive set of references and free tools for obtaining interesting datasets across various fields, from sports to politics to health. This breadth of resources is crucial for anyone looking to create impactful visualizations across different domains.

While the focus on Adobe Illustrator might be daunting due to its cost and learning curve, Yau’s examples show how Illustrator can enhance graphics created in other tools like SAS and R. I personally prefer the open-source Inkscape, but Yau’s insights helped me overcome my initial reluctance to use Illustrator, leading to more polished and professional visuals.

Yau uses R, Python, and Adobe Illustrator to demonstrate what can be achieved with imagination and creativity. Although some readers might desire more complex walkthroughs from raw data to final graphics, such material would require substantial foundational knowledge in R and Python. Including this would make the book significantly thicker and veer off from its focus on creating visually appealing graphics.

Conclusion: Visualize This is Essential Reading for Data Professionals

Visualize This (Amazon) is an indispensable guide for anyone serious about data visualization. Its methodical, data-first approach, combined with practical tutorials and a comprehensive toolkit, makes it a must-read for information designers, analysts, journalists, statisticians, and data scientists.

For those looking to refine their data visualization skills and create compelling, accurate graphics, this book offers invaluable insights and techniques.

Connect with me on LinkedIn and Twitter for more reviews and insights on the latest in data & AI, and #datamustread:

„#datamustread: Visualize This: The FlowingData Guide to Design, Visualization, and Statistics (2nd Edition) by Nathan Yau“ weiterlesen

Unlocking the Power of Data Science with Excel: Discover the Book „Data Smart“

Exploring the depths of Data Science with Excel: A glimpse into 'Data Smart' by Jordan Goldmeier, a must-read for data enthusiasts.
Exploring the depths of Data Science with Excel: A glimpse into ‚Data Smart‘ by Jordan Goldmeier, a must-read for data enthusiasts.

Data Smart (Amazon) is an exceptional guide that creatively uses Microsoft Excel to teach data science, making complex concepts accessible to business professionals. This 2nd edition, masterfully updated by Jordan Goldmeier, arrives a decade after John Foreman’s highly acclaimed original version, bringing fresh perspectives and contemporary insights to the renowned first edition.

Whether you’re a novice or a seasoned analyst, this book provides valuable insight and skill enhancement without requiring extensive programming knowledge. The practical, problem-solving approach ensures that you not only understand the theory, but also how to apply it in real-world scenarios. That’s why I’ve chosen Data Smart as our latest pick for the #datamustread book club.

Why „Data Smart“ is a #datamustread

Data Smart stands out in the realm of data science literature. Its approachable and practical methodology is a breath of fresh air for business professionals and data enthusiasts alike. Here’s why this book is an indispensable resource:

1. Excel as Your Data Science Laboratory:
The use of Excel, a tool many of us are familiar with, to unravel data science concepts is nothing short of brilliant. This approach significantly flattens the learning curve, making complex techniques more digestible.

2. Practical Learning through Real Business Problems:
Each chapter of the book introduces a different data science technique via a relatable business scenario. This context-driven approach makes the learning experience tangible and immediately applicable.

3. No Programming, No Problem:
The author’s method of teaching data science without delving into programming languages makes the content accessible to a broader audience.

4. Excel Skills Elevated:
In addition to data science concepts, readers will enhance their Excel prowess with advanced tools like Power Query and Excel Tables.

5. A Spectrum of Techniques:
From cluster analysis to forecasting, the book covers a wide array of methods, making it a comprehensive toolkit for any aspiring data scientist.

6. Fresh Perspectives in the Second Edition:
Goldmeier’s updates are not just cosmetic; they incorporate the latest Excel features, ensuring the content remains relevant in today’s fast-paced tech landscape.

Bridging the Gap with „Teach Yourself VISUALLY Power BI“

While exploring Data Smart, you’ll find parallels with the insights shared in my own book, Teach Yourself VISUALLY Power BI. Both texts aim to make data analytics accessible and actionable, providing a solid foundation for anyone looking to make informed decisions based on data.

Your Journey into Data Science Awaits

Data Smart is a gateway to understanding data science through a familiar and powerful tool: Excel. Whether you’re a beginner or a seasoned analyst, this book will enhance your analytical skills and expand your understanding of data in the business world.

Order Data Smart today and support both the authors and my endeavors in bringing such valuable resources to our community. Let’s dive into this journey of discovery together, transforming data into actionable insights.

Join the Conversation

After delving into Data Smart, I’d love to hear your thoughts and takeaways. Share your insights and join the discussion in our vibrant #datamustread community on LinkedIn and Twitter:

„Unlocking the Power of Data Science with Excel: Discover the Book „Data Smart““ weiterlesen

Power BI Tricks: 20 Essential DAX Tricks for Your Power BI Reports – A Comprehensive Guide to Power BI DAX

Even more Power BI DAX tricks in these books: "Datenvisualisierung mit Power BI" and "Teach Yourself Visually Power BI"
Even more Power BI DAX tricks in these books: „Datenvisualisierung mit Power BI“ and „Teach Yourself Visually Power BI“

Power BI DAX (Data Analysis Expressions) is at the core of Microsoft’s Power BI and offers incredible capabilities for data manipulation and insights. In this post, we’ll explore 20 ultimate DAX tricks to elevate your Power BI reports. Whether you’re a beginner or an expert, these tips will help you unlock the full potential of Power BI and Microsoft Fabric.

20 Ultimate DAX Tricks – Simply Explained

  1. Use CALCULATE for Context Modification 🛠️
    CALCULATE is a powerful function that changes the context in which data is analyzed.
    Example:CALCULATE(SUM('Sales'[Sales Amount]), 'Sales'[Region] = "West")
    This calculates the sum of sales in the West region.
  2. Use RELATED for Accessing Data from Related Tables 🔄
    RELATED function allows you to access data from a table related to the current table.
    Example: RELATED('Product'[Product Name])
    This fetches the product name related to the current row.
  3. Use EARLIER for Row Context 🕰️
    EARLIER is a useful function when you want to access data from an earlier row context.
    Example: CALCULATE(SUM('Sales'[Sales Amount]), FILTER('Sales', 'Sales'[Sales ID] = EARLIER('Sales'[Sales ID])))
  4. Use RANKX for Ranking 🏅
    RANKX function allows you to rank values in a column.
    Example: RANKX(ALL('Sales'), 'Sales'[Sales Amount], , DESC)
    This ranks sales amounts in descending order.
  5. Use DIVIDE for Safe Division 🧮
    DIVIDE function performs division and handles division by zero.
    Example: DIVIDE([Total Sales], [Total Units])
    This divides total sales by total units and returns BLANK() for division by zero.
  6. Use SWITCH for Multiple Conditions 🔄
    SWITCH function is a better alternative to nested IFs.
    Example: SWITCH([Rating], 1, "Poor", 2, "Average", 3, "Good", "Unknown")
    This assigns a label based on the rating.
  7. Use ALL for Removing Filters 🚫
    ALL function removes filters from a column or table.
    Example: CALCULATE(SUM('Sales'[Sales Amount]), ALL('Sales'))
    This calculates the total sales, ignoring any filters.
  8. Use CONCATENATEX for String Aggregation 🧵
    CONCATENATEX function concatenates a column of strings.
    Example: CONCATENATEX('Sales', 'Sales'[Product], ", ")
    This concatenates product names with a comma separator.
  9. Use USERELATIONSHIP for Inactive Relationships 🔄
    USERELATIONSHIP function allows you to use inactive relationships.
    Example: CALCULATE(SUM('Sales'[Sales Amount]), USERELATIONSHIP('Sales'[Date], 'Calendar'[Date]))
    This calculates sales using an inactive relationship.
  10. Use SAMEPERIODLASTYEAR for Year-Over-Year Comparisons 📆
    SAMEPERIODLASTYEAR function calculates the same period in the previous year.
    Example: CALCULATE(SUM('Sales'[Sales Amount]), SAMEPERIODLASTYEAR('Calendar'[Date]))
    This calculates sales for the same period last year.
  11. Use BLANK for Missing Data 🕳️
    BLANK function returns a blank.
    Example: IF('Sales'[Sales Amount] = 0, BLANK(), 'Sales'[Sales Amount])
    This returns a blank if the sales amount is zero.
  12. Use FORMAT for Custom Formatting 🎨
    FORMAT function formats a value based on a custom format string.
    Example: FORMAT('Sales'[Sales Date], "MMM-YYYY")
    This formats the sales date as „MMM-YYYY“.
  13. Use HASONEVALUE for Single Value Validation 🎯
    HASONEVALUE function checks if a column has only one distinct value.
    Example: IF(HASONEVALUE('Sales'[Region]), VALUES('Sales'[Region]), "Multiple Regions")
    This checks if there is only one region.
  14. Use ISFILTERED for Filter Detection 🕵️‍♀️
    ISFILTERED function checks if a column is filtered.
    Example: IF(ISFILTERED('Sales'[Region]), "Filtered", "Not Filtered")
    This checks if the region column is filtered.
  15. Use MAXX for Maximum Values in a Table 📈
    MAXX function returns the maximum value in a table.
    Example: MAXX('Sales', 'Sales'[Sales Amount])
    This returns the maximum sales amount.
  16. Use MINX for Minimum Values in a Table 📉
    MINX function returns the minimum value in a table.
    Example: MINX('Sales', 'Sales'[Sales Amount])
    This returns the minimum sales amount.
  17. Use COUNTROWS for Counting Rows in a Table 🧮
    COUNTROWS function counts the number of rows in a table.
    Example: COUNTROWS('Sales')
    This counts the number of rows in the Sales table.
  18. Use DISTINCTCOUNT for Counting Unique Values 🎲
    DISTINCTCOUNT function counts the number of distinct values in a column.
    Example: DISTINCTCOUNT('Sales'[Product]) This counts the number of distinct products.
  19. Use CONTAINS for Lookup Scenarios 🔍
    CONTAINS function checks if a table contains a row with certain values.
    Example: CONTAINS('Sales', 'Sales'[Product], "Product A")
    This checks if „Product A“ exists in the Sales table.
  20. Use GENERATESERIES for Creating a Series of Numbers 📊
    GENERATESERIES function generates a series of numbers.
    Example: GENERATESERIES(1, 10, 1)
    This generates a series of numbers from 1 to 10 with a step of 1.

Even more Power BI DAX Tricks

📚 If you want to dive even deeper into the world of Power BI, check out my Power BI books 🔗 Teach Yourself Visually Power BI (Amazon) and 🔗 Datenvisualisierung mit Power BI (Amazon)! These books are packed with even more tips, tricks, and tutorials to help you master Power BI. Don’t miss out on these invaluable resources!

Want to stay updated with the latest Power BI insights? Follow me on Twitter and LinkedIn. Share your thoughts, ask questions, and engage with a community of Power BI enthusiasts like yourself.

Feel free to leave a comment, ask questions, or share my Power BI DAX tweets:

„Power BI Tricks: 20 Essential DAX Tricks for Your Power BI Reports – A Comprehensive Guide to Power BI DAX“ weiterlesen

#datamustread Data Viz Essentials: The Must-Read Books to Master Data Visualization

#DataVizEssentials 2023: The Must-Read Books to Master Data Visualization
#DataVizEssentials 2023: The Must-Read Books to Master Data Visualization

Building on the previous #datamustread recommendations, I’m excited to present the data viz edition of #datamustread. In this post, we’re focusing on the indispensable skill of data visualization. Whether you’re a beginner or a seasoned pro, these five books will guide you to mastery:

  1. 📖 The Big Book of Dashboards
  2. 📖 Storytelling with Data
  3. 📖 The Truthful Art
  4. 📖 Show Me the Numbers
  5. 📖 Teach Yourself VISUALLY Power BI

The Big Book of Dashboards: Visualizing Your Data Using Real-World Business Scenarios

A comprehensive guide filled with real-world solutions for building effective business dashboards across various industries and platforms. It’s a go-to resource for matching great dashboards with real-world scenarios.

Storytelling with Data: A Data Visualization Guide for Business Professionals

Cole Nussbaumer Knaflic shares practical guidance on creating compelling data stories. Learn how to make your data visually appealing, engaging, and resonant with your audience.

„#datamustread Data Viz Essentials: The Must-Read Books to Master Data Visualization“ weiterlesen