How to unleash Data Science with an MBA?

Servers record a copy of LHC data and distribute it around the world for Analytics

My Data Science journey starts at CERN where I finished my master thesis in 2009. CERN, the European Organization for Nuclear Research, is the home of the Large Hadron Collider (LHC) and has some questions to answer: like how the universe works and what is it made of. CERN collects nearly unbelievable amounts of data – 35 petabytes of data per year that needs analysis. After submitted my thesis, I continued my Data Science research at CERN.

I began to wonder: Which insights are to be discovered beyond Particle Physics? How can traditional companies benefit from Data Science? After almost four exciting years at CERN with plenty of Hadoop and Map/Reduce, I decided to join Capgemini to develop business in Big Data Analysics, and to boost their engagements in Business Intelligence. In order to leverage my data-driven background I enrolled for the Executive MBA program at Frankfurt School of Finance & Management including an Emerging Markets module at CEIBS in Shanghai.

Today companies have realized that Business Analytics needs to be an essential part of their competitive strategy. The demand on Data Scientists grows exponentially. To me, Data Science is more about the right questions being asked than the actual data. The MBA enabled me to understand that data does not provide insights unless appropriately questioned. Delivering excellent Big Data projects requires a full understanding of the business, developing the questions, distilling the adequate amount of data to answer those questions and communicating the proposed solution to the target audience.

„The task of leaders is to simplify. You should be able to explain where you have to go in two minutes.“ – Jeroen van der Veer, former CEO of Royal Dutch Shell

Transition from Academia to Capgemini: A New Chapter in Data and Analytics

CERN Main Auditorium: my transition from academia to Capgemini
CERN Main Auditorium: my transition from academia to Capgemini

After enjoying research for the last four years, especially during my time at CERN, I have made a significant decision. I have decided to resign from my postgraduate position and make a transition from academia to the exciting world of Capgemini. My passion for Data and Analytics remains strong and will be the core focus of my new role.

Capgemini: A New Adventure After Academia

Capgemini, one of the world’s largest consulting corporations, has caught my attention. Unlike many other consulting companies, Capgemini does not yet have a dedicated team to offer effective strategies and solutions employing Big Data, Analytics, and Machine Learning. This presents an exciting opportunity for me to contribute and innovate.

My Vision: Building a Data-Driven Future at Capgemini

I love these technologies and am confident in my ability to elaborate a business development plan to drive business growth. Through customer and market definition, my plan includes new services such as:

  • Data Science Strategy: Enabling organizations to solve problems with insights from analytics.
  • Consulting: Answering questions using data.
  • Development: Building custom tools like interactive dashboards, pipelines, customized Hadoop setup, and data prep scripts.
  • Training: Offering various skill levels of training, from basic dashboard design to deep dives in R, Python, and D3.js.

This plan also includes a go-to-market strategy, which I’ll keep under wraps for now. Stay tuned for a retrospective reveal in the future!

Reflecting on My Transition from Academia

Making this transition from academia to a corporate role has been a considered decision. As I previously shared in my reflection on my software engineering internship at SAP, the blend of technological challenges and team collaboration has always intrigued me. Joining Capgemini allows me to continue pursuing my passion for data in a dynamic business environment.

Conclusion: Exciting Times Ahead

This transition from academia to Capgemini marks a thrilling new chapter in my career. I look forward to leveraging my expertise in Data and Analytics to contribute to Capgemini’s growth and innovation.

Follow my journey as I explore the intersection of data, technology, and business. Connect with me on Twitter and LinkedIn.

CERN Photographs featured in Gallery curated by Yahoo

John Ellis at CERN
John Ellis at CERN

Two of my photos taken at CERN are featured in the CERN Gallery curated by the Yahoo Editorial:

Physics projects don’t get any bigger than this. The active European Organization for Nuclear Research, aka CERN, formed in 1954 and is headquartered in Geneva, Switzerland, employs thousands of world-class scientists on the forefront of breakthrough research. Its claim to fame is unmatched as the origin of the World Wide Web and creator of underground 17-mile-long particle accelerator called the Large Hadron Collider. Here, see photos of the many aspects of an international institution that may discover a way to move faster than the speed of light and how our universe was pieced together.

[flickr_gallery user_id=“47399036@N07″ id=“72157630498907364″]

Challenges of Big Data Analytics in High-Energy Physics

Challenges of Big Data Analytics: volume, variety, velocity and veracity
Screenshot of CERN Big Data Analytics presentation

There are four key issues to overcome if you want to tame Big Data: volume (quantity of data), variety (different forms of data), velocity (how fast the data is generated and processed) and veracity (variation in quality of data). You have to be able to deal with lots and lots, of all kinds of data, moving really quickly.

That is why Big Data Analytics has a huge impact on how we plan CERN’s overall technology strategy as well as specific strategies for High-Energy Physics analysis. We want to profit from our data investment and extract the knowledge. This has to be done in a proactive, predictive and intelligent way.

The following presentation shows you how we use Big Data Analytics to improve the operation of the Large Hardron Collider.

Displaying Dimuon Events from the CMS Detector using D3.js

Physicists working on the CMS Detector
Physicists working on the CMS Detector

I became a Python geek and GnuPlot maniac since I joined CERN around three years ago. I have to admit, however, that I really enjoy the flexibility of D3.js, and its capability to render histograms directly in the web browser.

D3 is a JavaScript library for manipulating documents based on data. This library helps you to bring data to life leveraging HTML, CSS and SVG, and embed it in your website.

The following example loads a CSV file, which includes 10,000 dimuon events (i.e. events containing two muons) from the CMS detector, and displays the distribution of the invariant mass M (in GeV, in bins of size 0.1 GeV):

Feel free to download the sample CSV dataset here.

Further reading: D3 Cookbook