Volume 53 Issue 06 July/August 2020
Careers

From Academia to Major League Baseball: The Journey of a Data Scientist

My love for mathematics blossomed in a linear algebra course during my sophomore year at Pomona College. I felt truly challenged in the subject for the first time, and I enjoyed the sense of accomplishment that came with grasping complex topics. A certain beauty exists within mathematics, inherent in the way that one can prove something given only a few base assumptions and a series of logical statements.

One day, my professor suggested that I apply to the Research Experiences for Undergraduates (REU) program to gain experience in mathematical research. REUs expose undergraduate students to research in their respective disciplines, provide opportunities for networking, and offer a taste of the graduate school experience. REU projects receive funding from the National Science Foundation, which helps support participating undergraduates as they work on research projects at host institutions. During an REU, faculty or researchers from the student’s field mentor and teach him/her. I would strongly urge any undergraduate SIAM News reader who is interested in graduate school to apply for an REU. Due to encouragement from my professor, I ended up participating in two REU programs during my remaining collegiate summers. The mathematics with which I engaged during the REUs was well beyond the scope of classroom instruction, and both of my REU research groups published our results; I was hooked! My participation in these programs altered my career path and ultimately inspired me to pursue a doctoral degree in applied mathematics at Iowa State University.

Mike Dairyko, Senior Manager of Data Science for the Milwaukee Brewers Baseball Club, at Miller Park on opening day in 2019. Photo courtesy of Danny Henken.
Mike Dairyko, Senior Manager of Data Science for the Milwaukee Brewers Baseball Club, at Miller Park on opening day in 2019. Photo courtesy of Danny Henken.

In my early years of graduate school, I was convinced that I was going to be a mathematics professor at a small liberal arts college. My vision changed after I took “Introduction to Machine Learning” to complete a cognate course requirement for my degree. Machine learning piqued my interest because it was a combination of mathematics, statistics, and computer science. As the course progressed, I found myself studying machine learning during the time I had set aside for research. Then I discovered data science and knew it was the area in which I wanted to pursue a career.

After earning my Ph.D., I began to look for jobs in the data science community. In my opinion, networking is an essential skill that is worth developing before the job search begins. Throughout my job search, I utilized the professional network that I had built over the course of my undergraduate and graduate years. As a result, I received an invitation to interview for a data science position with the Milwaukee Brewers Baseball Club.

I am currently the Senior Manager of Data Science for the Milwaukee Brewers. I lead the data science portion of the Strategy and Analytics Department for Business Operations and manage another data scientist. My group acts as an internal consultant to support various departments within the Brewers, including Ticket Sales, Stadium Operations, and Marketing. My job scope is broad, but at the core I use machine learning to provide mathematical insights in relation to ticket sales and revenue. I have helped develop models to project game-by-game ticket sales, turnstile, and revenue; likelihood of ticket purchase; marketing impact on ticket sales; and much more. I employ a combination of the programming language Python, database manager SQL, and dashboard tool Tableau to build my models, access and manipulate data, and create visualizations of my outputs.

During the season, one of my main priorities is to produce game-by-game ticket and revenue projections. To do so, my group incorporates historical data—such as team performance, weather, and schedules—into multiple regression-based models and then consolidates the outputs in an easily-digestible format. A large codebase both automates and maintains this process; the codebase is regularly tweaked to ensure that it is agile enough to handle the constant usage and flow of new information. While I take a quantitative approach to creating these projections, the Ticket Sales Department relies on a more qualitative approach with institutional knowledge. A few days before each game, we meet to align the game forecast before distributing it throughout the organization. Most of the time, the delta between the two projections is relatively close. Whenever major discrepancies are present in the numbers, we either find minor bugs in the code or a need to update institutional knowledge. Our projections are most accurate when we utilize both qualitative and quantitative forecasts. These projections are then used for a variety of internal purposes, like concession and usher staffing, season-wide budgeting, and marketing.

A typical day at Miller Park, home of the Milwaukee Brewers, tends to involve a balance of individual and group work. I usually begin with a team meeting to provide status updates on various projects and offer assistance for any problems that arise within the group. I then spend most of my time developing SQL queries and Python scripts to assist with larger projects or answer various questions for upper management. I also handle administrative tasks that aid in the distribution of various model outputs to individuals within the organization. Sometimes I meet with personnel from other departments to discuss and interpret model projections. And whenever there is a game during work hours, I take a break and watch an inning or two in the ballpark!

Mathematicians are ultimately trained to develop problem-solving skills and apply them with persistence and creativity. For example, they will likely face many failed attempts when completing a problem set or conducting research. Carefully reviewing the work—and perhaps redoing it a different way or approaching the issue from another angle—eventually leads to success. I liken my position’s level of difficulty to that of conducting research for my dissertation. With that said, I do not apply the same high-level proof techniques from graduate school to my current work. However, I do use the problem-solving strategies, persistence, and creativity that I have honed throughout my mathematical journey every single day.

Although my path to becoming a data scientist was not necessarily linear, I have learned a great deal on the way and can share a few recommendations for those interested in a career in data science. I would encourage students to become comfortable with navigating a programming language such as R or Python. These languages are extremely powerful and indispensable for advanced modeling. Note that a lot of free online content is available to assist with broadening programing skills. Briefly stepping outside of mathematics and establishing computer science and statistics expertise is also useful. In retrospect, doing so would have greatly benefited me. Finally, participating in conferences with data science content is an excellent way to gain exposure to more advanced topics in the field and build a network within the community.

About the Author