The rise of artificial intelligence (AI) has affected every industry, but the exploitation of data in Major League Baseball (MLB) is the definition of game-changing.
“New data sources are coming online all the time,” said Oliver Dykstra, data engineer at MLB team Texas Rangers, who told ZDNET how it’s his job to turn the information the organization collects into a competitive advantage.
Also: Your AI transformation depends on these 5 business tactics
Dykstra has been with the Rangers since October 2022 and was part of the behind-the-scenes squad that supported the players in their 2023 World Series win.
“It’s a great team to work with,” he said. “It’s amazing to see the impact straightaway in real-life situations. I’ve never had a job where you can celebrate your wins quite like you can in a sports team.”
Dykstra has learned some important lessons during his two years with the Rangers. Here are five ways AI and data are helping to change baseball.
1. Providing better predictions
Dykstra said the key thing he’s learned from using AI is the importance of data-powered predictive matchups.
“We can run those scenarios a lot faster and get a better sense of what’s out there,” he said. “It’s about being able to toy with these matchups and run simulations to see how a game could go if we put in this guy or another or do particular pitch sequencing.”
Dykstra said his department has hundreds of models covering areas that constantly churn out fresh information.
“From the top level, we do full-season predictions — how many wins we think we’ll get, and the other teams in our division. We were very accurate in 2023.”
<!–>
Batter tendencies are another important area for predictions.
“Creating that matchup, you can get a pretty clear picture of where batters are more likely to swing and miss,” he said.
That kind of insight can be crucial to pitchers. However, as with insight from any AI-powered project, the cultural impact of using data must be considered.
Also: 4 ways to turn generative AI experiments into real business value
“You don’t get to be a pitcher by doing whatever someone tells you,” he said. “They have a strong sense of where they’re at. So, our job is to empower them as much as possible.”
2. Creating new partnerships
Internal data talent isn’t the only important resource. Successful MLB teams’ working relationships stretch beyond the enterprise.
Dykstra said the Rangers collect data from disparate sources and use a combination of Apache Airflow and Astronomer’s orchestration and observability platform–> to ensure staff and players receive timely insights.
“We wanted something that could be dynamic and more manageable and give us a lot of insight,” he said.
Also: Integrating AI starts with robust data foundations. Here are 3 strategies executives employ
Dykstra’s department works with Astronomer to help manage the Airflow implementation and the huge amount of data being processed.
“It’s not just the pro level we’re working with. Think about the dynamic nature of the game. At any point in time, you could have one game going on in a day or 1,000 across the country and the world,” he said.
“The flow of data is not that consistent, and if information in one of those pieces starts taking longer, it could throw off the whole chain. Managing the supporting infrastructure would require a lot of upkeep and mean we couldn’t look to the future as much as we would like to.”
3. Removing manual tasks
Dykstra described baseball as a text-heavy industry. The Rangers rely on scouts around the globe. Turning their written reports into useful data can be hard work — and that’s where generative AI (Gen AI) can help.
“There are a lot of secret terms and codes that scouts use. It’s too much for one person to read through all that information, and it’s sometimes hard to understand,” he said. “Extracting the value can be difficult. But with LLMs and generative AI, we can sort through these summaries, provide a great dictionary to translate key phrases, and summarize.”
Dykstra said much of the team’s work on Gen AI is exploratory, including the project to help turn scout information into useful insights.
Also: How your business can best exploit AI: Tell your board these 4 things
He said the organization had used the Llama LLM. The franchise’s other technology partners, including Databricks and Amazon, support investigations into additional models.
The Rangers are also exploring how they might use retrieval-augmented generation to ingest the baseball rule book and produce useful information for staff and spectators.
“That information changes a lot. One example might be healthcare and providing a chat interface for our people to explore the rules,” he said.
“There are also rules for people who visit the stadium. They have questions, such as ‘Can I bring a water bottle? Do I need to have a see-through backpack?'”
4. Monitoring other factors
Player data isn’t the only potential source of competitive advantage. Dykstra said the team also feeds its models with external information, including weather data.
“This is a hot new source. Every five minutes, we’re getting data from all the different fields,” he said. “The weather dynamics in a stadium are not quite what you would think they would be. You can’t just lift your finger. It’s not something you can necessarily intuitively get.”
Also: How to level up your job in the emerging AI economy
The Rangers’ home stadium, Globe Life Field, has a retractable roof, and conditions can vary considerably from open stadiums in other locations around the US.
“It’s crucial to give the players feedback and say, ‘The wind gotcha. Back at home, that would have been a home run, so just keep doing what you’re doing. That was great.’ They want that feedback immediately — they want it right after the game,” he said.
“Next day, they want to wake up and focus on the next game. Astronomer’s ability to meet those data windows and deliver insights to our people as quickly as possible after the game helps with everything.”
5. Building new cultures
Industry experts say organizations must democratize data access to make the most of the insight created by emerging technologies.
Dykstra said that’s exactly what’s happened at the Rangers, especially the manager’s preparedness to embrace data-powered opportunities.
“I’ve been incredibly impressed with Bruce Bochy. He brings the two worlds together and uses his gut to challenge whatever assumptions we’re making,” he said.
Also: The future of computing must be more sustainable, even as AI demand fuels energy use
Dykstra explained how the Rangers have a data analyst embedded within the team to help ensure coaches and players make the most of data: “It’s always a conversation.”
Of course, the widespread use of data can bring risks. He said the Rangers must abide by the MLB’s strict rules and regulations.
“The MLB heavily restricts what kind of feedback we can give our players and coaches during the game,” he said.
“Success is all about understanding how your data is moving, where it’s coming from, where it’s going, and being able to communicate that journey effectively. It’s a clear path.”
Artificial Intelligence
<!–>
–>