The Wolfram|Alpha Blog is now part of the Wolfram Blog. Join us there for the latest on Wolfram|Alpha and other Wolfram offerings »
Richard Clark

Complex Analysis and Statistics for Sports Data on the NFL, NBA, and MLB

January 24, 2013 —
Comments Off

Football, basketball, and baseball have two common elements. The first, each sport is the “best” depending on which one I’m watching at a given moment. The second, each sport’s raw data can now be computed and juxtaposed in Wolfram|Alpha, which means arguments over statistics, histories, and comparisons will be better than ever before.

We discussed NFL data in Wolfram|Alpha about a year ago, but as a refresher, while you can ask Wolfram|Alpha all sorts of stuff about teams and players, the real joy comes when you ask questions that are both timely and specific. In that previous blog post, we showed the—unfortunate, in my view—example of how the Rams were, at the time, the NFL team with the worst 3rd down conversion percentage. As a Rams supporter (I make no apologies), I am proud to say that their defense has dramatically improved. This isn’t merely an opinion of mine, either—the evidence is below, in the form of gorgeous graphs.

st louis rams sacks

Honestly, I just wanted to let the world know that the Rams are pretty awesome, and have improved their ability to make sure ball-ish objects go only where they want them to go.

Speaking of ball-ish objects, Wolfram|Alpha can now answer queries regarding professional basketball and baseball. We support queries that are thematically similar to our functionality on the NFL. So, for example, complex analysis of your favorite Miami Heat players is a cinch. Has one of your friends ever asserted that LeBron James has historically made more double doubles than, say, Chris Bosh? That friend would be wrong.

double doubles by lebron james, dwayne wade, chris bosh

If you’d prefer, you can also use Wolfram|Alpha’s natural-language processing to ask it questions that I cannot imagine most announcer-people discussing. The two teams with more than four coach technical fouls are the Phoenix Suns and the New York Knicks, who are both last and first place in their respective divisions—but the reason, like happiness or true love, escapes me.

NBA teams with more than 4 coach technical fouls

Something I love about basketball is how it includes both baskets as well as balls. Similarly, this is why I like baseball—its name does not deceive me regarding its content. Wolfram|Alpha can help you analyze America’s favorite pastime, but before you explore on your own, I have every intention of showing you more graphs.

Like the other sports, you can get more than just an overview: you can ask questions that, if asked to a fellow human being, would be frustratingly specific.

mlb games with more than 30 hits and less than 5 runs

In the above query, for example, it’s not just that we can determine that there is a grand total of one game (in the past 25 years, as that is how much data Wolfram|Alpha has at the moment) with more than 30 hits and less than five runs. We can also tell you everything about that particular game,  and how the teams in question compared to each other both in that game and the season in general.

By applying computation to players’ Wikipedia pages, Wolfram|Alpha can show how a player’s popularity changes over time. For example, Bryce Harper, the 2012 Rookie of the Year, was known in high school as a prospective player—but Wade Miley was essentially unknown until he made the major leagues.

wikipedia popularity for bryce harper, mike trout, starlin castro, wade miley

Now that you know about this new functionality, I should inform you that there’s a nonzero chance we’ll have blog posts on the Superbowl and March Madness in the future, analyzing sports in extreme detail. But until then, I urge you to show your friends why their opinions are wrong by using Wolfram|Alpha to expose them to the truth.

2 Comments

how the teams in question compared to each other both in that game and the season in general

Posted by jocuri mario June 28, 2013 at 2:48 am

I’ve been trying to find the distribution of number of runs scored per home run hit.
Any suggestions as to how to structure that query?

Posted by Greg Tharp July 7, 2014 at 9:44 pm