As with many things in sports analytics, it started with Bill James.
James was looking for a way to translate a baseball team’s runs scored and runs allowed into wins and losses. He noticed that the relationship could be expressed as follows:
The above could then be used to express a team’s winning percentage as a function of runs scored and runs allowed:
Because of the squared terms in the equation above, James chose to dub this the Pythagorean formula. James found that when he used this formula to predict a team’s win total given its runs scored and runs allowed, he would usually get within plus or minus four wins of the team’s actual total.
As it turns out, James’ formula is actually a logit model in disguise:
Solving the expression above for winning percentage yields:
Where, in the case of baseball, the coefficient beta has a value of approximately 2.
Keep reading with a 7-day free trial
Subscribe to Statitudes to keep reading this post and get 7 days of free access to the full post archives.