Statitudes

Statitudes

Share this post

Statitudes
Statitudes
Hall of Fame Probability
Copy link
Facebook
Email
Notes
More

Hall of Fame Probability

Building a model to predict a player's Basketball Hall of Fame status.

Justin Kubatko's avatar
Justin Kubatko
Aug 27, 2024
∙ Paid
2

Share this post

Statitudes
Statitudes
Hall of Fame Probability
Copy link
Facebook
Email
Notes
More
Share

Back when I was working on Basketball-Reference.com, I built a Hall of Fame probability model. I was a little surprised how popular it became, as the results of that model were cited quite often (for better and for worse). The model has been slightly tweaked since I left Sports Reference, but the framework remains the same.

I decided to look into this topic again, basically starting from scratch. My player pool consisted of players who met the following criteria:

  • Played at least 10 seasons in the NBA.

  • Had the entirety of their NBA career fall between the 1968-69 and 2019-20 seasons. I chose to start with the 1968-69 season because that’s the first time the NBA named All-Defensive teams. The 2019-20 season was chosen as the stopping point because players who retired after that are not yet eligible.

That gave me a pool of 691 players, 77 of whom are Hall of Famers. The response variable in my model was simply Hall of Fame status, with electees assigned a value of one and all other assigned a value of zero.

After much experimentation, I settled on five predictor variables:

  • Career value, which is a weighted sum of the player’s individualized wins for each season. The player’s best season received a weight of 1.00, their second-best season received a weight of 0.95, their third-best season received a weight of 0.90, and so on.

  • Number of All-Star Game selections.

  • Number of All-NBA points, where a First Team selection earns the player five points, a Second Team selection is worth three points, and a Third Team selection is worth one point.

  • Number of All-Defensive points, where a First Team selection earns the player two points and a Second Team selection is worth one point.

  • Number of championships won. The player must have appeared in at least one postseason game for the league champion to receive credit in a given season.

Those factors produced a logistic regression model with the following parameters:

  • Intercept = –9.63926

  • Career Value = 0.04991

  • All-Star Games = 0.85433

  • All-NBA Points = 0.18568

  • All-Defensive Points = 0.28016

  • Championships = 1.03844

Let’s use one of

Marc Stein
’s favorites, former Buffalo Braves star Bob McAdoo, as an example. Here are McAdoo’s values for each predictor:

  • Career Value = 85.3

  • All-Star Games = 5

  • All-NBA Points = 8

  • All-Defensive Points = 0

  • Championships = 2

These are used to calculate McAdoo’s logit:

L = –9.63926 + 0.04991 * 85.3
             + 0.85433 * 5
             + 0.18568 * 8
             + 0.28016 * 0
             + 1.03844 * 2

Which is then converted into a probability:

P = 1 / (1 + e^-L)
  = 1 / (1 + e^-2.452)
  = 0.921

If the model produced a probability of 0.5 or higher for a player, then I predicted they were a Hall of Famer, otherwise not. I’ll summarize the results:

  • There were 614 non-Hall of Famers in the player pool. The model correctly classified 610 of them, or 99.3%.

  • There were 77 Hall of Famers in the player pool. The model correctly classified 72 of them, or 93.5%.

  • Overall, the model correctly classified 682 of the 691 players, or 98.7%.

Let’s take a closer look at the misses, starting with the five Hall of Famers who were not pegged as such by the model:

Keep reading with a 7-day free trial

Subscribe to Statitudes to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Justin Kubatko
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More