Back when I was working on Basketball-Reference.com, I built a Hall of Fame probability model. I was a little surprised how popular it became, as the results of that model were cited quite often (for better and for worse). The model has been slightly tweaked since I left Sports Reference, but the framework remains the same.

I decided to look into this topic again, basically starting from scratch. My player pool consisted of players who met the following criteria:

Played at least 10 seasons in the NBA.

Had the entirety of their NBA career fall between the 1968-69 and 2019-20 seasons. I chose to start with the 1968-69 season because that’s the first time the NBA named All-Defensive teams. The 2019-20 season was chosen as the stopping point because players who retired after that are not yet eligible.

That gave me a pool of 691 players, 77 of whom are Hall of Famers. The response variable in my model was simply Hall of Fame status, with electees assigned a value of one and all other assigned a value of zero.

After much experimentation, I settled on five predictor variables:

Career value, which is a weighted sum of the player’s individualized wins for each season. The player’s best season received a weight of 1.00, their second-best season received a weight of 0.95, their third-best season received a weight of 0.90, and so on.

Number of All-Star Game selections.

Number of All-NBA points, where a First Team selection earns the player five points, a Second Team selection is worth three points, and a Third Team selection is worth one point.

Number of All-Defensive points, where a First Team selection earns the player two points and a Second Team selection is worth one point.

Number of championships won. The player must have appeared in at least one postseason game for the league champion to receive credit in a given season.

Those factors produced a logistic regression model with the following parameters:

Intercept = –9.63926

Career Value = 0.04991

All-Star Games = 0.85433

All-NBA Points = 0.18568

All-Defensive Points = 0.28016

Championships = 1.03844

Let’s use one of

’s favorites, former Buffalo Braves star Bob McAdoo, as an example. Here are McAdoo’s values for each predictor:Career Value = 85.3

All-Star Games = 5

All-NBA Points = 8

All-Defensive Points = 0

Championships = 2

These are used to calculate McAdoo’s logit:

```
L = –9.63926 + 0.04991 * 85.3
+ 0.85433 * 5
+ 0.18568 * 8
+ 0.28016 * 0
+ 1.03844 * 2
```

Which is then converted into a probability:

```
P = 1 / (1 + e^-L)
= 1 / (1 + e^-2.452)
= 0.921
```

If the model produced a probability of 0.5 or higher for a player, then I predicted they were a Hall of Famer, otherwise not. I’ll summarize the results:

There were 614 non-Hall of Famers in the player pool. The model correctly classified 610 of them, or 99.3%.

There were 77 Hall of Famers in the player pool. The model correctly classified 72 of them, or 93.5%.

Overall, the model correctly classified 682 of the 691 players, or 98.7%.

Let’s take a closer look at the misses, starting with the five Hall of Famers who were not pegged as such by the model:

## Keep reading with a 7-day free trial

Subscribe to Statitudes to keep reading this post and get 7 days of free access to the full post archives.