Predicting NBA Three-Point Shooting Part 2: Random Forests

Featured Image By Eli Lucero / Herald Journal

By Andrew Lawlor

A few weeks ago, I published a model projecting three-point shooting numbers in the NBA for this year’s draft class. I used a linear model based on college statistics, and it was pretty good, but it did have some problems, particularly around upperclassmen and players who did not take a lot of free throws (which is somewhat common among three-point specialists).

To improve my projections, I switched to using a tree-based model from a linear one. In a linear model, you  create an equation like y = mx + b (think back to algebra). y is what you are predicting (in this case, NBA three-point percentage), and x is the value (or values—there can be more than one) you are using to predict; the x values (or predictor variables) in the first model were college numbers for three-point percentage, three-point rate (the percentage of field goal attempts that are threes), two-point jumpshot percentage, free-throw percentage, and position. Then, you pick m and b values that fit the data.

A tree-based model uses decision trees to create predictions. Decision trees are a lot like flow charts. Think of this classic one by Twitter god Shea Serrano, but using statistics instead. In a shooting decision tree, we would divide up data into groups using decisions on the chart, and then calculate the average three-point shooting percentage of all the shooters in each bucket. This average becomes the prediction for shooters whose college statistics place them in that group. Here’s a visual representation of a simple tree for shooting:

I used a random forest, which takes a ton of these trees and combines the results to get an accurate prediction, for this model. Random forests follow the wisdom of the crowd; the idea that the opinion of the masses is better than the opinion of one expert.

The results of the random forest model are more accurate than the linear model. Using career three-point percentage, career free-throw percentage, career two-point jump shot percentage, career three-point rate, and the individual season numbers for the player’s last season in college for all of the above statistics (these were the same as the career numbers for freshmen), I ended up with a .26 R-squared value (the percentage of the variance in the data that can be explained by the model) for the test set (compared to .15 before), with a .026 mean absolute error (previously .028) and a .035 root mean-squared error (previously .039). All data came from Bart Torvik’s site again.

I found that career college free-throw percentage is the most predictive statistic for NBA three-point percentage, with a relative feature importance of .19. The second-most predictive is free throw percentage in a player’s last college season (.17). This is consistent with most other models out there predicting NBA shooting success; after all, a free throw is an isolated indicator of shooting ability, with no defense or anything to complicate the shot. 

This model did a pretty good job of predicting past players. Duncan Robinson had the best prediction in the set (40.3 percent) and ended up as the second-best shooter in the set (43.7 percent, above his prediction, but within his 95 percent confidence interval). Stephen Curry (40.1 percent predicted, 43.5 percent actual), Gary Trent Jr. (39.7 percent predicted, 40.5 percent actual), and Buddy Hield (39.3 percent predicted, 39.0 percent actual) all also did well in both predictions and actual shooting.

Like the first model, this model missed on Michael Porter Jr. due to his limited sample size of three games in college (35.6 percent predicted, 42.2 percent actual). It also missed pretty heavily on Khris Middleton, predicting him to only be a 33.5 percent shooter in the NBA based on poor college three-point shooting numbers (32.8 percent for his career, only 25 percent in his last season) that were not enough to overcome his 75 percent career free throw percentage.

Utah State senior Sam Merrill projects to shoot 38.1 percent from three in the NBA, the best value in this draft class. His 95 percent prediction interval is 32.2 percent–44.0 percent, meaning there is a 95 percent chance his career NBA three-point percentage falls in that range.

Tyrell Terry (37.8 percent), Aaron Nesmith (37.8 percent), Markus Howard (37.8 percent), and Nate Darling (37.6 percent) all also project well. So does the top shooter in the old model, Immanuel Quickley (37.5 percent).

Anthony Edwards projects decently (34.5 percent), with his strong 76.9 free throw percentage outweighing his poor 29.1 three-point percentage.

Interestingly, Devin Vassell (34.2 percent) and Obi Toppin (33.1 percent) do not project well, despite the fact that both carry some regard as outside shooters. In both cases, a mediocre free-throw percentage is the culprit (72.0 percent for Vassell, 70.6 percent for Toppin). Neither relied on sharpshooting to build a draft case, but these poor projections should give teams pause.

Here are the numbers for everyone in the class, with a 95 percent prediction interval:

NamePredicted NBA Three-Point PercentagePrediction Interval
Sam Merrill38.1 percent32.3 percent – 44.0 percent
Tyrell Terry37.8 percent32.0 percent – 43.7 percent
Aaron Nesmith37.8 percent32.0 percent – 43.7 percent
Markus Howard37.8 percent31.9 percent – 43.6 percent
Nate Darling37.6 percent31.7 percent – 43.5 percent
Immanuel Quickley37.5 percent31.6 percent – 43.3 percent
Isaiah Joe37.2 percent31.3 percent – 43.0 percent
Cassius Winston36.8 percent31.0 percent – 42.7 percent
Tyrese Haliburton36.4 percent30.6 percent – 42.3 percent
Ty-Shon Alexander36.3 percent30.5 percent – 42.2 percent
Desmond Bane35.7 percent29.9 percent – 41.6 percent
Jordan Nwora35.7 percent29.9 percent – 41.6 percent
Payton Pritchard35.5 percent29.6 percent – 41.3 percent
Tyrese Maxey35.3 percent29.4 percent – 41.2 percent
Saddiq Bey35.2 percent29.3 percent – 41.0 percent
Killian Tillie34.9 percent29.0 percent – 40.8 percent
Skylar Mays34.9 percent29.0 percent – 40.7 percent
Mason Jones34.8 percent28.9 percent – 40.6 percent
Devon Dotson34.7 percent28.9 percent – 40.6 percent
Jalen Harris34.6 percent28.8 percent – 40.5 percent
Malachi Flynn34.6 percent28.7 percent – 40.4 percent
Anthony Edwards34.5 percent28.7 percent – 40.4 percent
Patrick Williams34.5 percent28.7 percent – 40.4 percent
Nico Mannion34.5 percent28.7 percent – 40.4 percent
Jahmi’us Ramsey34.5 percent28.6 percent – 40.3 percent
Zeke Nnaji34.3 percent28.4 percent – 40.1 percent
Devin Vassell34.3 percent28.4 percent – 40.1 percent
Jalen Harris34.2 percent28.4 percent – 40.1 percent
Grant Riller34.2 percent28.4 percent – 40.1 percent
Malik Fitts34.2 percent28.3 percent – 40.1 percent
Kira Lewis Jr.34.1 percent28.3 percent – 40.0 percent
Nathan Knight33.8 percent28.0 percent – 39.7 percent
Elijah Hughes33.8 percent27.9 percent – 39.6 percent
Josh Green33.7 percent27.9 percent – 39.6 percent
Jaden McDaniels33.7 percent27.8 percent – 39.6 percent
Tre Jones33.6 percent27.8 percent – 39.5 percent
Isaiah Stewart33.6 percent27.8 percent – 39.5 percent
Cassius Stanley33.6 percent27.7 percent – 39.4 percent
Saben Lee33.5 percent27.7 percent – 39.4 percent
CJ Elleby33.5 percent27.6 percent – 39.3 percent
Reggie Perry33.4 percent27.6 percent – 39.3 percent
Cole Anthony33.4 percent27.6 percent – 39.3 percent
Ashton Hagans33.4 percent27.5 percent – 39.2 percent
Isaac Okoro33.4 percent27.5 percent – 39.2 percent
Rayshaun Hammonds33.4 percent27.5 percent – 39.2 percent
Mamadi Diakite33.3 percent27.5 percent – 39.2 percent
Robert Woodard II33.3 percent27.4 percent – 39.2 percent
Paul Reed33.3 percent27.4 percent – 39.1 percent
Kaleb Wesson33.2 percent27.4 percent – 39.1 percent
James Wiseman33.2 percent27.4 percent – 39.1 percent
Jalen Smith33.2 percent27.4 percent – 39.1 percent
Obi Toppin33.2 percent27.3 percent – 39.0 percent
Omer Yurtseven33.0 percent27.2 percent – 38.9 percent
Tyler Bey32.8 percent27.0 percent – 38.7 percent
Sha’markus Kennedy32.7 percent26.8 percent – 38.5 percent
Vernon Carey Jr.32.4 percent26.5 percent – 38.2 percent
Onyeka Okongwu32.3 percent26.5 percent – 38.2 percent
Filip Petrusev32.3 percent26.4 percent – 38.1 percent
Xavier Tillman32.1 percent26.2 percent – 38.0 percent
Daniel Oturu32.1 percent26.2 percent – 37.9 percent
Lamine Diane31.8 percent25.9 percent – 37.6 percent
Jon Teske31.6 percent25.7 percent – 37.4 percent
Yoeli Childs31.6 percent25.7 percent – 37.4 percent
Freddie Gillespie31.5 percent25.6 percent – 37.4 percent
Nick Richards31.3 percent25.4 percent – 37.1 percent
Precious Achiuwa31.0 percent25.2 percent – 36.9 percent
Austin Wiley29.0 percent23.1 percent – 34.8 percent
Tyrique Jones26.8 percent20.9 percent – 32.7 percent

2 thoughts on “Predicting NBA Three-Point Shooting Part 2: Random Forests

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: