2014 Draft Projection Models

Home for all your discussion of basketball statistical analysis.
J.E.
Posts: 852
Joined: Fri Apr 15, 2011 8:28 am

Re: 2014 Draft Projection Models

Post by J.E. »

Here's my 'Minutes played per NBA Season'-projection
http://pastebin.com/x2Hj4T7U

Please note that this isn't making any statements about future impact of these players, just how many minutes they're most likely to play. The projections for impact will be somewhat similar, but certainly not identical

It's not a fan of Embiid, projecting him to play only the 25th most minutes, without actually having the information that he just broke his foot. Wiggins is #7, Parker #5, Smart #11, Gordon #12, Randle #4

1, 2, 3 are Kyle Anderson, Elfrid Payton and Jordan Adams, currently mocked at #25, #12 and #23, respectively. Adams is young and good at stealing the ball, while Anderson and Payton racked up lots of total AST (Anderson is 6'8", too). From watching scouting videos both Anderson and Adams appear to be unathletic - something that might not have been a problem for them in College but could very well be in the NBA and probably somewhat explains the difference in minute projection rank and mock draft rank

Doing this analysis with Ridge Regression leads to better OOS prediction but the coefficients are hard to interpret. The most sparse (and decent) model I can create comes from Least-Angle-Regression with Bayesian Information Criterion and the coefficients are as follows, in order of importance

Code: Select all

- Age 	241
STL_tot 	147
MOV*SOS 	91
DRB_tot*SOS 	80
TS*SOS 	72
PTS_tot*SOS 	61
TS 	33
AST_tot 	33
- PF/AGE 	30
If I can find more time in the next days I'll try to post some OOS predictions from earlier years, and maybe I'll try to predict RAPM, too
Crow
Posts: 10533
Joined: Thu Apr 14, 2011 11:10 pm

Re: 2014 Draft Projection Models

Post by Crow »

is TS here true shots?
J.E.
Posts: 852
Joined: Fri Apr 15, 2011 8:28 am

Re: 2014 Draft Projection Models

Post by J.E. »

Crow wrote:is TS here true shots?
True Shooting Percentage. Sorry, I forgot to add the '%' sign to make it clearer
Statman
Posts: 548
Joined: Fri Apr 15, 2011 5:29 pm
Location: Arlington, Texas
Contact:

Re: 2014 Draft Projection Models

Post by Statman »

J.E. wrote:Here's my 'Minutes played per NBA Season'-projection
http://pastebin.com/x2Hj4T7U

Please note that this isn't making any statements about future impact of these players, just how many minutes they're most likely to play. The projections for impact will be somewhat similar, but certainly not identical

It's not a fan of Embiid, projecting him to play only the 25th most minutes, without actually having the information that he just broke his foot. Wiggins is #7, Parker #5, Smart #11, Gordon #12, Randle #4

1, 2, 3 are Kyle Anderson, Elfrid Payton and Jordan Adams, currently mocked at #25, #12 and #23, respectively. Adams is young and good at stealing the ball, while Anderson and Payton racked up lots of total AST (Anderson is 6'8", too). From watching scouting videos both Anderson and Adams appear to be unathletic - something that might not have been a problem for them in College but could very well be in the NBA and probably somewhat explains the difference in minute projection rank and mock draft rank

Doing this analysis with Ridge Regression leads to better OOS prediction but the coefficients are hard to interpret. The most sparse (and decent) model I can create comes from Least-Angle-Regression with Bayesian Information Criterion and the coefficients are as follows, in order of importance

Code: Select all

- Age 	241
STL_tot 	147
MOV*SOS 	91
DRB_tot*SOS 	80
TS*SOS 	72
PTS_tot*SOS 	61
TS 	33
AST_tot 	33
- PF/AGE 	30
If I can find more time in the next days I'll try to post some OOS predictions from earlier years, and maybe I'll try to predict RAPM, too
JE - my model is FINALLY done - check it out, I think you'll see some similarities in my projected minutes to yours. I project player progression across 14 skillsets (after adjusting for pace/SoS/etc) - and I use playing time limiters off a bunch of skillsets & skillset combos based off historic precedent in the NBA. This, to me, seems to tie in more of a RAPM vibe, guys that do well across the skillset spectrum tend to project more minutes even if their "rating" is somewhat lacking.

Anyway - I tie the projected career playing time to the projected career rating (both done year by year) to created a career "WAR" in order to rank players.

http://hoopsnerd.com/?p=600

Doug McDermott's defensive projections were so bad that his career minutes were SEVERELY limited. I obviously don't expect his minutes to be that low - BUT I do expect the poor skillsets that limited his playing time in the projection will manifest themselves in other ways in his game (limited athleticism making it harder for him to get shots, play D, etc) - and it all somewhat evens out in the end (he ends up ranked 37th - at least higher than Hood or Lavine, who never even projected above replacement level in their careers).

Anyway - Wiggins ends up 17th in projected career WAR, but 6th in projected career playing time (after Parker, Adams, Smart, Payton, & Stokes), which is a little more like what you are doing here - as maybe a bit of a nod to RAPM.

BTW - Kyle Anderson was limited more in my projections because his lack of scoring hurts his later projected career minutes (as that already mediocrish skillset falls off the cliff a bit). Plus, I also used '13 ratings (weighted half of '14), which hurt his projection since he was more of a break out guy in '14. My ratings love his teammate Adams though - who was great in both '13 & '14 with no statistical skillset red flags.
Dr Positivity
Posts: 331
Joined: Thu Sep 20, 2012 6:44 pm

Re: 2014 Draft Projection Models

Post by Dr Positivity »

Nice work Statman! I think your list will end up being pretty good compared to conventional wisdom

Jordan Adams is underrated for the same reasons Andrew Wiggins is overrated 8-)
Statman
Posts: 548
Joined: Fri Apr 15, 2011 5:29 pm
Location: Arlington, Texas
Contact:

Re: 2014 Draft Projection Models

Post by Statman »

Dr Positivity wrote:Nice work Statman! I think your list will end up being pretty good compared to conventional wisdom

Jordan Adams is underrated for the same reasons Andrew Wiggins is overrated 8-)
Thanks. I'm trying to do something different - not just have some rank with an ambiguous number next to each guy like I used to do years ago. I think it's more researched than many others - having to compile player ratings for 790 college players over 19 seasons (completely adjusting for pace, SOS, etc), as well as the entire careers of 2620 NBA players over 35 seasons to come up with my constants. I'm trying to show players more in their projected totality - across statistical spectrums spanning entire careers. Gives people a chance to see which guys have the statistical make up to be more immediate impact (Napier), and which guys may have more long term upside (Parker, Smart).

I will re-draft all the past drafts and compare to actual draft results, I am certain on average it will out perform the real gm picks - and that's without any added benefit of scouting, combine results, draft savvy (not drafting a guy until late 2nd who rates out great because you know he'll fall to you), etc. to even better differentiate the players from the pretenders.
Last edited by Statman on Thu Jun 26, 2014 3:09 am, edited 1 time in total.
Crow
Posts: 10533
Joined: Thu Apr 14, 2011 11:10 pm

Re: 2014 Draft Projection Models

Post by Crow »

Statman, I don't know if my comment at your site got lost in the ether or it didn't resonate with you; but in case it got lost, I asked about the total weight for defensive actions only being about 1/3rd the total for an average NBA player. Are you sticking with that due to lack of solid individual shot defense data? Still this seems like a major issue in trying to project total impact fully.
Statman
Posts: 548
Joined: Fri Apr 15, 2011 5:29 pm
Location: Arlington, Texas
Contact:

Re: 2014 Draft Projection Models

Post by Statman »

Crow wrote:Statman, I don't know if my comment at your site got lost in the ether or it didn't resonate with you; but in case it got lost, I asked about the total weight for defensive actions only being about 1/3rd the total for an average NBA player. Are you sticking with that due to lack of solid individual shot defense data? Still this seems like a major issue in trying to project total impact fully.
I better check my spam comments - I never saw a comment from you.

I am sticking with it due to very little advanced college data available past the last couple years. Defense is factored into player ratings tied to playing time/production and team defense - as well as the typical box score stats (blks, steals, pf, etc).

I don't see it as a "major" issue - predicting future NBA performance without that data is viable. Now, including none box score stuff (RAPM & such) I would assume would improve predictive capabilities. Some day.
Dr Positivity
Posts: 331
Joined: Thu Sep 20, 2012 6:44 pm

Re: 2014 Draft Projection Models

Post by Dr Positivity »

My final draft rankings http://asubstituteforwar.wordpress.com/ ... ikes-back/ My strategy is more on the side of skillset/talent evaluation/scouting/etc. than numbers, however in addition to my traditional model I included 3 other ones to test it against to see if it'll make it better, one weighting against conventional draft board (ESPN), one using a college PER + age, and one weighting it with VJL's EWP
Nathan
Posts: 137
Joined: Sat Jun 22, 2013 4:30 pm

Re: 2014 Draft Projection Models

Post by Nathan »

I finally have some preliminary results from my model. I'll give a brief rundown of my methodology first.

I used all pace adjusted per-40 box score stats from DX, as well as SOS and age from Sports Reference, from each prospect's most recent season. I did not use height or any team stats (other than SOS). I hope to incorporate multiple years for each player as well as height and team stats, which I expect will significantly improve my ratings.

I inverted age, turnovers, two pointers missed, and three pointers missed, stats which from prior experience I found to be negatively predictive of success.

I calculated all second-order terms (stuff like assists*rebounds, for instance).

I subtracted the mean off of each stat, divided it by its standard deviation, and took the arctangent of it, an operation that basically serves to squeeze the entire distribution in between -pi/2 and pi/2. I removed all major colinearity from the dataset, requiring that no two stats have correlation >0.2. I checked and confirmed that each resulting "stat" still had some correlation with the stat it "used to be" before I did all those operations.

My model predicts peak NBA plus/minus; in particular I used GotBuckets APM, as it is free of any possible box score bias.

It is a Markov Chain Monte Carlo model which assumes each players APM is drawn from a normal distribution with mean given as a function of his college stats and standard deviation given by the sum in quadrature of the error in his APM (given on the GotBuckets website), and the error in his projection. Currently, the calculated mean error in my projections is about 1.53, meaning that on average, about two thirds of players should have peak APM within 1.53 +/- units of where I project them. In the future I plan to calculate uncertainties for each player individually as functions of their college stats (and perhaps more importantly, their age).

I'll give a list of calculated weights tomorrow.

And finally, here's a list of the 73 players in DX's top 100 (or, DX's top 100 from a month ago) who played in college last year:

Elfrid Payton ______6.55746121
Andrew Wiggins ______3.9968845
DeAndre Kane ______3.79697828
Marcus Smart ______3.22014762
Shabazz Napier ______3.07128301
James Young ______2.80215328
Spencer Dinwiddie ______2.60563913
Aaron Gordon ______2.49549356
Joel Embiid ______1.8405201
Jabari Parker ______1.80809221
Tyler Ennis ______1.73942355
K.J. McDaniels ______1.58488521
Lamar Patterson ______1.21118819
Jordan Adams ______1.00606821
Gary Harris ______0.56465323
Aaron Craft ______0.54533677
Dwight Powell ______0.50225556
Khem Birch ______0.45320336
Jordan Bachynski ______0.35492847
Nick Johnson ______0.32062275
Kyle Anderson ______0.26925508
Alex Kirk ______0.2605019
Markel Brown ______0.189723
C.J. Fair ______0.12876617
Julius Randle ______0.01900779
James McAdoo ______-0.02680059
Melvin Ejim ______-0.04257099
Devyn Marble ______-0.10058178
Cory Jefferson ______-0.2211347
DeAndre Daniels ______-0.27800155
Nik Stauskas ______-0.36611203
Jahii Carson ______-0.65687057
Mitch McGary ______-0.78120747
Kendall Williams ______-0.78766665
Patric Young ______-0.89126099
Semaj Christon ______-0.95177293
Keith Appling ______-1.02522103
Jerami Grant ______-1.09016594
Casey Prather ______-1.2955002
Bryce Cotton ______-1.3255489
Glenn Robinson ______-1.43691374
Jabari Brown ______-1.50679031
Joe Harris ______-1.58101938
Isaiah Austin ______-1.73284208
Juvonte Reddic ______-1.81206552
Sim Bhullar ______-1.82825098
Jarnell Stokes ______-1.8768745
Niels Giffey ______-1.97057779
Johnny O'Bryant ______-1.98293587
Josh Huestis ______-2.0443171
Cameron Bairstow ______-2.15385291
Noah Vonleh ______-2.17496624
Alec Brown ______-2.26402615
Zach LaVine ______-2.42352312
Russ Smith ______-2.49649784
LaQuinton Ross ______-2.59323857
Rodney Hood ______-2.61264741
Deonte Burton ______-2.6783112
Jordan McRae ______-2.74104274
C.J. Wilcox ______-2.85747673
Sean Kilpatrick ______-3.0470004
Eric Moreland ______-3.12551713
Markel Starks ______-3.13496685
Jordan Clarkson ______-3.47183575
Akil Mitchell ______-3.70630473
T.J. Warren ______-3.83836868
Xavier Thames ______-3.85287383
Jakarr Sampson ______-3.88915413
Fuquan Edwin ______-3.8951919
Adreian Payne ______-3.97236446
Doug McDermott ______-4.84557967
Shayne Whittington ______-5.68788702
Cleanthony Early ______-7.24058276

A lot of the guys expected to come out near the top based on various mock drafts and VJL's projections did indeed come out near the top, which was very encouraging.

Probably the biggest mystery is Deandre Kane, who ranks 3rd. He'll probably fall some when I include height, as he's relatively short for a guy pulling in 8 boards per 40. The method I use to create my dataset is quite harsh in squeezing in outliers (see To-do), and Kane benefited from this as he was a major outlier, in a bad way, in terms of age (already 25). In general, my current model seems to overrate guys who are average or above-average across the board and underrate guys who have a few elite skills for this reason. It also tends to overrate smalls and underrate bigs, and it presumably overrates guys playing for bad teams and underrates guys playing for good teams. I do think Kane could be decent though...he's unlikely to improve, but he may not have to. He appears to be just the kind of big, athletic point guard that's taking the league by storm these days. I doubt he'll ever be a starter, let alone a star, but he seems more than deserving of a roster spot.

To-do:

-add heights to dataset
-add team stats to dataset
-adjust model to account for multiple years
-instead of dividing by standard deviation before taking arctan, try dividing by 2x standard deviation. This would be less harsh in squeezing in outliers.
-work on finding player-specific uncertainties

I welcome any advice you guys have and I am happy to answer any questions. Special thanks to James Brocato for inspiring the overall structure of this model with his excellent work at Shutupandjam.net.
J.E.
Posts: 852
Joined: Fri Apr 15, 2011 8:28 am

Re: 2014 Draft Projection Models

Post by J.E. »

A bunch of comments, meant to be as constructive criticism
Nathan wrote:I inverted age, turnovers, two pointers missed, and three pointers missed, stats which from prior experience I found to be negatively predictive of success.
Why? If it's negatively predictive of success it should simply get a negative coefficient, there's no real need to invert it. There are cases where inverting makes sense but that's definitely not a good reason to do so
I subtracted the mean off of each stat, divided it by its standard deviation, and took the arctangent of it, an operation that basically serves to squeeze the entire distribution in between -pi/2 and pi/2.
I'd be surprised if the "arctan"-helped with anything. Do you have OOS results with and without it? If it makes projections only marginally better I'd probably lean towards removing it because you get a simpler model
I removed all major colinearity from the dataset, requiring that no two stats have correlation >0.2. I checked and confirmed that each resulting "stat" still had some correlation with the stat it "used to be" before I did all those operations.
I'm generally not a fan of hand-picking variables, plus the 0.2 cutoff is arbitrary. The more subjective influence you take on variables the bigger the gap will later be between "past prediction" (retrodiction) performance and real world performance. You can avoid it by switching to regression techniques that deal with colinearity better through regularization (Ridge Regression, LASSO, ElasticNet) or other techniques that deal with it through more sound variable elimination methods (AIC, BIC)
Nathan
Posts: 137
Joined: Sat Jun 22, 2013 4:30 pm

Re: 2014 Draft Projection Models

Post by Nathan »

J.E. wrote:A bunch of comments, meant to be as constructive criticism
Nathan wrote:I inverted age, turnovers, two pointers missed, and three pointers missed, stats which from prior experience I found to be negatively predictive of success.
Why? If it's negatively predictive of success it should simply get a negative coefficient, there's no real need to invert it. There are cases where inverting makes sense but that's definitely not a good reason to do so
I subtracted the mean off of each stat, divided it by its standard deviation, and took the arctangent of it, an operation that basically serves to squeeze the entire distribution in between -pi/2 and pi/2.
I'd be surprised if the "arctan"-helped with anything. Do you have OOS results with and without it? If it makes projections only marginally better I'd probably lean towards removing it because you get a simpler model
I removed all major colinearity from the dataset, requiring that no two stats have correlation >0.2. I checked and confirmed that each resulting "stat" still had some correlation with the stat it "used to be" before I did all those operations.
I'm generally not a fan of hand-picking variables, plus the 0.2 cutoff is arbitrary. The more subjective influence you take on variables the bigger the gap will later be between "past prediction" (retrodiction) performance and real world performance. You can avoid it by switching to regression techniques that deal with colinearity better through regularization (Ridge Regression, LASSO, ElasticNet) or other techniques that deal with it through more sound variable elimination methods (AIC, BIC)
-I should have gone into greater detail on why exactly I inverted those stats. I did so in order to generate more relevant cross terms, as my intuition is that a cross between two terms that are both positive indicators (or both negative indicators) is more informative than a cross between a positive indicator and a negative indicator. In effect, I get cross terms like assists/turnovers instead of assists*turnovers.

-The arctan helped in particular with the cross terms, where huge outliers were particularly common. Such huge outliers made the model vulnerable to overfitting if the outliers were in the past data, and they caused the model to give nonsensical predictions if the outliers were in this year's data.

-The 0.2 cutoff was indeed arbitrary, but there was no hand picking involved. For each stat, my program ran through all the other stats in my database. When it ran into a stat with which it had >0.2 correlation, it eliminated that correlation with a linear model. I am hesitant to switch away from MCMC as I consider the calculation of uncertainty as a function of college stats to be of significant importance and I do not want to lose that capability.
nbacouchside
Posts: 151
Joined: Sun Jul 14, 2013 4:58 am
Contact:

Re: 2014 Draft Projection Models

Post by nbacouchside »

I ran a stepwise back and fwd regression on per 40 pace adjusted stats for draft classes from 02-12 to predict rookie and 3rd year xRAPM (02-11 classes for this one). [Edit: results in later posts]
Last edited by nbacouchside on Fri Jul 04, 2014 5:35 pm, edited 1 time in total.
Crow
Posts: 10533
Joined: Thu Apr 14, 2011 11:10 pm

Re: 2014 Draft Projection Models

Post by Crow »

Lots to look at. Thanks guys.

Nathan, no biggie but one or two decimal places might be preferable.
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: 2014 Draft Projection Models

Post by DSMok1 »

nbacouchside wrote:I ran a stepwise back and fwd regression on per 40 pace adjusted stats for draft classes from 02-12 to predict rookie and 3rd year xRAPM (02-11 classes for this one).
Hmmm... look where the OKC Thunder draft picks in these 2 drafts come up in the 1 year xRAPM prediction!! You might be on to their drafting basis. Very interesting work.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
Post Reply