Page 3 of 3

Re: Early check on rookie projections

Posted: Fri Dec 19, 2014 8:07 pm
by Statman
colts18 wrote:I don't get why you aren't including height in your model. Your model is supposed to be predictive, so adding height should add prediction.
My model is predictive based solely on actual on court performance on purpose. I'll run every college guy (that way I KNOW it is working like it should - I'm not limiting my data set to make the results "look" better so I won't have guys that won't be drafted rank high) - many of which I cannot trust the height numbers. I feel as soon as I try to add other none performance factors (height, standing reach, consensus on draft position, rank coming out of high school, etc) - I'm getting away from my point that it is possible to project future performance from college to pro based solely on college production. Before I started really delving into this - I was told that college production (one guy that told me was an actual NBA owner, simple guess who that is) was pretty much worthless. I want to show that it very well may be just the opposite - it might be the most important factor to consider, relative to age of course.

I expect I can eventually make the model better with added complexity beyond performance (something I would obviously do if I had 40 hours a week to work solely on this project) - but there is a TON of testing I'd have to do before I ever got comfortable delving into none performance factors & how much they'd affect results. But, yes, if I were in a position & had the time to test every factor I could come up with - I'd do whatever I could to smooth out the outliers (largest differences between projection & actual) that pop up - without creating new ones.

Heck - I haven't even come close to finishing the past 19 seasons of college ratings to FULLY test the past results of what I do now - let alone try to work none performance factors in. I feel I need to try to perfect (the best I can) one step at a time. Performance base projection based on historical precedent (across all statistical rating subsets) is the 1st step. I believe similarity scores based on the rating statistical subsets with the massive data set I have might be step #2 to improve step 1 (not certain) - but that'd be performance based also. Step 3 would then be introducing other none performance factors that seem to address more where the model misses.

If step one is "better" than actual draft history - steps 2 & possibly three would be icing. I THINK step 1 will outperform actual draft history on it's own when I get to test it.

But, all that being said - my & other draft models aren't ever going to be close to perfect - BUT in combination with good scouting & due diligence in learning about possible draftees - it can really help steer a team away from who is projected too high, & who will be the bargains. I think it could actually help get PRODUCTIVE players later in a draft, & especially help stock a D league affiliate. Scouts can't look at (let alone properly evaluate) 1000s of prospects every year, but properly put together models can & narrow that list greatly to help the scouts.

All the non performance data in the world was never going to get my model to spit out that Andrew Wiggins was a worthy #1 pick, let alone a future star. MAYBE it would have moved him up from #17 to, say, #10 - which means if a team was ever listening to me they wouldn't draft him anyway.