APBRmetrics

Posted: **Tue Aug 19, 2014 7:07 pm**

http://tothemean.com/2014/06/17/machine ... draft.html

I have read it once. Would appreciate hearing feedback on it, especially from those with higher analytic training & experience.

Posted: **Tue Aug 19, 2014 7:29 pm**

http://tothemean.com/2014/06/25/finding ... e-nba.html

Would be interesting to take this from finding draft inefficiencies to finding performance / pay inefficiencies in the free agency market situation.

Posted: **Tue Aug 19, 2014 11:00 pm**

I have a few minor quibbles but overall seems solid, and it's pretty cool to see a large number of techniques tested against each other

In general I think (it's obvious) if we want to improve on existing methods, trying different things (alot of which will come from ML) is the way to go.

Here, and probably in many cases in the future, it doesn't seem like we could gain much from using other ML techniques over linear regression. That doesn't mean we should stop trying, though.

Posted: **Tue Aug 19, 2014 11:45 pm**

Thanks for sharing your perspective.

Posted: **Tue Aug 19, 2014 11:58 pm**

imo, the big flaw here is that he picked a baddish dv, and no matter what machine learning technique you use, it cant overcome a bad choice of dv.

and i dont get why he tried all the things that he did but he didnt try gradient boosting. it wouldnt have made much of a difference tho, probably.

Posted: **Wed Aug 20, 2014 8:42 am**

Yeah, the choice of the dependent variable was obviously a strange one, and I agree that's tough/impossible to overcome. Though I saw the article as more of an exercise to test ML techniques against each other, although I may be wrong. With that dependent variable all the statements regarding 'comparison with actual GM performance' obviously go out the window, though

Another aspect that might improve this analysis is to try more than one regression technique (AIC, BIC, LASSO, ELASTICNET, RIDGE..), or at least mention which kind was used. If he used, let's say, stepwise regression with t-tests then that would (very likely) lead to worse results than with the methods mentioned above, and thus make "regression" look worse in comparison to other techniques than it actually is

Posted: **Thu Aug 21, 2014 2:31 am**

J.E. wrote: Another aspect that might improve this analysis is to try more than one regression technique (AIC, BIC, LASSO, ELASTICNET, RIDGE..), or at least mention which kind was used. If he used, let's say, stepwise regression with t-tests then that would (very likely) lead to worse results than with the methods mentioned above, and thus make "regression" look worse in comparison to other techniques than it actually is.

hey, good catch.

when i read it, i just assumed that when he said "linear regression", he meant vanilla OLS. it didnt occur to me that he may have used a 'better' regression technique, and was just unclear/opaque in his description. i was sorta surprised that what he called linear regression did as well as it did compared to some of the ML stuff. im now thinking that he may have used elastic net or something and just not been clear about his choice of regression model...

Posted: **Thu Aug 21, 2014 2:58 am**

I sent him a brief notification of the post. Perhaps he will stop by.

Posted: **Mon Aug 25, 2014 4:26 am**

Thanks for posting my work here and providing some discussion. Its great to hear some feedback on my work, keep it coming! I'll respond to some of the comments above.

-In regards to dv choice - As my article described I struggled on what to use here. I agree that what I ended up using is a bit strange but the values it gave for historical players seemed to have a good balance of longevity and stardom which are things I would think most teams believe are valuable (but obviously every team has their own mindset/strategy so maybe not?). I am by no means set on that dv and it would be very easy to swap it out for something else and see how the models change.
-In regards to "gradient boosting" - its probably worth trying to see how it changes performance, however at the time I was more interested in trying out different general machine learning algorithms rather than trying to enhance them individually too much. Now that I know which ones stand out the most without much tweaking the next steps for improving would be trying to optimize the models individually with things like boosting/bagging/blending/tuning parameters/variable selection/etc. It is a never ending game tuning these kind of models so I just wanted to put out something which shows potential and leave tuning/enhancing for down the road.
-In regards to type of LR - I used AIC and didn't really explore other types. I personally haven't seen a huge difference in results between various types and AIC has always performed the best for me but I agree that it wouldn't hurt (and wouldn't be difficult) to try them out and see. Also it would also be interesting to see how much the various types would vary in performance. This is along the same lines as the my comment about boosting - I had to draw the line somewhere and couldn't explore everything. Also I was interested in comparing non linear regression based models to see how they compare since most people out there doing this stuff are using linear regression AFAIK.

This is a great forum, I'm stoked to add this to my daily reading. Thanks for finding me.

Posted: **Mon Aug 25, 2014 9:53 am**

Welcome to the forum

Given that you did use AIC (and not simple OLS) and have a large sample size I doubt you'll see much improvement with other linear regression methods, but yes, trying them wouldn't hurt

Posted: **Mon Aug 25, 2014 5:09 pm**

hey, thanks for coming by and replying.

perhaps for dv, youd want to try something RAPM-based? you could still include something for longevity by using MP as you did originally. PER doesnt handle defense well at all, and may overstate the benefit of usage (prolly not as much as a lot of people think, tho). the primary issue would be the defense issue.

i do get that the choice of dv may not be extremely relevant if your purpose is more to compare ML models, as opposed to making the best possible draft projection model.

im sure youll quite enjoy this forum.

APBRmetrics

machine learning draft prediction model

machine learning draft prediction model

More from Jesse Fischer

Re: machine learning draft prediction model

Re: machine learning draft prediction model

Re: machine learning draft prediction model

Re: machine learning draft prediction model

Re: machine learning draft prediction model

Re: machine learning draft prediction model

Re: machine learning draft prediction model

Re: machine learning draft prediction model

Re: machine learning draft prediction model