Appr. 5.x year reg. adj. +/- (J.E., 2010)

J.E. · Post by **J.E.** » Fri May 20, 2011 10:44 am

Euroleague update!

http://stats-for-the-nba.appspot.com/euroleague-ranking

Now with the full 10/11 euroleague season. Still includes 09/10

Important for the NBA:

-Rubio comes out as the best offensive player, #4 overall; he's scoring 6.5 pts/g on 31% FG% though. For some reason Barca's offense seems to suffer when he's not playing.
-Valanciunas looks positive on defense. I guess that's a pretty good thing for a 19 year old
-Veseley looks even better on defense, a little worse on offense
-Bogdanovic still dead last

EvanZ · Post by **EvanZ** » Fri May 20, 2011 11:50 am

Interesting to find Bojan Bogdanovic at the bottom. I thought he is a potential late first round pick.

J.E. · Post by **J.E.** » Sat May 21, 2011 1:00 pm

EvanZ wrote:Interesting to find Bojan Bogdanovic at the bottom. I thought he is a potential late first round pick.

Well, if you look at http://www.in-the-game.org/?page_id=9945 you see opponents scoring at a crazy rate when he's in the game. He has played a good amount of minutes, too, so the algorithm is not afraid to shift a lot of the blame on him.

What I found interesting is that most of the top players, by this metric, are not exactly young. Rubio really sticks out

bbstats · Post by **bbstats** » Sun May 22, 2011 1:04 pm

Sorry if this is slightly/very OT, but I keep wondering:

Would there be any benefit to running a brute-force solver (like Excel's new evolutionary algorithm) that specifically tends towards low Standard Errors for player values? Is this essentially what RAPM does?

That is, to set up the equation to improve (lower) each player's individual standard error rather than the sum of all the squared errors (which is what least-squares does, in my understanding).

To me this seems extremely similar to the RAPM system, but I have no idea to what degree. I also have no idea where the Lambda values would fit in this situation, mostly because I still don't understand what the Lambdas do, exactly.

I tried to learn regularization on my own but the library at my school didn't seem to have much to say about it.

EvanZ · Post by **EvanZ** » Sun May 22, 2011 1:36 pm

The way I understand it is that the lambda is just a Lagrange multiplier that is used to enforce some additional constraint (i.e. in addition to the least squares minimization). I don't know exactly what that constraint is for RAPM, but for example, in mechanics it would be used to enforce incompressibility of a body. It is also sometimes called a penalty method.

DSMok1 · Post by **DSMok1** » Mon May 23, 2011 11:49 am

bbstats wrote:Sorry if this is slightly/very OT, but I keep wondering:

Would there be any benefit to running a brute-force solver (like Excel's new evolutionary algorithm) that specifically tends towards low Standard Errors for player values? Is this essentially what RAPM does?

That is, to set up the equation to improve (lower) each player's individual standard error rather than the sum of all the squared errors (which is what least-squares does, in my understanding).

To me this seems extremely similar to the RAPM system, but I have no idea to what degree. I also have no idea where the Lambda values would fit in this situation, mostly because I still don't understand what the Lambdas do, exactly.

I tried to learn regularization on my own but the library at my school didn't seem to have much to say about it.

EvanZ wrote:The way I understand it is that the lambda is just a Lagrange multiplier that is used to enforce some additional constraint (i.e. in addition to the least squares minimization). I don't know exactly what that constraint is for RAPM, but for example, in mechanics it would be used to enforce incompressibility of a body. It is also sometimes called a penalty method.

Basically, RAPM does what EvanZ said, the way I understand it as well. The requirement is to minimize out-of-sample error from a k-fold cross-validation routine by adjusting the regression toward the mean. Ridge regression is different than a normal "regression to mean" because it is used within the regression itself and because all variables are normalized before doing the regression (by the algorithm). This latter doesn't mean much to us since we're not dealing with multiple scales for our variables.

J.E. · Post by **J.E.** » Sun May 29, 2011 12:27 pm

Thanks to http://www.in-the-game.org

Spanish ACB! 2011 only

http://stats-for-the-nba.appspot.com/acb-ranking

Rubio looks good once again

Crow · Post by **Crow** » Sun May 29, 2011 5:12 pm

Biyombo estimated as slightly negative, but with a confidence interval between moderately good and moderately negative and a league transition to make, this metric is not a decisive bit of information about him.

Prestes in the top 10 (edit: make that bottom 10) of the ACB on estimated impact would seem to be more concerning and call for careful work with all the information you could gather. But maybe he got "unlucky" in some way(s) (coach, teammates, role, timing in the rotation or timing of opponent and teammate performance) or maybe it is just too early to judge him.

hoopthinker · Post by **hoopthinker** » Mon May 30, 2011 4:50 am

J.E. wrote:Thanks to http://www.in-the-game.org

Spanish ACB! 2011 only

http://stats-for-the-nba.appspot.com/acb-ranking

Rubio looks good once again

Where did you find the play-by-play data for ACB at in-the-game.org because
I couldn't find them anywhere in their site?
The results though seem to agree with common perception except maybe rubio.
victor sada the backup pg seems to be more trusted by his coach.
Keep up the very interesting research.

J.E. · Post by **J.E.** » Mon May 30, 2011 9:10 am

Crow wrote: Prestes in the top 10 of the ACB on estimated impact

You mean bottom 10, I suppose?

hoopthinker wrote: Where did you find the play-by-play data for ACB at in-the-game.org because
I couldn't find them anywhere in their site?

There's no play by play data but there's http://www.acbdata.net/autoserviciong/ficheros/cp/
They don't list possessions, just minutes, so I had to estimate possessions. That's why you don't see an offense/defense split for those numbers. That could have been quite misleading because of team pace.
in-the-game.org converted all those files into matchup files. When/Once he wants to make it publicly available you can find it at http://www.in-the-game.org/?page_id=11733

victor sada the backup pg seems to be more trusted by his coach.

I'm sorry, I don't follow that league. What does that mean? Rubio does play more minutes, doesn't he? Does Sada play more at the end of games?

Crow · Post by **Crow** » Mon May 30, 2011 6:02 pm

"You mean bottom 10, I suppose?"

Yes I meant Prestes in bottom 10 on RAPM estimate. Thanks for the correction.

J.E. · Post by **J.E.** » Thu Jun 09, 2011 5:25 pm

I get mightily improved out of sample prediction when I
use 0 as a prior for 2006
use 2006 ratings as prior for 2007
and so on

Don't have time to post results right now but the results look encouraging

DSMok1 · Post by **DSMok1** » Thu Jun 09, 2011 6:00 pm

Sounds cool! How do you choose the Lambda for each year?

J.E. · Post by **J.E.** » Thu Jun 09, 2011 10:47 pm

DSMok1 wrote:Sounds cool! How do you choose the Lambda for each year?

Crossvalidation on the data of the year I'm currently working with. Lambda was between 3000 and 5000 for every year

J.E. · Post by **J.E.** » Fri Jun 10, 2011 3:20 pm

I get mightily improved out of sample prediction when I
use 0 as a prior for 2006
use 2006 ratings as prior for 2007
and so on

Don't have time to post results right now but the results look encouraging

Here are the player ratings.

http://stats-for-the-nba.appspot.com/ranking_rec

They look very similar to the 4 year ratings. R is 0.91

Rookies were given a prior of -1/-1, but that part only helped a tiny bit

Biggest drop in rating from regular 4 year to this version (rookies not listed):
Stevenson, DeShawn
Harden, James
Williams, Mo
Hill, Grant

Biggest gain:
Collison, Nick
Redd, Michael
Gay, Rudy
Deng, Loul
Cook, Brian
Durant, Kevin

This method produces better out of sample error (per possession based) than the one year and the 4 year version, so I recommend using this in favor of the other 2.

Next test uses rating/2 as a prior, assuming a regression to the mean for everyone

APBRmetrics

Appr. 5.x year reg. adj. +/- (J.E., 2010)

Re: Appr. 5.x year reg. adj. +/- (J.E., 2010)

Re: Appr. 5.x year reg. adj. +/- (J.E., 2010)

Re: Appr. 5.x year reg. adj. +/- (J.E., 2010)

Re: Appr. 5.x year reg. adj. +/- (J.E., 2010)

Re: Appr. 5.x year reg. adj. +/- (J.E., 2010)

Re: Appr. 5.x year reg. adj. +/- (J.E., 2010)

Re: Appr. 5.x year reg. adj. +/- (J.E., 2010)

Re: Appr. 5.x year reg. adj. +/- (J.E., 2010)

Re: Appr. 5.x year reg. adj. +/- (J.E., 2010)

Re: Appr. 5.x year reg. adj. +/- (J.E., 2010)

Re: Appr. 5.x year reg. adj. +/- (J.E., 2010)

Re: Appr. 5.x year reg. adj. +/- (J.E., 2010)

Re: Appr. 5.x year reg. adj. +/- (J.E., 2010)

Re: Appr. 5.x year reg. adj. +/- (J.E., 2010)

Re: Appr. 5.x year reg. adj. +/- (J.E., 2010)