Page 4 of 5

Re: Shots at RPM

Posted: Wed Nov 12, 2014 4:23 pm
by permaximum
@mystic

Show me some data and it's ridge regression results then we'll see what's wrong here. Something has to be wrong. The data or the calculation process... I'm pretty sure there's nothing wrong at my calculation process.

Re: Shots at RPM

Posted: Wed Nov 12, 2014 5:01 pm
by mystic
Uh? You said you are using glmnet, which means you can test that by yourself. Increase the amount of possessions for a low possession player artificially either via weights or by adding the same lineup data to the raw data a couple of times. You can also plugin the lambda manually, use a higher (like 100) and a lower (like 10) value and then compare the standard deviation of the coefficients.

Re: Shots at RPM

Posted: Wed Nov 12, 2014 5:06 pm
by permaximum
No significant changes. It's not too hard or time consuming to calculate 2007-08 RAPM or the next seasons' with basketballvalue.com data. Just do it and let's see what's wrong here.

Re: Shots at RPM

Posted: Wed Nov 12, 2014 5:20 pm
by mystic
https://yourlogicalfallacyis.com/burden-of-proof

I leave it at that, because I can't take someone, who basically wants to argue that the weights and lambda have no effect on the result of a ridge regression, seriously.

Re: Shots at RPM

Posted: Wed Nov 12, 2014 5:37 pm
by permaximum
Calm dawn pal, I didn't say weights or lambda have no effect on the result of ridge regression. You're doing demagogy. I'm just saying players with few possessions can be at top in the results. If there's something wrong in the process, I'm saying let's find out.

However, not in theory, but practice. If you don't have the time for that fine... Still, I suggest you to let your work do the talking. Especially in a forum called APBRmetrics.

While we are talking about the words of work, why didn't the keeper of RPM participate in 2014-15 prediction contest?

Edit: I'm insisting on this one because I saw one case of yours here, you calculated the lambda wrong and skewed the results for ridge regression.

For this data: viewtopic.php?t=7919#p11777

You calculated it here : viewtopic.php?t=7919#p11782

So, there is only one incident for your ability to calculate ridge regression, and you did it wrong.... I'm an ameteur but you're a professional. You're not supposed to make those kind of mistakes.

Re: Shots at RPM

Posted: Wed Nov 12, 2014 5:49 pm
by mystic
permaximum wrote:I'm just saying players with few possessions can be at top in the results. If there's something wrong in the process, I'm saying let's find out.
Nothing wrong with the process, if such thing happens. Maybe we understood each other wrongly here, because I did not argue that such players can't have extreme values.
permaximum wrote: However, not in theory, but practice. If you don't have the time for that fine... Still, I suggest you to let your work do the talking. Especially in a forum called APBRmetrics.
Not quite sure what you even talking about, but well, let us start with you uploading the used raw data and the R script in order to check whether "no significant changes" is a truthful answer. Shall we?
permaximum wrote: While we are talking about the words of work, why didn't the keeper of RPM participate in 2014-15 prediction contest?
No idea, but I also don't see why that should matter.
permaximum wrote: Edit: I'm insisting on this one because I saw one case of yours here, you calculated the lambda wrong and skewed the results for ridge regression.

So, there is only one incident for your ability to calculate ridge regression, and you did it wrong.... I'm an ameteur but you're a professional. You're not supposed to make those kind of mistakes.
:lol:

Sorry, but that is so much bs, seriously, how would you know that the lambda was "calculated wrong"? I didn't use glmnet for that example, just saying ...

Re: Shots at RPM

Posted: Wed Nov 12, 2014 6:06 pm
by permaximum
Because when you use the lambda that results minimum CV error you should find coefs around these values.

Intercept: 8.668964
Fig1 : 0.1984727
Fig2 : 0.1969857
Fig3 : 0.4239485

Let's look at your findings.

(Intercept) 20.2519966
Fig1 0.1828746
Fig2 0.1930801
Fig3 0.2523306

I didn't use glmnet here too BTW.

Re: Shots at RPM

Posted: Wed Nov 12, 2014 6:56 pm
by colts18
J.E. in your statistical Plus/minus that is used for RPM, do you split out turnovers that liveball (steals) and deadballs (all other turnovers). That should give slightly more credit to players whose turnovers are deadball (allowing the defense to recover) rather than having it stolen (resulting in an easier shot for the opponent).

Re: Shots at RPM

Posted: Wed Nov 12, 2014 7:32 pm
by permaximum
colts18 wrote:J.E. in your statistical Plus/minus that is used for RPM, do you split out turnovers that liveball (steals) and deadballs (all other turnovers). That should give slightly more credit to players whose turnovers are deadball (allowing the defense to recover) rather than having it stolen (resulting in an easier shot for the opponent).
I once splitted them too along with blocked misses but found out that the difference didn't worth my trouble. However, it's probably not a "trouble" for him since he's using PBP data always.

Re: Shots at RPM

Posted: Wed Nov 12, 2014 10:01 pm
by mystic
permaximum wrote:Because when you use the lambda that results minimum CV error you should find coefs around these values.
I just ran the regression on those numbers and found the lambda to be 1.8, and subsequently coefficients close to yours. No idea how I arrived at "17" 2.5 years ago, maybe the result of the cv was 1.7 and I misread it and then plugged in the wrong lambda manually? The old results of the regression are based on a lambda of exact 17 (which points to that). As you see, even "pros" can make mistakes, and it would be silly to assume that somebody even well-trained in a subject would be infallible, especially if something like that is done as quickly as I did it back in 2012.

Anyway, that does not adress the issue at hand, where you claimed that a different amount of possessions or a change of the lambda wouldn't have an effect on the results of the low possession players. Are you willing to upload your used raw data (meaning the already prepared matrix for the regression) as well as your R-Script in order to let me or others check, if the results would be indeed no different?

Re: Shots at RPM

Posted: Wed Nov 12, 2014 10:51 pm
by permaximum
Here's the data. And the R script. It is so simple...

Code: Select all

memory.limit(6051)
library(doParallel)
library(glmnet)
rapm <- read.table("c:/users/****/yyyyy/rapm2008.csv", sep=";", header=TRUE)
require(doParallel)
registerDoParallel(3)
x <- as.matrix(rapm[c("O1","O2","O3","O4","O5","O6","O7","O8","O10","O11","O12","O14","O15","O16","O17","O19","O20","O21","O22","O23","O24","O25","O26","O28","O29","O30","O31","O32","O33","O34","O35","O36","O37","O39","O40","O41","O42","O44","O45","O46","O47","O48","O49","O50","O51","O52","O54","O55","O56","O59","O60","O61","O62","O63","O64","O65","O66","O67","O68","O69","O70","O71","O72","O73","O74","O75","O76","O77","O78","O79","O80","O83","O85","O86","O87","O88","O89","O90","O91","O92","O93","O94","O95","O96","O97","O98","O100","O101","O103","O109","O110","O111","O112","O113","O114","O115","O116","O117","O118","O119","O121","O122","O123","O124","O125","O126","O127","O128","O130","O131","O132","O133","O136","O137","O138","O139","O140","O141","O143","O144","O145","O146","O147","O148","O149","O150","O151","O152","O154","O155","O156","O158","O159","O161","O163","O164","O165","O166","O167","O168","O169","O170","O171","O172","O173","O174","O176","O180","O181","O185","O186","O187","O188","O189","O190","O191","O193","O194","O196","O197","O198","O199","O200","O201","O202","O203","O204","O205","O206","O209","O210","O211","O212","O214","O215","O218","O220","O221","O222","O223","O224","O226","O231","O235","O236","O237","O238","O239","O240","O241","O243","O244","O245","O246","O247","O249","O250","O251","O252","O253","O254","O255","O257","O258","O260","O261","O262","O263","O264","O265","O266","O269","O270","O271","O272","O273","O274","O275","O276","O277","O278","O279","O280","O281","O283","O284","O285","O286","O287","O288","O289","O292","O293","O294","O295","O296","O297","O298","O300","O301","O302","O304","O305","O306","O307","O308","O309","O311","O312","O313","O315","O316","O317","O318","O319","O320","O321","O322","O323","O324","O326","O327","O328","O329","O330","O331","O332","O337","O344","O348","O349","O353","O354","O355","O356","O357","O362","O366","O367","O372","O373","O377","O382","O383","O389","O390","O392","O401","O403","O408","O409","O410","O411","O412","O413","O416","O417","O418","O419","O426","O430","O433","O436","O439","O444","O450","O453","O454","O458","O459","O460","O467","O468","O470","O472","O477","O483","O484","O486","O488","O489","O492","O501","O505","O508","O510","O514","O515","O516","O517","O518","O525","O531","O537","O541","O544","O550","O551","O552","O553","O555","O556","O557","O558","O559","O560","O561","O562","O563","O565","O567","O569","O572","O573","O574","O575","O577","O578","O579","O580","O581","O585","O588","O589","O592","O593","O594","O595","O596","O597","O598","O599","O600","O601","O602","O605","O606","O607","O608","O609","O610","O612","O614","O617","O622","O629","O636","O640","O643","O644","O646","O650","O653","O655","O659","O660","O661","O662","O663","O664","O665","O666","O667","O668","O669","O670","O671","O672","O673","O674","O675","O676","O677","O678","O679","O680","O681","O682","O683","O684","O685","O686","O687","O688","O689","O690","O691","O692","O693","O694","O695","O696","O697","O698","O699","O700","O701","O702","O703","O704","O705","O706","O707","O708","O709","O710","O711","O712","O713","O714","O715","O716","O717","O718","O719","O720","O721","O722","O723","O724","O725","O726","O727","O728","O729","O730","O731","O732","O801","O802","O803","O804","O805","O806","O807","O808","O809","O810","O811","O812","O813","O814","O815","O816","O817","O818","O819","O820","O821","O822","O823","O824","O825","O826","O827","O828","O829","O830","O831","O832","O833","O834","O835","O836","O837","O838","O839","O840","O841","O842","O843","O844","O845","O846","O847","O848","O849",
"D1","D2","D3","D4","D5","D6","D7","D8","D10","D11","D12","D14","D15","D16","D17","D19","D20","D21","D22","D23","D24","D25","D26","D28","D29","D30","D31","D32","D33","D34","D35","D36","D37","D39","D40","D41","D42","D44","D45","D46","D47","D48","D49","D50","D51","D52","D54","D55","D56","D59","D60","D61","D62","D63","D64","D65","D66","D67","D68","D69","D70","D71","D72","D73","D74","D75","D76","D77","D78","D79","D80","D83","D85","D86","D87","D88","D89","D90","D91","D92","D93","D94","D95","D96","D97","D98","D100","D101","D103","D109","D110","D111","D112","D113","D114","D115","D116","D117","D118","D119","D121","D122","D123","D124","D125","D126","D127","D128","D130","D131","D132","D133","D136","D137","D138","D139","D140","D141","D143","D144","D145","D146","D147","D148","D149","D150","D151","D152","D154","D155","D156","D158","D159","D161","D163","D164","D165","D166","D167","D168","D169","D170","D171","D172","D173","D174","D176","D180","D181","D185","D186","D187","D188","D189","D190","D191","D193","D194","D196","D197","D198","D199","D200","D201","D202","D203","D204","D205","D206","D209","D210","D211","D212","D214","D215","D218","D220","D221","D222","D223","D224","D226","D231","D235","D236","D237","D238","D239","D240","D241","D243","D244","D245","D246","D247","D249","D250","D251","D252","D253","D254","D255","D257","D258","D260","D261","D262","D263","D264","D265","D266","D269","D270","D271","D272","D273","D274","D275","D276","D277","D278","D279","D280","D281","D283","D284","D285","D286","D287","D288","D289","D292","D293","D294","D295","D296","D297","D298","D300","D301","D302","D304","D305","D306","D307","D308","D309","D311","D312","D313","D315","D316","D317","D318","D319","D320","D321","D322","D323","D324","D326","D327","D328","D329","D330","D331","D332","D337","D344","D348","D349","D353","D354","D355","D356","D357","D362","D366","D367","D372","D373","D377","D382","D383","D389","D390","D392","D401","D403","D408","D409","D410","D411","D412","D413","D416","D417","D418","D419","D426","D430","D433","D436","D439","D444","D450","D453","D454","D458","D459","D460","D467","D468","D470","D472","D477","D483","D484","D486","D488","D489","D492","D501","D505","D508","D510","D514","D515","D516","D517","D518","D525","D531","D537","D541","D544","D550","D551","D552","D553","D555","D556","D557","D558","D559","D560","D561","D562","D563","D565","D567","D569","D572","D573","D574","D575","D577","D578","D579","D580","D581","D585","D588","D589","D592","D593","D594","D595","D596","D597","D598","D599","D600","D601","D602","D605","D606","D607","D608","D609","D610","D612","D614","D617","D622","D629","D636","D640","D643","D644","D646","D650","D653","D655","D659","D660","D661","D662","D663","D664","D665","D666","D667","D668","D669","D670","D671","D672","D673","D674","D675","D676","D677","D678","D679","D680","D681","D682","D683","D684","D685","D686","D687","D688","D689","D690","D691","D692","D693","D694","D695","D696","D697","D698","D699","D700","D701","D702","D703","D704","D705","D706","D707","D708","D709","D710","D711","D712","D713","D714","D715","D716","D717","D718","D719","D720","D721","D722","D723","D724","D725","D726","D727","D728","D729","D730","D731","D732","D801","D802","D803","D804","D805","D806","D807","D808","D809","D810","D811","D812","D813","D814","D815","D816","D817","D818","D819","D820","D821","D822","D823","D824","D825","D826","D827","D828","D829","D830","D831","D832","D833","D834","D835","D836","D837","D838","D839","D840","D841","D842","D843","D844","D845","D846","D847","D848","D849")])
y <- as.matrix(rapm[c("Rating")])
fit <- cv.glmnet(x,y,alpha=0,weights=rapm$Poss,nfolds=50,parallel=TRUE)
coef(fit)
Lambda.min = 71.88326

Re: Shots at RPM

Posted: Thu Nov 13, 2014 4:58 am
by Crow
Lots of choices with RAPM. One could emphasize minimizing errors of the top performers, top salaries, next to be free agents, one conference, one team, one position, offense or defense...

Re: Shots at RPM

Posted: Thu Nov 13, 2014 1:45 pm
by mystic
permaximum wrote:Here's the data. And the R script. It is so simple...
The data link tells me in Turkish that the file has been removed or is currently unavailable.

Also, I'm not exactly sure we are talking about the same issue here, because I wanted to see the script which you used to conclude this:
mystic wrote:You can also plugin the lambda manually, use a higher (like 100) and a lower (like 10) value and then compare the standard deviation of the coefficients.
permaximum wrote:No significant changes.
I'm pretty sure that the posted script is not doing that. The calculated coefficients you get are based on lambda.1se, not on lambda.min, btw. Another hint to get rid of that awful long part to convert the player part into the data.matrix, you can use brackets, if you know how many columns the data contains or if you want to use that script for various different files without knowing the exact amount of players included, while always structure the raw data that the last two columns contain Poss and Rating, you can use something like this:

Code: Select all

n<-length(rapm)-2
x<-rapm[1:n]
x<-data.matrix(x)
In order to check the standard deviation for the different coefficient sets calculated using different lambdas, you should have something like that (if you included the n<-length(rapm)-2):

Code: Select all

highfit<-glmnet(x,y,weights=rapm$Poss,alpha=0,lambda=10)
highsd<-coef(highfit)
highsd<-sd(highsd[2:n+1])
lowfit<-glmnet(x,y,weights=rapm$Poss,alpha=0,lambda=100)
lowsd<-coef(lowfit)
lowsd<-sd(lowsd[2:n+1])
print(highsd>lowsd)
If on your R console TRUE shows up, this
permaximum wrote: In theory, yes. In practice, no.
is proven wrong. And that's what we are actually talking about, not whether I can correctly run a ridge regression using R.

Re: Shots at RPM

Posted: Thu Nov 13, 2014 2:14 pm
by permaximum
1. The data should be downloadable now.

2. Yes, the script is for calculating RAPM 07-08. However increasing the lambda is so simple. Try 700, 7000, 70000 for lambda.min instead of 70.

3. Yes, I typed the script from my mind except the "x" part. The last line should be "coef(fit, s="lambda.min") instead of "coef(fit)"

4. Thanks for the tip about shortening that line. However I remove some players from the code sometimes. Don't want to work on the data.

5. Here's your ridge-regression ready data. Show me those in practice and let me see Kevin Garnett on top with a different lambda instead of those players who have significantly less posssesions. And I'll admit that you're right... Even if you can do that, with the best lambda you get these results which has a few players who have low possessions on top.

Re: Shots at RPM

Posted: Thu Nov 13, 2014 2:46 pm
by mystic
1. I got it.

2. Well, I showed the code in my previous post ...

3. Fine.

4. Well, you could remove the particular players via setting the respective column-name to NULL.

5. It still seems that you don't understand what I actually said and bringing up a strawman instead. I wrote that the maths tells us that increasing lambda is making it more difficult to get away from 0. We can check that by calculating the respective SD of the coefficient sets for a smaller and a bigger lambda value. If what I say is true, the respective SD for the set calculated with the lower lambda should be higher than that of the set calculated with the higher lambda value.

Here is the script I just used on your dataset:

Code: Select all

library(glmnet)
matchups<-read.csv("rapm2008.csv",sep=";")
Weights<-matchups$Poss
Weights<-data.matrix(Weights)
y<-matchups$Rating
y<-data.matrix(y)
n<-length(matchups)
x<-matchups[3:n]
x<-data.matrix(x)
highfit<-glmnet(x,y,weights=Weights,alpha=0,lambda=10)
highsd<-coef(highfit)
highsd<-sd(highsd[2:n+1],na.rm=TRUE)
lowfit<-glmnet(x,y,weights=Weights,alpha=0,lambda=100)
lowsd<-coef(lowfit)
lowsd<-sd(lowsd[2:n+1],na.rm=TRUE)
print(highsd>lowsd)
That is the answer on my console:

Code: Select all

[1] TRUE
and then this:

Code: Select all

> highsd
[1] 5.6968
> lowsd
[1] 2.43546
Let us compare that with your statement:
permaximum wrote:No significant changes.
Well, I would say the difference between 5.7 and 2.4 is significant.

Btw, I have no idea which of those IDs is Kevin Garnett or any other player for that matter.