Page 3 of 3

Re: Demystifying Ridge Regression

Posted: Fri May 17, 2013 10:52 am
by mystic
ed küpfer wrote:I would like to see a worked out example, with R code if possible.
http://www.countthebasket.com/blog/2008 ... lus-minus/

Replace lm with lm.ridge. The lambda can be calculated by using parcor (ridge.cv).

I would suggest, using glmnet instead in order to get lambda and doing the ridge regression. http://www.jstatsoft.org/v33/i01/

Re: Demystifying Ridge Regression

Posted: Fri May 17, 2013 11:22 am
by DSMok1
ed küpfer wrote:I would like to see a worked out example, with R code if possible.
Did Eli not share his code with you? :shock: :P

I, too, would like to see some R code. Would save me the trouble of writing it myself....

Re: Demystifying Ridge Regression

Posted: Fri May 17, 2013 11:31 am
by mystic
DSMok1 wrote: I, too, would like to see some R code. Would save me the trouble of writing it myself....
Getting the appropiate matchupfile is much bigger trouble than writing a R script to calculate the values via ridge regression.

As I pointed out in my previous post, just use the example from Eli and then replace lm with lm.ridge. Use the parcor package and ridge.cv (cv means cross validation here) to get the lambda before.

Re: Demystifying Ridge Regression

Posted: Fri May 17, 2013 11:33 am
by v-zero
Can't help you a great deal with R code, I don't use R, but there are loads of examples out there regarding performing ridge regression in R - do you have a specific implementation problem?

Re: Demystifying Ridge Regression

Posted: Fri May 17, 2013 3:37 pm
by EvanZ
Beating a dead horse, but...

If people are seriously interested, just sign up for Coursera. There's no excuse not to check it out. It's free. It uses Octave, also free (essentially Matlab for free), and very easy to program. There are a few homework assignments where you actually implement regularized regression (both linear and logistic). Not just calling a stats function, but literally implementing a cost function, multiplying matrices, etc. It seems as good an introduction as you're likely to get with actual hands-on programming experience. There are discussion forums on the site that I believe are even moderated by students taking the actual course at Stanford and help answer questions online.

Does Coursera have some kind of weird stigma around here? There are over 100,000 people that take the ML course. I actually have a weekly study group in SF for it, that anyone is invited to:

http://www.meetup.com/Coursera/San-Francisco-CA/943222/

Don't talk about learning it. Just go out and do it! There's really no excuse, again, if you're *actually* interested in learning something.

Re: Demystifying Ridge Regression

Posted: Fri May 17, 2013 4:13 pm
by talkingpractice
mystic wrote: Getting the appropiate matchupfile is much bigger trouble than writing a R script to calculate the values via ridge regression.
+1. Doing the matchups carefully/perfectly was quite literally ~99% of our manhours on the rapm process.

Re: Demystifying Ridge Regression

Posted: Fri May 17, 2013 8:33 pm
by Tsunami
I've been calculating the RAPM (I call it TRAPM since it's Tikhonov regularization of APM and I don't get creepy autocorrect failures) and I have some questions for others working on this.

1.) What is the potential negative implication of leaving out empty possessions from the players x plays matrix (those that end is a margin of 0)? To me, it seems the resulting value would have to be described as: "Average +/- per scoring possession" or something slightly more complicated - or is there an unintended consequence to this I am overlooking?

2.) My research/observations are that a lambda of infinity will return a sorting list of raw +/-.

3.) Are there competing forces at work that manifest when tuning the Lambda? I had a situation where using a very small Lamda indicated LeBron James to be given a #1 ranking, a slightly larger Lambda pushed him into #2, and a significantly large Lambda (which really yanked down all the less-played players) pushed him back on top. This seemed strange to me.

4.) Having the "average home court player" in the lineups to receive credit for that advantage is very elegant - does it make sense to add other "players" like "garbage-time" player or would a variable amount of active players depending on the situation junk up the underlying theory? Does it even make sense to do that? I'm not sure to whom garbage time propels - but even if the value of a "garbage-time" player was meaningless, it would remove some of the "credit" for garbage time scoring from the actual players on the court.

I'm looking forward to this community!

Re: Demystifying Ridge Regression

Posted: Sat May 18, 2013 2:35 am
by mtamada
I don't have firsthand experience with R, but given how massively popular it is with statisticians and given its open source heritage, surely there are multiple implementations of ridge regression floating around the web? E.g. a crude google search popped this up as the second hit: http://cran.r-project.org/web/packages/ridge/ridge.pdf. Caveat emptor of course, I cannot vouch for its quality; it appears to be aimed at genomics or bioinformatics researchers.

Re: Demystifying Ridge Regression

Posted: Sat May 25, 2013 5:23 am
by jbrocato23
mystic wrote:
ed küpfer wrote:I would like to see a worked out example, with R code if possible.
http://www.countthebasket.com/blog/2008 ... lus-minus/

Replace lm with lm.ridge. The lambda can be calculated by using parcor (ridge.cv).

I would suggest, using glmnet instead in order to get lambda and doing the ridge regression. http://www.jstatsoft.org/v33/i01/

I'm curious as to how you implement priors to the coefficients in r. As far as I can tell, there is no input argument for priors in glmnet, lm.ridge, or any other ridge regression package.