Need help with datamining

Home for all your discussion of basketball statistical analysis.
Post Reply
blaise
Posts: 1
Joined: Thu May 05, 2011 5:06 pm

Need help with datamining

Post by blaise »

Hello guys!

For my school project in data mining course, I have chosen the data at your site (this years), but didn't quite look at it before hand. Now I'm a bit lost

1. Where can I get more info about attributes? I found the glossary, but i still need to know how are the ratings and rates calculated. Or how is the time interval chosen in mathups.txt?


2. What to look for when data mining? I thought about doing classification for the winner team, but first i would rather do something less complex. Like in the players.txt look for relations between attributes. If you have any suggestions...
Crow
Posts: 10565
Joined: Thu Apr 14, 2011 11:10 pm

Re: Need help with datamining

Post by Crow »

Hi.

"For my school project in data mining course, I have chosen the data at your site (this years)"

I assume you mean Aaron B.'s basketballvalue.com site?

(Another resource with Regularized Adjusted +/- values split into offensive and defensive ratings is here http://stats-for-the-nba.appspot.com/

"1. Where can I get more info about attributes?"

If you mean attributes about players I'd mainly go to basketball-reference.com and 82games.com.

" I found the glossary, but I still need to know how are the ratings and rates calculated."

Follow the "here" and "here" links listed on the first entry in the glossary, http://basketballvalue.com/glossary.php
the 3 Rosenbaum links here http://www.82games.com/articles.htm
the link here
http://www.countthebasket.com/blog/
and here
http://www.sloansportsconference.com/re ... e-testing/
and various threads at this forum.

"... how is the time interval chosen in mathups.txt?" [matchups]

See the glossary and the "here" links.


"2. What to look for when data mining? I thought about doing classification for the winner team, but first i would rather do something less complex. Like in the players.txt look for relations between attributes. If you have any suggestions..."

Your could look simply at position, age or years of NBA experience and height and maybe weight or combinations of them or at stats like "usage" (or alternatively, maybe field goal attempts per minute) and "assist %", defined here http://www.basketball-reference.com/about/glossary.html
as they impact team offense but its impact is not completely and exactly understood. You could also potentially look at 3 pointers made and free throw attempt rate and for defense, steals and blocks. They may be more "important" in Adjusted +/- than their apparent, direct statistical value because of uncounted impacts on other plays that players who have high 3 pointers made, steals or blocks may have or generally may have.

If you wanted to data-mine less obvious things, you might consider years in college, furthest team advancement in the NCAA tournament, high school recruitment ranking, whether the player was the leading scorer in college for his team, the draft pick number, the quality of the PG they play with, % time played with a 7 footer and what he does with and without one, etc.

Best wishes with your project and hopefully let us here what you did and found.
Post Reply