## 2019-20 team win projection contest

### 2019-20 team win projection contest

Start talking here when you want.

The teams, the factors in the off season affecting the projections, the methods past and future, whatever.

The teams, the factors in the off season affecting the projections, the methods past and future, whatever.

### Re: 2019-20 team win projection contest

Anybody want to work collaboratively on a freeshare minutes model? Say in August?

What else would improve predictions?

Anybody want to swap model building techniques, in public or private? Anyone want to try a team entry as a novelty?

Depends how you prioritize winning over learning and fun.

What else would improve predictions?

Anybody want to swap model building techniques, in public or private? Anyone want to try a team entry as a novelty?

Depends how you prioritize winning over learning and fun.

### Re: 2019-20 team win projection contest

Hi for those who do not fully know the terminologies. What is a minutes model? Prediction of how many minutes each player will play this season? per game?Crow wrote: ↑Fri Jun 14, 2019 5:01 amAnybody want to work collaboratively on a freeshare minutes model? Say in August?

What else would improve predictions?

Anybody want to swap model building techniques, in public or private? Anyone want to try a team entry as a novelty?

Depends how you prioritize winning over learning and fun.

Are we predicting until the end of the regular season or right up to the end?

### Re: 2019-20 team win projection contest

How many minutes each player will play in regular season.

Many predictors build their win estimates from minutes weighed sums of player performances / contributions. But not all.

There has been some sharing of minutes models at the end in the past but more collaborative refinement and earlier release might help more predictors.

Many predictors build their win estimates from minutes weighed sums of player performances / contributions. But not all.

There has been some sharing of minutes models at the end in the past but more collaborative refinement and earlier release might help more predictors.

### Re: 2019-20 team win projection contest

Ok I see.

Well for my predictions I use the minutes as an additional parameter/feature and not collapse everything into a minute-weighted dimension.

I also found that a combination of player-based and team-based features works the best

Well for my predictions I use the minutes as an additional parameter/feature and not collapse everything into a minute-weighted dimension.

I also found that a combination of player-based and team-based features works the best

### Re: 2019-20 team win projection contest

Incidentally,

if someone wants to go for the single-dimension predictor (e.g. the PIE score for each player that is on the stats.nba.com) you can try the following:

-Derive a single metric that ranks each player's chance of win. This metric can be estimated as a function of different player parameters. But the function is the same for all players. Just the values are different. How to derive this? I ll tell you in the end. But call this metric WN.

-Then for each time you can combine all the WNs of the players in the roster and get the team's average WN. This could be minute-weighted or not (if you have included the expected minutes in the original WN calculation).

-Once you have two team's WNs you can determine the winner (based on the higher WN) and a probability of win (based on a simple regression model.

-This idea is simple (since you are collapsing 100s-1000s features dimensions to a single number) but quite effective if done correctly. And should work better than simplistic ELO-based approaches (e.g. 538).

CAVEAT: The above discussion ignores team-based effects in win/lose outcome and assumes that each player contributes to the result individually. Or to put it differently, a team is simple the (weighted) average of it's players.

How to calculate WN metric:

Here is a link (website seems to be down so history link instead):

https://web.archive.org/web/20180626134 ... /pages/WN8

and the new version here:

http://jaj22.org.uk/wn9implement.html

This is from a competitive online-game where two teams of random players battle each other for the win. The process of deriving a metric is much the same and can be adapted to NBA games or any other team-based games where you are trying to determin invidual's skill as a contribution to team victory. You will need some software like Eureqa (or similar regression tool) that will determine the important features that predict win/loss chance.

if someone wants to go for the single-dimension predictor (e.g. the PIE score for each player that is on the stats.nba.com) you can try the following:

-Derive a single metric that ranks each player's chance of win. This metric can be estimated as a function of different player parameters. But the function is the same for all players. Just the values are different. How to derive this? I ll tell you in the end. But call this metric WN.

-Then for each time you can combine all the WNs of the players in the roster and get the team's average WN. This could be minute-weighted or not (if you have included the expected minutes in the original WN calculation).

-Once you have two team's WNs you can determine the winner (based on the higher WN) and a probability of win (based on a simple regression model.

-This idea is simple (since you are collapsing 100s-1000s features dimensions to a single number) but quite effective if done correctly. And should work better than simplistic ELO-based approaches (e.g. 538).

CAVEAT: The above discussion ignores team-based effects in win/lose outcome and assumes that each player contributes to the result individually. Or to put it differently, a team is simple the (weighted) average of it's players.

How to calculate WN metric:

Here is a link (website seems to be down so history link instead):

https://web.archive.org/web/20180626134 ... /pages/WN8

and the new version here:

http://jaj22.org.uk/wn9implement.html

This is from a competitive online-game where two teams of random players battle each other for the win. The process of deriving a metric is much the same and can be adapted to NBA games or any other team-based games where you are trying to determin invidual's skill as a contribution to team victory. You will need some software like Eureqa (or similar regression tool) that will determine the important features that predict win/loss chance.

### Re: 2019-20 team win projection contest

I haven't tried to read or follow the links to method.

I'll just ask are the "chances to win" just averaged or do the weights vary by player? I'd think they'd have to be variably weighted.

"I also found that a combination of player-based and team-based features works the best" I'd agree to that. Though I wonder how do you get the team based for coming season when players and roles may have changed and coaching and system may have changed or tweaked?

Do you just use last season data or do you try to project with age curves, team "optimization efficiency analysis", etc.?

I'll just ask are the "chances to win" just averaged or do the weights vary by player? I'd think they'd have to be variably weighted.

"I also found that a combination of player-based and team-based features works the best" I'd agree to that. Though I wonder how do you get the team based for coming season when players and roles may have changed and coaching and system may have changed or tweaked?

Do you just use last season data or do you try to project with age curves, team "optimization efficiency analysis", etc.?

### Re: 2019-20 team win projection contest

So given that each player in the team has a different win chance (lets call this metric Wn, which can be calculated according to that article), let's look at an example.

Team1 vs Team2

What you can do is one of the following:

Either

**A)**Average all the players' Wn number for Team1 and Team2 and determine who is going to win based on whichever team has the higher averaged Wn. So Team1_Wn = sum(Wn)/N1 , Team2_Wn = sum(Wn)/N2 where N1, N2 the players in the two team rosters respectively. If Team1_Wn > Team2_Wn then Team1 has a better chance to win.

How much better chance? Then you will need something like logistic regression from past games data (or simply some sigmoid function) to map the difference into a probability.

You could as you say do a weighted averaging but figuring out the weights might be tricky. You could do something empirical like weighing by experience (games played) or expected minutes played etc etc.

or

**B)**even better than averaging player Wn's into a single team Wn number is to instead do regression on the full dimensionality. So in other words (and this will only work on teams with equal number of players so you need to make some choices here), take 10 players from Team1 and 10 players from Team2. Then do a regression (from past games data) from the 20 dimensional Wn space to the win/lose space. This should give you a probabilistic mapping that can predict win chance of teams (of equal number of players) without having to weigh+sum to single numbers.

Well that's the tricky part isnt it?Crow wrote: ↑Fri Jun 14, 2019 9:06 pm"I also found that a combination of player-based and team-based features works the best" I'd agree to that. Though I wonder how do you get the team based for coming season when players and roles may have changed and coaching and system may have changed or tweaked?

Generally player stats are not difficult to predict from season to season. However, team stats are. Due to composition changes, and all the reasons you mentioned above.

I do not have a solution to this. At least a good one.

What I am doing at the moment (until I figure out something better) is to do a linear combination of the previous season's team stats with the current season stats, in a moving window approach.

So if last year they played 82 games (Reg. Season) and this season they played 5 games then a Stat is going to be:

Stat_current = Stat_last * w1 + Stat_current*w2

where:

w1= max(0, 77/82) = 0.939

w2= min(1, 5/82) = 0.0609

Stat_current = Stat_last*0.939 + Stat_current*0.0609

And as the current season progresses the contribution from last season's stats goes down.

No, not projecting anything at the moment. I am using data from 2004/05 up until 2018/19. So almost 15 years of historical data.

This sort of takes care of diminishing player performance with age since changing player stats from every year together with age and experience are mapped to game outcomes. However, this is not the same for teams because there is no temporal continuity of a team between seasons, as it can change a lot.

Anyway, this is really basic and I am currently trying to figure out a better approach. I think this early season uncertainty (until the stats mature) is responsible for this early season dip in prediction accuracy.

If anyone has better ideas please feel free to share.

Maybe some time-series prediction of team stats from player stats using a long short-term memory model (LSTM).

### Re: 2019-20 team win projection contest

Not a stat head, so can't help too much on the details, but for the 10 player bit I would say that you may be able to get away with a lower number than that (if that at all simplifies it, it's possible there's no real difference between running it with 1 or 100), in my experience you can usually explain a teams play with ~ their top 3 players over a season long type sample and it can turn to noise adding in many more players after that. Not sure how well that'd hold over smaller time periods if you're trying to go game by game.

### Re: 2019-20 team win projection contest

I think you might be right about that. Most of the variance of the win/lose can be captured by the 3 (or so) top players and the rest in the roster only contribute a very little.eminence wrote: ↑Fri Jun 14, 2019 11:02 pmNot a stat head, so can't help too much on the details, but for the 10 player bit I would say that you may be able to get away with a lower number than that (if that at all simplifies it, it's possible there's no real difference between running it with 1 or 100), in my experience you can usually explain a teams play with ~ their top 3 players over a season long type sample and it can turn to noise adding in many more players after that. Not sure how well that'd hold over smaller time periods if you're trying to go game by game.

I guess it is a question whether you want to capture everything (in order to boost accuracy but maybe add noise) or just go with a 3 dimensional space and work from that.

Also what one might do is calculate player stats in relation to the mean team stats (or mean over all teams). Assuming that we do not have any outliers (i.e. most NBA level players are good enough) otherwise we use the median.

In that case we can model the win chance by taking the top 3 and bottom 3 players from each team relative to that mean.

Why the bottom 3? Well maybe top players contribute to the wins, but also maybe bottom players contribute to the loss.

In fact I am going to re-calculate my player stats relative to their-team/or all-teams and see if it makes any difference in prediction accuracy.

### Re: 2019-20 team win projection contest

Does east / west strength gap get bigger? Does conference level win differential get bigger?

Seems like schedule will be even more important for predicting team wins.

Seems like schedule will be even more important for predicting team wins.

### Re: 2019-20 team win projection contest

This past season, there was negative correlation between contestants guessing the East/West differential and individual team performance. FWIW.

Since the Bulls dynasty, west has dominated by avg of about 57-43% wins vs the east in RS.It got close to 50-50 in 2009 and again in 2016. These were abrupt changes from previous season which would go back to "normal" in a couple of years.

Disparity was relatively minor last year but doubled this year.

It would be interesting to see which major players switched conferences in some years of major change.

Since the Bulls dynasty, west has dominated by avg of about 57-43% wins vs the east in RS.

Code: Select all

```
year W/E
1998 .421
1999 .548
2002 .552
2003 .603
2004 .628
2005 .568
2006 .560
2007 .575
2008 .573
2009 .512
2010 .548
2011 .582
2012 .578
2013 .582
2014 .631
2015 .584
2016 .516
2017 .547
2018 .529
2019 .560
```

Disparity was relatively minor last year but doubled this year.

It would be interesting to see which major players switched conferences in some years of major change.