A repost from the old forum:
Here is some rough work on WS/48 aging curves.
I followed the procedure outlined here:
http://www.insidethebook.com/ee/index.php/site/article/basic_aging_curve_for_hitters_1957_2006/Obviously, survivor bias is a huge issue, and one I don't fully know how to compensate for.
I used matched pairs of all players that had 100 MPG in both years. The fact that a player had 100 MPG in each year biases the older sample towards decline, since in order for a player to get 100 MPG in Y+1 he had to perform to a certain level in Y, perhaps even be lucky. The same is true in early years, biasing towards increase, though not so badly.
To try to solve this, I regressed the first year results, using the 200 minutes of expected WS/48, and compared that to the unadjusted second year results.
Here is the aging curve, showing raw WS/48 and the regressed WS/48. I'm not sure if I regressed the right way....
I'm showing the peak is at 26, rather than 27. That said, I like the looks of the regressed curve, particularly when compared to the minutes we see at each age. If the number of minutes starts to drop significantly at age 27, the players are probably passing their peak. I do expect the minutes to peak earlier than the players themselves, as players in development are given more minutes because of potential improvement.
Note I normalized these curves so the peaks would be at 0.100. In other words: if a player is at a true talent of 0.175 at age 25, he would be projected to be at 0.161 at age 30.
Here are the actual numbers, both for the Raw WS/48 and the Adjusted WS/48:
Code:
Age Eff. Min Raw Regressed Curve
18 2648 -0.024 -0.032 -0.028
19 34629 0.016 0.008 0.006
20 126201 0.037 0.032 0.034
21 277620 0.057 0.057 0.056
22 624441 0.071 0.072 0.073
23 1055516 0.085 0.088 0.085
24 1279689 0.092 0.094 0.093
25 1334885 0.096 0.098 0.098
26 1332637 0.099 0.100 0.100
27 1284283 0.100 0.098 0.099
28 1186896 0.099 0.095 0.096
29 1062218 0.096 0.091 0.091
30 905080 0.092 0.085 0.085
31 729743 0.086 0.076 0.077
32 567531 0.078 0.070 0.067
33 429684 0.074 0.060 0.057
34 308454 0.063 0.045 0.046
35 201169 0.053 0.037 0.035
36 123215 0.042 0.025 0.022
37 71096 0.029 0.008 0.009
38 38210 0.011 -0.010 -0.005
39 20366 -0.002 -0.016 -0.019
40 8667 -0.010 -0.041 -0.035
41 3240 -0.015 -0.065 -0.051
42 1077 -0.048 -0.071 -0.069
43 296 -0.035 -0.041 -0.088
I recommend using the "curve" numbers; the curve is a 4th-order polynomial, fitted to the Regressed data, weighted the square root of the Effective Minutes (which aren't actual minutes played).
The equation of the curve:
Code:
-2.7188227611 Intercept
0.3199248445 Age
-0.0132504155 Age^2
0.0002404433 Age^3
-0.0000016798 Age^4
Please evaluate whether I have regressed appropriately.... this survivor bias issue is difficult.