Page 1 of 2

Weighting Performances

Posted: Fri Mar 23, 2012 5:16 pm
by Italian Stallion
I have some data that measures the recent performances of various athletes.

I have 4 performances for each athlete on a scale of 0 -100 for a little over 100 athletes.

Ex.

75 85 81 Last 83
43 62 41 Last 47

What I'm trying to determine is the appropriate weight to give each of the first 3 performances to get as close as possible to predicting the 4th.

I have already demonstrated that the most recent performance is the most predictive and best of last 2 and best of last 3 also work quite well, but I don't now how to maximize the results.

Is there a function in Excel that can do that?

Re: Weighting Performances

Posted: Fri Mar 23, 2012 5:50 pm
by Mike G
You can select the range, in this case 4 entries in a column.
Then drag that to cover a 5th cell. Just as if you were drag-copying from one cell to another.

In this case, you get 86 and 46.
The 86 is larger than the other 4 entries in example 1, because the opening 75 indicates improvement thereafter.

I believe it's just a straight line fit to the next interval.

You can drag-extend any number of columns (or rows) in this way, all at once.

Re: Weighting Performances

Posted: Fri Mar 23, 2012 6:15 pm
by Italian Stallion
I must not have been clear.

What I want is the weights to use on the first 3 numbers to get as close as possible to the 4th number (which I already know) for a sample of 100 cases.

I'll make it simpler and use only 3 numbers and 2 cases.

100, 50, 75.

To get a 75, the appropriate weight for the 100 and 50 is 50% each.

50, 70, 63.3.

to get a 63.3 the appropriate weight is 33% for the 50 and 67% for the 70

Then I'd want to know what weight is best for solving for both cases to get as close as possible.

I have a much more complex situation in that I have 3 numbers leading to a 4th number and 100 cases that are all a little bit different. I guess it's some kind of regression, but I am clueless.

I know that the number in the sequence of 3 most likely to be closest to the 4th is the most recent one, then the second to last etc... So instead of weighting them 33.3% each like in an average, perhaps they should be weighted something like 25% for the 1st, 33%% for the 2nd and 42% for the 3rd, but that's just a wild guess.

Re: Weighting Performances

Posted: Fri Mar 23, 2012 7:55 pm
by mystic
That is indeed regression analysis, which will solve your problem. A linear regression would give you the coefficients best fitting for the problem.

The equation would be (without intercept):

a*1st+b*2nd=3rd

1st and 2nd are your independent variables, 3rd is your dependent. a and b are the coefficients.

If I run the linear regression (I used SPSS; basically the Excel for statistical analysis) of your last given problem (100,50,75 and 50,70,63.3), I get 0.467 for a and 0.567 for b.

Maybe that page can give you help how to to that in Excel:
http://office.microsoft.com/en-us/excel ... 11963.aspx


A pretty strong free available stats program is R. Maybe you should take a look at that: http://www.r-project.org/

Re: Weighting Performances

Posted: Fri Mar 23, 2012 8:08 pm
by Italian Stallion
Italian Stallion wrote:I must not have been clear.

What I want is the weights to use on the first 3 numbers to get as close as possible to the 4th number (which I already know) for a sample of 100 cases.

I'll make it simpler and use only 3 numbers and 2 cases.

100, 50, 75.

To get a 75, the appropriate weight for the 100 and 50 is 50% each.

50, 70, 63.3.

to get a 63.3 the appropriate weight is 33% for the 50 and 67% for the 70

Then I'd want to know what weight is best for solving for both cases to get as close as possible.

I have a much more complex situation in that I have 3 numbers leading to a 4th number and 100 cases that are all a little bit different. I guess it's some kind of regression, but I am clueless.

I know that the number in the sequence of 3 most likely to be closest to the 4th is the most recent one, then the second to last etc... So instead of weighting them 33.3% each like in an average, perhaps they should be weighted something like 25% for the 1st, 33%% for the 2nd and 42% for the 3rd, but that's just a wild guess.
Thanks. I guess I'm going to have learn something new. :?

Re: Weighting Performances

Posted: Fri Mar 23, 2012 8:14 pm
by Italian Stallion
This is the actual data. How difficult would it be to cut and paste these into the software. I am looking for the weights of f1, f2, and f3 to get me closest to fig4

Fig1 Fig2 Fig3 Fig4

62 76 66 76
71 76 75 68
48 58 50 55
54 43 51 50
58 50 55 55
76 75 68 70
86 87 83 64
58 47 35 46
84 81 82 84
82 83 77 80
55 61 43 71
42 54 48 43
63 65 72 67
69 50 59 67
65 82 54 87
36 40 56 46
43 64 47 38
60 31 46 54
37 58 60 56
11 52 37 45
74 64 79 81
80 83 81 81
82 84 87 73
84 68 82 75
78 84 68 82
80 78 84 68
65 66 57 53
16 42 54 34
86 82 79 87
73 74 65 80
63 59 60 65
49 58 13 35
50 58 50 59
37 48 48 46
67 71 70 55
33 35 15 5
27 39 42 41
58 54 37 49
69 63 49 40
51 53 58 68
44 35 49 43
60 55 62 60
30 49 21 37
58 60 56 55
55 53 53 53
62 64 65 73
70 57 51 12
40 56 46 52
78 62 68 72
42 54 34 36
86 84 84 90
60 51 45 47
57 66 65 69
50 54 44 50
39 42 41 53
52 59 45 50
61 43 71 66
37 50 44 52
58 50 59 63
31 46 54 47
41 35 40 30
53 53 53 59
77 55 77 69
78 78 84 83
92 92 87 73
64 60 65 51
70 60 62 60
54 87 72 67
80 82 56 28
60 56 55 55
35 33 27 23
29 32 32 45
64 69 69 60
75 65 83 73
62 68 72 66
80 79 85 77
81 82 84 62
80 81 79 71
42 41 53 30
46 54 47 23
68 73 67 65
41 53 31 41
67 78 79 56
92 94 74 89
51 34 67 65
50 75 53 62
80 78 74 73
91 88 96 88
96 88 97 82
66 62 46 27
64 63 63 70
36 44 32 47
54 55 66 66
53 51 64 60
83 90 87 67
51 59 61 59
36 70 45 44
84 84 87 76
86 86 74 80
83 83 83 82
78 77 79 78
70 76 80 62
21 43 54 20

Re: Weighting Performances

Posted: Fri Mar 23, 2012 8:41 pm
by mystic
VAR00001 = Fig1
VAR00002 = Fig2
VAR00003 = Fig3
VAR00004 = Fig4

Coefficients: 0.162, 0.247 and 0.546

Code: Select all

			Coefficients(a)(,)(b)
		Unstandardized Coefficients		Standardized Coefficients
Model		B	Std. Error	Beta	t	Sig.
1	VAR00001	,162	,113	,167	1,431	,156
	 VAR00002	,247	,105	,260	2,350	,021
	 VAR00003	,546	,105	,562	5,187	,000
a. Dependent Variable: VAR00004
b. Linear Regression through the Origin

Code: Select all

		Model Summary
Model	R	R Squareb	Adjusted R Square	Std. Error of the Estimate
1	,984a	,968	,967	11,19753
a. Predictors: VAR00003, VAR00002, VAR00001
b. For regression through the origin (the no-intercept model), R Square measures the proportion of the variability in the dependent variable about the origin explained by regression. This CANNOT be compared to R Square for models which include an intercept.

Code: Select all

			ANOVA(c)(,)(d)
Model		Sum of Squares	df	Mean Square	F	Sig.
1	Regression	374938,542	3	124979,514	996,769	,000a
	Residual	12538,458	100	125,385		
	Total	387477,000b	103			
a. Predictors: VAR00003, VAR00002, VAR00001
b. This total sum of squares is not corrected for the constant because the constant is zero for regression through the origin.
c. Dependent Variable: VAR00004
d. Linear Regression through the Origin


Re: Weighting Performances

Posted: Fri Mar 23, 2012 8:51 pm
by Italian Stallion
I'm about to give you a big hug. I just downloaded R, started looking at it, and realized it was going to take me a bit of time to have any idea how to use it. That's a project for another day.

There aren't enough ways to thank someone on a forum. Seriously. :D

I have to learn how to use that software ASAP!

Re: Weighting Performances

Posted: Fri Mar 23, 2012 9:07 pm
by mystic
I used SPSS to get those results.

If you downloaded R and you want to use those results of the regression in order to make a prediction I would recommended ridge regression. I just wrote a quick script for your problem, that is the result (lambda was 17, found via cross validation):

Code: Select all

 (Intercept) 20.2519966
Fig1         0.1828746
Fig2         0.1930801
Fig3         0.2523306 
In terms of making a prediction for out of sample data, those coefficients should give you a better result. The equation would be:

Fig4 = 20.25 + 0.183*Fig1 + 0.193*Fig2 + 0.252*Fig3


Btw, regarding the values in the previous post: when you check the significance level, you can throw out Fig1 and use just Fig2 and Fig3:

Code: Select all

			Coefficients(a)(,)(b)
		Unstandardized Coefficients		Standardized Coefficients
Model		B	Std. Error	Beta	t	Sig.
1	Fig2	,328	,089	,346	3,700	,000
	 Fig3	,623	,091	,642	6,867	,000
a. Dependent Variable: Fig4
b. Linear Regression through the Origin

Code: Select all

		Model Summary
Model	R	R Squareb	Adjusted R Square	Std. Error of the Estimate
1	,983a	,967	,966	11,255
a. Predictors: Fig3, Fig2
b. For regression through the origin (the no-intercept model), R Square measures the proportion of the variability in the dependent variable about the origin explained by regression. This CANNOT be compared to R Square for models which include an intercept.


Re: Weighting Performances

Posted: Sat Mar 24, 2012 1:13 am
by mtamada
Those linear regression examples give you a very good start. You might want to put a constant term into the regression. And in terms of this specific application, we're in the realm of time series analysis where there are a number of specialized models e.g. imposing a functional form on the pattern of coefficients on the lagged values (exponential decay e.g.). Techniques from digital signal processing can be brought in -- DeanO liked to use the Kalman filter.

Re: Weighting Performances

Posted: Sat Mar 24, 2012 2:24 am
by Ryan
One could also consider a longitudinal analysis.

Re: Weighting Performances

Posted: Sat Mar 24, 2012 6:11 pm
by Italian Stallion
Thanks again guys.

I really appreciate the generosity.

Re: Weighting Performances

Posted: Sat Mar 24, 2012 8:50 pm
by gfarkas
mystic wrote:I used SPSS to get those results.

If you downloaded R and you want to use those results of the regression in order to make a prediction I would recommended ridge regression. I just wrote a quick script for your problem, that is the result (lambda was 17, found via cross validation):

Code: Select all

 (Intercept) 20.2519966
Fig1         0.1828746
Fig2         0.1930801
Fig3         0.2523306 
I'm just curious, but why would you recommend Ridge Regression right off the bat? I've always considered it a technique to be used when a problem was ill-posed and/or OLS doesn't yield a unique solution.

Re: Weighting Performances

Posted: Fri Mar 30, 2012 3:32 pm
by mystic
gfarkas wrote: I'm just curious, but why would you recommend Ridge Regression right off the bat? I've always considered it a technique to be used when a problem was ill-posed and/or OLS doesn't yield a unique solution.
Well, I see a ill-posed problem posted, we have 3 independent variables and a lot more equations. And we can always find a lambda for which the RMSE of the ridge regression is smaller than the RMSE for OLS.

Re: Weighting Performances

Posted: Sat Apr 07, 2012 3:36 am
by Crow
If this is offensive data, are you taking into account quality of defense faced?