Standard Error of the Estimate

From Training Material
Revision as of 14:09, 5 June 2014 by Bernard Szlachta (talk | contribs) (→‎The standard error of the estimate)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

The standard error of the estimate

The standard error of the estimate

  • is closely related to this quantity and is defined below:
  • is a measure of the accuracy of predictions
ClipCapIt-140603-233234.PNG
sest is the standard error of the estimate,
Y        - actual score
Y'       - predicted score
Y-Y'     - differences between the actual scores and the predicted scores.
Σ(Y-Y')2 - SSE 
N        - number of pairs of scores

Simple Example

  • The graphs below shows two regression examples.
  • You can see that in graph A, the points are closer to the line then they are in graph B.
  • Therefore, the predictions in Graph A are more accurate than in Graph B.
ClipCapIt-140603-233044.PNG


Example

Assume the data below are the data from a population of five X-Y pairs

ClipCapIt-140603-233622.PNG
  • The last column shows that the sum of the squared errors of prediction is 2.791.
  • Therefore, the standard error of the estimate is:
ClipCapIt-140603-233320.PNG

Formula for the Standard Error

There is a version of the formula for the standard error in terms of Pearson's correlation:

ClipCapIt-140603-233713.PNG

where ρ is the population value of Pearson's correlation


SSY is

ClipCapIt-140603-233729.PNG

Similar formulas are used when the standard error of the estimate is computed from a sample rather than a population.

  • The only difference is that the denominator is N-2 rather than N, since two parameters (the slope and the intercept) were estimated in order to estimate the sum of squares
  • Formulas comparable to the ones for the population are shown below.

ClipCapIt-140603-233915.PNG


Example

For the example data,

  • μy = 2.06
  • SSY = 4.597
  • ρ= 0.6268.


Therefore,

ClipCapIt-140603-233829.PNG

which is the same value computed previously.

Quiz

1 In a regression line, the ________ the standard error of the estimate is, the more accurate the predictions are.

larger
smaller
The standard error of the estimate is not related to the accuracy of the predictions.

Answer >>

smaller

The standard error of the estimate is a measure of the accuracy of predictions. The regression line is the line that minimizes the sum of squared deviations of prediction (also called the sum of squares error), and the standard error of the estimate is the square root of the average squared deviation.


2 Linear regression was used to predict Y from X in a certain population. In this population, SSY is 50, the correlation between X and Y is .5, and N is 100. What is the standard error of the estimate?

Answer >>

0.61

The standard error of the estimate for a population is sqrt[(1-rho2)*SSY/N]

sqrt[(1-.52)*50/100] equals .61


3 You sample 10 people in a high school to try to predict GPA in 10th grade from GPA in 9th grade. You determine that SSE = 5.8. What is the standard error of the estimate?

Answer >>

0.85

The standard error of the estimate for a sample is sqrt[SSE/(N-2)]

sqrt[5.8/8] equals to .85


4 The graph below represents a regression line predicting Y from X. This graph shows the error of prediction for each of the actual Y values. Use this information to compute the standard error of the estimate in this sample.

ClipCapIt-140603-234415.PNG

Answer >>

1

The standard error of the estimate for a sample is sqrt[SSE/(N-2)].

SSE is the sum of the squared errors of prediction,

so SSE is (-.2)2 + (.4)2 + (-.8)2 + (1.3)2 + (-.7)2 equals to 3.02;

sqrt(3.02/3) is 1.0