Statistics for Decision Makers - 14.04 - Regression - Inferential Statistics

From Training Material
Jump to navigation Jump to search
title
14.04 - Regression - Inferential Statistics
author
Bernard Szlachta (NobleProg Ltd) bs@nobleprog.co.uk

Assumptions。

  • No assumptions are needed to determine the best-fitting straight line
  • Assumptions are made in the calculation of inferential statistics
  • These assumptions refer to the population, not the sample
Linearity
The relationship between the two variables is linear
Homoscedasticity
The variance around the regression line is the same for all values of X
The errors of prediction are distributed normally
This means that the deviations from the regression line are normally distributed. It does not mean that X or Y is normally distributed.

ClipCapIt-140603-221400.PNG ClipCapIt-140606-020611.PNG

Significance Test for the Slope (b)。

Coefficients:
            Estimate Std. Error  Pr(>|t|)   
(Intercept)  1.63968    0.45150 0.00191
Hours        0.05223    0.02108  0.02334 
Residual standard error: 0.6449 on 18 degrees of freedom
Multiple R-squared:  0.2544,    Adjusted R-squared:  0.213 
F-statistic: 6.142 on 1 and 18 DF,  p-value: 0.02334
  • The 0.02108 statistic is the sample value of the slope (b) and the hypothesized value is 0
  • The P value is 0.02335 < 0.05
  • Therefore, the slope is significantly different from 0

Significance Test for the Correlation。

Coefficients:
            Estimate Std. Error  Pr(>|t|)   
(Intercept)  1.63968    0.45150 0.00191
Hours        0.05223    0.02108  0.02334
Residual standard error: 0.6449 on 18 degrees of freedom
Multiple R-squared:  0.2544,    Adjusted R-squared:  0.213 
F-statistic: 6.142 on 1 and 18 DF,  p-value: 0.02334
  • The P-value < 0.05 therefore the correlation is significant
  • If the correlation is not significant, the model is not valid

Quiz。

Please find the quiz here

Quiz

1 Which of the following are assumptions made in the calculation of regression inferential statistics?

A:The errors of prediction are normally distributed
B:X is normally distributed
C:Y is normally distributed
D:The variance around the regression line is the same for all values of X
E:The relationship between X and Y is linear

Answer >>

A,D,E The assumptions are linearity, homoscedasticity, and normally distributed errors. See the text for more information.


2 All questions below use the following model

Coefficients
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  30.8634     6.9873   4.417 0.000199
Post          0.3628     0.1197   3.032 0.005933 
Residual standard error: 8.562 on 23 degrees of freedom
Multiple R-squared 0.2855,	Adjusted R-squared 0.2544 
F-statistic: 9.191 on 1 and 23 DF,  p-value 0.005933 

The intercept is statistically significant

True
False

3 The slope is statistically significant

True
False

4 The model is statistically significant

True
False

5 All points will be within +/-14 points from the regression line

True
False

6 The model explains 0.005933 of the variation

True
False