Significance Testing
Jump to navigation
Jump to search
Prerequisites
Questions
- How a probability value is used to cast doubt on the null hypothesis?
- What does the phrase "statistically significant" mean
- What is a difference between statistical significance and practical significance
- What are the two approaches significance testing
Significance level
- A low probability value casts doubt on the null hypothesis
- How low must the probability value be in order to conclude that the null hypothesis is false?
- there is clearly no right or wrong answer
- p < 0.05
- p < 0.01
- When a researcher concludes that the null hypothesis is false, the researcher is said to have rejected the null hypothesis
- The probability value below which the null hypothesis is rejected is called significance level or α level or simply α
Statistical significance vs. practical significance
- When the null hypothesis is rejected, the effect is said to be statistically significant
- For example, in the Physicians Reactions case study, the p-value is 0.0057
- Therefore, the effect of obesity is statistically significant and the null hypothesis that obesity makes no difference is rejected
- It is very important to keep in mind that statistical significance means only that the null hypothesis of exactly no effect is rejected; it does not mean that the effect is important, which is what "significant" usually means
- When an effect is significant, you can have confidence the effect is not exactly zero
- Finding that an effect is significant does not tell you about how large or important the effect is.
Do not confuse statistical significance with practical significance. A small effect can be highly significant if the sample size is large enough.
Why does the word "significant" in the phrase "statistically significant" mean something so different from other uses of the word?
Answer >>
- The meaning of "significant" in everyday language has changed
- In the 19th century, something was "significant" if it signified something
- Finding that an effect is statistically significant signifies that the effect is real and not due to chance
- Over the years, the meaning of "significant" changed, leading to the potential misinterpretation.
Two approaches to conducting significance tests
Ronald Fisher Approach
- A significance test is conducted and the probability value reflects the strength of the evidence against the null hypothesis
P-values | Meaning |
---|---|
below 0.01 | the data provide strong evidence that the null hypothesis is false |
between 0.01 and 0.05 | the null hypothesis is typically rejected, but not with less confidence |
between 0.05 and 0.10 | provide weak evidence against the null hypothesis, are not considered low enough to justify rejecting it |
Higher probabilities provide less evidence that the null hypothesis is false.
Neyman and Pearson
- An α level is specified before analyzing the data
P-value | Null Hypothesis |
---|---|
P-value < α | H0 is rejected |
P-value > α | H0 is not rejected |
- If a result is significant, then it does not matter how significant it is
- If it is not significant, then it does not matter how close to being significant it is
- E.g. if α = 0.05 then P-values of 0.049 and 0.001 are treated identically
- Similarly, probability values of 0.06 and 0.34 are treated identically
Comparison of approaches
The Fisher approach is more suitable for scientific research
- use where there is no need for an immediate decision, e.g. a researcher may conclude that there is some evidence against the null hypothesis
- more research is needed before a definitive conclusion can be drawn
The Pearson is more suitable for applications in which a yes/no decision must be made
- use if you are less interested in assessing the weight of the evidence than knowing what action should be taken
- e.g. should the machine be shut down for repair?
Questions
Template:Statistics Links < Introduction to Hypothesis Testing | Type I and II Errors >