Significance Testing

From Training Material
Revision as of 17:00, 3 February 2012 by Izabela Szlachta (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


Prerequisites

Questions

  • How a probability value is used to cast doubt on the null hypothesis?
  • What does the phrase "statistically significant" mean
  • What is a difference between statistical significance and practical significance
  • What are the two approaches significance testing

Significance level

  • A low probability value casts doubt on the null hypothesis
  • How low must the probability value be in order to conclude that the null hypothesis is false?
    • there is clearly no right or wrong answer
    • p < 0.05
    • p < 0.01
  • When a researcher concludes that the null hypothesis is false, the researcher is said to have rejected the null hypothesis
  • The probability value below which the null hypothesis is rejected is called significance level or α level or simply α

Statistical significance vs. practical significance

  • When the null hypothesis is rejected, the effect is said to be statistically significant
  • For example, in the Physicians Reactions case study, the p-value is 0.0057
  • Therefore, the effect of obesity is statistically significant and the null hypothesis that obesity makes no difference is rejected
  • It is very important to keep in mind that statistical significance means only that the null hypothesis of exactly no effect is rejected; it does not mean that the effect is important, which is what "significant" usually means
  • When an effect is significant, you can have confidence the effect is not exactly zero
  • Finding that an effect is significant does not tell you about how large or important the effect is.
Do not confuse statistical significance with practical significance.
A small effect can be highly significant if the sample size is large enough.

Why does the word "significant" in the phrase "statistically significant" mean something so different from other uses of the word?

Answer >>

  • The meaning of "significant" in everyday language has changed
  • In the 19th century, something was "significant" if it signified something
  • Finding that an effect is statistically significant signifies that the effect is real and not due to chance
  • Over the years, the meaning of "significant" changed, leading to the potential misinterpretation.

Two approaches to conducting significance tests

Ronald Fisher Approach

  • A significance test is conducted and the probability value reflects the strength of the evidence against the null hypothesis
P-values Meaning
below 0.01 the data provide strong evidence that the null hypothesis is false
between 0.01 and 0.05 the null hypothesis is typically rejected, but not with less confidence
between 0.05 and 0.10 provide weak evidence against the null hypothesis, are not considered low enough to justify rejecting it

Higher probabilities provide less evidence that the null hypothesis is false.

Neyman and Pearson

  • An α level is specified before analyzing the data
P-value Null Hypothesis
P-value < α H0 is rejected
P-value > α H0 is not rejected
  • If a result is significant, then it does not matter how significant it is
  • If it is not significant, then it does not matter how close to being significant it is
  • E.g. if α = 0.05 then P-values of 0.049 and 0.001 are treated identically
  • Similarly, probability values of 0.06 and 0.34 are treated identically

Comparison of approaches

The Fisher approach is more suitable for scientific research

  • use where there is no need for an immediate decision, e.g. a researcher may conclude that there is some evidence against the null hypothesis
  • more research is needed before a definitive conclusion can be drawn

The Pearson is more suitable for applications in which a yes/no decision must be made

  • use if you are less interested in assessing the weight of the evidence than knowing what action should be taken
  • e.g. should the machine be shut down for repair?

Questions

1 In psychology research, it is conventional to reject the null hypothesis if the probability value is lower than what number?

Answer >>

It is conventional to conclude the null hypothesis is false if the data analysis results in a probability value less than 0.05.


2 Select all that apply. The probability value below which the null hypothesis is rejected is also called the

key probability.
significance level.
alpha level.
focal value.

Answer >>

Two other common names for the probability value below which the null hypothesis is rejected are the alpha level (or just alpha) and the significance level.


3 When comparing test scores of two groups, a difference of one point would never be highly statistically significant, even if you had a really large sample.

True
False

Answer >>

Do not confuse statistical significance with practical significance. A small effect, like a one point difference in this case, can be highly statistically significant if the sample size is large enough.


4 There are two main approaches to significance testing. In one approach, the probability value reflects the strength of the evidence against the null hypothesis. The smaller the p value, the more evidence you have that the null hypothesis is false. Which statistician(s) supported this approach?

Fisher
Neyman
Pearson

Answer >>

Fisher favored this approach, which is also the approach favored by this text. Neyman and Pearson favored the approach of choosing an alpha level and then making a yes/no decision based on whether the p value is smaller or larger than that alpha level. Thus, different p values that are on the same side of the alpha level are treated the same.


Template:Statistics Links < Introduction to Hypothesis Testing | Type I and II Errors >