Statistics for Decision Makers - 13.01 - Power

From Training Material
Jump to navigation Jump to search
Title
Statistics for Decision Makers - 13.01 - Power
author
Bernard Szlachta (NobleProg Ltd) bs@nobleprog.co.uk

Prerequisites

Define power。

Website.PNG
Suppose you created a new design for your website
  • Do people prefer the new design over the old one?
  • Does the experiment have a high probability of providing strong evidence that the new design is better than the old design if the new design is really better?
  • Is the proposed sample size so small (that even a fairly large population difference would be difficult to detect)?
    • If the sample size is small, then even a fairly large difference in sample means might not be significant
  • When an effect is not significant, the result is inconclusive (no hypothesis is rejected or accepted)
Is the test worth the money if there is no conclusion?

What is Power。

  • Power is defined as the probability of correctly rejecting a false null hypothesis

  • In terms of our example, it is the probability that:
    • given there is a difference between the population means of the new design and the old one
    • the sample means will be significantly different
  • The probability of failing to reject a false null hypothesis is often referred to as β
  • Therefore power can be defined as:
power = 1 - β


Power and Sample Size

  • A manager can request the power to be high before money is spent on sampling and research
  • Let us assume that we compare means of satisfaction of the old design with the new one and the manager wants to have a 90% chance of detecting the difference if there is one

How many people should we ask for an opinion? 。

  • The scale of satisfaction is from 1 to 10
  • We want to make sure that there is a difference of at least 3 points
  • H1: meannew - meanold > 3
  • We assume alpha = 5%, and standard deviation = 2
  • Using an R package we can calculate the sample size needed to achieve a power of 90%
 
 power.t.test(power=0.9,sd=2,sig.level=0.05,delta=3,alternative="one.sided")
    Two-sample t test power calculation 
             n = 8.386343
         delta = 3
            sd = 2
     sig.level = 0.05
         power = 0.9
   alternative = one.sided

Therefore, in order to have a 90% certainty that the test will detect a difference in satisfaction with the new design of 3 or more points, we need a sample size of at least 9 respondents.


What if we care about any difference?

  • The smaller the difference you want to detect, the bigger the sample size should be
  • For example, if we want to detect the difference of 1 with a 90% probability we would need a sample size of at least 70 respondents

Importance of estimating power。

Identify situations in which it is important to estimate power
  • It is very important to consider power while designing an experiment
  • You should avoid spending a lot of time and/or money on an experiment that has little chance of finding a significant effect

Quiz。

Please find the Questions here

Quiz

1 Power is:

The probability that the null hypothesis is true
The probability that the null hypothesis is false
The probability a false null hypothesis will be rejected
The probability a true null hypothesis will be rejected

Answer >>

It is the probability of correctly rejecting a false null hypothesis.


2 If the power of an experiment is low then

The experiment will likely be inconclusive
Any significant findings obtained are suspect
The results are skewed

Answer >>

With low power, the null hypothesis is unlikely to be rejected. When the null hypothesis is not rejected, the experiment is inconclusive.


3 Let us assume that you want to have 90% of probability of detecting a difference of 1 given there is one.

The standard deviation of previous surveys was around 50. What kind of sample size would you expect to require?

3 respondents
50 respondents
500 respondents
42000 respondents
1 million respondents

4 A hard disk company wants to prove that the new design of a hard drive allows it to work under extreme temperatures. Each prototype of the hard drive costs £1 million. Set up of the test environment for each test also costs $1 million.

What a manager can do to make sure that the test will be conclusive without spending too much money?

Start with a small sample and than if the test is inconclusive repeat the experiment with a bigger sample
Calculate the sample size so the probability of detecting the difference will be satisfactory
Make the sample as big as possible to minimize the probability of inconclusive test


Power | Example Calculations >