Introduction to Hypothesis Testing

Category:Hypothesis Testing

title: Hypothesis Testing
author: Bernard Szlachta (NobleProg Ltd)

Prerequisites

Causation,Binomial Distribution

Tool Description

Names: Hypothesis Testing
Usages: Checking the probability of things being different
Examples: Is the new version of software better than the previous one?; Do women watch YouTube more often than man?; Does blue background of the background makes people less tiered than red one?

Questions。

How can we distinguish between two things?
What is the probability that the conclusion is not due to pure chance?
What is a difference between:
- Probability of an event
- Probability of a state of the world
How to define the "null hypothesis" and the "alternative hypothesis"

Lady Tasting Tea。

Ronald Fisher explained the concept of hypothesis testing with a story of a lady tasting tea.

The lady in question claimed to be able to tell whether the tea or the milk was added first to a cup.
Fisher gave her eight cups, four of each variety, in random order.
The woman got all eight cups correct.
What is the probability that she got it right, but just by pure chance?

Answer。

Expand

Answer >>

James Bond Example。

File:James Bond Sean Connery Dr. No.jpg

Problem

James Bond insists that Martinis should be shaken rather than stirred
We want to determine whether Mr. Bond can tell the difference between a shaken and a stirred Martini

Experiment

Suppose we gave Mr. Bond a series of 16 taste tests
In each test, we flipped a fair coin to determine whether to stir or shake the Martini
Then we presented the martini to Mr. Bond and asked him to decide whether it was shaken or stirred

Results

Let's say Mr. Bond was correct on 13 of the 16 taste tests
Can he tell the differenced?

Interpretation

This result does not prove that he does!
It could be he was just lucky and guessed right 13 out of 16 times
How plausible is the explanation that he was just lucky?

Answer。

Expand

Answer >>

Physicians' Reactions。

Problem

Do physicians spend less time with obese patients?

Experiment

Physicians were sampled randomly and each was shown a chart of a patient complaining of a migraine headache
They were then asked to estimate how long they would spend with the patient
The charts were identical except that for half the charts, the patient was obese and for the other half, the patient was of normal weight
The chart a particular physician viewed was determined randomly
31 physicians viewed charts of average-weight patients and 38 physicians viewed charts of obese patients

Results

The reported mean time spend with patients:
- obese 24.7min
- average-weight: 31.4min
How might this difference between means have occurred?

Interpretation。

Expand

Answer >>

The Probability Value。

Probability value is also know as "P", "P-value" or "p"
In the James Bond example, the computed probability of 0.0106 is the probability he would be correct on 13 or more taste tests (out of 16) if he were just guessing (i.e. by pure chance).
The 0.0106 is NOT the probability he cannot tell the difference
The probability of 0.016 is the probability of a certain outcome (13 or more out of 16) assuming a certain state of the world (James Bond was only guessing.)
It is not the probability that a state of world is true

An example - a bird which knows how to divide。

An animal trainer claims that a trained bird can determine whether or not numbers are evenly divisible by 7
In an experiment assessing this claim, the bird is given a series of 16 test trials
On each trial, a number is displayed on a screen and the bird pecks at one of two keys to indicate its choice
The numbers are chosen in such a way that the probability of any number being evenly divisible by 7 is 0.50
The bird is correct on 9/16 choices

Answer。

Expand

Answer >>

State of the world vs an outcome。

Hypotheses are The possible states of the world
The probability value is the probability of an outcome given the hypothesis
It is not the probability of the hypothesis given the outcome

If the probability of the outcome given the hypothesis is sufficiently low, we have evidence that the hypothesis false
However, we do not compute the probability that the hypothesis is false
In the James Bond example, the hypothesis is that he cannot tell the difference between shaken and stirred martinis
The probability value is low (0.0106), thus providing evidence that he can tell the difference
However, we have not computed the probability that he can tell the difference
A branch of statistics called Bayesian statistics provides methods for computing the probabilities of hypotheses

The Null Hypothesis。

The null hypothesis is that an apparent effect is due to chance

In the Physicians' Reactions example, the null hypothesis is that in the population of physicians, the mean time expected to be spent with obese patients is equal to the mean time expected to be spent with average-weight patients:

H0: μobese = μaverage
or
H0: μobese - μaverage = 0.

In a correlational study of the relationship between high-school grades and college grades would be that the population correlation is 0

H0: ρ = 0

The test for a biased coin:

H0: π = 0.5

The null hypothesis is typically the opposite of the researcher's hypothesis
- the physicians were expect to spend less time with obese patients, but the null hypothesis is they do not.
- ff the null hypothesis were true, a difference as large or larger than the sample difference of 6.7 minutes would be very unlikely to occur
- therefore, the researchers rejected the null hypothesis of no difference and concluded that in the population, physicians intend to spend less time with obese patients

The alternative hypothesis。

- If the null hypothesis is rejected, then the alternative hypothesis is accepted
The alternative hypothesis is simply the reverse of the null hypothesis

H0: μobese = μaverage
is rejected, then there are two alternatives:
H1: μobese < μaverage
or
H1: μobese > μaverage

The direction of the sample means determines which alternative is adopted

Questions。

Please do the questions on the website (not in presentation mode)

Exercise 1。

Judy says that she can persuade the enquiring customer to buy a course.
The probability that the customer would buy without Judy persuading them is 70% percent
Statistics shows that Judy successfully converted 85 out 100 enquiries
Is Judy a good saleswoman or she just got lucky?

Exercise 2。

You want check whether new design of your website increases booking rate
You conduct A/B test, in case of design A, 2% of 1000 people visiting the site bought the course, in case of design B, 2.5% of 1000 visitor bought the course
Is the new design B better?

Questions

Template:Statistics Links Hypothesis Testing | Significance Testing >

Introduction to Hypothesis Testing

Contents

Tool Description

Questions。

Lady Tasting Tea。

Answer。

James Bond Example。

Answer。

Physicians' Reactions。

Interpretation。

The Probability Value。

An example - a bird which knows how to divide。

Answer。

State of the world vs an outcome。

The Null Hypothesis。

The alternative hypothesis。

Questions。

Exercise 1。

Exercise 2。

Questions

Navigation menu

Personal tools

Namespaces

Variants

Views

Search

Opportunities

Navigation

Tools

	he would get 80% correct if he took the test again.
	he would get this score or better if he were just guessing.
	he was guessing blindly on the test.

	Mean of the 1st graders < Mean of the 2nd graders.
	Mean of the 1st graders > Mean of the 2nd graders.
	Mean of the 1st graders = Mean of the 2nd graders.

	There is no difference! Both must be equally popular with absolute certainty!
	There is very small probability (6 in 100) that there is a difference
	If there is no difference in reality, it is quite likely (6 in 100) for this sample size to get this difference by pure chance. Test is inconclusive.

	Women watch Youtube more
	More man than women watch Youtube
	We cannot tell