Statistics for Decision Makers - 06.03 - Research Design - Sampling Bias

From Training Material
Jump to navigation Jump to search
title
06.03 - Research Design - Sampling Bias
author
Bernard Szlachta (NobleProg Ltd) bs@nobleprog.co.uk

Introduction。

  • Sampling bias refers to the method of sampling, not the sample itself.
  • Random sampling may not result in a sample representative of the population.
  • Not every sample obtained using a biased sampling method will be greatly non-representative of the population.

Types of Sampling Bias。

  • Self-Selection Bias
  • Undercoverage Bias
  • Survivorship Bias

Self-Selection Bias。

  • People who "self-select" for the experiment are likely to differ in important ways from the population the experimenter wishes to draw conclusions about.
Typical examples
  • "Non-scientific" polls taken on television.
  • Website polls.


Example 1
  • A university newspaper ran an ad asking for students to volunteer for a study in which intimate details of their sex lives would be discussed.
Example 2
  • An online survey about computer use is likely to attract people more interested in technology than is typical.

Undercoverage Bias。

Sampling too few observations from a segment of the population.


Example
ClipCapIt-140604-221705.PNG
  • The poll taken by the Literary Digest in 1936 indicated that Landon would win an election against Roosevelt by a large margin when, in fact, it was Roosevelt who won by a large margin.
  • A common explanation is that poorer people were undercovered because they were less likely to have telephones and that this group was more likely to support Roosevelt.

Survivorship Bias。

Survivorship bias occurs when the observations recorded at the end of the investigation are a non-random set of those present at the beginning of the investigation.

Example 1
ClipCapIt-140529-154717.PNG
  • In finance, survivorship bias is the tendency for failed companies to be excluded from performance studies because they no longer exist.
  • It often causes the results of studies to skew higher because only companies which were successful enough to survive until the end of the period are included.

Survivorship Bias。

Example 2
ClipCapIt-140529-153750.PNG
  • In World War II, the statistician Abraham Wald analyzed the distribution of hits from anti-aircraft fire on aircraft returning from missions
  • Where to place extra armour?
  • At locations that were frequently hit to reduce the damage there?
  • However, this would ignore the survivorship bias occurring because only a subset of aircraft return
  • If there were few hits in a certain location on returning planes, then hits in that location were likely to bring a plane down
Locations without hits on the returning planes should be given extra armour

Survivorship Bias。

Example 3
  • In online surveys a long registration process can discourage some people from finishing the survey
  • The length of the survey is another factor as well. In other words, people who can spend a couple of hours on a survey usually are not very representative
  • The rate of people who start the process and people who finish it may be good indicator

Survivorship Bias - Solution。

  • What would be the solution?
  • Remove the score from pretest which did not make it to posttest?
  • Any other solutions?
    • Compute the mean of the scores which didn't make it to posttest in the pretest.
    • Test whether the mean differs substantially from the rest of the pretest scores.
    • If it doesn't, you save, drop-out scores do not matter.

Quiz。

Please find the quiz here

Quiz

1 A researcher does a survey randomly calling land line phones. People who only have cell phones are not sampled. This is an example of

self-selection bias.
undercoverage bias.
survivorship bias.

Answer >>

undercoverage bias

This is undercoverage bias since those with only cell phones are not only undercovered but not covered at all.


2 A radio station asks readers to phone in their choice in a daily poll. This is an example of

self-selection bias.
undercoverage bias.
survivorship bias.

Answer >>

self-selection bias

This is self-selection bias since those with strong feelings are most likely to respond.


3 A researcher surveys people who have been in therapy for 5 years with the same psychotherapist. This is an example of

self-selection bias.
undercoverage bias.
survivorship bias.

Answer >>

survivorship bias

Those who stay for 5 years may be more satisfied with their therapist than average. They may also have more severe problems if they stay in therapy so long.


4 You want to test whether calling a customer a week after the purchase increases customer satisfaction with software.
First, you do the pretest of satisfaction just after the purchase. Than, a week after the call you perform a posttest.
Not all customers responded to the posttest. What can you do to remove the effect of the bias?

Take into account only the customers who took the posttest.
Nothing can be done, all pretest respondents must perform the posttest to make the research valid.
If the mean pretest of the people who did not take the posttest is not significantly different from the overall pretest mean, you can compare the posttest and pretest means.