Specific Comparisons (Independent Groups)

From Training Material
Jump to navigation Jump to search

Learning Objectives

  1. Define linear combination
  2. Specify a linear combination in terms of coefficients
  3. Do a significance test for a specific comparison

Specific Comparisons (Independent Groups)

  • There are many occasions on which the comparisons among means are more complicated than simply comparing one mean with another
  • This section shows how to test these more complex comparisons.
  • The methods in this section assume that the comparison among means was decided on before looking at the data (planned comparisons)
  • A different procedure is necessary for unplanned comparisons

Self-esteem Example

  • Twelve subjects were selected from a population of:
    • high-self-esteem subjects (esteem = 1)
    • an additional 12 subjects were selected from a population of low-self-esteem subjects (esteem = 2)
  • Subjects then performed on a task and (independent of how well they really did) half were told they:
    • succeeded (outcome = 1)
    • the other half were told they failed (outcome = 2)
  • Therefore there were six subjects in each esteem/success combination and 24 subjects altogether.

After the task, subjects were asked to rate (on a 10-point scale) how much of their outcome (success or failure) they attributed to themselves as opposed to being due to the nature of the task.

Data from Hypothetical Experiment
outcome esteem attrib
1 1 7
1 1 8
1 1 7
1 1 8
1 1 9
1 1 5
1 2 6
1 2 5
1 2 7
1 2 4
1 2 5
1 2 6
2 1 4
2Template:Statistics Links 1 6
2 1 5
2 1 4
2 1 7
2 1 3
2 2 9
2 2 8
2 2 9
2 2 8
2 2 7
2 2 6


The means of the four conditions are shown in table below

Mean ratings of self-attributions of success or failure
Outcome Esteem Mean
Success High Self Esteem 7.333
Low Self Esteem 5.500
Failure High Self Esteem 4.833
Low Self Esteem 7.833

Does positive outcome boost self-esteem?

  • Did, on average, subjects who were told they succeeded differed significantly from subjects who were told they failed?
  • The means for subjects in the success condition are 7.333 for the high-self-esteem subjects and 5.500 for the low-self-esteem subjects

The mean of all subjects in the:

  • success condition is (7.333 + 5.500)/2 = 6.417
  • failure condition is (4.833 + 7.833)/2 = 6.333
How do we do a significance test for this difference of 6.417-6.333 = 0.083?

Linear Combination

The first step is to express this difference in terms of a linear combination of a set of coefficients and the means We can compute the mean of the success and failure conditions:

Msuccess = (.5)(7.333) + (.5)(5.500) = 6.42
Mfailure = (.5)(4.833) + (.5)(7.833) = 6.33

The difference between the two means can be expressed as

.5 x 7.333 + .5 x 5.500 -(.5 x 4.833 + .5 x 7.833)
= .5 x 7.333 + .5 x 5.500 -.5 x 4.833 - .5 x 7.83
  • We therefore can compute the difference between the "success" mean and the "failure" mean by multiplying each "success" mean by 0.5, each failure mean by -0.5 and adding the results
  • In Table below, the coefficient column is the multiplier and the product column in the result of the multiplication.
  • If we add up the four values in the product column we get
L = 3.667 + 2.750 - 2.417 - 3.917 = 0.083
  • This is the same value we got when we computed the difference between means previously (within rounding error)
  • We call the value "L" for "linear combination."
Coefficients for comparing low and high self esteem
Outcome Esteem Mean Coeff Product
Success High Self Esteem 7.333 0.5 3.667
Low Self Esteem 5.500 0.5 2.750
Failure High Self Esteem 4.833 -0.5 -2.417
Low Self Esteem 7.833 -0.5 -3.917

Now, the question is whether our value of L is significantly different from 0. The general formula for L is:

L1stat.gif

where ci is the ith coefficient and Mi is the ith mean. As shown above, L = 0.083. The formula for testing L for significance is shown below

Lstat.gif
C.gif

MSE is the mean of the variances. The four variances are shown in the table below. Their mean is 1.625. Therefore MSE = 1.625.

Variances of attributions of success or failure to oneself
Outcome Esteem Variance
Success High Self Esteem 1.867
Low Self Esteem 1.100
Failure High Self Esteem 2.167
Low Self Esteem 1.367

The value of n is the number of subjects in each group. Here, n = 6. Putting it all together,

Tl1.gif

The degrees of freedom is

df = N - k
N is the total number of subjects (24)
k is the number of groups (4)
df = 20

We find that the two-tailed probability value is 0.874

Therefore, the difference between the "success" condition and the "failure" condition is not significant.

Does the effect of outcome (success or failure) differs depending on the self esteem of the subject?

  • Does the effect of outcome (success or failure) differs depending on the self esteem of the subject?
  • For example, success may make high-self-esteem subjects more likely to attribute the outcome to themselves whereas success may make low-self-esteem subjects less likely to attribute the outcome to themselves.
  • First, we have to test a difference between differences
  • Specifically, is the difference between success and failure outcomes for the high-self-esteem subjects different from the difference between success and failure outcomes for the low-self-esteem subjects
  • The means shown in the table below show that this is the case
  • For the high-self-esteem subjects, the difference between the success and failure is 7.333-4.8333 = 2.5
  • For low-self-esteem subjects, the difference is 5.500-7.833=-2.333
  • The difference between differences is 2.5 - (-2.333) =4.83.
Coefficients for testing differences between differences
Self Esteem Outcome Mean Coeff Product
High Success 7.333 1 7.333
Failure 4.833 -1 -4.833
Low Success 5.500 -1 -5.500
Failure 7.833 1 7.833

To continue the calculations,

L2stat.gif
C2.gif
Tl2.gif
  • The two-tailed p value is 0.0002
  • Therefore, the difference between differences is highly significant.

Interaction

  • In Analysis of Variance section, you will see that comparisons such as this are testing what is called an interaction
  • In general, there is an interaction when the effect of one variable differs as a function of the level of another variable
  • In this example the effect of the outcome variable is different depending on the subject's self esteem
  • For the high-self-esteem subjects, success led to more self attributions than did failure; for the low-self-esteem subjects, success led to less self attributions than failure

Multiple Comparisons

  • The more comparisons you make, the greater your chance of a Type I error
  • It is useful to distinguish between two error rates:
    1. the per-comparison error rate and
    2. the familywise error rate.

Per Comparison Error Rate

  • The per-comparison error rate is the probability of a Type I error for a particular comparison.
  • The familywise error rate is the probability of making one or more Type I error in a family or set of comparisons.

In the attribution experiment discussed above, we computed two comparisons. If we use the 0.05 level for each comparison, then the per-comparison rate is simply 0.05.

Bonferroni inequality

  • The family-wise rate can be complex
  • There is a simple approximation that is fairly accurate when the number of comparisons is small

Let us define:

  • α as the per-comparison error rate
  • c as the number of comparisons

The following inequality always holds true for the familywise error rate (FW) can be approximated as:

FW ≤ cα
  • This inequality is called the Bonferroni inequality.

Bonferroni correction

  • The Bonferroni inequality can be used to control the familywise error rate as follows:
If you want to the familywise error rate to be α, you use α/c as the per-comparison error rate
  • This correction, called the Bonferroni correction, will generally result in a family wise error rate less than α


Should the familywise error rate be controlled?

  • Unfortunately, there is no clear-cut answer to this question.
  • The disadvantage of controlling the familywise error rate is that it makes it more difficult to obtain a significant result for any given comparison: The more comparisons you do, the lower the per-comparison rate must be and therefore the harder it is to reach significance.
  • That is, the power is lower when you control the familywise error rate.
  • The advantage is that you have a lower chance of making a Type I error.

Family of comparisons

  • One consideration is the definition of a family of comparisons
  • Let's say you conducted a study in which you were interested in whether there was a difference between male and female babies in the age at which they started crawling
  • After you finished analyzing the data, a colleague of yours had a totally different research question: Do babies who are born in the winter differ from those born in the summer in the age they start crawling?
  • Should the familywise rate be controlled or should it be allowed to be greater than 0.05?
  • Our view is that there is no reason you should be penalized (by lower power) just because your colleague used the same data to address a different research question.
  • Therefore, the familywise error rate need not be controlled
  • Consider the two comparisons done on the attribution example at the beginning of this section:
These comparisons are testing completely different hypotheses. Therefore, controlling the familywise rate is not necessary.
GET LAMP coin.jpg

Now consider a study designed to investigate the relationship between various variables and the ability of subjects to predict the outcome of a coin flip.

  1. One comparison is between males and females;
  2. Second comparison is between those over 40 and those under 40
  3. Third is between vegetarians and non-vegetarians
  4. Fourth is between firstborns and others

The question of whether these four comparisons are testing different hypotheses depends on your point of view.

On the one hand, there is nothing about whether age makes a difference that is related to whether diet makes a difference

In that sense, the comparisons are addressing different hypotheses

On the other hand, the whole series of comparisons could be seen as addressing the general question of whether anything affects the ability to predict the outcome of a coin flip

If nothing does, then allowing the familywise rate to be high means that there is a high probability of reaching the wrong conclusion.

Orthogonal Comparisons

  • In the preceding sections, we talked about comparisons being independent
  • Independent comparisons are often called orthogonal comparisons
  • There is a simple test to determine whether two comparisons are orthogonal:
If the sum of the products of the coefficients is 0, then the comparisons are orthogonal
  • Consider again the experiment on the attribution of success or failure.
  • The table below shows the coefficients previously presented in chapter
  • Note that the sum of the numbers in this column is 0
  • Therefore, the two comparisons are orthogonal.
Outcome Esteem C1 C2 Product (C1 * C2)
Success High Self Esteem 0.5 1 0.5
Low Self Esteem 0.5 -1 -0.5
Failure High Self Esteem -0.5 -1 0.5
Low Self Esteem -0.5 1 -0.5


Non-orthogonal comparisons

  • The table below shows two comparisons that are not orthogonal
  • The first compares the high-self-esteem subjects to low-self-esteem subjects; the second considers only those in the success group compares high-self-esteem subjects to low-self-esteem subjects
  • The failure group is ignored by using 0's as coefficients
  • Comparison of these two groups of subjects for the whole sample is not independent of the comparison of them for the success group
  • You can see that the sum of the products of the coefficients is 0.5 and not 0.
Coefficients for two non-orthogonal comparisons
Outcome Esteem C1 C2 Product (C1 * C2)
Success High Self Esteem 0.5 0.5 0.25
Low Self Esteem -0.5 -0.5 0.25
Failure High Self Esteem 0.5 0.0 0.0
Low Self Esteem -0.5 0.0 0.0

Questions

1 Bonferroni adjustments are necessary when making the multiple comparisons to avoid inflating the type I error rate.

True
False

Answer >>

True. the Bonferroni adjustment is designed to control the type I error rate by reducing the the critical p-value for all comparisons.


2 You plan to test all pairwise comparisons among 4 means. What is the new critical p value after a Bonferroni adjustment to maintain an experiment-wise alpha of 0.05.

Answer >>

6 comparisons 0.05/6 = 0.00833.


3 What coefficients would be used to compare Group 1 to the average of Groups 2-4?

1, 1, 1, 1
1, -.333, -.333, -.333
1, -.5, -.5, -.5

Answer >>

1, -.333, -.333, -.333 because the -.333's result in the average of Groups 2-4.


4 In an experiment with three conditons, the means are 2, 4, and 9. What would be the value of L for the coefficients 2, -1, -1?

Answer >>

(2)(2) + (-1)(4) + (-1)(9) = -9.


5 Are the following two sets of coefficients are orthogonal?
-1, 0, 0, 1
-1, 1, 0, 0

Yes
No

Answer >>

The sum of the products of the coefficients must be 0 for them to be orthogonal. For these coefficients (-1)(-1) + (0)(1) + (0)(0) +(1)(0) = 1.


6 In an experiment with three conditons and 5 subjects per condition, the means are 2, 4, and 9 and the MSE is 24. Find t for the coefficients 2, -1, -1?

Answer >>

L = -9, sum of coefficients squared is 6, n is 5, MSE is 24 and t = 9/sqrt((6)(24)/5) = 1.677.