<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en-GB">
	<id>https://training-course-material.com/index.php?action=history&amp;feed=atom&amp;title=R_-_Testing_Means</id>
	<title>R - Testing Means - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://training-course-material.com/index.php?action=history&amp;feed=atom&amp;title=R_-_Testing_Means"/>
	<link rel="alternate" type="text/html" href="https://training-course-material.com/index.php?title=R_-_Testing_Means&amp;action=history"/>
	<updated>2026-05-14T01:20:25Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.45.1</generator>
	<entry>
		<id>https://training-course-material.com/index.php?title=R_-_Testing_Means&amp;diff=52421&amp;oldid=prev</id>
		<title>Bernard Szlachta: /* Exercise */</title>
		<link rel="alternate" type="text/html" href="https://training-course-material.com/index.php?title=R_-_Testing_Means&amp;diff=52421&amp;oldid=prev"/>
		<updated>2017-02-15T08:09:34Z</updated>

		<summary type="html">&lt;p&gt;&lt;span class=&quot;autocomment&quot;&gt;Exercise&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;[[Category:Intro to R|080]]&lt;br /&gt;
&lt;br /&gt;
== Binomial distribution ==&lt;br /&gt;
[[Introduction_to_Hypothesis_Testing#James_Bond_Example]]&lt;br /&gt;
 &amp;gt; pbinom(12,prob=0.5,lower.tail=F,size=16)&lt;br /&gt;
 [1] 0.01063538&lt;br /&gt;
&lt;br /&gt;
or &lt;br /&gt;
&lt;br /&gt;
 &amp;gt; binom.test(13,n=16,p=0.5,alternative=&amp;quot;greater&amp;quot;,)&lt;br /&gt;
 &lt;br /&gt;
 	Exact binomial test&lt;br /&gt;
 &lt;br /&gt;
 data:  13 and 16 &lt;br /&gt;
 number of successes = 13, number of trials = 16, p-value =&lt;br /&gt;
 0.01064&lt;br /&gt;
 alternative hypothesis: true probability of success is greater than 0.5 &lt;br /&gt;
 95 percent confidence interval:&lt;br /&gt;
  0.5834277 1.0000000 &lt;br /&gt;
 sample estimates:&lt;br /&gt;
 probability of success &lt;br /&gt;
                0.8125&lt;br /&gt;
&lt;br /&gt;
== Difference between means - independent samples ==&lt;br /&gt;
&amp;quot;Do the population means for urban and rural residents differ on a test of energy use?&amp;quot;&lt;br /&gt;
&lt;br /&gt;
 # Create a null-hypothesis for one-tailed and two-tailed test&lt;br /&gt;
 # Interpret the result&lt;br /&gt;
Load the data:&lt;br /&gt;
 &amp;gt; e &amp;lt;- read.table(&amp;quot;http://training-course-material.com/images/e/e4/Energy_use.txt&amp;quot;,header=T);&lt;br /&gt;
Check variances:&lt;br /&gt;
 &amp;gt; sapply(e,var)&lt;br /&gt;
   Urban   Rural &lt;br /&gt;
 2915935 1859019 &lt;br /&gt;
&lt;br /&gt;
Or nicely formated:&lt;br /&gt;
 &amp;gt; format(sapply(e,var),big.mark = &amp;quot;,&amp;quot;)&lt;br /&gt;
      Urban       Rural &lt;br /&gt;
 &amp;quot;2,915,935&amp;quot; &amp;quot;1,859,019&amp;quot; &lt;br /&gt;
&lt;br /&gt;
Quite big difference, let us test weather we can assume they are equal:&lt;br /&gt;
 &amp;gt; var.test(e$Urban,e$Rural)&lt;br /&gt;
 &lt;br /&gt;
 	F test to compare two variances&lt;br /&gt;
 &lt;br /&gt;
 data:  e$Urban and e$Rural &lt;br /&gt;
 F = 1.5685, num df = 19, denom df = 19, p-value = 0.3349&lt;br /&gt;
 alternative hypothesis: true ratio of variances is not equal to 1 &lt;br /&gt;
 95 percent confidence interval:&lt;br /&gt;
  0.620845 3.962825 &lt;br /&gt;
 sample estimates:&lt;br /&gt;
 ratio of variances &lt;br /&gt;
           1.568534 &lt;br /&gt;
&lt;br /&gt;
Convert the data:&lt;br /&gt;
 &amp;gt; energy &amp;lt;- stack(e)  #Convert colums into factors&lt;br /&gt;
 &amp;gt; names(energy) &amp;lt;- c(&amp;quot;EnergyUse&amp;quot;,&amp;quot;Type&amp;quot;) &lt;br /&gt;
&lt;br /&gt;
And test the mean&lt;br /&gt;
 &amp;gt; t.test(EnergyUse ~Type,   data=energy,var.equal=T)&lt;br /&gt;
&lt;br /&gt;
 	Two Sample t-test&lt;br /&gt;
&lt;br /&gt;
 data:  EnergyUse by Type &lt;br /&gt;
 t = -4.9907, df = 38, p-value = 1.367e-05&lt;br /&gt;
 alternative hypothesis: true difference in means is not equal to 0 &lt;br /&gt;
 95 percent confidence interval:&lt;br /&gt;
  -3427.706 -1449.394 &lt;br /&gt;
 sample estimates:&lt;br /&gt;
 mean in group Rural mean in group Urban &lt;br /&gt;
            2978.65             5417.20 &lt;br /&gt;
&lt;br /&gt;
* H0: Means are equal&lt;br /&gt;
* H1: Means are not equal&lt;br /&gt;
&lt;br /&gt;
* The probability that the difference between two mean is just a pure chance is tiny 0.000001367 &amp;lt; 0.05.&lt;br /&gt;
* Therefore, we reject the null hypothesis.&lt;br /&gt;
* The result is statistically significant&lt;br /&gt;
&lt;br /&gt;
== Exercise ==&lt;br /&gt;
&amp;quot;Is there a difference in contribution levels to nonprofits between married and never married females?&amp;quot;&lt;br /&gt;
# Create a null hypothesis and an alternative hypothesis&lt;br /&gt;
# Interpret the result and draw a conclusion&lt;br /&gt;
https://training-course-material.com/images/c/c9/Non-profit-contribution.txt&lt;br /&gt;
 nonprofit &amp;lt;- read.table(&amp;quot;https://training-course-material.com/images/c/c9/Non-profit-contribution.txt&amp;quot;,header=T,fill = T);&lt;br /&gt;
&amp;lt;div class=&amp;quot;toccolours mw-collapsible mw-collapsed&amp;quot; style=&amp;quot;&amp;quot;&amp;gt;&lt;br /&gt;
Answer &amp;gt;&amp;gt;&lt;br /&gt;
&amp;lt;div class=&amp;quot;mw-collapsible-content&amp;quot;&amp;gt;&lt;br /&gt;
 npc &amp;lt;- read.table(&amp;quot;https://training-course-material.com/images/c/c9/Non-profit-contribution.txt&amp;quot;,fill=NA,h=T)&lt;br /&gt;
 npcs &amp;lt;- stack(npc)&lt;br /&gt;
 t.test(values~ind, alternative=&amp;#039;two.sided&amp;#039;, conf.level=.95, var.equal=FALSE,data=npcs);&lt;br /&gt;
 p-value = 0.7836&lt;br /&gt;
 There fore there is not enough evidence to reject the null hypothesis.&lt;br /&gt;
 In other words, the difference between means is not statistically significant.&lt;br /&gt;
 There is not enough evidence to say that the contribution levels to non-profit between married and never married females is different.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Difference between means - paired ==&lt;br /&gt;
&lt;br /&gt;
Does an intervention program reduce the number of cigarettes smoked each day?&amp;quot;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Assumptions&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
* The number of points in each data set must be the same&lt;br /&gt;
* They must be organized in pairs, in which there is a definite relationship between each pair of data points.&lt;br /&gt;
* In our case the people asked were the same people after and before the program.&lt;br /&gt;
 &lt;br /&gt;
&lt;br /&gt;
Does an intervention program reduce the number of cigarettes smoked each day?&amp;quot; &lt;br /&gt;
Assumed significance level alpha = 0.05 (the maximum  tolerable probability of H0 to be a pure chance)&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Two Tails&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
* H0 - means are the same  (mb - ma = 0, or mb = ma)&lt;br /&gt;
* H1 - they are different &lt;br /&gt;
&lt;br /&gt;
 smoke &amp;lt;- read.table(&amp;quot;http://training-course-material.com/images/1/14/Smoking.txt&amp;quot;,h=T)&lt;br /&gt;
 t.test(smoke$Before, smoke$After, alternative=&amp;#039;two.sided&amp;#039;, conf.level=.95, paired=TRUE)&lt;br /&gt;
 Paired t-test&lt;br /&gt;
 data:  smoke$Before and smoke$After &lt;br /&gt;
 t = 1.5782, df = 19, p-value = 0.131&lt;br /&gt;
 alternative hypothesis: true difference in means is not equal to 0 &lt;br /&gt;
 95 percent confidence interval:&lt;br /&gt;
  -0.7665942  5.4665942 &lt;br /&gt;
 sample estimates:&lt;br /&gt;
 mean of the differences &lt;br /&gt;
                    2.35 &lt;br /&gt;
&lt;br /&gt;
* P-value =  0.131024&lt;br /&gt;
* The probability that the difference  between the means  is  just by pure chance, given that they are equal in reality)&lt;br /&gt;
* It is quite probable (more probably than our alpha)&lt;br /&gt;
* Therefore there is not enough evidence  to reject  hypotesis one.&lt;br /&gt;
* There is not enough evidence to say that the program reduced the numbers of smoked cigarates.&lt;br /&gt;
* It doesn&amp;#039;t mean that the programe didn&amp;#039;t work!!!&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;One Tail&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
 smoke &amp;lt;- read.table(&amp;quot;http://training-course-material.com/images/1/14/Smoking.txt&amp;quot;,h=T)&lt;br /&gt;
 t.test(smoke$Before, smoke$After, alternative=&amp;#039;greater&amp;#039;, conf.level=.95, paired=TRUE)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* H0 - mbefore &amp;lt;= mafter  (i.e. mb - ma &amp;lt;= 0) - number of cigarettes smoked increased or hasn&amp;#039;t changed &lt;br /&gt;
* H1 - mbefore &amp;gt; mafter  (i.e. mb-ma &amp;gt; 0) - people decreased the number of cigarettes smoked &lt;br /&gt;
* P-value = 0.065512&lt;br /&gt;
* It is still quite probable that number of  smoked cigarates  before the programme whas lower by pure chance.&lt;br /&gt;
* How would the result change if significance level would be 10%?&lt;br /&gt;
&lt;br /&gt;
== Exercises ==&lt;br /&gt;
&lt;br /&gt;
=== Exercise 1 ===&lt;br /&gt;
Is there a difference in weekly sales levels in units sold between Region 1 and Region 2?&lt;br /&gt;
&lt;br /&gt;
http://training-course-material.com/images/c/c7/Sales-in-regions.txt&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div style=&amp;quot;color:white !important&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
sales &amp;lt;- read.table(&amp;quot;&amp;quot;,h=T)&lt;br /&gt;
&lt;br /&gt;
sales.f &amp;lt;- stack(sales[c(&amp;quot;Sales.R1&amp;quot;,&amp;quot;Sales.R2&amp;quot;)])&lt;br /&gt;
&lt;br /&gt;
tapply(sales.f$values,sales.f$ind,mean)&lt;br /&gt;
&lt;br /&gt;
t.test(values~ind, alternative=&amp;#039;less&amp;#039;, conf.level=.95, var.equal=FALSE,data=sales.f)&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Exercise 2 (proportion test) ===&lt;br /&gt;
&lt;br /&gt;
A company has been accused of racism. Only 4 green people had been promoted compared with 196 pinks.&lt;br /&gt;
It turned out that there where 2310 pink applicants and 32 green applicants.&lt;br /&gt;
# Would this suggest that pink people where discriminated (12.5% success rate for green versus 8.5% for pinks)?&lt;br /&gt;
# What is probability that would happen by pure chance?&lt;br /&gt;
# How situation would look like if 3 green people had been promoted instead of 4?&lt;br /&gt;
&amp;lt;div style=&amp;quot;color:white !important&amp;quot;&amp;gt;&lt;br /&gt;
prop.test(c(4,196),c(32,2310))&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;/div&gt;</summary>
		<author><name>Bernard Szlachta</name></author>
	</entry>
</feed>