<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en-GB">
	<id>https://training-course-material.com/index.php?action=history&amp;feed=atom&amp;title=Introduction_to_Simple_Linear_Regression</id>
	<title>Introduction to Simple Linear Regression - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://training-course-material.com/index.php?action=history&amp;feed=atom&amp;title=Introduction_to_Simple_Linear_Regression"/>
	<link rel="alternate" type="text/html" href="https://training-course-material.com/index.php?title=Introduction_to_Simple_Linear_Regression&amp;action=history"/>
	<updated>2026-05-13T10:24:57Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.45.1</generator>
	<entry>
		<id>https://training-course-material.com/index.php?title=Introduction_to_Simple_Linear_Regression&amp;diff=18039&amp;oldid=prev</id>
		<title>Bernard Szlachta: /* Quiz */</title>
		<link rel="alternate" type="text/html" href="https://training-course-material.com/index.php?title=Introduction_to_Simple_Linear_Regression&amp;diff=18039&amp;oldid=prev"/>
		<updated>2014-06-04T10:53:50Z</updated>

		<summary type="html">&lt;p&gt;&lt;span class=&quot;autocomment&quot;&gt;Quiz&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;{{Cat|Regression| 01}}&lt;br /&gt;
=Simple Regression=&lt;br /&gt;
In simple linear regression, we predict scores on one variable from the scores on a second variable.&lt;br /&gt;
*Criterion variable: The variable we are predicting, referred to as Y.&lt;br /&gt;
*Predictor variable: The variable we are basing our predictions on, referred to as X. &lt;br /&gt;
&lt;br /&gt;
When there is only one predictor variable, the prediction method is called &amp;#039;&amp;#039;&amp;#039;simple regression&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
* In simple linear regression, the predictions of Y when plotted as a function of X form a straight line.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Simple Regression Example==&lt;br /&gt;
Data in the table are plotted in the graph below. &lt;br /&gt;
* there is a positive relationship between X and Y. &lt;br /&gt;
* If you were going to predict Y from X, &lt;br /&gt;
* the higher the value of X, the higher your prediction of Y.&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot; style=&amp;quot;text-align: center; background-color: white;&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
| rowspan=&amp;quot;6&amp;quot; | [[File:ClipCapIt-140603-213413.PNG]]&lt;br /&gt;
! style=&amp;quot;width:70px;&amp;quot;| X&lt;br /&gt;
! style=&amp;quot;width:70px;&amp;quot;| Y&lt;br /&gt;
|-&lt;br /&gt;
| 1.00&lt;br /&gt;
| 1.00&lt;br /&gt;
|-&lt;br /&gt;
| 2.00&lt;br /&gt;
| 2.00&lt;br /&gt;
|-&lt;br /&gt;
| 3.00&lt;br /&gt;
| 1.30&lt;br /&gt;
|-&lt;br /&gt;
| 4.00&lt;br /&gt;
| 3.75&lt;br /&gt;
|-&lt;br /&gt;
| 5.00&lt;br /&gt;
| 2.25&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==Linear regression==&lt;br /&gt;
*Linear regression consists of finding the best-fitting straight line through the points. &lt;br /&gt;
*The best-fitting line is called a regression line&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
;Example&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot; style=&amp;quot;background-color: white;&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
| [[File:ClipCapIt-140603-213934.PNG]]&lt;br /&gt;
| valign=&amp;quot;top&amp;quot; width=&amp;quot;500px;&amp;quot;|&lt;br /&gt;
*The black diagonal line is the regression line and consists of the predicted score on Y for each possible value of X. &lt;br /&gt;
*The vertical lines from the points to the regression line represent the errors of prediction. &lt;br /&gt;
* the red point is very near the regression line; its error of prediction is small.&lt;br /&gt;
* By contrast, the yellow point is much higher than the regression line and therefore its error of prediction is large.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Regression Line=&lt;br /&gt;
==The error of prediction==&lt;br /&gt;
The error of prediction for a point is the value of the point minus the predicted value (the value on the line)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
;Example&lt;br /&gt;
*the predicted values (Y&amp;#039;) and the errors of prediction (Y-Y&amp;#039;). &lt;br /&gt;
*the first point has a Y of 1.00 and a predicted Y of 1.21. Therefore its error of prediction is -0.21.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot; style=&amp;quot;text-align:right; background-color: white;&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! style=&amp;quot;width:80px;&amp;quot;|X&lt;br /&gt;
! style=&amp;quot;width:80px;&amp;quot;|Y&lt;br /&gt;
! style=&amp;quot;width:80px;&amp;quot;|Y&amp;#039;&lt;br /&gt;
! style=&amp;quot;width:80px;&amp;quot;|Y-Y&amp;#039;&lt;br /&gt;
! style=&amp;quot;width:80px;&amp;quot;|(Y-Y&amp;#039;)&amp;lt;sup&amp;gt;2&amp;lt;/sup&amp;gt;&lt;br /&gt;
|-&lt;br /&gt;
| 1.00&lt;br /&gt;
| 1.00&lt;br /&gt;
| 1.210&lt;br /&gt;
| -0.210&lt;br /&gt;
| 0.044&lt;br /&gt;
|-&lt;br /&gt;
| 2.00&lt;br /&gt;
| 2.00&lt;br /&gt;
| 1.635&lt;br /&gt;
| 0.365&lt;br /&gt;
| 0.133&lt;br /&gt;
|-&lt;br /&gt;
| 3.00&lt;br /&gt;
| 1.30&lt;br /&gt;
| 2.060&lt;br /&gt;
| -0.760&lt;br /&gt;
| 0.578&lt;br /&gt;
|-&lt;br /&gt;
| 4.00&lt;br /&gt;
| 3.75&lt;br /&gt;
| 2.485&lt;br /&gt;
| 1.265&lt;br /&gt;
| 1.600&lt;br /&gt;
|-&lt;br /&gt;
| 5.00&lt;br /&gt;
| 2.25&lt;br /&gt;
| 2.910&lt;br /&gt;
| -0.660&lt;br /&gt;
| 0.436&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=Regression Line=&lt;br /&gt;
==The Best Fitting Line==&lt;br /&gt;
;What does it meant by &amp;quot;best fitting line&amp;quot; ?&lt;br /&gt;
* By far the most commonly used criterion for the best fitting line is the line that minimizes the sum of the squared errors of prediction. &lt;br /&gt;
* That is the criterion that was used to find the line in previous regression line graph.&lt;br /&gt;
* The last column in the previous table shows the squared errors of prediction. &lt;br /&gt;
* The sum of the squared errors of prediction shown in the previous table is lower than it would be for any other regression line.&lt;br /&gt;
&lt;br /&gt;
==The Formula for a Regression Line==&lt;br /&gt;
The formula for a regression line&lt;br /&gt;
 Y&amp;#039; = bX + A&lt;br /&gt;
 where Y&amp;#039; is the predicted score, b is the slope of the line, and A is the Y intercept. &lt;br /&gt;
&lt;br /&gt;
;Example&lt;br /&gt;
The equation for the line in the previous graph is&lt;br /&gt;
:Y&amp;#039; = 0.425X + 0.785&lt;br /&gt;
&lt;br /&gt;
* For X = 1, Y&amp;#039; = (0.425)(1) + 0.785 = 1.21&lt;br /&gt;
* For X = 2, Y&amp;#039; = (0.425)(2) + 0.785 = 1.64&lt;br /&gt;
[[File:ClipCapIt-140603-213934.PNG]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Computing the Regression Line==&lt;br /&gt;
* In the age of computers, the regression line is typically computed with statistical software. &lt;br /&gt;
* However, the calculations are relatively easy are given here for anyone who is interested. &lt;br /&gt;
 &lt;br /&gt;
&lt;br /&gt;
The calculations are based on the statistics below. &lt;br /&gt;
:M&amp;lt;sub&amp;gt;X&amp;lt;/sub&amp;gt; is the mean of X&lt;br /&gt;
:M&amp;lt;sub&amp;gt;Y&amp;lt;/sub&amp;gt; is the mean of Y &lt;br /&gt;
:s&amp;lt;sub&amp;gt;X&amp;lt;/sub&amp;gt; is the standard deviation of X&lt;br /&gt;
:s&amp;lt;sub&amp;gt;Y&amp;lt;/sub&amp;gt; is the standard deviation of Y&lt;br /&gt;
:r is the correlation between X and Y&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot; style=&amp;quot;text-align: center; background-color: white;&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! width=&amp;quot;60px&amp;quot; | M&amp;lt;sub&amp;gt;X&amp;lt;/sub&amp;gt; &lt;br /&gt;
! width=&amp;quot;60px&amp;quot; | M&amp;lt;sub&amp;gt;Y&amp;lt;/sub&amp;gt;&lt;br /&gt;
! width=&amp;quot;60px&amp;quot; | s&amp;lt;sub&amp;gt;X&amp;lt;/sub&amp;gt;&lt;br /&gt;
! width=&amp;quot;60px&amp;quot; | s&amp;lt;sub&amp;gt;Y&amp;lt;/sub&amp;gt;&lt;br /&gt;
! width=&amp;quot;60px&amp;quot; | r&lt;br /&gt;
|-&lt;br /&gt;
| 3&lt;br /&gt;
| 2.06&lt;br /&gt;
| 1.581&lt;br /&gt;
| 1.072&lt;br /&gt;
| 0.627&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==The Slope of the Regression Line ==&lt;br /&gt;
The slope (b) can be calculated as follows:&lt;br /&gt;
 b = r s&amp;lt;sub&amp;gt;Y&amp;lt;/sub&amp;gt;/s&amp;lt;sub&amp;gt;X&amp;lt;/sub&amp;gt;&lt;br /&gt;
and the intercept (A) can be calculated as&lt;br /&gt;
 A = M&amp;lt;sub&amp;gt;Y&amp;lt;/sub&amp;gt; - bM&amp;lt;sub&amp;gt;X&amp;lt;/sub&amp;gt;&lt;br /&gt;
  &lt;br /&gt;
For these data, &lt;br /&gt;
 b = (0.627)(1.072)/1.581 = 0.425&lt;br /&gt;
 A = 2.06 - (0.425)(3)=0.785&lt;br /&gt;
 &lt;br /&gt;
* The calculations have all been shown in terms of sample statistics rather than population parameters. &lt;br /&gt;
* The formulas are the same; simply use the parameter values for means, standard deviations, and the correlation.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Standardized Variables==&lt;br /&gt;
* The regression equation is simpler if variables are standardized so that their means are equal to 0 and standard deviations are equal to 1, for then b = r and A = 0. &lt;br /&gt;
* This makes the regression line:&lt;br /&gt;
 Z&amp;lt;sub&amp;gt;Y&amp;#039;&amp;lt;/sub&amp;gt; = (r)(Z&amp;lt;sub&amp;gt;X&amp;lt;/sub&amp;gt;)&lt;br /&gt;
 where Z&amp;lt;sub&amp;gt;Y&amp;#039;&amp;lt;/sub&amp;gt; is the predicted standard score for Y, r is the correlation, and Z&amp;lt;sub&amp;gt;X&amp;lt;/sub&amp;gt; is the standardized score for X. &lt;br /&gt;
Note that the slope of the regression equation for standardized variables is r.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Example==&lt;br /&gt;
The case study, Predicting GPA contains high school and university grades for 105 computer science majors at a local state school. &lt;br /&gt;
* We now consider how we could predict a student&amp;#039;s university GPA if we knew his or her high school GPA.&lt;br /&gt;
* The correlation is 0.78 The regression equation is &lt;br /&gt;
 GPA&amp;#039; = (0.675)(High School GPA) + 1.097&lt;br /&gt;
* A student with a high school GPA of 3 would be predicted to have a university GPA of&lt;br /&gt;
 GPA&amp;#039; = (0.675)(3) + 1.097 = 3.12&lt;br /&gt;
  &lt;br /&gt;
:[[File:ClipCapIt-140603-221400.PNG]]&lt;br /&gt;
&lt;br /&gt;
* The graph shows University GPA as a function of High School GPA&lt;br /&gt;
* There is a strong positive relationship between them&lt;br /&gt;
&lt;br /&gt;
==Assumptions==&lt;br /&gt;
* It may surprise you, but the calculations shown in this section are assumption free. &lt;br /&gt;
* Of course, if the relationship between X and Y is not linear, a different shaped function could fit the data better. &lt;br /&gt;
* Inferential statistics in regression are based on several assumptions.&lt;br /&gt;
&lt;br /&gt;
=Quiz=&lt;br /&gt;
&amp;lt;quiz display=simple &amp;gt;&lt;br /&gt;
{ The formula for a regression equation is&lt;br /&gt;
&lt;br /&gt;
[[File:ClipCapIt-140603-222649.PNG]]&lt;br /&gt;
&lt;br /&gt;
What would be the predicted Y score for a person scoring 4 on X? &lt;br /&gt;
&lt;br /&gt;
|type=&amp;quot;{}&amp;quot;}&lt;br /&gt;
{ 10 }&lt;br /&gt;
&lt;br /&gt;
{&lt;br /&gt;
{{Show Answer|&lt;br /&gt;
10&lt;br /&gt;
&lt;br /&gt;
Plug X equals 4 into the equation to find Y&amp;#039;equals to 3(4) - 2 is 10 &lt;br /&gt;
}}&lt;br /&gt;
}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{ Suppose it is possible to predict a person&amp;#039;s score on Test B from the person&amp;#039;s score on Test A. The regression equation is:&lt;br /&gt;
&lt;br /&gt;
[[File:ClipCapIt-140603-222717.PNG]]&lt;br /&gt;
&lt;br /&gt;
What is a person&amp;#039;s predicted score on Test B assuming this person got a 40 on Test A? &lt;br /&gt;
&lt;br /&gt;
|type=&amp;quot;{}&amp;quot;}&lt;br /&gt;
{ 101.5 }&lt;br /&gt;
&lt;br /&gt;
{&lt;br /&gt;
{{Show Answer|&lt;br /&gt;
101.5&lt;br /&gt;
&lt;br /&gt;
Plug A equals to 40 into the equation to find B&amp;#039;equals to 2.3(40) + 9.5 is 101.5 }}&lt;br /&gt;
}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{ Suppose a person got a score of 32.5 on Test A and a score of 95.25 on Test B. Using the same regression equation as in the previous problem, &lt;br /&gt;
&lt;br /&gt;
[[File:ClipCapIt-140603-222717.PNG]]&lt;br /&gt;
&lt;br /&gt;
what is the error of prediction for this person? &lt;br /&gt;
&lt;br /&gt;
|type=&amp;quot;{}&amp;quot;}&lt;br /&gt;
{ 11 }&lt;br /&gt;
&lt;br /&gt;
{&lt;br /&gt;
{{Show Answer|&lt;br /&gt;
11&lt;br /&gt;
&lt;br /&gt;
Predicted value, B&amp;#039; equals to 2.3(32.5) + 9.5 is 84.25; Error of prediction is B - B&amp;#039; equals to 95.25 - 84.25 equals to 11 &lt;br /&gt;
}}&lt;br /&gt;
}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{What is the most common criterion used to determine the best-fitting line? &lt;br /&gt;
&lt;br /&gt;
|type=&amp;quot;()&amp;quot;}&lt;br /&gt;
-The line that goes through the most points&lt;br /&gt;
-The line that has the same number of points above it as below it&lt;br /&gt;
+The line that minimizes the sum of squared errors of prediction&lt;br /&gt;
&lt;br /&gt;
{&lt;br /&gt;
{{Show Answer|&lt;br /&gt;
The line that minimizes the sum of squared errors of prediction&lt;br /&gt;
&lt;br /&gt;
The most common criterion used to determine the best-fitting line is the line that minimizes the sum of squared errors of prediction. This line does not need to go through any of the actual data points, and it can have a different number of points above it and below it. &lt;br /&gt;
}}&lt;br /&gt;
}&lt;br /&gt;
&lt;br /&gt;
{ The mean of X is 3 and the mean of Y is 7. The regression line that predicts Y from X necessarily goes through the point (3,7). &lt;br /&gt;
&lt;br /&gt;
|type=&amp;quot;()&amp;quot;}&lt;br /&gt;
+True&lt;br /&gt;
-False&lt;br /&gt;
&lt;br /&gt;
{&lt;br /&gt;
{{Show Answer|&lt;br /&gt;
The line that minimizes the sum of squared errors of prediction&lt;br /&gt;
&lt;br /&gt;
Someone who scored the mean on X would be predicted to score the mean on Y. &lt;br /&gt;
}}&lt;br /&gt;
}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{  You want to be able to predict a woman&amp;#039;s shoe size from her height. You have gathered this information from your female classmates. The mean height of women in your class is 64 inches, and the standard deviation is 2 inches. The mean shoe size is 8, and the standard deviation is 1. The correlation between these two variables is .5. What is the slope of the regression line? &lt;br /&gt;
&lt;br /&gt;
|type=&amp;quot;()&amp;quot;}&lt;br /&gt;
-0.00&lt;br /&gt;
+0.25&lt;br /&gt;
-0.50&lt;br /&gt;
-0.10&lt;br /&gt;
&lt;br /&gt;
{&lt;br /&gt;
{{Show Answer|&lt;br /&gt;
0.25&lt;br /&gt;
&lt;br /&gt;
b is r(sY/sX).  0.5&amp;lt;sup&amp;gt;0.5&amp;lt;/sup&amp;gt; equals to .25 }}&lt;br /&gt;
}&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/quiz&amp;gt;&lt;/div&gt;</summary>
		<author><name>Bernard Szlachta</name></author>
	</entry>
</feed>