<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en-GB">
	<id>https://training-course-material.com/index.php?action=history&amp;feed=atom&amp;title=Machine_Learning_with_R</id>
	<title>Machine Learning with R - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://training-course-material.com/index.php?action=history&amp;feed=atom&amp;title=Machine_Learning_with_R"/>
	<link rel="alternate" type="text/html" href="https://training-course-material.com/index.php?title=Machine_Learning_with_R&amp;action=history"/>
	<updated>2026-05-13T23:38:49Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.45.1</generator>
	<entry>
		<id>https://training-course-material.com/index.php?title=Machine_Learning_with_R&amp;diff=24675&amp;oldid=prev</id>
		<title>Bernard Szlachta at 00:37, 17 February 2015</title>
		<link rel="alternate" type="text/html" href="https://training-course-material.com/index.php?title=Machine_Learning_with_R&amp;diff=24675&amp;oldid=prev"/>
		<updated>2015-02-17T00:37:16Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;[[Category:R]]&lt;br /&gt;
[[Category:Machine Learning]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;slideshow style=&amp;quot;nobleprog&amp;quot; headingmark=&amp;quot;⌘&amp;quot; incmark=&amp;quot;…&amp;quot; scaled=&amp;quot;false&amp;quot; font=&amp;quot;Trebuchet MS&amp;quot; &amp;gt;&lt;br /&gt;
;title: Introduction to R with exercises&lt;br /&gt;
;author: MIHALY BARASZ for NobleProg Ltd&lt;br /&gt;
&amp;lt;/slideshow&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== TABLE OF CONTENTS ⌘==&lt;br /&gt;
* Sources and further reading&lt;br /&gt;
* Machine Learning vs. Statistical Learning&lt;br /&gt;
* Linear regression&lt;br /&gt;
* Exercise for linear regression&lt;br /&gt;
* Exercise for linear regression (contd.)&lt;br /&gt;
* R best practices&lt;br /&gt;
* Logistic regression&lt;br /&gt;
* Testing, cross-validation&lt;br /&gt;
* Classification exercise&lt;br /&gt;
* Presenting the results&lt;br /&gt;
* Deploying your results&lt;br /&gt;
* Generalized Linear Models&lt;br /&gt;
* Generalized Linear Model (cont.)&lt;br /&gt;
* Regularization&lt;br /&gt;
* Regularization more generally&lt;br /&gt;
== 1 SOURCES AND FURTHER READING ⌘==&lt;br /&gt;
Source materials&lt;br /&gt;
* “An Introduction to Statistical Learning”&lt;br /&gt;
** Available for free in PDF form online&lt;br /&gt;
** Online course by Trevor Hastie and Rob Tibshirani&lt;br /&gt;
* Andrew Ng&amp;#039;s “Machine Learning” online course&lt;br /&gt;
Further reading&lt;br /&gt;
* “Think Stats” and “Think Bayes”&lt;br /&gt;
**both by Allen B. Downey&lt;br /&gt;
**both available for free online&lt;br /&gt;
**programming in Python&lt;br /&gt;
== 2 MACHINE LEARNING VS. STATISTICAL LEARNING ⌘==&lt;br /&gt;
* Different origins&lt;br /&gt;
* Different focus&lt;br /&gt;
* Highly convergent in the recent years&lt;br /&gt;
== 3 LINEAR REGRESSION ⌘==&lt;br /&gt;
The simplest model for estimating a numerical response&lt;br /&gt;
Y=β0+β1X1+β2X2+⋯+βpXp+ε&lt;br /&gt;
Details&lt;br /&gt;
* Understanding the results&lt;br /&gt;
* Assessing the accuracy&lt;br /&gt;
* Iterpreting the coefficients&lt;br /&gt;
* Understanding factors&lt;br /&gt;
* Adding higher oder terms and interactions&lt;br /&gt;
== 4 EXERCISE FOR LINEAR REGRESSION ⌘==&lt;br /&gt;
* Data file: Advertising.csv (from ISLR)&lt;br /&gt;
* Multivariate linear regression&lt;br /&gt;
**Which variables are important&lt;br /&gt;
**Do they not have any predicting power?&lt;br /&gt;
**How much precision do we lose by dropping the &amp;quot;unimportant&amp;quot; variables?&lt;br /&gt;
== 5 EXERCISE FOR LINEAR REGRESSION (CONTD.) ⌘==&lt;br /&gt;
* Interactions between variables&lt;br /&gt;
Regression with all interactions&lt;br /&gt;
**Comparing results&lt;br /&gt;
**What are interations&lt;br /&gt;
**Visualizing interactions&lt;br /&gt;
&lt;br /&gt;
== 6 R BEST PRACTICES ⌘==&lt;br /&gt;
* Organizing your work (and data)&lt;br /&gt;
* Reusable work&lt;br /&gt;
* Plotting&lt;br /&gt;
* Learning&lt;br /&gt;
== 7 LOGISTIC REGRESSION ⌘==&lt;br /&gt;
Response is categorical: Yes or No.&lt;br /&gt;
f(X)=β0+β1X1+β2X2+⋯+βpXp&lt;br /&gt;
Find a suitable f(X) and classify to Yes if f(X)&amp;gt;0 and to No otherwise. What is a good f?&lt;br /&gt;
* Minimizes the training error? Not fine-grained enough; hard to optimize for.&lt;br /&gt;
* Map f(X) to probabilities and maximize for the likelihood of training data.&lt;br /&gt;
== 8 TESTING, CROSS-VALIDATION ⌘==&lt;br /&gt;
* Training vs. Test-set performance&lt;br /&gt;
* Bias-Variance trade-off (under/overfitting)&lt;br /&gt;
* Strategies for estimating test error; Cross-Validation&lt;br /&gt;
* Bootstra&lt;br /&gt;
== 9 CLASSIFICATION EXERCISE ⌘==&lt;br /&gt;
* Data: &amp;quot;defaulters&amp;quot; from the ISLR package&lt;br /&gt;
== 10 PRESENTING THE RESULTS ⌘==&lt;br /&gt;
* Session in R Markdown&lt;br /&gt;
== 11 DEPLOYING YOUR RESULTS ⌘==&lt;br /&gt;
* Exporting a model to a spreadsheet&lt;br /&gt;
* Porting to a different programming environment&lt;br /&gt;
* Using R as a library&lt;br /&gt;
* Deploying R applications to web&lt;br /&gt;
**Shiny: http://shiny.rstudio.com/&lt;br /&gt;
== 12 GENERALIZED LINEAR MODELS ⌘==&lt;br /&gt;
* What&amp;#039;s common in linear regression and logistic regression?&lt;br /&gt;
* How do they fit under one common assumption?&lt;br /&gt;
* What is the family parameter in glm?&lt;br /&gt;
== 13 GENERALIZED LINEAR MODEL (CONT.) ⌘==&lt;br /&gt;
* Common underlying assumption: a linear function of the predictors determines the distribution of the response.&lt;br /&gt;
* The parameters of the linear function are determined in a way to maximize the likelihood of the observations.&lt;br /&gt;
f(X)=β0+β1X1+β2X2+⋯+βpXp&lt;br /&gt;
For example, given the value of predictors X we assume that the distribution of the response depends only on f(X):&lt;br /&gt;
* Linear regression: N(f(X),σ2) with a constant σ2 (its value doesn&amp;#039;t matter)&lt;br /&gt;
* Two class classification: binomial, with the probability of the Yes class being p, where logp1−p=f(X)&lt;br /&gt;
Deviance: negative log likelihood (times two :)). This is what we actually minimize in practice. In case of linear regression…&lt;br /&gt;
== 14 REGULARIZATION ⌘==&lt;br /&gt;
* Prediction accuracy; especially if p&amp;gt;n.&lt;br /&gt;
* Model interpretability: removes irrelevant features. Feature selection.&lt;br /&gt;
== 15 REGULARIZATION MORE GENERALLY ⌘==&lt;br /&gt;
Methods&lt;br /&gt;
* Subset selection&lt;br /&gt;
* Shrinkage (aka. regularization). Ridge regression, lasso&lt;br /&gt;
* Dimension reduction. Pricipal components regression; partial least squares.&lt;br /&gt;
== 16 REGULARIZATION EXERCISE ⌘==&lt;br /&gt;
* Data: regul.csv&lt;br /&gt;
== 17 TREE BASED METHODS ⌘==&lt;br /&gt;
* Decision trees&lt;br /&gt;
* Random forests (baggin, bootstrap)&lt;br /&gt;
* Boosting&lt;br /&gt;
== 18 UNSUPERVISED LEARNING ⌘==&lt;br /&gt;
* Reasons, goals&lt;br /&gt;
* Methods&lt;br /&gt;
== 19 PRINCIPAL COMPONENTS ANALYSIS ⌘==&lt;br /&gt;
== 20 CLUSTERING ⌘==&lt;br /&gt;
* Goals&lt;br /&gt;
* Examples&lt;br /&gt;
* Challenges&lt;br /&gt;
== 21 K-MEANS CLUSTERING ⌘==&lt;br /&gt;
Demonstration of R magic&lt;/div&gt;</summary>
		<author><name>Bernard Szlachta</name></author>
	</entry>
</feed>