# Machine Learning with R

title
Introduction to R with exercises
author
MIHALY BARASZ for NobleProg Ltd

• Machine Learning vs. Statistical Learning
• Linear regression
• Exercise for linear regression
• Exercise for linear regression (contd.)
• R best practices
• Logistic regression
• Testing, cross-validation
• Classification exercise
• Presenting the results
• Generalized Linear Models
• Generalized Linear Model (cont.)
• Regularization
• Regularization more generally

## 1 SOURCES AND FURTHER READING ⌘

Source materials

• “An Introduction to Statistical Learning”
• Online course by Trevor Hastie and Rob Tibshirani
• Andrew Ng's “Machine Learning” online course

• “Think Stats” and “Think Bayes”
• both by Allen B. Downey
• programming in Python

## 2 MACHINE LEARNING VS. STATISTICAL LEARNING ⌘

• Different origins
• Different focus
• Highly convergent in the recent years

## 3 LINEAR REGRESSION ⌘

The simplest model for estimating a numerical response Y=β0+β1X1+β2X2+⋯+βpXp+ε Details

• Understanding the results
• Assessing the accuracy
• Iterpreting the coefficients
• Understanding factors
• Adding higher oder terms and interactions

## 4 EXERCISE FOR LINEAR REGRESSION ⌘

• Data file: Advertising.csv (from ISLR)
• Multivariate linear regression
• Which variables are important
• Do they not have any predicting power?
• How much precision do we lose by dropping the "unimportant" variables?

## 5 EXERCISE FOR LINEAR REGRESSION (CONTD.) ⌘

• Interactions between variables

Regression with all interactions

• Comparing results
• What are interations
• Visualizing interactions

## 6 R BEST PRACTICES ⌘

• Organizing your work (and data)
• Reusable work
• Plotting
• Learning

## 7 LOGISTIC REGRESSION ⌘

Response is categorical: Yes or No. f(X)=β0+β1X1+β2X2+⋯+βpXp Find a suitable f(X) and classify to Yes if f(X)>0 and to No otherwise. What is a good f?

• Minimizes the training error? Not fine-grained enough; hard to optimize for.
• Map f(X) to probabilities and maximize for the likelihood of training data.

## 8 TESTING, CROSS-VALIDATION ⌘

• Training vs. Test-set performance
• Strategies for estimating test error; Cross-Validation
• Bootstra

## 9 CLASSIFICATION EXERCISE ⌘

• Data: "defaulters" from the ISLR package

## 10 PRESENTING THE RESULTS ⌘

• Session in R Markdown

## 11 DEPLOYING YOUR RESULTS ⌘

• Exporting a model to a spreadsheet
• Porting to a different programming environment
• Using R as a library
• Deploying R applications to web

## 12 GENERALIZED LINEAR MODELS ⌘

• What's common in linear regression and logistic regression?
• How do they fit under one common assumption?
• What is the family parameter in glm?

## 13 GENERALIZED LINEAR MODEL (CONT.) ⌘

• Common underlying assumption: a linear function of the predictors determines the distribution of the response.
• The parameters of the linear function are determined in a way to maximize the likelihood of the observations.

f(X)=β0+β1X1+β2X2+⋯+βpXp For example, given the value of predictors X we assume that the distribution of the response depends only on f(X):

• Linear regression: N(f(X),σ2) with a constant σ2 (its value doesn't matter)
• Two class classification: binomial, with the probability of the Yes class being p, where logp1−p=f(X)

Deviance: negative log likelihood (times two :)). This is what we actually minimize in practice. In case of linear regression…

## 14 REGULARIZATION ⌘

• Prediction accuracy; especially if p>n.
• Model interpretability: removes irrelevant features. Feature selection.

## 15 REGULARIZATION MORE GENERALLY ⌘

Methods

• Subset selection
• Shrinkage (aka. regularization). Ridge regression, lasso
• Dimension reduction. Pricipal components regression; partial least squares.

## 16 REGULARIZATION EXERCISE ⌘

• Data: regul.csv

## 17 TREE BASED METHODS ⌘

• Decision trees
• Random forests (baggin, bootstrap)
• Boosting

## 18 UNSUPERVISED LEARNING ⌘

• Reasons, goals
• Methods

• Goals
• Examples
• Challenges

## 21 K-MEANS CLUSTERING ⌘

Demonstration of R magic