Data science with R
- Module 1: Introduction and preliminaries
Making R more friendly, R and available GUIs
The R environment
Related software and documentation
R and statistics
Using R interactively
An introductory session
Getting help with functions and features
R commands, case sensitivity, etc.
Recall and correction of previous commands
Executing commands from or diverting output to a file
Data permanency and removing object - Module 2: Simple manipulations; numbers and vectors
Vectors and assignment
Vector arithmetic
Generating regular sequences
Logical vectors
Missing values
Character vectors
Index vectors; selecting and modifying subsets of a data set
Other types of objects - Module 3: Objects, their modes and attributes
Intrinsic attributes: mode and length
Changing the length of an object
Getting and setting attributes
The class of an object - Module 4: Ordered and unordered factors
A specific example
The function tapply() and ragged arrays
Ordered factors - Module 5: Arrays and matrices
Arrays
Array indexing. Subsections of an array
Index matrices
The array() function
Mixed vector and array arithmetic. The recycling rule
The outer product of two arrays
Generalized transpose of an array
Matrix facilities
Matrix multiplication
Linear equations and inversion
Eigenvalues and eigenvectors
Singular value decomposition and determinants
Least squares fitting and the QR decomposition
Forming partitioned matrices, cbind() and rbind()
The concatenation function, (), with arrays
Frequency tables from factors
- Module 6: Lists and data frames
Lists
Constructing and modifying lists
Concatenating lists
Data frames
Making data frames
Attach() and detach()
Working with data frames
Attaching arbitrary lists
Managing the search path - Module 7: Reading data from files
The read.table()function
The scan() function
Accessing builtin datasets
Loading data from other R packages
Editing data - Module 8: Grouping, loops and conditional execution
Grouped expressions
Control statements
Conditional execution: if statements
Repetitive execution: for loops, repeat and while - Module 9: Writing your own functions
Simple examples
Defining new binary operators
Named arguments and defaults
The ‘…’ argument
Assignments within functions
More advanced examples
Efficiency factors in block designs
Dropping all names in a printed array
Recursive numerical integration
Scope
Customizing the environment
Classes, generic functions and object orientation - Module 10: Statistical models in R
Defining statistical models; formulae
Contrasts
Linear models
Generic functions for extracting model information
Analysis of variance and model comparison
ANOVA tables
Updating fitted models
Generalized linear models
Families
The glm() function
Nonlinear least squares and maximum likelihood models
Least squares
Maximum likelihood
Some non-standard models
- Module 11: Graphical procedures
High-level plotting commands
The plot() function
Displaying multivariate data
Display graphics
Arguments to high-level plotting functions
Low-level plotting commands
Mathematical annotation
Hershey vector fonts
Interacting with graphics
Using graphics parameters
Permanent changes: The par() function
Temporary changes: Arguments to graphics functions
Graphics parameters list
Graphical elements
Axes and tick marks
Figure margins
Multiple figure environment
Device drivers
PostScript diagrams for typeset documents
Multiple graphics devices
Dynamic graphics - Module 12: Packages
Standard packages
Contributed packages and CRAN
Namespaces - Module 13: Introduction to Applied Machine Learning
Statistical learning vs. Machine learning
Iteration and evaluation
Bias-Variance trade-off - Module 14: Regression
Linear regression
Generalizations and Nonlinearity
Exercises - Module 15: Classifications
Bayesian refresher
Naive Bayes
Logistic regression
K-Nearest neighbors
Exercises - Module 16: Cross-validation and Resampling
Cross-validation approaches
Bootstrap
Exercises - Module 17: Unsupervised Learning
K-means clustering
Examples
Challenges of unsupervised learning and beyond K-means