Data science with R

  • Module 1: Introduction and preliminaries
    Making R more friendly, R and available GUIs
    The R environment
    Related software and documentation
    R and statistics
    Using R interactively
    An introductory session
    Getting help with functions and features
    R commands, case sensitivity, etc.
    Recall and correction of previous commands
    Executing commands from or diverting output to a file
    Data permanency and removing object
  • Module 2: Simple manipulations; numbers and vectors
    Vectors and assignment
    Vector arithmetic
    Generating regular sequences
    Logical vectors
    Missing values
    Character vectors
    Index vectors; selecting and modifying subsets of a data set
    Other types of objects
  • Module 3: Objects, their modes and attributes
    Intrinsic attributes: mode and length
    Changing the length of an object
    Getting and setting attributes
    The class of an object
  • Module 4: Ordered and unordered factors
    A specific example
    The function tapply() and ragged arrays
    Ordered factors
  • Module 5: Arrays and matrices
    Arrays
    Array indexing. Subsections of an array
    Index matrices
    The array() function
    Mixed vector and array arithmetic. The recycling rule
    The outer product of two arrays
    Generalized transpose of an array
    Matrix facilities
    Matrix multiplication
    Linear equations and inversion
    Eigenvalues and eigenvectors
    Singular value decomposition and determinants
    Least squares fitting and the QR decomposition
    Forming partitioned matrices, cbind() and rbind()
    The concatenation function, (), with arrays
    Frequency tables from factors
  •  
  • Module 6: Lists and data frames
    Lists
    Constructing and modifying lists
    Concatenating lists
    Data frames
    Making data frames
    Attach() and detach()
    Working with data frames
    Attaching arbitrary lists
    Managing the search path
  • Module 7: Reading data from files
    The read.table()function
    The scan() function
    Accessing builtin datasets
    Loading data from other R packages
    Editing data
  • Module 8: Grouping, loops and conditional execution
    Grouped expressions
    Control statements
    Conditional execution: if statements
    Repetitive execution: for loops, repeat and while
  • Module 9: Writing your own functions
    Simple examples
    Defining new binary operators
    Named arguments and defaults
    The ‘…’ argument
    Assignments within functions
    More advanced examples
    Efficiency factors in block designs
    Dropping all names in a printed array
    Recursive numerical integration
    Scope
    Customizing the environment
    Classes, generic functions and object orientation
  • Module 10: Statistical models in R
    Defining statistical models; formulae
    Contrasts
    Linear models
    Generic functions for extracting model information
    Analysis of variance and model comparison
    ANOVA tables
    Updating fitted models
    Generalized linear models
    Families
    The glm() function
    Nonlinear least squares and maximum likelihood models
    Least squares
    Maximum likelihood
    Some non-standard models
  • Module 11: Graphical procedures
    High-level plotting commands
    The plot() function
    Displaying multivariate data
    Display graphics
    Arguments to high-level plotting functions
    Low-level plotting commands
    Mathematical annotation
    Hershey vector fonts
    Interacting with graphics
    Using graphics parameters
    Permanent changes: The par() function
    Temporary changes: Arguments to graphics functions
    Graphics parameters list
    Graphical elements
    Axes and tick marks
    Figure margins
    Multiple figure environment
    Device drivers
    PostScript diagrams for typeset documents
    Multiple graphics devices
    Dynamic graphics
  • Module 12: Packages
    Standard packages
    Contributed packages and CRAN
    Namespaces
  • Module 13: Introduction to Applied Machine Learning
    Statistical learning vs. Machine learning
    Iteration and evaluation
    Bias-Variance trade-off
  • Module 14: Regression
    Linear regression
    Generalizations and Nonlinearity
    Exercises
  • Module 15: Classifications
    Bayesian refresher
    Naive Bayes
    Logistic regression
    K-Nearest neighbors
    Exercises
  • Module 16: Cross-validation and Resampling
    Cross-validation approaches
    Bootstrap
    Exercises
  • Module 17: Unsupervised Learning
    K-means clustering
    Examples
    Challenges of unsupervised learning and beyond K-means