Get in Touch

Course Outline

I. Introduction and preliminaries

1. Overview

  • Enhancing R usability: R and available graphical user interfaces (GUIs)
  • RStudio
  • Related software and documentation
  • R and statistics
  • Interactive use of R
  • An introductory session
  • Obtaining help with functions and features
  • R commands, case sensitivity, and related concepts
  • Recalling and correcting previous commands
  • Executing commands from or redirecting output to a file
  • Data permanency and object removal
  • Good programming practices: self-contained scripts, readability through structured code, documentation, and markdown
  • Installing packages: CRAN and Bioconductor

2. Reading data

  • Text files (read.delim)
  • CSV files

3. Simple manipulations; numbers and vectors + arrays

  • Vectors and assignment
  • Vector arithmetic
  • Generating regular sequences
  • Logical vectors
  • Missing values
  • Character vectors
  • Index vectors: selecting and modifying subsets of a data set
    • Arrays
  • Array indexing: subsections of an array
  • Index matrices
  • The array() function and simple array operations, such as multiplication and transposition
  • Other types of objects

4. Lists and data frames

  • Lists
  • Constructing and modifying lists
    • Concatenating lists
  • Data frames
    • Creating data frames
    • Working with data frames
    • Attaching arbitrary lists
    • Managing the search path

5. Data manipulation

  • Selecting and subsetting observations and variables
  • Filtering and grouping
  • Recoding and transformations
  • Aggregation and combining data sets
  • Forming partitioned matrices using cbind() and rbind()
  • The concatenation function with arrays
  • Character manipulation using the stringr package
  • Introduction to grep and regexpr

6. More on Reading data

  • XLS and XLSX files
  • readr and readxl packages
  • SPSS, SAS, Stata, and other data formats
  • Exporting data to text, CSV, and other formats

6. Grouping, loops and conditional execution

  • Grouped expressions
  • Control statements
  • Conditional execution: if statements
  • Repetitive execution: for loops, repeat, and while
  • Introduction to apply, lapply, sapply, and tapply

7. Functions

  • Creating functions
  • Optional arguments and default values
  • Variable numbers of arguments
  • Scope and its consequences

8. Simple graphics in R

  • Creating graphs
  • Density plots
  • Dot plots
  • Bar plots
  • Line charts
  • Pie charts
  • Boxplots
  • Scatter plots
  • Combining plots

II. Statistical analysis in R

1. Probability distributions

  • R as a collection of statistical tables
  • Examining the distribution of a data set

2. Testing of Hypotheses

  • Tests about a population mean
  • Likelihood ratio test
  • One- and two-sample tests
  • Chi-square goodness-of-fit test
  • Kolmogorov-Smirnov one-sample statistic
  • Wilcoxon signed-rank test
  • Two-sample test
  • Wilcoxon rank sum test
  • Mann-Whitney test
  • Kolmogorov-Smirnov test

3. Multiple Testing of Hypotheses

  • Type I error and false discovery rate (FDR)
  • ROC curves and AUC
  • Multiple testing procedures (Benjamini-Hochberg, Bonferroni, etc.)

4. Linear regression models

  • Generic functions for extracting model information
  • Updating fitted models
  • Generalised linear models
    • Families
    • The glm() function
  • Classification
    • Logistic regression
    • Linear discriminant analysis
  • Unsupervised learning
    • Principal components analysis
    • Clustering methods (k-means, hierarchical clustering, k-medoids)

5. Survival analysis (survival package)

  • Survival objects in R
  • Kaplan-Meier estimate, log-rank test, parametric regression
  • Confidence bands
  • Censored (interval-censored) data analysis
  • Cox proportional hazards models with constant covariates
  • Cox proportional hazards models with time-dependent covariates
  • Simulation: model comparison (comparing regression models)

6. Analysis of Variance

  • One-way ANOVA
  • Two-way classification of ANOVA
  • MANOVA

III. Worked problems in bioinformatics

  • Short introduction to the limma package
  • Microarray data analysis workflow
  • Data download from GEO: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE1397
  • Data processing (quality control, normalisation, differential expression)
  • Volcano plot
  • Clustering examples and heatmaps
 28 Hours

Number of participants


Price per participant

Testimonials (2)

Provisional Upcoming Courses (Require 5+ participants)

Related Categories