timothy leffel, spring 2017


agenda for course:

  • week 1 – R workflow, navigation, programming basics
  • week 2 – working with datasets and external files, data cleaning + manipulation
  • week 3 – summarizing data with dplyr::, visualizing data with ggplot2::
  • week 4 – document authoring with R Markdown, working with the web

course materials will eventually all be on the course website:


each week we'll have slides, notes, and a script. little exercises will be interleaved throughout the notes. the best way to write up solutions is to start a new R script called (e.g.) week1_exercises.r and type directly into that.

there will also be a list of links to useful resources up on the site

types of files we'll be using

R scripts

  • a plain-text file with extension .R or .r
  • all plain-text files (e.g. .txt) can be opened and edited directly in any text editor
  • contains R code that we'll run interactively in R Studio
  • also contains comments, which are just annotations that explain what the code is doing

types of files we'll be using


  • all kinds of extensions, e.g. .csv, .tsv, .xls, .xlsx, .dat, .sav, .dta. nowadays, R can read them all. we'll go through examples of several in week 2.
  • working with .csv files is generally preferable, since they are simple and come in plain-text format.
  • proprietary formats like .xlsx have certain nice features, but they're binary files, which can make their behavior unpredictable (and depend on the Excel version used to create them).
  • a less common format is .Rdata/.rda, which contains an R workspace with datasets and objects pre-loaded. (not plain-text so I try to avoid them)

types of files we'll be using

R Markdown files

  • extension .Rmd or .rmd
  • plain-text format (opens in any text editor)
  • a special kind of R script from which nice, clean documents can be easily generated (in .pdf, .html, or .docx formats)
  • easiest way to compile is with cmd+shift+k from R Studio

firing up R via R Studio

when you're using R, it's "looking" in a specific directory (folder). many tears have been shed over trying to get R to look in the desired directory (mine and those of countless other victims).

the best way to start an R session is to grab/make a plain text file with extension .r (e.g. my_script.r), put it in its own folder (e.g. R_folder), and then open it with R Studio (which you should set as the default).

if you start R by opening a specific script in R Studio, R will be looking into the folder containing your script and you won't have to mess with working directories.

you can also to go "tools" –> "global options" –> "default working directory" within R Studio to tell R where it should look if you just open R Studio directly

how to talk to R – via command-line interface (yikes :/)

how to talk to R – via default R GUI (better…)