Provides an intensive, hands-on introduction to the R programming language. Prepares students with the fundamental programming skills required to start your journey to becoming a modern day data analyst.
Upon successfully completing this course, students will:
“Programming is like kicking yourself in the face, sooner or later your nose will bleed.”
Note: There are other R IDE's available: Emacs, Microsoft R Open, Notepad++, etc.
Numerous help options are available internal and external to R. Within R, you can get help by:
# provides details for specific function help(functionname) # provides same information as help(functionname) ?functionname # provides examples for said function example(functionname)
External to R:
# get your current working directory getwd() [1] "/home/xps13/Dropbox/Programming/R/Intro-to-R-Bootcamp" # set your working directory setwd("/home/xps13/Dropbox/Programming/R") getwd() [1] "/home/xps13/Dropbox/Programming/R"
Set your working directory to the "R-Bootcamp" folder you downloaded for this course.
R can be used as a simple calculator
# Uses PEMDAS convention for order of operations 4 + 3 / 10 ^ 2 ## [1] 4.03 4 + (3 / 10 ^ 2) ## [1] 4.03 (4 + 3) / 10 ^ 2 ## [1] 0.07 # large/small numbers will be displayed in scientific notation 1 / 17 ^ 7 ## [1] 2.437011e-09 # Undefined caculations result in Inf or NaN 1 / 0 ## [1] Inf Inf - Inf ## [1] NaN
Assign values to objects (aka variables) with "<-
"
x <- 3 # assign 3 to x x # evaluate x ## [1] 3 x <- x + 1 # we can increment (build onto) existing objects x ## [1] 4
Note that there are multiple ways to assign variables but best practice recommends using "<-
"
x = 3 # BAD x <- 3 # GOOD
Variable names are case sensitive:
x <- 3 X Error: object 'x' not found
Economic Order Quantity Model: \[Q = \sqrt \frac{2DK}{h}\]
Calculate Q
where:
D = 1000
K = 5
h = 0.25
hint: sqrt(x)
\(= \sqrt x\)
D <- 1000 K <- 5 h <- .25 Q <- sqrt((2 * D * K) / h) Q ## [1] 200
# list all objects ls() ## [1] "D" "h" "K" "Q" # remove defined object from the environment rm(D) # removes everything in the working environment -- use with caution! rm(list = ls())
Vector: a sequence of data elements of the same basic type
# the ":" operator can be used to create sequential vectors 1:10 ## [1] 1 2 3 4 5 6 7 8 9 10 -3:5 ## [1] -3 -2 -1 0 1 2 3 4 5 # store a vector to variable x x <- 1:10 x ## [1] 1 2 3 4 5 6 7 8 9 10 # the "c" operator can be used to combine non-sequential elements y <- c(2, 5, -1) y ## [1] 2 5 -1
Note: We'll discuss vectors more later but for now you need to understand that…
A key difference between R and many other languages is the idea of vectorization.
In other languages you might have to run a loop to add two vectors together.
# two vectors to add x <- c(1, 3, 4) y <- c(1, 2, 4) # empty vector z <- as.vector(NULL) # `for` loop to add corresponding elements in each vector for (i in seq_along(x)) { z[i] <- x[i] + y[i] print(z) } ## [1] 2 ## [1] 2 5 ## [1] 2 5 8
In R, many arithmetic functions such as +
, -
, *
, etc. are vectorized functions that can operate on entire vectors at once by applying underlying C code.
Significantly reduces the need for creating for
loops
x + y ## [1] 2 5 8 x * y ## [1] 1 6 16 x > y ## [1] FALSE TRUE FALSE
Beware of recycling
long <- 1:10 short <- 1:5 long ## [1] 1 2 3 4 5 6 7 8 9 10 short ## [1] 1 2 3 4 5 long + short ## [1] 2 4 6 8 10 7 9 11 13 15
Back to our EOQ Model: \(Q = \sqrt \frac{2DK}{h}\)
Calculate Q
where:
D = 1000
K = 5
h =
vector of values 0.25, 0.50, 0.75
hint: sqrt(x)
\(= \sqrt x\)
c()
may be handy here
D <- 1000 K <- 5 h <- c(.25, .50, .75) Q <- sqrt((2 * D * K) / h) Q ## [1] 200.0000 141.4214 115.4701
The fundamental unit of shareable code is the package.
So how do we install these packages?
# install packages from CRAN install.packages("packagename") # install packages from Bioconductor source("http://bioconductor.org/biocLite.R") # only required the first time biocLite() # only required the first time biocLite("packagename") # install packages from GitHub install.packages("devtools") # only required the first time devtools::install_github("username/packagename")
Download these packages from CRAN:
dplyr tidyr ggplot2 stringr lubridate
install.packages("dplyr") install.packages("tidyr") install.packages("ggplot2") install.packages("stringr") install.packages("lubridate") # alternative install.packages(c("dplyr", "tidyr", "ggplot2", "stringr", "lubridate"))
For a full list of useful packages see this guide
Once the package is downloaded to your computer you can access the functions and resources provided by the package in two different ways:
# load the package to use in the current R session library(packagename) # use a particular function within a package without loading the package packagename::functionname
# provides details regarding contents of a package help(package = "packagename") # list vignettes available for a specific package vignette(package = "packagename") # view specific vignette vignette("vignettename")
Operator/Function | Description | Operator/Function | Description |
---|---|---|---|
help() |
get help | ls() |
list objects in working session |
? |
get help | rm() |
remove objects in current session |
getwd() |
get working directory | : , c() |
create vector |
setwd() |
set working directory | install.packages() |
install package from CRAN |
+, -, *, /, ^ |
arithmetic | library() |
load package |
<- |
assignment | vignette() |
view/list package vignette |
5 minutes!