Introduction

About Me

  • 2005-2009 Economist, East Asia (University Tuebingen)
  • 2010 OECD Statistics Directorate, Trade and Business Statistics (SQL)
  • 2011-2015 OECD Directorate for Science, Technology and Innovation (SAS, R)
  • 2016 FAO Statistics Directorate, Methodological Innovations
  • Website: rdata.work
  • GitHub: r4io
  • Email: r4io@rdata.work

Short Course Information

Instructor

  • Bo Werth

Time & Location

  • Time: 9:30 - 17:00
  • Location: OECD IT Traning MB MZ289

Website

What are we doing?

  • R Programming literacy
  • Data visualization

Requirements

  • The training accounts can access the OECD R server
  • The hands-on scripts are traversed by single-line or region execution

Short Course Description & Objectives

Provides an intensive, hands-on introduction to the R programming language. Prepares students with the fundamental programming skills required to start your journey to becoming a modern day data analyst.


Objectives

Upon successfully completing this course, students will:

  • Be up and running with R
  • Understand the different types of data R can work with
  • Understand the different structures in which R holds data
  • Be able to import data into R
  • Perform basic data wrangling activities with R
  • Compute basic descriptive statistics with R
  • Visualize their data with base R and ggplot graphics

Short Course Schedule & Material

R Programming I: Overview

  • Getting started with R
  • Importing data into R
  • Understanding data structures
  • Understanding data types
  • Shaping and transforming your data






Tomorrow

  • Base R graphics
  • ggplot graphics library






All required classroom material can be downloaded from the course website:

http://boot.rdata.work/r_bootcamp

Analytics & Programming

Why Program?

Why Program?

Flexibility

  • Frees us from point-n-click analysis software
  • Allows us to customize our analyses
  • Allows us to build analytic applications

Slows us down

  • Forces us to think about our analytic processes

Speeds the analysis up

  • Many statistical programming languages now leverage C++ and Java to speed up computation time

Reproducibility

  • Provides reproducibility that spreadsheet analysis cannot
  • Literate statistical programming is on the rise

Why R?

Why R?

Built for Analytics!

Why R?

Built for Analytics!



  • .csv, .txt, .xls, etc. files
  • web scraping: xml text nodes, html tables (rvest)
  • databases: Microsoft SQL Server, MySQL, Oracle, PostgreSQL, mongodb, etc.
  • SPSS, STATA, SAS

Why R?

Built for Analytics!



  • easy to create "tidy" data
  • works well with numerics, characters, dates, missing values
  • robust regex capabilities

Why R?

Built for Analytics!



  • joining disparate data sets
  • selecting, filtering, summarizing
  • great "pipe-line" process: %>%

Why R?

Built for Analytics!



  • R is known for its visualization capabilities
  • ggplot introduced grammar of graphics
  • interactive plotting - easily leverage D3.js libraries using htmlwidgets

Why R?

Built for Analytics!



  • built for statistical analyses
  • thousands of libraries provide many statistical capabilities
  • easy to build your own algorithms

Why R?

Built for Analytics!



  • RMarkdown (produce slides, HTML web pages, pdf, doc)
  • Shiny allows rapid prototyping of web applications (HTML / CSS / JS)
  • Reproducibility (communicate to your future self!)

Why R?

Great Community!

Why R?

Create Cool Stuff!