class: center, middle, inverse, title-slide # dplyr Time Series Manipulation ## Calculate Growth Rates and Ratios ### 2017/12/11 --- class: inverse, center, middle # Calculating Ratios using dplyr --- ## 6A. Value added and its components by activity, ISIC rev4 - http://stats.oecd.org/Index.aspx?DataSetCode=SNA_TABLE6A - Location: OECD Countries - Transaction: VA (B1GA), Output (P1), Intermediate Cons (P2) - Activity: 21 Sections ISIC Rev. 4 and Total Economy - Measure: Current prices, Constant Prices (National Base Year) - Time: 2000-2016 - Value: National Currency (Mio.) ```r sna_dat <- read.csv("../data/sna_table6a.csv") ``` <table> <thead> <tr> <th style="text-align:left;"> LOCATION </th> <th style="text-align:left;"> TRANSACT </th> <th style="text-align:left;"> ACTIVITY </th> <th style="text-align:left;"> MEASURE </th> <th style="text-align:right;"> TIME </th> <th style="text-align:right;"> Value </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2000 </td> <td style="text-align:right;"> 644959 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2001 </td> <td style="text-align:right;"> 690375 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2002 </td> <td style="text-align:right;"> 731107 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2003 </td> <td style="text-align:right;"> 786958 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2004 </td> <td style="text-align:right;"> 847314 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2005 </td> <td style="text-align:right;"> 919054 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2006 </td> <td style="text-align:right;"> 1002660 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2007 </td> <td style="text-align:right;"> 1088972 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2008 </td> <td style="text-align:right;"> 1175982 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2009 </td> <td style="text-align:right;"> 1207173 </td> </tr> </tbody> </table> --- ### Create subset with total economy data Could be any other data set with harmonized dimensions - rename value column to avoid automatic joining on common names ```r sna_dat_tot <- sna_dat %>% filter(ACTIVITY == "VTOT") %>% mutate(value_total = Value) %>% select(LOCATION, TRANSACT, MEASURE, TIME, value_total) ``` <table> <thead> <tr> <th style="text-align:left;"> LOCATION </th> <th style="text-align:left;"> TRANSACT </th> <th style="text-align:left;"> MEASURE </th> <th style="text-align:right;"> TIME </th> <th style="text-align:right;"> value_total </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2000 </td> <td style="text-align:right;"> 644959 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2001 </td> <td style="text-align:right;"> 690375 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2002 </td> <td style="text-align:right;"> 731107 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2003 </td> <td style="text-align:right;"> 786958 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2004 </td> <td style="text-align:right;"> 847314 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2005 </td> <td style="text-align:right;"> 919054 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2006 </td> <td style="text-align:right;"> 1002660 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2007 </td> <td style="text-align:right;"> 1088972 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2008 </td> <td style="text-align:right;"> 1175982 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2009 </td> <td style="text-align:right;"> 1207173 </td> </tr> </tbody> </table> --- ### Calculate activities' share in total economy - join on common columns (here specified explicitly) - calculate ratio using two numeric columns (check reusult for `Inf` and `NA`) ```r sna_dat_share <- sna_dat %>% left_join(sna_dat_tot, by = c("LOCATION", "TRANSACT", "MEASURE", "TIME")) %>% mutate(share = Value / value_total, share_pct = share * 100) ``` <table> <thead> <tr> <th style="text-align:left;"> LOCATION </th> <th style="text-align:left;"> TRANSACT </th> <th style="text-align:left;"> ACTIVITY </th> <th style="text-align:left;"> MEASURE </th> <th style="text-align:right;"> TIME </th> <th style="text-align:right;"> Value </th> <th style="text-align:right;"> value_total </th> <th style="text-align:right;"> share </th> <th style="text-align:right;"> share_pct </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> AUT </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VC </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2000 </td> <td style="text-align:right;"> 38908.35 </td> <td style="text-align:right;"> 190624.8 </td> <td style="text-align:right;"> 0.204 </td> <td style="text-align:right;"> 20.4 </td> </tr> <tr> <td style="text-align:left;"> AUT </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VC </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2001 </td> <td style="text-align:right;"> 40446.71 </td> <td style="text-align:right;"> 197076.6 </td> <td style="text-align:right;"> 0.205 </td> <td style="text-align:right;"> 20.5 </td> </tr> <tr> <td style="text-align:left;"> AUT </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VC </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2002 </td> <td style="text-align:right;"> 40192.86 </td> <td style="text-align:right;"> 202353.4 </td> <td style="text-align:right;"> 0.199 </td> <td style="text-align:right;"> 19.9 </td> </tr> <tr> <td style="text-align:left;"> AUT </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VC </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2003 </td> <td style="text-align:right;"> 40467.36 </td> <td style="text-align:right;"> 207247.7 </td> <td style="text-align:right;"> 0.195 </td> <td style="text-align:right;"> 19.5 </td> </tr> <tr> <td style="text-align:left;"> AUT </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VC </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2004 </td> <td style="text-align:right;"> 42014.61 </td> <td style="text-align:right;"> 216098.3 </td> <td style="text-align:right;"> 0.194 </td> <td style="text-align:right;"> 19.4 </td> </tr> <tr> <td style="text-align:left;"> AUT </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VC </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2005 </td> <td style="text-align:right;"> 44228.92 </td> <td style="text-align:right;"> 225888.1 </td> <td style="text-align:right;"> 0.196 </td> <td style="text-align:right;"> 19.6 </td> </tr> <tr> <td style="text-align:left;"> AUT </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VC </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2006 </td> <td style="text-align:right;"> 47735.52 </td> <td style="text-align:right;"> 239076.0 </td> <td style="text-align:right;"> 0.200 </td> <td style="text-align:right;"> 20.0 </td> </tr> <tr> <td style="text-align:left;"> AUT </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VC </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2007 </td> <td style="text-align:right;"> 51552.36 </td> <td style="text-align:right;"> 253604.7 </td> <td style="text-align:right;"> 0.203 </td> <td style="text-align:right;"> 20.3 </td> </tr> <tr> <td style="text-align:left;"> AUT </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VC </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2008 </td> <td style="text-align:right;"> 51165.88 </td> <td style="text-align:right;"> 262414.8 </td> <td style="text-align:right;"> 0.195 </td> <td style="text-align:right;"> 19.5 </td> </tr> <tr> <td style="text-align:left;"> AUT </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VC </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2009 </td> <td style="text-align:right;"> 47129.75 </td> <td style="text-align:right;"> 256671.0 </td> <td style="text-align:right;"> 0.184 </td> <td style="text-align:right;"> 18.4 </td> </tr> </tbody> </table> --- ### Plot activities' share in total economy (selected countries) ```r sna_dat_share %>% filter(LOCATION%in%c("AUT","BEL") & ACTIVITY!="VTOT" & MEASURE=="C") %>% ggplot(aes(x = TIME, y = share, color = ACTIVITY)) + geom_line() + facet_grid(Country ~ Transaction) + scale_color_discrete(guide = guide_legend(ncol = 2)) ``` ![](11-dplyr-time-series/fig-html/data_plot_sna-1.svg)<!-- --> --- class: inverse, center, middle # Ratio example: Top 3 activities by country in 2016 --- ### Ratio example: Top 3 activities by country in 2016 ```r sna_dat_share_b1ga_2016_top3 <- sna_dat_share %>% filter(ACTIVITY != "VTOT", TRANSACT == "B1GA", MEASURE == "C", TIME == "2016") %>% group_by(LOCATION) %>% top_n(3, wt = share) ``` <table> <thead> <tr> <th style="text-align:left;"> LOCATION </th> <th style="text-align:left;"> TRANSACT </th> <th style="text-align:left;"> MEASURE </th> <th style="text-align:right;"> TIME </th> <th style="text-align:left;"> ACTIVITY </th> <th style="text-align:right;"> Value </th> <th style="text-align:right;"> value_total </th> <th style="text-align:right;"> share </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> AUT </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2016 </td> <td style="text-align:left;"> VC </td> <td style="text-align:right;"> 57258.75 </td> <td style="text-align:right;"> 314692.5 </td> <td style="text-align:right;"> 0.18 </td> </tr> <tr> <td style="text-align:left;"> AUT </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2016 </td> <td style="text-align:left;"> VG </td> <td style="text-align:right;"> 38405.87 </td> <td style="text-align:right;"> 314692.5 </td> <td style="text-align:right;"> 0.12 </td> </tr> <tr> <td style="text-align:left;"> AUT </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2016 </td> <td style="text-align:left;"> VL </td> <td style="text-align:right;"> 30999.79 </td> <td style="text-align:right;"> 314692.5 </td> <td style="text-align:right;"> 0.10 </td> </tr> <tr> <td style="text-align:left;"> BEL </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2016 </td> <td style="text-align:left;"> VC </td> <td style="text-align:right;"> 53858.70 </td> <td style="text-align:right;"> 377635.6 </td> <td style="text-align:right;"> 0.14 </td> </tr> <tr> <td style="text-align:left;"> BEL </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2016 </td> <td style="text-align:left;"> VG </td> <td style="text-align:right;"> 46028.20 </td> <td style="text-align:right;"> 377635.6 </td> <td style="text-align:right;"> 0.12 </td> </tr> <tr> <td style="text-align:left;"> BEL </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2016 </td> <td style="text-align:left;"> VM </td> <td style="text-align:right;"> 36101.20 </td> <td style="text-align:right;"> 377635.6 </td> <td style="text-align:right;"> 0.10 </td> </tr> <tr> <td style="text-align:left;"> CHE </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2016 </td> <td style="text-align:left;"> VC </td> <td style="text-align:right;"> 117303.76 </td> <td style="text-align:right;"> 638981.4 </td> <td style="text-align:right;"> 0.18 </td> </tr> <tr> <td style="text-align:left;"> CHE </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2016 </td> <td style="text-align:left;"> VG </td> <td style="text-align:right;"> 91272.66 </td> <td style="text-align:right;"> 638981.4 </td> <td style="text-align:right;"> 0.14 </td> </tr> <tr> <td style="text-align:left;"> CHE </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2016 </td> <td style="text-align:left;"> VO </td> <td style="text-align:right;"> 68343.49 </td> <td style="text-align:right;"> 638981.4 </td> <td style="text-align:right;"> 0.11 </td> </tr> <tr> <td style="text-align:left;"> CZE </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2016 </td> <td style="text-align:left;"> VC </td> <td style="text-align:right;"> 1162392.00 </td> <td style="text-align:right;"> 4292397.0 </td> <td style="text-align:right;"> 0.27 </td> </tr> <tr> <td style="text-align:left;"> CZE </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2016 </td> <td style="text-align:left;"> VG </td> <td style="text-align:right;"> 471827.00 </td> <td style="text-align:right;"> 4292397.0 </td> <td style="text-align:right;"> 0.11 </td> </tr> <tr> <td style="text-align:left;"> CZE </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2016 </td> <td style="text-align:left;"> VL </td> <td style="text-align:right;"> 360797.00 </td> <td style="text-align:right;"> 4292397.0 </td> <td style="text-align:right;"> 0.08 </td> </tr> </tbody> </table> --- ### Ratio example: Top 3 activities by country in 2016 ```r sna_dat_share_b1ga_2016_top3 %>% mutate(act_label = paste(ACTIVITY, Activity)) %>% ggplot(aes(x = LOCATION, y = share, fill = act_label)) + geom_bar(stat = "identity", position = "stack", width = 0.6) + scale_fill_discrete(guide = guide_legend(nrow = 3)) + theme(legend.position = "top", legend.direction = "horizontal") ``` ![](11-dplyr-time-series/fig-html/plot_top_sna-1.svg)<!-- --> --- class: inverse, center, middle # Calculating Growth Rates using dplyr --- ### The `lag` function - Time dimension must be in `date` format "`yyyy-mm-dd`" ```r sna_dat2 <- sna_dat %>% mutate(date = as.Date(paste0(TIME, "-01-01"))) head(sna_dat2$date, 5) ``` ``` ## [1] "2000-01-01" "2001-01-01" "2002-01-01" "2003-01-01" "2004-01-01" ``` ```r class(sna_dat2$date) ``` ``` ## [1] "Date" ``` ```r dplyr::lag(head(sna_dat2$date, 5)) ``` ``` ## [1] NA "2000-01-01" "2001-01-01" "2002-01-01" "2003-01-01" ``` ```r stats::lag(head(sna_dat2$date, 5)) ``` ``` ## [1] "2000-01-01" "2001-01-01" "2002-01-01" "2003-01-01" "2004-01-01" ``` ```r ## ?lag ``` --- ### Calculate Growth Rate ```r sna_growth <- sna_dat2 %>% group_by(LOCATION, TRANSACT, ACTIVITY, MEASURE) %>% mutate(value_lag = lag(Value), growth = Value / value_lag - 1) ``` <table> <thead> <tr> <th style="text-align:left;"> LOCATION </th> <th style="text-align:left;"> TRANSACT </th> <th style="text-align:left;"> ACTIVITY </th> <th style="text-align:left;"> MEASURE </th> <th style="text-align:right;"> TIME </th> <th style="text-align:right;"> Value </th> <th style="text-align:right;"> value_lag </th> <th style="text-align:right;"> growth </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2000 </td> <td style="text-align:right;"> 644959 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2001 </td> <td style="text-align:right;"> 690375 </td> <td style="text-align:right;"> 644959 </td> <td style="text-align:right;"> 0.070 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2002 </td> <td style="text-align:right;"> 731107 </td> <td style="text-align:right;"> 690375 </td> <td style="text-align:right;"> 0.059 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2003 </td> <td style="text-align:right;"> 786958 </td> <td style="text-align:right;"> 731107 </td> <td style="text-align:right;"> 0.076 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2004 </td> <td style="text-align:right;"> 847314 </td> <td style="text-align:right;"> 786958 </td> <td style="text-align:right;"> 0.077 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2005 </td> <td style="text-align:right;"> 919054 </td> <td style="text-align:right;"> 847314 </td> <td style="text-align:right;"> 0.085 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2006 </td> <td style="text-align:right;"> 1002660 </td> <td style="text-align:right;"> 919054 </td> <td style="text-align:right;"> 0.091 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2007 </td> <td style="text-align:right;"> 1088972 </td> <td style="text-align:right;"> 1002660 </td> <td style="text-align:right;"> 0.086 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2008 </td> <td style="text-align:right;"> 1175982 </td> <td style="text-align:right;"> 1088972 </td> <td style="text-align:right;"> 0.080 </td> </tr> <tr> <td style="text-align:left;"> AUS </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2009 </td> <td style="text-align:right;"> 1207173 </td> <td style="text-align:right;"> 1175982 </td> <td style="text-align:right;"> 0.027 </td> </tr> </tbody> </table> --- class: inverse, center, middle # Growth rate example: countries with maximum absolute growth rates --- ### Countries with maximum absolute growth rates ```r sna_growth_top <- sna_growth %>% filter(ACTIVITY == "VTOT" & TRANSACT == "B1GA") %>% group_by(LOCATION) %>% top_n(n = 1, wt = abs(growth)) %>% arrange(desc(growth)) ``` <table> <thead> <tr> <th style="text-align:left;"> LOCATION </th> <th style="text-align:left;"> TRANSACT </th> <th style="text-align:left;"> ACTIVITY </th> <th style="text-align:left;"> MEASURE </th> <th style="text-align:right;"> TIME </th> <th style="text-align:right;"> Value </th> <th style="text-align:right;"> value_lag </th> <th style="text-align:right;"> growth </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> TUR </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2002 </td> <td style="text-align:right;"> 317199.87 </td> <td style="text-align:right;"> 218880.94 </td> <td style="text-align:right;"> 0.449 </td> </tr> <tr> <td style="text-align:left;"> IRL </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2015 </td> <td style="text-align:right;"> 236813.52 </td> <td style="text-align:right;"> 177538.93 </td> <td style="text-align:right;"> 0.334 </td> </tr> <tr> <td style="text-align:left;"> LVA </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2007 </td> <td style="text-align:right;"> 20069.82 </td> <td style="text-align:right;"> 15130.98 </td> <td style="text-align:right;"> 0.326 </td> </tr> <tr> <td style="text-align:left;"> EST </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2007 </td> <td style="text-align:right;"> 14258.48 </td> <td style="text-align:right;"> 11888.96 </td> <td style="text-align:right;"> 0.199 </td> </tr> <tr> <td style="text-align:left;"> ISL </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2008 </td> <td style="text-align:right;"> 1356808.00 </td> <td style="text-align:right;"> 1152608.00 </td> <td style="text-align:right;"> 0.177 </td> </tr> <tr> <td style="text-align:left;"> HUN </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2001 </td> <td style="text-align:right;"> 13335190.00 </td> <td style="text-align:right;"> 11395842.00 </td> <td style="text-align:right;"> 0.170 </td> </tr> <tr> <td style="text-align:left;"> LUX </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2006 </td> <td style="text-align:right;"> 30338.94 </td> <td style="text-align:right;"> 26668.19 </td> <td style="text-align:right;"> 0.138 </td> </tr> <tr> <td style="text-align:left;"> MEX </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2004 </td> <td style="text-align:right;"> 8299895.00 </td> <td style="text-align:right;"> 7302821.00 </td> <td style="text-align:right;"> 0.137 </td> </tr> <tr> <td style="text-align:left;"> SVK </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2006 </td> <td style="text-align:right;"> 50714.29 </td> <td style="text-align:right;"> 44744.18 </td> <td style="text-align:right;"> 0.133 </td> </tr> <tr> <td style="text-align:left;"> NOR </td> <td style="text-align:left;"> B1GA </td> <td style="text-align:left;"> VTOT </td> <td style="text-align:left;"> C </td> <td style="text-align:right;"> 2008 </td> <td style="text-align:right;"> 2344746.00 </td> <td style="text-align:right;"> 2085049.00 </td> <td style="text-align:right;"> 0.125 </td> </tr> </tbody> </table> --- ### Countries with maximum absolute growth rates ```r sna_growth %>% filter(LOCATION%in%sna_growth_top$LOCATION[1:8] & ACTIVITY == "VTOT" & TRANSACT == "B1GA") %>% ggplot(aes(x = date, y = growth, color = Measure)) + geom_line(size = 1) + facet_wrap( ~ LOCATION, ncol = 4) + theme(legend.position = "top", legend.direction = "horizontal") ``` ![](11-dplyr-time-series/fig-html/sna_growth_plot-1.svg)<!-- --> --- class: inverse, center, middle # Introductory Time Series With R ## S P Cowpertwait, Paul & Metcalfe, Andrew (2009) 10.1007/978-0-387-88698-5 --- ## Basic Time Series Manipulation - number of international passenger bookings (in thousands) per month on an airline (Pan Am) in the United States - Federal Aviation Administration, 1949–1960 - used to predict future demand before ordering new aircraft and training aircrew - Brown, R. (1963). Smoothing, Forecasting, and Prediction of Discrete Time Series. Englewood Cliffs, NJ: Prentice-Hall. ```r data(AirPassengers) AP <- AirPassengers head(AP) ``` ``` ## [1] 112 118 132 129 121 135 ``` ```r class(AP) ``` ``` ## [1] "ts" ``` --- ## Methods for `ts` objects ```r start(AP); end(AP); frequency(AP); summary(AP) ``` ``` ## [1] 1949 1 ``` ``` ## [1] 1960 12 ``` ``` ## [1] 12 ``` ``` ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 104.0 180.0 265.5 280.3 360.5 622.0 ``` --- ## Plotting `ts` objects ```r par(mfrow = c(1, 2)); plot(AP, ylab = "Passengers (1000's)"); plot(aggregate(AP)) ``` ![](11-dplyr-time-series/fig-html/cw-example-5-1.svg)<!-- --> --- ## Aggregating `ts` objects - trend analysis: remove seasonal effect by aggregating to the annual level - boxplot: summary of the values for each season - `cycle` function being used to extract the seasons for each item of data ```r par(mfrow = c(1, 2)); plot(aggregate(AP)); boxplot(AP ~ cycle(AP)) ``` ![](11-dplyr-time-series/fig-html/cw-example-7-1.svg)<!-- --> --- ```r ## www <- "http://www.massey.ac.nz/~pscowper/ts/USunemp.dat" www <- "https://raw.githubusercontent.com/AtefOuni/ts/master/Data/USunemp.dat" US.month <- read.table(www, header = T) attach(US.month) US.month.ts <- ts(USun, start=c(1996,1), end=c(2006,10), freq = 12) plot(US.month.ts, ylab = "unemployed (%)") ``` ![](11-dplyr-time-series/fig-html/cw-example-10-1-1.svg)<!-- --> --- ``` ## choc beer elec ## 1 1451 96.3 1497 ## 2 2037 84.4 1463 ## 3 2477 91.2 1648 ## 4 2785 81.9 1595 ``` ```r Elec.ts <- ts(CBE[, 3], start = 1958, freq = 12) Beer.ts <- ts(CBE[, 2], start = 1958, freq = 12) Choc.ts <- ts(CBE[, 1], start = 1958, freq = 12) plot(cbind(Elec.ts, Beer.ts, Choc.ts)) ``` ![](11-dplyr-time-series/fig-html/cw-example-10-3-1.svg)<!-- --> --- class: inverse, center, middle # Growth Rate Example: Quarterly Data --- ## Create Example Data ```r date <- c("2000-01-01","2000-04-01", "2000-07-01", "2000-10-01","2001-01-01","2001-04-01", "2001-07-01","2001-10-01","2002-01-01", "2002-04-01","2002-07-01","2002-10-01") value <- c(1592,1825,1769,1909,2022,2287,2169,2366,2001,2087,2099,2258) df <- data.frame(date,value) ## Convert date column to date format df$date = as.POSIXct(df$date) ``` ``` ## date value ## 1 2000-01-01 1592 ## 2 2000-04-01 1825 ## 3 2000-07-01 1769 ## 4 2000-10-01 1909 ## 5 2001-01-01 2022 ## 6 2001-04-01 2287 ``` --- ## Growth Rate Objective: calculate growth rate comparing quarter 1 from one year to quarter 1 for the following year ```r df_q <- df %>% group_by(month=month(date)) %>% arrange(date) %>% mutate(yearOverYear=value/lag(value,1)) ``` ``` ## # A tibble: 6 x 4 ## # Groups: month [4] ## date value month yearOverYear ## <dttm> <dbl> <dbl> <dbl> ## 1 2000-01-01 1592 1 NA ## 2 2000-04-01 1825 4 NA ## 3 2000-07-01 1769 7 NA ## 4 2000-10-01 1909 10 NA ## 5 2001-01-01 2022 1 1.270101 ## 6 2001-04-01 2287 4 1.253151 ``` ```r ## value[2001-01-01] / value[2000-01-01] 2022 / 1592 ``` ``` ## [1] 1.270101 ``` --- ``` ## R version 3.4.2 (2017-09-28) ## Platform: x86_64-redhat-linux-gnu (64-bit) ## Running under: Fedora 27 (Workstation Edition) ## ## Matrix products: default ## BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so ## ## locale: ## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C ## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 ## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 ## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C ## [9] LC_ADDRESS=C LC_TELEPHONE=C ## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C ## ## attached base packages: ## [1] methods stats graphics grDevices utils datasets base ## ## other attached packages: ## [1] bindrcpp_0.2 knitr_1.17 lubridate_1.6.0 tidyr_0.7.1 ## [5] dplyr_0.7.4 ggplot2_2.2.1 rmarkdown_1.6 ## ## loaded via a namespace (and not attached): ## [1] Rcpp_0.12.13 bindr_0.1 magrittr_1.5 munsell_0.4.3 ## [5] colorspace_1.3-2 R6_2.2.2 rlang_0.1.2 highr_0.6 ## [9] stringr_1.2.0 plyr_1.8.4 tools_3.4.2 grid_3.4.2 ## [13] gtable_0.2.0 xaringan_0.3 htmltools_0.3.6 yaml_2.1.14 ## [17] lazyeval_0.2.0 rprojroot_1.2 digest_0.6.12 assertthat_0.2.0 ## [21] tibble_1.3.4 reshape2_1.4.2 purrr_0.2.3 glue_1.1.1 ## [25] evaluate_0.10.1 labeling_0.3 stringi_1.1.5 compiler_3.4.2 ## [29] scales_0.5.0 backports_1.1.0 pkgconfig_2.0.1 ```