Various Method for Time Series Data

Time Series

Compare Various Method for Time Series Data

Yeongeun Jeon , Jung In Seo
06-07-2020

Introduction

1. Data

# Create time series

ridership.ts <- ts(Amtrak.data$Ridership, start=c(1991,1), end=c(2004,3), freq=12)


# Decompose train data and validation data

train.ts <- window(ridership.ts,start=c(1991,1), end=c(2001,3))
valid.ts <- window(ridership.ts,start=c(2001,4))
nValid   <- length(valid.ts)

# Create Data.frame for ggplot
                 
xts(ridership.ts, order = as.Date(ridership.ts))
               [,1]
1991-01-01 1708.917
1991-02-01 1620.586
1991-03-01 1972.715
1991-04-01 1811.665
1991-05-01 1974.964
1991-06-01 1862.356
1991-07-01 1939.860
1991-08-01 2013.264
1991-09-01 1595.657
1991-10-01 1724.924
1991-11-01 1675.667
1991-12-01 1813.863
1992-01-01 1614.827
1992-02-01 1557.088
1992-03-01 1891.223
1992-04-01 1955.981
1992-05-01 1884.714
1992-06-01 1623.042
1992-07-01 1903.309
1992-08-01 1996.712
1992-09-01 1703.897
1992-10-01 1810.000
1992-11-01 1861.601
1992-12-01 1875.122
1993-01-01 1705.259
1993-02-01 1618.535
1993-03-01 1836.709
1993-04-01 1957.043
1993-05-01 1917.185
1993-06-01 1882.398
1993-07-01 1933.009
1993-08-01 1996.167
1993-09-01 1672.841
1993-10-01 1752.827
1993-11-01 1720.377
1993-12-01 1734.292
1994-01-01 1563.365
1994-02-01 1573.959
1994-03-01 1902.639
1994-04-01 1833.888
1994-05-01 1831.049
1994-06-01 1775.755
1994-07-01 1867.508
1994-08-01 1906.608
1994-09-01 1685.632
1994-10-01 1778.546
1994-11-01 1775.995
1994-12-01 1783.350
1995-01-01 1548.415
1995-02-01 1496.925
1995-03-01 1798.316
1995-04-01 1732.895
1995-05-01 1772.345
1995-06-01 1761.207
1995-07-01 1791.655
1995-08-01 1874.820
1995-09-01 1571.309
1995-10-01 1646.948
1995-11-01 1672.631
1995-12-01 1656.845
1996-01-01 1381.758
1996-02-01 1360.852
1996-03-01 1558.575
1996-04-01 1608.420
1996-05-01 1696.696
1996-06-01 1693.183
1996-07-01 1835.516
1996-08-01 1942.573
1996-09-01 1551.401
1996-10-01 1686.508
1996-11-01 1576.204
1996-12-01 1700.433
1997-01-01 1396.588
1997-02-01 1371.690
1997-03-01 1707.522
1997-04-01 1654.604
1997-05-01 1762.903
1997-06-01 1775.800
1997-07-01 1934.219
1997-08-01 2008.055
1997-09-01 1615.924
1997-10-01 1773.910
1997-11-01 1732.368
1997-12-01 1796.626
1998-01-01 1570.330
1998-02-01 1412.691
1998-03-01 1754.641
1998-04-01 1824.932
1998-05-01 1843.289
1998-06-01 1825.964
1998-07-01 1968.172
1998-08-01 1921.645
1998-09-01 1669.597
1998-10-01 1791.474
1998-11-01 1816.714
1998-12-01 1846.754
1999-01-01 1599.427
1999-02-01 1548.804
1999-03-01 1832.333
1999-04-01 1839.720
1999-05-01 1846.498
1999-06-01 1864.852
1999-07-01 1965.743
1999-08-01 1949.002
1999-09-01 1607.373
1999-10-01 1803.664
1999-11-01 1850.309
1999-12-01 1836.435
2000-01-01 1541.660
2000-02-01 1616.928
2000-03-01 1919.538
2000-04-01 1971.493
2000-05-01 1992.301
2000-06-01 2009.763
2000-07-01 2053.996
2000-08-01 2097.471
2000-09-01 1823.706
2000-10-01 1976.997
2000-11-01 1981.408
2000-12-01 2000.153
2001-01-01 1683.148
2001-02-01 1663.404
2001-03-01 2007.928
2001-04-01 2023.792
2001-05-01 2047.008
2001-06-01 2072.913
2001-07-01 2126.717
2001-08-01 2202.638
2001-09-01 1707.693
2001-10-01 1950.716
2001-11-01 1973.614
2001-12-01 1984.729
2002-01-01 1759.629
2002-02-01 1770.595
2002-03-01 2019.912
2002-04-01 2048.398
2002-05-01 2068.763
2002-06-01 1994.267
2002-07-01 2075.258
2002-08-01 2026.560
2002-09-01 1734.155
2002-10-01 1916.771
2002-11-01 1858.345
2002-12-01 1996.352
2003-01-01 1778.033
2003-02-01 1749.489
2003-03-01 2066.466
2003-04-01 2098.899
2003-05-01 2104.911
2003-06-01 2129.671
2003-07-01 2223.349
2003-08-01 2174.360
2003-09-01 1931.406
2003-10-01 2121.470
2003-11-01 2076.054
2003-12-01 2140.677
2004-01-01 1831.508
2004-02-01 1838.006
2004-03-01 2132.446
ridership_df            <- as.data.frame(ridership.ts)
names(ridership_df)     <- "Observation"
ridership_df$Date       <- as.Date(ridership.ts)  # Add Date column in last
cln                     <- ncol(ridership_df)
ridership_df            <- ridership_df[, c(cln, 1:(cln-1))]  # Change the order of the column. That is first column "Date"
row.names(ridership_df) <- NULL


head(ridership_df)
        Date Observation
1 1991-01-01    1708.917
2 1991-02-01    1620.586
3 1991-03-01    1972.715
4 1991-04-01    1811.665
5 1991-05-01    1974.964
6 1991-06-01    1862.356
# Plot the series
   
ggplot(data=ridership_df, aes(x=Date, y=Observation)) + geom_line() +
labs(y="Ridership") +
scale_x_date(breaks =  seq(as.Date("1991-01-01"), as.Date("2004-03-01"), by="2 year")) + theme_minimal() 

2. Models

2.1 Regression

2.1.1 모형 적합과 예측

train.lm.trend.season <- tslm(train.ts ~ trend + I(trend^2) + season)  # Fit Trend + Seasonality Model
summary(train.lm.trend.season)

Call:
tslm(formula = train.ts ~ trend + I(trend^2) + season)

Residuals:
     Min       1Q   Median       3Q      Max 
-213.775  -39.363    9.711   42.422  152.187 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.697e+03  2.768e+01  61.318  < 2e-16 ***
trend       -7.156e+00  7.293e-01  -9.812  < 2e-16 ***
I(trend^2)   6.074e-02  5.698e-03  10.660  < 2e-16 ***
season2     -4.325e+01  3.024e+01  -1.430  0.15556    
season3      2.600e+02  3.024e+01   8.598 6.60e-14 ***
season4      2.606e+02  3.102e+01   8.401 1.83e-13 ***
season5      2.938e+02  3.102e+01   9.471 6.89e-16 ***
season6      2.490e+02  3.102e+01   8.026 1.26e-12 ***
season7      3.606e+02  3.102e+01  11.626  < 2e-16 ***
season8      4.117e+02  3.102e+01  13.270  < 2e-16 ***
season9      9.032e+01  3.102e+01   2.911  0.00437 ** 
season10     2.146e+02  3.102e+01   6.917 3.29e-10 ***
season11     2.057e+02  3.103e+01   6.629 1.34e-09 ***
season12     2.429e+02  3.103e+01   7.829 3.44e-12 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 70.92 on 109 degrees of freedom
Multiple R-squared:  0.8246,    Adjusted R-squared:  0.8037 
F-statistic: 39.42 on 13 and 109 DF,  p-value: < 2.2e-16
train.lm.trend.season.pred <- forecast(train.lm.trend.season, h=nValid, level=0) # Forecast
train.lm.trend.season.pred
         Point Forecast     Lo 0     Hi 0
Apr 2001       2004.271 2004.271 2004.271
May 2001       2045.419 2045.419 2045.419
Jun 2001       2008.675 2008.675 2008.675
Jul 2001       2128.560 2128.560 2128.560
Aug 2001       2187.911 2187.911 2187.911
Sep 2001       1875.032 1875.032 1875.032
Oct 2001       2007.896 2007.896 2007.896
Nov 2001       2007.662 2007.662 2007.662
Dec 2001       2053.740 2053.740 2053.740
Jan 2002       1819.752 1819.752 1819.752
Feb 2002       1785.569 1785.569 1785.569
Mar 2002       2098.014 2098.014 2098.014
Apr 2002       2107.922 2107.922 2107.922
May 2002       2150.528 2150.528 2150.528
Jun 2002       2115.242 2115.242 2115.242
Jul 2002       2236.585 2236.585 2236.585
Aug 2002       2297.394 2297.394 2297.394
Sep 2002       1985.972 1985.972 1985.972
Oct 2002       2120.294 2120.294 2120.294
Nov 2002       2121.518 2121.518 2121.518
Dec 2002       2169.054 2169.054 2169.054
Jan 2003       1936.523 1936.523 1936.523
Feb 2003       1903.798 1903.798 1903.798
Mar 2003       2217.701 2217.701 2217.701
Apr 2003       2229.067 2229.067 2229.067
May 2003       2273.131 2273.131 2273.131
Jun 2003       2239.303 2239.303 2239.303
Jul 2003       2362.104 2362.104 2362.104
Aug 2003       2424.371 2424.371 2424.371
Sep 2003       2114.406 2114.406 2114.406
Oct 2003       2250.186 2250.186 2250.186
Nov 2003       2252.868 2252.868 2252.868
Dec 2003       2301.862 2301.862 2301.862
Jan 2004       2070.789 2070.789 2070.789
Feb 2004       2039.522 2039.522 2039.522
Mar 2004       2354.883 2354.883 2354.883
head(train.lm.trend.season.pred)
$model

Call:
tslm(formula = train.ts ~ trend + I(trend^2) + season)

Coefficients:
(Intercept)        trend   I(trend^2)      season2      season3  
 1696.97936     -7.15585      0.06074    -43.24584    260.01492  
    season4      season5      season6      season7      season8  
  260.61746    293.79656    248.96148    360.63400    411.65134  
    season9     season10     season11     season12  
   90.31620    214.60366    205.67114    242.92943  


$mean
          Jan      Feb      Mar      Apr      May      Jun      Jul
2001                            2004.271 2045.419 2008.675 2128.560
2002 1819.752 1785.569 2098.014 2107.922 2150.528 2115.242 2236.585
2003 1936.523 1903.798 2217.701 2229.067 2273.131 2239.303 2362.104
2004 2070.789 2039.522 2354.883                                    
          Aug      Sep      Oct      Nov      Dec
2001 2187.911 1875.032 2007.896 2007.662 2053.740
2002 2297.394 1985.972 2120.294 2121.518 2169.054
2003 2424.371 2114.406 2250.186 2252.868 2301.862
2004                                             

$lower
          Jan      Feb      Mar      Apr      May      Jun      Jul
2001                            2004.271 2045.419 2008.675 2128.560
2002 1819.752 1785.569 2098.014 2107.922 2150.528 2115.242 2236.585
2003 1936.523 1903.798 2217.701 2229.067 2273.131 2239.303 2362.104
2004 2070.789 2039.522 2354.883                                    
          Aug      Sep      Oct      Nov      Dec
2001 2187.911 1875.032 2007.896 2007.662 2053.740
2002 2297.394 1985.972 2120.294 2121.518 2169.054
2003 2424.371 2114.406 2250.186 2252.868 2301.862
2004                                             

$upper
          Jan      Feb      Mar      Apr      May      Jun      Jul
2001                            2004.271 2045.419 2008.675 2128.560
2002 1819.752 1785.569 2098.014 2107.922 2150.528 2115.242 2236.585
2003 1936.523 1903.798 2217.701 2229.067 2273.131 2239.303 2362.104
2004 2070.789 2039.522 2354.883                                    
          Aug      Sep      Oct      Nov      Dec
2001 2187.911 1875.032 2007.896 2007.662 2053.740
2002 2297.394 1985.972 2120.294 2121.518 2169.054
2003 2424.371 2114.406 2250.186 2252.868 2301.862
2004                                             

$level
[1] 0

$x
          Jan      Feb      Mar      Apr      May      Jun      Jul
1991 1708.917 1620.586 1972.715 1811.665 1974.964 1862.356 1939.860
1992 1614.827 1557.088 1891.223 1955.981 1884.714 1623.042 1903.309
1993 1705.259 1618.535 1836.709 1957.043 1917.185 1882.398 1933.009
1994 1563.365 1573.959 1902.639 1833.888 1831.049 1775.755 1867.508
1995 1548.415 1496.925 1798.316 1732.895 1772.345 1761.207 1791.655
1996 1381.758 1360.852 1558.575 1608.420 1696.696 1693.183 1835.516
1997 1396.588 1371.690 1707.522 1654.604 1762.903 1775.800 1934.219
1998 1570.330 1412.691 1754.641 1824.932 1843.289 1825.964 1968.172
1999 1599.427 1548.804 1832.333 1839.720 1846.498 1864.852 1965.743
2000 1541.660 1616.928 1919.538 1971.493 1992.301 2009.763 2053.996
2001 1683.148 1663.404 2007.928                                    
          Aug      Sep      Oct      Nov      Dec
1991 2013.264 1595.657 1724.924 1675.667 1813.863
1992 1996.712 1703.897 1810.000 1861.601 1875.122
1993 1996.167 1672.841 1752.827 1720.377 1734.292
1994 1906.608 1685.632 1778.546 1775.995 1783.350
1995 1874.820 1571.309 1646.948 1672.631 1656.845
1996 1942.573 1551.401 1686.508 1576.204 1700.433
1997 2008.055 1615.924 1773.910 1732.368 1796.626
1998 1921.645 1669.597 1791.474 1816.714 1846.754
1999 1949.002 1607.373 1803.664 1850.309 1836.435
2000 2097.471 1823.706 1976.997 1981.408 2000.153
2001                                             
Acf(train.lm.trend.season$residuals, lag.max = 12, main="") 

# Fit AR(1) model to training residuals

train.res.arima <- Arima(train.lm.trend.season$residuals, order=c(1,0,0))
summary(train.res.arima)
Series: train.lm.trend.season$residuals 
ARIMA(1,0,0) with non-zero mean 

Coefficients:
         ar1     mean
      0.5998   0.3728
s.e.  0.0712  11.8408

sigma^2 estimated as 2876:  log likelihood=-663.54
AIC=1333.08   AICc=1333.29   BIC=1341.52

Training set error measures:
                     ME     RMSE     MAE      MPE     MAPE      MASE
Training set -0.1223159 53.19141 39.8277 103.3885 175.2219 0.5721356
                    ACF1
Training set -0.07509305
train.res.arima.pred <- forecast(train.res.arima, h=nValid, level=0)
train.res.arima.pred
         Point Forecast      Lo 0      Hi 0
Apr 2001      7.4111098 7.4111098 7.4111098
May 2001      4.5942671 4.5942671 4.5942671
Jun 2001      2.9047772 2.9047772 2.9047772
Jul 2001      1.8914526 1.8914526 1.8914526
Aug 2001      1.2836792 1.2836792 1.2836792
Sep 2001      0.9191481 0.9191481 0.9191481
Oct 2001      0.7005091 0.7005091 0.7005091
Nov 2001      0.5693735 0.5693735 0.5693735
Dec 2001      0.4907208 0.4907208 0.4907208
Jan 2002      0.4435463 0.4435463 0.4435463
Feb 2002      0.4152520 0.4152520 0.4152520
Mar 2002      0.3982816 0.3982816 0.3982816
Apr 2002      0.3881030 0.3881030 0.3881030
May 2002      0.3819981 0.3819981 0.3819981
Jun 2002      0.3783365 0.3783365 0.3783365
Jul 2002      0.3761403 0.3761403 0.3761403
Aug 2002      0.3748231 0.3748231 0.3748231
Sep 2002      0.3740331 0.3740331 0.3740331
Oct 2002      0.3735592 0.3735592 0.3735592
Nov 2002      0.3732750 0.3732750 0.3732750
Dec 2002      0.3731045 0.3731045 0.3731045
Jan 2003      0.3730023 0.3730023 0.3730023
Feb 2003      0.3729410 0.3729410 0.3729410
Mar 2003      0.3729042 0.3729042 0.3729042
Apr 2003      0.3728821 0.3728821 0.3728821
May 2003      0.3728689 0.3728689 0.3728689
Jun 2003      0.3728610 0.3728610 0.3728610
Jul 2003      0.3728562 0.3728562 0.3728562
Aug 2003      0.3728534 0.3728534 0.3728534
Sep 2003      0.3728517 0.3728517 0.3728517
Oct 2003      0.3728506 0.3728506 0.3728506
Nov 2003      0.3728500 0.3728500 0.3728500
Dec 2003      0.3728496 0.3728496 0.3728496
Jan 2004      0.3728494 0.3728494 0.3728494
Feb 2004      0.3728493 0.3728493 0.3728493
Mar 2004      0.3728492 0.3728492 0.3728492
head(train.res.arima.pred)
$method
[1] "ARIMA(1,0,0) with non-zero mean"

$model
Series: train.lm.trend.season$residuals 
ARIMA(1,0,0) with non-zero mean 

Coefficients:
         ar1     mean
      0.5998   0.3728
s.e.  0.0712  11.8408

sigma^2 estimated as 2876:  log likelihood=-663.54
AIC=1333.08   AICc=1333.29   BIC=1341.52

$level
[1] 0

$mean
           Jan       Feb       Mar       Apr       May       Jun
2001                               7.4111098 4.5942671 2.9047772
2002 0.4435463 0.4152520 0.3982816 0.3881030 0.3819981 0.3783365
2003 0.3730023 0.3729410 0.3729042 0.3728821 0.3728689 0.3728610
2004 0.3728494 0.3728493 0.3728492                              
           Jul       Aug       Sep       Oct       Nov       Dec
2001 1.8914526 1.2836792 0.9191481 0.7005091 0.5693735 0.4907208
2002 0.3761403 0.3748231 0.3740331 0.3735592 0.3732750 0.3731045
2003 0.3728562 0.3728534 0.3728517 0.3728506 0.3728500 0.3728496
2004                                                            

$lower
           Jan       Feb       Mar       Apr       May       Jun
2001                               7.4111098 4.5942671 2.9047772
2002 0.4435463 0.4152520 0.3982816 0.3881030 0.3819981 0.3783365
2003 0.3730023 0.3729410 0.3729042 0.3728821 0.3728689 0.3728610
2004 0.3728494 0.3728493 0.3728492                              
           Jul       Aug       Sep       Oct       Nov       Dec
2001 1.8914526 1.2836792 0.9191481 0.7005091 0.5693735 0.4907208
2002 0.3761403 0.3748231 0.3740331 0.3735592 0.3732750 0.3731045
2003 0.3728562 0.3728534 0.3728517 0.3728506 0.3728500 0.3728496
2004                                                            

$upper
           Jan       Feb       Mar       Apr       May       Jun
2001                               7.4111098 4.5942671 2.9047772
2002 0.4435463 0.4152520 0.3982816 0.3881030 0.3819981 0.3783365
2003 0.3730023 0.3729410 0.3729042 0.3728821 0.3728689 0.3728610
2004 0.3728494 0.3728493 0.3728492                              
           Jul       Aug       Sep       Oct       Nov       Dec
2001 1.8914526 1.2836792 0.9191481 0.7005091 0.5693735 0.4907208
2002 0.3761403 0.3748231 0.3740331 0.3735592 0.3732750 0.3731045
2003 0.3728562 0.3728534 0.3728517 0.3728506 0.3728500 0.3728496
2004                                                            
plot(train.lm.trend.season$residuals, ylim=c(-250,250), ylab="Residuals", xlab="Time", bty="l", xaxt="n", xlim=c(1991,2006.25), main="")
axis(1, at=seq(1991,2006,1), labels=format(seq(1991,2006,1)))
lines(train.res.arima.pred$fitted, lwd=2, col="blue")

Acf(train.res.arima$residuals, lag.max = 12)

2.1.2 향상된 예측값

# Improved forecast(Forecast of the fiited model + Forecast of AR(1) model of residuals)

res.arima         <- as.data.frame(train.res.arima.pred)[,1]
fore              <- as.data.frame(train.lm.trend.season.pred)[,1]
Improved_forecast <- apply(cbind(res.arima,fore), 1,sum)

head(Improved_forecast)
[1] 2011.682 2050.014 2011.580 2130.452 2189.195 1875.951
Improved_forecast <- ts(Improved_forecast, start=c(2001,4), end=c(2004,3), freq=12)  # Convert as time-series

2.1.3 그래프

# Plot for regression based on forecast 

 regression <- ridership_df %>% mutate("Fitted"= c(train.lm.trend.season$fitted,Improved_forecast)) # Combine original data and fitted data


ggplot(data=regression, aes(x=Date, y=Observation)) + geom_line(aes(color="Data", size="Data", linetype="Data")) + 
geom_line(aes(x=Date, y=Fitted, colour="Fitted model", size="Fitted model", linetype="Fitted model")) + 
scale_color_manual(name="",values=c("Data"="black", "Fitted model"="blue")) +
scale_size_manual("",values=c("Data"=0.5,"Fitted model"=0.9)) + scale_linetype_manual("",values=c("Data"=1,"Fitted model"=2)) +
labs(y="Ridership")  + geom_vline(xintercept = ridership_df[length(train.ts),1], color = "red", linetype = 2) +
annotate('text', x = as.Date("2003-01-01"), y = 2500, label = 'Forecast', color = 'black', size=5)  +
scale_x_date(breaks =  seq(as.Date("1991-01-01"), as.Date("2004-03-01"), by="2 year")) + theme_minimal()

2.2 Dynamic linear model(DLM)

2.2.1 모형 적합

# Fit Dynamic linear model (DLM) 

model1 <- function(p){
  
               mod <- dlmModPoly(2) +  # local trend linear 
                      dlmModSeas(12)
  
  V(mod)            <- exp(p[1])
  diag(W(mod))[1:3] <- exp(p[2:4])
  
  return(mod)  
}

mle1 <- dlmMLE(train.ts, parm=c(0.1,0.1,1,1), build=model1 )   # Estimation parameter through MLE. Parameter= Variance of error
ifelse(mle1$convergence==0, print("converge"), print("did not converge") ) # Check Convergence
[1] "converge"
[1] "converge"
modelfit1 <- model1(mle1$par)  # Fitting the DLM
V(modelfit1)
        [,1]
[1,] 1413.91
W(modelfit1)
          [,1]      [,2]     [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,] 1143.988 0.0000000  0.00000    0    0    0    0    0    0     0
 [2,]    0.000 0.2040261  0.00000    0    0    0    0    0    0     0
 [3,]    0.000 0.0000000 14.47867    0    0    0    0    0    0     0
 [4,]    0.000 0.0000000  0.00000    0    0    0    0    0    0     0
 [5,]    0.000 0.0000000  0.00000    0    0    0    0    0    0     0
 [6,]    0.000 0.0000000  0.00000    0    0    0    0    0    0     0
 [7,]    0.000 0.0000000  0.00000    0    0    0    0    0    0     0
 [8,]    0.000 0.0000000  0.00000    0    0    0    0    0    0     0
 [9,]    0.000 0.0000000  0.00000    0    0    0    0    0    0     0
[10,]    0.000 0.0000000  0.00000    0    0    0    0    0    0     0
[11,]    0.000 0.0000000  0.00000    0    0    0    0    0    0     0
[12,]    0.000 0.0000000  0.00000    0    0    0    0    0    0     0
[13,]    0.000 0.0000000  0.00000    0    0    0    0    0    0     0
      [,11] [,12] [,13]
 [1,]     0     0     0
 [2,]     0     0     0
 [3,]     0     0     0
 [4,]     0     0     0
 [5,]     0     0     0
 [6,]     0     0     0
 [7,]     0     0     0
 [8,]     0     0     0
 [9,]     0     0     0
[10,]     0     0     0
[11,]     0     0     0
[12,]     0     0     0
[13,]     0     0     0

2.2.2 Kalman filtering

# Estimation for Kalman filtering

filtering <- dlmFilter(train.ts, modelfit1)
str(filtering,1)
List of 9
 $ y  : Time-Series [1:123] from 1991 to 2001: 1709 1621 1973 1812 1975 ...
 $ mod:List of 10
  ..- attr(*, "class")= chr "dlm"
 $ m  : Time-Series [1:124, 1:13] from 1991 to 2001: 0 263 1477 2060 2237 ...
 $ U.C:List of 124
 $ D.C: num [1:124, 1:13] 3162.3 26.6 23.4 21.4 19.3 ...
 $ a  : Time-Series [1:123, 1:13] from 1991 to 2001: 0 394 2049 2636 2683 ...
 $ U.R:List of 123
 $ D.R: num [1:123, 1:13] 10916 8163 5192 4498 4165 ...
 $ f  : Time-Series [1:123] from 1991 to 2001: 0 263 1954 2542 2578 ...
 - attr(*, "class")= chr "dlmFiltered"
# Plot estimation for filtering model

plot(dropFirst(filtering$f), col="blue", lwd=2, lty=2, ylab="Ridership")
lines(train.ts ,lty=1, lwd=2, col="black") 
legend("bottomleft", legend=c("Data", "Fitted filtering data"), col=c("black", "blue"), lty=1:2, lwd=2)

# Plot for estimated states

plot(dropFirst(filtering$m[,1]), ylab="Level")
plot(dropFirst(filtering$m[,2]), ylab="Slope")
plot(dropFirst(filtering$m[,3]), ylab="Seasonality")

# Plot for data and estimated level

plot(train.ts, ylim=c(1000,2600), ylab="Ridership")
lines(dropFirst(filtering$m[,1]) ,lty=2, lwd=2, col="blue") 
legend("topright", legend=c("Data", "Filtered level"), col=c("black", "blue"), lty=1:2, lwd=1:2)

2.2.3 Kalman smoother

# Estimation for Kalman smoother

smoothing <- dlmSmooth(filtering)   # dlmSmooth(Filted DLM)  or dlmSmooth(train.ts, modelfit1)
str(smoothing,1)
List of 3
 $ s  : Time-Series [1:124, 1:13] from 1991 to 2001: 1900 1899 1885 1878 1835 ...
 $ U.S:List of 124
 $ D.S: num [1:124, 1:13] 46.6 32.5 29.1 28.3 28 ...
# Plot estimation for smoothing model

theta         <- modelfit1$GG%*%t(smoothing$s[1:length(train.ts),])  #s0-s[t] : Total t+1
fitted_smooth <- modelfit1$FF%*%theta


plot(train.ts, ylab="Ridership", lwd=2)
time <- as.vector(time(train.ts))
lines(time, fitted_smooth ,lty=2, lwd=2, col="blue") 
legend("bottomleft", legend=c("Data", "Fitted smoothing data"), col=c("black", "blue"), lty=1:2, lwd=2)
# Plot for estimated states

 
plot(dropFirst(smoothing$s[,1]), ylab="Level")
plot(dropFirst(smoothing$s[,2]), ylab="Slope")
plot(dropFirst(smoothing$s[,3]), ylab="Seasonality")
# Plot for data and estimated level


plot(train.ts, ylim=c(1000,2600), ylab="Ridership")
lines(dropFirst(smoothing$s[,1]) ,lty=2, lwd=2, col="blue") 
legend("topright", legend=c("Data", "Smoothed level"), col=c("black", "blue"), lty=1:2, lwd=1:2)

# Residual(Check independence and normality)


plot(residuals(filtering,sd = FALSE), ylab="Residual")
abline(h=0)
tsdiag(filtering, main = "Diagnostics for Regression Model")
# Check normality for error

qqnorm(residuals(filtering, sd = FALSE))
qqline(residuals(filtering, sd = FALSE))

2.2.4 예측

# Forecast
  
forecast_DLM <- dlmForecast(filtering, nAhead = nValid)  # Forecast(filtering model)
str(forecast_DLM,1)
List of 4
 $ a: Time-Series [1:36, 1:13] from 2001 to 2004: 1946 1948 1950 1951 1953 ...
  ..- attr(*, "dimnames")=List of 2
 $ R:List of 36
 $ f: Time-Series [1:36, 1] from 2001 to 2004: 1997 2024 1992 2100 2148 ...
  ..- attr(*, "dimnames")=List of 2
 $ Q:List of 36

2.2.5 그래프

# Plot for DLM 

DLM      <- ridership_df %>% mutate("Fitted"=c(filtering$f,forecast_DLM$f))  # Combine original data and fitted data
 
lower_DLM <- c(rep(0,length(train.ts)), forecast_DLM$f - qnorm(0.975)*sqrt(unlist(forecast_DLM$Q)))  # Lower of 95% confidence interval
upper_DLM <- c(rep(0,length(train.ts)), forecast_DLM$f + qnorm(0.975)*sqrt(unlist(forecast_DLM$Q)))  # Upper of 95% confidence interval
    

ggplot(data=DLM, aes(x=Date, y=Observation)) + geom_line(aes(color="Data", size="Data", linetype="Data")) + 
geom_line(aes(x=Date, y=Fitted, colour="Fitted model", size="Fitted model", linetype="Fitted model")) + 
scale_color_manual(name="",values=c("Data"="black", "Fitted model"="blue")) +
scale_size_manual("",values=c("Data"=0.5,"Fitted model"=0.9)) + scale_linetype_manual("",values=c("Data"=1,"Fitted model"=2)) +
labs(y="Ridership")  + geom_vline(xintercept = ridership_df[length(train.ts),1], color = "red", linetype = 2) +
annotate('text', x = as.Date("2003-01-01"), y = 2700, label = 'Forecast', color = 'black', size=5)  +
scale_x_date(breaks =  seq(as.Date("1991-01-01"), as.Date("2004-03-01"), by="2 year")) + theme_minimal() +
geom_ribbon(aes(ymin=lower_DLM, ymax=upper_DLM), fill="#99CCFF", alpha=0.5)

2.3 Bootstrap and Bagging

2.3.1 모형 적합과 예측

set.seed(10)
bagged          <- train.ts %>% baggedModel(bld.mbb.bootstrap(train.ts,1000),fn=auto.arima) 
forecast_bagged <- bagged %>% forecast(h=nValid)  # Auto.arima is more accurate than ets. / After generating bootstrapped time series, convert auto.arima. Then forecast. 
# 95% confidence interval
 
boot.pred  <- forecast_bagged$forecasts_boot
CI.boot    <- apply(boot.pred, 1, function(x) { quantile(x, probs = c(0.025, 0.975) ) }) %>% t() # Forecast confidence interval of Validation data 
CI.pred    <- rbind(matrix(rep(0,length(train.ts)*2), nrow=length(train.ts)), CI.boot)
lower_boot <- CI.pred[,1]
upper_boot <- CI.pred[,2]

2.3.2 그래프

boots <- ridership_df %>% 
  mutate("Fitted"=c(bagged$fitted,forecast_bagged$mean)) # Combine original data and fitted data
ggplot(data=boots, aes(x=Date, y=Observation)) + geom_line(aes(color="Data", size="Data", linetype="Data")) + 
geom_line(aes(x=Date, y=Fitted, colour="Fitted model", size="Fitted model", linetype="Fitted model")) + 
scale_color_manual(name="",values=c("Data"="black", "Fitted model"="blue")) +
scale_size_manual("",values=c("Data"=0.5,"Fitted model"=0.9)) + scale_linetype_manual("",values=c("Data"=1,"Fitted model"=2)) +
labs(y="Ridership")  + geom_vline(xintercept = ridership_df[length(train.ts),1], color = "red", linetype = 2) +
annotate('text', x = as.Date("2003-01-01"), y = 2700, label = 'Forecast', color = 'black', size=5)  +
scale_x_date(breaks =  seq(as.Date("1991-01-01"), as.Date("2004-03-01"), by="2 year")) + theme_minimal() +
geom_ribbon(aes(ymin=lower_boot, ymax=upper_boot), fill="#99CCFF", alpha=0.5)        

2.4 Neural network

2.4.1 모형 적합과 예측

set.seed(10)
neural_model <- nnetar(train.ts,  repeats=200, lambda="auto") 
forecast_neu <- forecast(neural_model, PI=TRUE, npaths=1000, h=nValid, level=95) # Npaths : How many simulation/ Normal error 

# 95% confidence interval
           
lower_neu <- c(rep(0,length(train.ts)), forecast_neu$lower)
upper_neu <- c(rep(0,length(train.ts)), forecast_neu$upper)

2.4.2 그래프

neu <- ridership_df %>% mutate("Fitted"=c(neural_model$fitted, forecast_neu$mean)) 
ggplot(data=neu, aes(x=Date, y=Observation)) + geom_line(aes(color="Data", size="Data", linetype="Data")) + 
geom_line(aes(x=Date, y=Fitted, colour="Fitted model", size="Fitted model", linetype="Fitted model")) + 
scale_color_manual(name="",values=c("Data"="black", "Fitted model"="blue")) +
scale_size_manual("",values=c("Data"=0.5,"Fitted model"=0.9)) + scale_linetype_manual("",values=c("Data"=1,"Fitted model"=2)) +
labs(y="Ridership")  + geom_vline(xintercept = ridership_df[length(train.ts),1], color = "red", linetype = 2) +
annotate('text', x = as.Date("2003-01-01"), y = 2700, label = 'Forecast', color = 'black', size=5)  +
scale_x_date(breaks =  seq(as.Date("1991-01-01"), as.Date("2004-03-01"), by="2 year")) + theme_minimal() +
geom_ribbon(aes(ymin=lower_neu, ymax=upper_neu), fill="#99CCFF", alpha=0.5)

3. 모형 비교

3.1 그래프

 df <- rbind(
   
    data.frame(Date=ridership_df[-1,1], y=ridership_df[-1,2], series="Original", ymin=0, ymax=0),
    data.frame(Date=ridership_df[-1,1], y=c(dropFirst(train.lm.trend.season$fitted),Improved_forecast), series="Regression", ymin=0, ymax=0),
    data.frame(Date=ridership_df[-1,1], y=c(dropFirst(filtering$f), forecast_DLM$f), series="DLM(Filtering)", ymin=dropFirst(lower_DLM), ymax=dropFirst(upper_DLM)),
    data.frame(Date=ridership_df[-1,1], y=c(dropFirst(bagged$fitted),forecast_bagged$mean), series="Bootstrap and Bagging", ymin=dropFirst(lower_boot), ymax=dropFirst(upper_boot)),  
    data.frame(Date=ridership_df[-1,1], y=c(dropFirst(neural_model$fitted), forecast_neu$mean), series="Neutral network" , ymin=dropFirst(lower_neu), ymax=dropFirst(upper_neu))
    
  )
  
  
ggplot(df, aes(x=Date, y=y, group=series,color=series, size=series)) + geom_line() +
geom_ribbon(aes(ymin=ymin, ymax=ymax, fill=series), alpha=0.5, colour = NA, show.legend = FALSE) +
scale_color_manual("",values = c("black","green", "red", "blue", "magenta")) +
scale_size_manual("",values=c(1.2,rep(1,4))) +
scale_fill_manual("",values=c("#FFFFFF","#99FF99","#FFCC99", "#0099FF", "#FF99CC")) +
labs(y="Ridership")  + geom_vline(xintercept = ridership_df[length(train.ts),1], color = "black", linetype = 2) +
annotate('text', x = as.Date("2003-01-01"), y = 2700, label = 'Forecast', color = 'black', size=5)  +
scale_x_date(breaks =  seq(as.Date("1991-01-01"), as.Date("2004-03-01"), by="2 year")) + theme_minimal() 

3.2 결과값

3.2.1 적합된 값의 MSE

# Fit model 
  
filtering$f[1]      <- NA  # First value changes to NA
                
point_fitted_result <- ridership_df[1:length(train.ts),] %>% mutate("Regression"=c(train.lm.trend.season$fitted)) %>% 
                                  mutate("DLM(Filtering)"=filtering$f) %>% mutate("Bootstrap and Bagging"=bagged$fitted) %>% 
                                  mutate("Neural network"=neural_model$fitted)
point_fitted_result
          Date Observation Regression DLM(Filtering)
1   1991-01-01    1708.917   1689.884             NA
2   1991-02-01    1620.586   1639.665        262.920
3   1991-03-01    1972.715   1936.073       1953.806
4   1991-04-01    1811.665   1929.945       2542.089
5   1991-05-01    1974.964   1956.515       2578.135
6   1991-06-01    1862.356   1905.193       2636.291
7   1991-07-01    1939.860   2010.499       2559.954
8   1991-08-01    2013.264   2055.272       2524.776
9   1991-09-01    1595.657   1727.813       2516.373
10  1991-10-01    1724.924   1846.099       2324.983
11  1991-11-01    1675.667   1831.286       2232.946
12  1991-12-01    1813.863   1862.786       2144.421
13  1992-01-01    1614.827   1614.219       2533.874
14  1992-02-01    1557.088   1565.457       1526.730
15  1992-03-01    1891.223   1863.324       1898.173
16  1992-04-01    1955.981   1858.654       1733.055
17  1992-05-01    1884.714   1886.681       2025.580
18  1992-06-01    1623.042   1836.817       1832.052
19  1992-07-01    1903.309   1943.581       1788.410
20  1992-08-01    1996.712   1989.811       1927.770
21  1992-09-01    1703.897   1663.811       1550.400
22  1992-10-01    1810.000   1783.554       1768.653
23  1992-11-01    1861.601   1770.199       1743.813
24  1992-12-01    1875.122   1803.157       1950.380
25  1993-01-01    1705.259   1556.048       1717.550
26  1993-02-01    1618.535   1508.744       1636.483
27  1993-03-01    1836.709   1808.069       1968.354
28  1993-04-01    1957.043   1804.856       1845.458
29  1993-05-01    1917.185   1834.342       1954.275
30  1993-06-01    1882.398   1785.935       1745.782
31  1993-07-01    1933.009   1894.157       2004.339
32  1993-08-01    1996.167   1941.845       2046.631
33  1993-09-01    1672.841   1617.303       1662.751
34  1993-10-01    1752.827   1738.504       1785.547
35  1993-11-01    1720.377   1726.607       1768.711
36  1993-12-01    1734.292   1761.022       1815.898
37  1994-01-01    1563.365   1515.371       1586.540
38  1994-02-01    1573.959   1469.525       1494.872
39  1994-03-01    1902.639   1770.308       1839.784
40  1994-04-01    1833.888   1768.553       1887.641
41  1994-05-01    1831.049   1799.497       1871.495
42  1994-06-01    1775.755   1752.547       1713.669
43  1994-07-01    1867.508   1862.227       1884.130
44  1994-08-01    1906.608   1911.374       1951.432
45  1994-09-01    1685.632   1588.289       1581.900
46  1994-10-01    1778.546   1710.948       1745.575
47  1994-11-01    1775.995   1700.509       1755.732
48  1994-12-01    1783.350   1736.382       1821.999
49  1995-01-01    1548.415   1492.189       1624.354
50  1995-02-01    1496.925   1447.801       1524.911
51  1995-03-01    1798.316   1750.041       1815.238
52  1995-04-01    1732.895   1749.744       1795.564
53  1995-05-01    1772.345   1782.146       1769.381
54  1995-06-01    1761.207   1736.654       1657.413
55  1995-07-01    1791.655   1847.792       1840.032
56  1995-08-01    1874.820   1898.396       1879.097
57  1995-09-01    1571.309   1576.769       1565.207
58  1995-10-01    1646.948   1700.886       1668.964
59  1995-11-01    1672.631   1691.905       1649.227
60  1995-12-01    1656.845   1729.236       1704.771
61  1996-01-01    1381.758   1486.501       1486.364
62  1996-02-01    1360.852   1443.571       1371.718
63  1996-03-01    1558.575   1747.269       1670.654
64  1996-04-01    1608.420   1748.430       1584.176
65  1996-05-01    1696.696   1782.289       1612.430
66  1996-06-01    1693.183   1738.255       1569.849
67  1996-07-01    1835.516   1850.851       1742.445
68  1996-08-01    1942.573   1902.913       1867.054
69  1996-09-01    1551.401   1582.744       1601.545
70  1996-10-01    1686.508   1708.319       1668.318
71  1996-11-01    1576.204   1700.796       1679.321
72  1996-12-01    1700.433   1739.584       1651.080
73  1997-01-01    1396.588   1498.307       1469.337
74  1997-02-01    1371.690   1456.835       1383.042
75  1997-03-01    1707.522   1761.990       1660.411
76  1997-04-01    1654.604   1764.609       1678.985
77  1997-05-01    1762.903   1799.927       1693.031
78  1997-06-01    1775.800   1757.351       1657.212
79  1997-07-01    1934.219   1871.404       1833.422
80  1997-08-01    2008.055   1924.924       1970.264
81  1997-09-01    1615.924   1606.213       1667.501
82  1997-10-01    1773.910   1733.246       1743.867
83  1997-11-01    1732.368   1727.180       1739.239
84  1997-12-01    1796.626   1767.427       1786.765
85  1998-01-01    1570.330   1527.607       1564.790
86  1998-02-01    1412.691   1487.593       1527.906
87  1998-03-01    1754.641   1794.207       1756.789
88  1998-04-01    1824.932   1798.283       1737.385
89  1998-05-01    1843.289   1835.058       1827.893
90  1998-06-01    1825.964   1793.941       1775.680
91  1998-07-01    1968.172   1909.452       1920.357
92  1998-08-01    1921.645   1964.430       2026.409
93  1998-09-01    1669.597   1647.176       1630.716
94  1998-10-01    1791.474   1775.667       1767.309
95  1998-11-01    1816.714   1771.059       1755.758
96  1998-12-01    1846.754   1812.764       1844.312
97  1999-01-01    1599.427   1574.402       1620.497
98  1999-02-01    1548.804   1535.845       1549.632
99  1999-03-01    1832.333   1843.917       1851.400
100 1999-04-01    1839.720   1849.452       1837.085
101 1999-05-01    1846.498   1887.684       1873.979
102 1999-06-01    1864.852   1848.025       1803.831
103 1999-07-01    1965.743   1964.994       1959.681
104 1999-08-01    1949.002   2021.429       2021.165
105 1999-09-01    1607.373   1705.634       1658.950
106 1999-10-01    1803.664   1835.582       1743.359
107 1999-11-01    1850.309   1832.432       1759.353
108 1999-12-01    1836.435   1875.595       1860.963
109 2000-01-01    1541.660   1638.691       1619.308
110 2000-02-01    1616.928   1601.592       1518.112
111 2000-03-01    1919.538   1911.122       1872.280
112 2000-04-01    1971.493   1918.114       1899.065
113 2000-05-01    1992.301   1957.805       1970.615
114 2000-06-01    2009.763   1919.603       1941.086
115 2000-07-01    2053.996   2038.030       2099.478
116 2000-08-01    2097.471   2095.923       2122.863
117 2000-09-01    1823.706   1781.585       1784.138
118 2000-10-01    1976.997   1912.992       1932.182
119 2000-11-01    1981.408   1911.300       1950.244
120 2000-12-01    2000.153   1955.920       2009.787
121 2001-01-01    1683.148   1720.474       1766.339
122 2001-02-01    1663.404   1684.833       1682.651
123 2001-03-01    2007.928   1995.820       1967.525
    Bootstrap and Bagging Neural network
1                1672.872             NA
2                1617.336             NA
3                1908.030             NA
4                1893.846             NA
5                1917.793             NA
6                1862.987             NA
7                1964.097             NA
8                2006.419             NA
9                1674.582             NA
10               1793.660             NA
11               1779.073             NA
12               1812.999             NA
13               1574.009       1748.872
14               1514.496       1606.626
15               1814.154       1875.572
16               1812.596       1847.949
17               1851.417       1981.911
18               1809.408       1885.382
19               1926.027       1875.970
20               1983.089       1969.768
21               1663.944       1695.800
22               1796.449       1765.318
23               1793.257       1731.103
24               1840.777       1846.805
25               1612.958       1665.678
26               1568.811       1586.168
27               1873.225       1839.350
28               1876.987       1922.599
29               1909.497       1929.753
30               1862.276       1695.602
31               1966.243       1917.390
32               2009.328       1983.906
33               1680.160       1780.127
34               1798.459       1809.613
35               1786.797       1840.941
36               1821.840       1844.079
37               1578.443       1731.015
38               1523.750       1612.422
39               1821.908       1786.269
40               1817.100       1937.641
41               1847.364       1902.932
42               1798.645       1879.650
43               1907.730       1896.262
44               1957.619       1959.361
45               1633.379       1716.055
46               1756.222       1766.347
47               1745.961       1754.881
48               1780.128       1760.108
49               1539.817       1600.430
50               1488.870       1554.253
51               1786.071       1815.880
52               1778.688       1825.306
53               1802.792       1819.437
54               1747.621       1788.118
55               1849.369       1849.505
56               1889.593       1882.520
57               1555.650       1713.309
58               1670.497       1756.146
59               1652.162       1756.722
60               1678.066       1772.637
61               1431.176       1554.870
62               1376.499       1467.130
63               1677.204       1700.927
64               1672.316       1692.373
65               1705.382       1744.748
66               1658.461       1759.372
67               1770.653       1785.754
68               1822.512       1879.528
69               1502.132       1651.018
70               1630.767       1652.962
71               1625.499       1691.410
72               1665.566       1660.253
73               1434.339       1466.154
74               1393.075       1408.447
75               1702.480       1542.804
76               1709.964       1636.749
77               1750.172       1710.173
78               1714.905       1732.234
79               1833.843       1830.665
80               1894.638       1952.148
81               1583.825       1669.633
82               1715.944       1720.600
83               1717.950       1607.561
84               1762.614       1732.955
85               1528.209       1502.318
86               1488.879       1446.350
87               1794.321       1658.939
88               1797.921       1694.190
89               1836.185       1795.339
90               1794.055       1813.152
91               1908.518       1915.511
92               1964.764       2002.368
93               1647.794       1693.221
94               1777.777       1779.508
95               1774.592       1766.526
96               1815.955       1817.215
97               1578.385       1621.556
98               1535.665       1474.991
99               1840.996       1721.865
100              1843.629       1833.189
101              1877.245       1857.522
102              1832.569       1849.687
103              1942.801       1947.317
104              1995.583       1955.913
105              1675.385       1738.173
106              1802.236       1780.050
107              1799.498       1823.228
108              1845.026       1864.405
109              1610.752       1638.980
110              1576.090       1540.597
111              1890.785       1791.913
112              1904.403       1878.954
113              1951.137       1917.212
114              1917.900       1943.835
115              2039.240       2013.149
116              2101.885       2030.660
117              1791.607       1789.649
118              1923.879       1858.853
119              1923.927       1917.670
120              1966.671       1921.575
121              1732.342       1667.786
122              1694.137       1641.518
123              2004.197       1866.765
apply(point_fitted_result[,3:6], 2, function(x){ mean((x-train.ts)^2, na.rm=TRUE) } )  # MSE
           Regression        DLM(Filtering) Bootstrap and Bagging 
             4457.089             57363.744              2664.542 
       Neural network 
             6623.718 

3.2.2 예측값의 MSE

# Forecast model
  
point_forecast_result <- ridership_df[-(1:length(train.ts)),] %>% mutate("Regression"=Improved_forecast) %>% 
                                 mutate("DLM"= c(forecast_DLM$f)) %>% mutate("Bootstrap and Bagging"=forecast_bagged$mean) %>%
                                 mutate("Neural network"=forecast_neu$mean)
        
point_forecast_result
          Date Observation Regression      DLM Bootstrap and Bagging
124 2001-04-01    2023.792   2011.682 1996.881              2010.499
125 2001-05-01    2047.008   2050.014 2024.378              2047.238
126 2001-06-01    2072.913   2011.580 1992.103              2003.402
127 2001-07-01    2126.717   2130.452 2100.250              2117.666
128 2001-08-01    2202.638   2189.195 2148.355              2171.674
129 2001-09-01    1707.693   1875.951 1830.904              1854.530
130 2001-10-01    1950.716   2008.596 1959.455              1982.606
131 2001-11-01    1973.614   2008.231 1953.408              1977.169
132 2001-12-01    1984.729   2054.230 1993.073              2018.519
133 2002-01-01    1759.629   1820.195 1743.912              1781.106
134 2002-02-01    1770.595   1785.984 1710.032              1741.799
135 2002-03-01    2019.912   2098.412 2013.061              2048.875
136 2002-04-01    2048.398   2108.310 2016.583              2052.063
137 2002-05-01    2068.763   2150.910 2044.080              2089.364
138 2002-06-01    1994.267   2115.620 2011.805              2045.938
139 2002-07-01    2075.258   2236.961 2119.952              2160.507
140 2002-08-01    2026.560   2297.769 2168.057              2214.189
141 2002-09-01    1734.155   1986.346 1850.606              1897.534
142 2002-10-01    1916.771   2120.668 1979.157              2025.328
143 2002-11-01    1858.345   2121.891 1973.110              2020.250
144 2002-12-01    1996.352   2169.427 2012.775              2061.456
145 2003-01-01    1778.033   1936.896 1763.614              1824.538
146 2003-02-01    1749.489   1904.171 1729.734              1786.104
147 2003-03-01    2066.466   2218.074 2032.762              2093.886
148 2003-04-01    2098.899   2229.440 2036.285              2097.939
149 2003-05-01    2104.911   2273.504 2063.782              2135.530
150 2003-06-01    2129.671   2239.676 2031.507              2092.483
151 2003-07-01    2223.349   2362.476 2139.654              2207.334
152 2003-08-01    2174.360   2424.743 2187.759              2261.192
153 2003-09-01    1931.406   2114.779 1870.308              1944.328
154 2003-10-01    2121.470   2250.559 1998.859              2072.590
155 2003-11-01    2076.054   2253.241 1992.812              2067.175
156 2003-12-01    2140.677   2302.235 2032.477              2108.578
157 2004-01-01    1831.508   2071.162 1783.316              1871.460
158 2004-02-01    1838.006   2039.895 1749.436              1832.846
159 2004-03-01    2132.446   2355.256 2052.464              2139.963
    Neural network
124       1997.554
125       2021.883
126       2043.181
127       2078.808
128       2119.805
129       2000.495
130       2022.231
131       2028.845
132       2043.715
133       1807.887
134       1711.522
135       1924.056
136       1974.236
137       2015.186
138       2054.744
139       2096.959
140       2138.893
141       2110.288
142       2107.135
143       2107.609
144       2114.474
145       1985.672
146       1812.554
147       1912.449
148       1962.936
149       2006.231
150       2053.949
151       2104.552
152       2150.084
153       2160.178
154       2165.345
155       2168.004
156       2171.652
157       2122.786
158       1994.067
159       1984.030
apply(point_forecast_result[,3:6], 2, function(x){ mean((x-valid.ts)^2) } )  # MSE
           Regression                   DLM Bootstrap and Bagging 
            23572.416              4557.938              4614.936 
       Neural network 
            19917.629 

3.2.3 검증용 데이터에 대한 95% 신뢰구간 길이

length_confidence_DLM  <- upper_DLM[-(1:length(train.ts))]-lower_DLM[-(1:length(train.ts))]
length_confidence_boot <- upper_boot[-(1:length(train.ts))]-lower_boot[-(1:length(train.ts))]
length_confidence_neu  <- upper_neu[-(1:length(train.ts))]-lower_neu[-(1:length(train.ts))]
 
length_confidence <- data.frame("DLM"=length_confidence_DLM, "Bootstrap and Bagging"=length_confidence_boot,"Neural network"=length_confidence_neu )
length_confidence
         DLM Bootstrap.and.Bagging Neural.network
1   245.4329              155.9241       303.7068
2   281.6241              157.1769       316.8747
3   315.2843              155.4869       321.0918
4   346.0806              158.7478       323.6515
5   374.6767              151.5382       335.2460
6   401.5385              150.9886       355.7925
7   426.9936              150.7217       329.9089
8   451.2835              146.1745       335.1348
9   474.5971              153.9625       341.4205
10  497.0642              167.3797       391.3477
11  518.7964              175.6921       363.4295
12  538.3738              183.3449       318.4656
13  567.4820              198.5845       372.3355
14  589.7717              193.3104       382.7846
15  612.2222              206.1852       378.0763
16  634.1357              210.4745       408.4248
17  655.5368              200.5349       404.2653
18  676.4709              207.8676       403.7618
19  696.9801              204.4233       402.2748
20  717.1041              209.1535       409.5392
21  736.8907              208.6671       421.4122
22  756.4089              226.5134       527.5300
23  775.4545              235.8952       528.3340
24  793.0199              249.8218       406.0942
25  818.2557              267.6009       414.2663
26  838.4271              267.8685       438.2408
27  858.9189              270.2530       435.3584
28  879.1478              281.9772       416.5980
29  899.1073              265.1256       408.2115
30  918.8118              281.5378       407.7103
31  938.2774              279.5539       416.2228
32  957.5218              276.3729       396.8239
33  976.5780              281.7691       416.3705
34  995.5250              296.8660       507.0597
35 1014.0249              306.4516       570.0954
36 1031.2877              324.0183       527.0656

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".