If you're interested in time series analysis and forecasting, this is the right place to be. The Time Series Lab (TSL) software platform makes time series analysis available to anyone with a basic knowledge of statistics. Future versions will remove the need for a basic knowledge altogether by providing fully automated forecasting systems. The platform is designed and developed in a way such that results can be obtained quickly and verified easily. At the same time, many advanced time series and forecasting operations are available for the experts. In our case studies, we often present screenshots of the program so that you can easily replicate results.
Did you know you can make a screenshot of a TSL program window? Press Ctrl + p to open a window which allows you to save a screenshot of the program. The TSL window should be located on your main monitor.
Click on the buttons below to go to our case studies. At the beginning of each case study, the required TSL package is mentioned. Our first case study, about the Nile data, is meant to illustrate the basic workings of the program and we advise you to start with that one.
Date: June 30, 2022
Software: Time Series Lab - Home Edition
Topics: fractional seasonal periods and comparison of forecasting performance
The data for this case study is weekly data on US finished motor gasoline products supplied (in thousands of barrels per day) from February 1991 to May 2005.
It is part of the R package fpp2 and available from the EIA website.
It is also bundled with the installation file of TSL.
The dataset is used in the TBATS paper of De Livera, A.M., R.J. Hyndman, and R.D. Snyder (2011).
Furthermore, the dataset is analysed by R.J. Hyndman on his blog.
We quote from this blog post:
The TBATS model is preferable when the seasonality changes over time. The ARIMA approach is preferable if there are covariates that are useful predictors as these can be added as additional regressors.
This gasoline case study illustrates that you don't need to choose between the two methods when you work with Time Series Lab. TSL offers a modelling framework for complex seasonal patterns AND, at the same time, with the inclusion of covariates (explanatory variables). We show that it is possible for TSL to produce more accurate forecasts compared to the TBATS package. We deliberately compare with TBATS since this package shows accurate forecasts when complex seasonal patterns are present in the data.
In the figure below, the gasoline dataset is loaded into TSL and plotted. The upward trend and seasonality patterns are clearly visible in the data. The Data characteristics area shows T = 745 observations. At a later stage (Estimation step) we will split the time series into an Training sample and a test sample.
Information: On the Database page, you can copy the contents of the blue Data characteristics pane to the clipboard by right-mouse clicking the area and selecting Copy contents or by selecting the text and clicking Ctrl-c.
TSL Database page with Gasoline dataset loaded
Local Linear Trend model
The seasonal pattern of this time series is important but for illustrative purposes, we start our analysis without a seasonal component and select the Local Linear Trend model which is a model with a trend component but no seasonal component. Select the Local Linear Trend model model on the Pre-built models page.
Alternatively, you can go to the Build your own model page and select a time-varying level and a time-varying slope.
Our time series consist of a total of 745 observations (February 1991 to May 2005). For this case study we select the first 484 observations as Training sample and leaving 261 observations as Validation sample. Drag (and/or click) the sample bar on the Pre-built models page to set a Training sample of size 484, or alternatively, set the start and end of the Training sample to 1 and 484 on the Estimation page. Click the Process Dashboard button on the Pre-built models page or the Estimate button on the Estimation page. TSL estimates the model and if you go to the Text output page you see a green colored message informing us that:
All selected models and series were estimated successfully
Furthermore, at the bottom of the Text output page we find the Model fit. For the current model this is:
Variable: gasoline Model: TSL003 Local Linear Trend TSL003 Log likelihood -3485.355 Akaike Information Criterion (AIC) 6978.710 Bias corrected AIC (AICc) 6978.794 Bayesian Information Criterion (BIC) 6995.439 in-sample MSE 1.1177e+05 ... RMSE 334.317 ... MAE 268.147 ... MAPE 3.480 Sample size 484 Effective sample size 482 * based on one-step-ahead forecast errors
We report these numbers here to show the improvement of adding a seasonal component later. The graphical output of the current model is shown in the figure below.
Graph page of TSL with Gasoline dataset
With a smoothed
level through the data, the seasonal pattern is even better visible. The triangular pattern in
the level will appear later as well when we plot the forecasting performance of the model.
Let's assess the forecast performance of the model by going to the Model comparison page. This page can be viewed by clicking the Model comparison button in the button bar on the left of your screen. Note that this button is only visible when a Validation sample is specified. Click on the green Start loss calculation button in the top right of the window. Under User defined models, a new check-button appears which you should tick. The resulting TSL screen is shown below.
RMSE loss Local Linear Trend model and Gasoline dataset
The pyramid shaped loss line can be explained by the fact that a forecast from the Local Linear Trend model is a straight line that is upward sloping for our data set. The forecasts do not take into account the seasonal pattern of the data so when the data is at the highest or lowest point in the seasonal cycle, the loss is the highest. Let's verify this. Go to the Forecasting page and select multi-step-ahead in the top left corner. Navigate to Plot options and Show forecast 150 periods ahead. The resulting window should look the one presented in the figure below.
Forecasts for h=150 time points ahead
Basic Structural Time Series model
It is time to introduce a seasonal component. We go to the Build your own model page and select a time-varying level, a time-varying slope, and a time-varying seasonal. The resulting model is called the Basic Structural Time Series model by Harvey (1990). Set the Seasonal period length to 365.25/7 ≈ 52.179 (weekly data taking leap years into account) and a Number of factors equal to 22.
Information: Seasonal period length is the number of time points after which the seasonal repeats. This can be a fractional number. For example, with daily data, specify a period of 365.25 for a seasonal that repeats each year, taking leap years into account. Number of factors specifies the seasonal flexibility. Note that a higher number is not always better and parsimonious models often perform better in forecasting.
The Build your own model page should look like the one in the figure below.
Component selection page
Estimate the model and go to the Text output page. We see that the model fit is improved by adding the seasonal component.
Variable: gasoline Model: TSL004 TSL004 Log likelihood -3218.400 Akaike Information Criterion (AIC) 6532.799 Bias corrected AIC (AICc) 6543.613 Bayesian Information Criterion (BIC) 6733.539 in-sample MSE 1.0083e+05 ... RMSE 317.539 ... MAE 252.502 ... MAPE 3.244 Sample size 484 Effective sample size 430 * based on one-step-ahead forecast errors
A large improvement in forecasting performance, compared to the Local Linear Trend model, can be seen if we start (and plot) the loss calculation on the Model comparison page. This shows how important it can be to model the seasonal pattern of a time series. We will see an example of multiple seasonal patterns in a time series which makes the correct handling even more important. Plotting both RMSE losses leads to the following figure.
Forecast performance of LLT and Basis Structural model
Next, go back to the Build your own model page and lower the number of factors of the seasonal component. Parsimonious models often perform better in forecasting. A better performing number of factors is 7 although other values might be even better for forecasting. A model comparison between 22 and 7 factors is made in the figure below. The loss corresponding to the model with 7 factors is lower than the one that is presented in Figure 2 of De Livera, A.M., R.J. Hyndman, and R.D. Snyder (2011) which is obtained with the TBATS package.
Forecast performance of LLT and Basis Structural model
A figure with the extracted trend and seasonal pattern is obtained from the Graphics and Diagnostics page.
Gasoline data extracted trend and seasonal pattern
- Estimate the model with a Level, Slope, and Seasonal with frequency 52. Verify that by taking the leap year not into account (52 instead of 52.179), forecasts become worse.
- Forecasts can further be improved by adding explanatory variables. In TSL you can do this with the click of a couple of buttons on the Model setup page. Let us know which variables you have used to boost the forecast precision for the gasoline dataset!
Durbin, J. and Koopman, S. J. (2012). Time series analysis by state space methods. Oxford university press.
Harvey, A. (1989). Forecasting, Structural Time Series Models and the Kalman Filter. Cambridge: Cambridge University Press. doi:10.1017/CBO9781107049994
De Livera, A.M., R.J. Hyndman, and R.D. Snyder (2011). Forecasting Time Series With Complex Seasonal Patterns Using Exponential Smoothing. Journal of the American Statistical Association 106:496, 1513-1527.