# Case Studies

If you're interested in time series analysis and forecasting, this is the right place to be. The Time Series Lab (TSL) software platform makes time series analysis available to anyone with a basic knowledge of statistics. Future versions will remove the need for a basic knowledge altogether by providing fully automated forecasting systems. The platform is designed and developed in a way such that results can be obtained quickly and verified easily. At the same time, many advanced time series and forecasting operations are available for the experts. In our case studies, we often present screenshots of the program so that you can easily replicate results.

Did you know you can make a screenshot of a TSL program window? Press Ctrl + p to open a window which allows you to save a screenshot of the program. The TSL window should be located on your main monitor.

Click on the buttons below to go to our case studies. At the beginning of each case study, the required TSL package is mentioned. Our first case study, about the Nile data, is meant to illustrate the basic workings of the program and we advise you to start with that one.

# Gasoline

Date: June 30, 2022

Software: Time Series Lab - Home Edition

Topics: fractional seasonal periods and comparison of forecasting performance

Batch program: gasoline.txt

#### Gasoline consumption

The data for this case study is weekly data on US finished motor gasoline products supplied (in thousands of barrels per day) from February 1991 to May 2005.
It is part of the R package fpp2 and available from the EIA website.
It is also bundled with the installation file of TSL.
The dataset is used in the TBATS paper of De Livera, A.M., R.J. Hyndman, and R.D. Snyder (2011).
Furthermore, the dataset is analysed by R.J. Hyndman on his blog.
We quote from this blog post:The TBATS model is preferable when the seasonality changes over time. The ARIMA approach is preferable if there are covariates that are useful predictors as these can be added as additional regressors.

This gasoline case study illustrates that you don't need to choose between the two methods when you work with Time Series Lab. TSL offers a modelling framework for complex seasonal patterns AND, at the same time, with the inclusion of covariates (explanatory variables). We show that it is possible for TSL to produce more accurate forecasts compared to the TBATS package. We deliberately compare with TBATS since this package shows accurate forecasts when complex seasonal patterns are present in the data.

In the figure below, the gasoline dataset is loaded into TSL and plotted. The upward trend and seasonality patterns are clearly visible in the data.
The **Data characteristics** area shows T = 745 observations. At a later stage (Estimation step) we will split the time series into an Training sample and a test sample.

**Information:** On the Database page, you can copy the contents of the blue Data characteristics pane to the clipboard by right-mouse clicking the area and selecting Copy contents or by selecting the text and clicking Ctrl-c.

#### TSL Database page with Gasoline dataset loaded

#### Local Linear Trend model

The seasonal pattern of this time series is important but for illustrative purposes, we start our analysis without a seasonal component and select the **Local Linear Trend model** which is a model with a trend component but no seasonal component. Select the **Local Linear Trend model** model on the **Pre-built models** page.
Alternatively, you can go to the **Build your own model** page and select a time-varying level and a time-varying slope.

Our time series consist of a total of 745 observations (February 1991 to May 2005).
For this case study we select the first 484 observations as Training sample and leaving 261
observations as Validation sample. Drag (and/or click) the sample bar on the Pre-built models
page to set a Training sample of size 484, or alternatively, set the start and end of the Training
sample to 1 and 484 on the **Estimation** page.
Click the **Process Dashboard** button on the **Pre-built models** page or the **Estimate** button on
the Estimation page. TSL estimates the model and if you go to the **Text output** page you
see a **green** colored message informing us that:

All selected models and series were estimated successfully

Furthermore, at the bottom of the Text output page we find the Model fit. For the current model this is:

```
Variable: gasoline
Model: TSL003 Local Linear Trend
TSL003
Log likelihood -3485.355
Akaike Information Criterion (AIC) 6978.710
Bias corrected AIC (AICc) 6978.794
Bayesian Information Criterion (BIC) 6995.439
in-sample MSE 1.1177e+05
... RMSE 334.317
... MAE 268.147
... MAPE 3.480
Sample size 484
Effective sample size 482
* based on one-step-ahead forecast errors
```

We report these numbers here to show the improvement of adding a seasonal component later. The graphical output of the current model is shown in the figure below.

#### Graph page of TSL with Gasoline dataset

With a smoothed
level through the data, the seasonal pattern is even better visible. The triangular pattern in
the level will appear later as well when we plot the forecasting performance of the model.

Let's assess the forecast performance of the model by going to the Model comparison
page. This page can be viewed by clicking the **Model comparison** button in the button bar
on the left of your screen. Note that this button is only visible when a Validation sample is
specified. Click on the green **Start loss calculation** button in the top right of the window.
Under **User defined models**, a new check-button appears which you should tick. The resulting
TSL screen is shown below.

#### RMSE loss Local Linear Trend model and Gasoline dataset

The pyramid shaped loss line can be explained by the fact that a forecast from the Local Linear Trend model is a **straight line**
that is upward sloping for our data set. The forecasts do not take into account the seasonal
pattern of the data so when the data is at the highest or lowest point in the seasonal cycle, the
loss is the highest. Let's verify this. Go to the **Forecasting** page and select **multi-step-ahead** in the top left corner. Navigate to **Plot options** and Show forecast 150 periods ahead. The
resulting window should look the one presented in the figure below.

#### Forecasts for h=150 time points ahead

#### Basic Structural Time Series model

It is time to introduce a seasonal component. We go to the **Build your own model** page and
select a time-varying level, a time-varying slope, and a time-varying seasonal. The resulting
model is called the **Basic Structural Time Series model** by Harvey (1990). Set the Seasonal
period length to 365.25/7 ≈ 52.179 (weekly data taking leap years into account) and a Number
of factors equal to 22.

**Information:** Seasonal period length is the number of time points after which the
seasonal repeats. This can be a fractional number. For example, with daily data,
specify a period of 365.25 for a seasonal that repeats each year, taking leap years
into account. Number of factors specifies the seasonal flexibility. Note that a higher number
is not always better and parsimonious models often perform better in forecasting.

The Build your own model page should look like the one in the figure below.

#### Component selection page

Estimate the model and go to the Text output page. We see that the model fit is improved by adding the seasonal component.

```
Variable: gasoline
Model: TSL004
TSL004
Log likelihood -3218.400
Akaike Information Criterion (AIC) 6532.799
Bias corrected AIC (AICc) 6543.613
Bayesian Information Criterion (BIC) 6733.539
in-sample MSE 1.0083e+05
... RMSE 317.539
... MAE 252.502
... MAPE 3.244
Sample size 484
Effective sample size 430
* based on one-step-ahead forecast errors
```

A large improvement in forecasting performance, compared to the Local Linear Trend model, can be seen if we start (and plot) the loss calculation on the Model comparison page. This shows how important it can be to model the seasonal pattern of a time series. We will see an example of multiple seasonal patterns in a time series which makes the correct handling even more important. Plotting both RMSE losses leads to the following figure.

#### Forecast performance of LLT and Basis Structural model

Next, go back to the Build your own model page and lower the number of factors of the seasonal component. Parsimonious models often perform better in forecasting. A better performing number of factors is 7 although other values might be even better for forecasting. A model comparison between 22 and 7 factors is made in the figure below. The loss corresponding to the model with 7 factors is lower than the one that is presented in Figure 2 of De Livera, A.M., R.J. Hyndman, and R.D. Snyder (2011) which is obtained with the TBATS package.

#### Forecast performance of LLT and Basis Structural model

A figure with the extracted trend and seasonal pattern is obtained from the **Graphics and Diagnostics** page.

#### Gasoline data extracted trend and seasonal pattern

#### Further exploration

- Estimate the model with a Level, Slope, and Seasonal with frequency 52. Verify that by taking the leap year not into account (52 instead of 52.179), forecasts become worse.
- Forecasts can further be improved by adding explanatory variables. In TSL you can do this with the click of a couple of buttons on the Model setup page. Let us know which variables you have used to boost the forecast precision for the gasoline dataset!

# Bibliography

### References

Durbin, J. and Koopman, S. J. (2012). Time series analysis by state space methods. *Oxford university press*.

Harvey, A. (1989). Forecasting, Structural Time Series Models and the Kalman Filter. *Cambridge: Cambridge University Press*. doi:10.1017/CBO9781107049994

De Livera, A.M., R.J. Hyndman, and R.D. Snyder (2011). Forecasting Time Series With Complex Seasonal Patterns Using Exponential Smoothing. *Journal of the American Statistical Association* 106:496, 1513-1527.