# Case Studies

If you're interested in time series analysis and forecasting, this is the right place to be. The Time Series Lab (TSL) software platform makes time series analysis available to anyone with a basic knowledge of statistics. Future versions will remove the need for a basic knowledge altogether by providing fully automated forecasting systems. The platform is designed and developed in a way such that results can be obtained quickly and verified easily. At the same time, many advanced time series and forecasting operations are available for the experts. In our case studies, we often present screenshots of the program so that you can easily replicate results.

Did you know you can make a screenshot of a TSL program window? Press Ctrl + p to open a window which allows you to save a screenshot of the program. The TSL window should be located on your main monitor.

Click on the buttons below to go to our case studies. At the beginning of each case study, the required TSL package is mentioned. Our first case study, about the Nile data, is meant to illustrate the basic workings of the program and we advise you to start with that one.

# Nile

Date: June 30, 2022

Software: Time Series Lab - Home Edition

Topics: basic workings of program

#### Nile data

In this first case study we illustrate the fundamentals of TSL using observations from the river Nile. The data set consists of a series of readings of the annual flow volume at Aswan from 1871 to 1970. The Nile dataset is part of any TSL installer file and can be found in the data folder located in the install folder of TSL. Many time series concepts can be explained by the Nile time series alone.

#### Loading data

Let's start the modelling process. First go to the Database page of TSL by clicking the Database button. On this page we load, visually inspect, and prepare our data for the modelling process. The data set is loaded and selected from the file system by pressing the Load data button or by selecting Load data from the File menu. Locate the file Nile.csv in the data folder of the TSL install folder.

**Important**: The data set should be in column format with headers. The format of the data should be *.xls(x), or *.csv, *.txt with commas as field separation. The program (purposely) does not sort the data which means that the data should be in the correct time series order before loading it into the program.

After loading, click on the name **Nile** in the database field. If we click the **arrow bar** at the right side of the screen, a new area unfolds which shows us **Data characteristics** of the selected time series. It shows that the Nile time series has a length of $T = 100$ observations with $0$ missing values, among other characteristics. The TSL window should look like the Figure below.

#### Data inspection and preparation page

The highlighted variable **Nile** also appears in the **Select dependent variable** drop-down
menu. This is the so-called y-variable of the time series equation and it is the time series
variable of interest, i.e. the time series variable you want to model, analyse, and forecast. Optionally, a time series axis can be specified. The program's algorithm tries to auto-detect
the time axis specification (e.g. annual data, daily data) from the first column of the data
set. In the case of the Nile data illustration, it finds an annual time axis specification. If
auto-detection fails, the program selects the Index axis option which is just a number for each
observation, $1,2,3,...$

#### Pre-built models

Click on the **Pre-built models** button in the button bar at the left of your screen.
Switch on the Local Level model. Make sure this is the only selected model, see also the **Model
selection** summary in the blue pane in the bottom of the screen. Select an 100%/0% ratio
for **Training** and **Validation** sample. The settings are shown in the Figure below. Click the **Process Dashboard** button which is the green arrow located at the bottom right of your screen. After pressing this button, two things happen:

- TSL estimates the selected models and prints results to the
**Text output page**. The results are: progress results from the optimizer and**model fit**of the selected models. - Once processing of the selected models is complete, TSL plots the information it found
and shows the
**Graphics page**.

#### Model selection page with Local Level model selected

#### Graphical output

After processing the selected models, TSL automatically takes you to the Graphics page. Components, or combinations of components, can be easily plotted and removed from the plot by checking or unchecking the tickboxes in the top left corner of the page. You can add subplots as well to create a grid of plots.

Click on a (sub)plot to activate it. Notice that by clicking on a subplot, the check-boxes in the top left of the window correspond to the current selection of lines in the subplot. If not all checkbox settings correspond with the lines in the subplot, switch tabs to show the rest of the selection.

To see what is meant by the text above: switch from Smoothing to Filtering. You now see
that the Level checkbox is **unchecked** because the level that is currently plotted corresponds
to the Smoothed level and not the Filtered level. The reason the Smoothed level in the plot
does not automatically switch to a Filtered level on changing is that we sometimes want to
compare Smoothed, Filtered, and Predicted components in one plot. If you click on level, the
resulting graph should look like the following figure.

#### Time Series Lab Graph page

#### Missing data

We continue this Case study with a version of the Nile data with missing values to illustrate
one of the many advantages of using TSL, namely the capability of **easily** handling missing
values. Missing values in time series can occur due a variety of reasons and for some time
series algorithms it is problematic.

Missing data can cause problems for some time series algorithms. These algorithms often revert to deleting the missing values or the missing values are filled with certain values. In TSL there is no need to rely on such drastic measures. Missing values are part of time series analysis and they should be handled in a correct manner.

Go back to the Database page of TSL and select the **Nile_missing** time series by clicking
on the name. We see that the **Data characteristics** are updated by selecting the new time
series. It shows us 40 missing values, among other characteristics. The TSL window should
now look like the following figure.

#### Data inspection and preparation page

We will estimate and compare two models with each other. Click on the **Pre-built models**
button in the button bar at the left of your screen and switch on, the models **Exponential
Smoothing** and **Local Level**. Make sure these are the only selected models, see also the **Model
selection** summary in the blue pane in the bottom of the screen. Select an 100%/0% ratio for
**Training** and **Validation sample** and click the **Process Dashboard** button which is the green
arrow located at the bottom right of your screen.

#### Comparing results

Go to the **Graphics and diagnostics** page and click the **Clear all** button (eraser icon, bottom
right) to start with a clean graph window. From the **Individual** tab select Y data to plot the
Nile_missing time series. From the drop-down menu select the Exp Smoothing model and
plot the **Total signal** from the **Composite** tab. Next, from the drop-down menu select the
Local Level model and make sure the **Type** in the top left corner says **Predicting**, followed
by plotting the **Total signal** from the **Composite** tab. The resulting graph should look the
figure below. We see that the Local Level model reacts stronger to changes in the time
series after missing values periods.

#### Time Series Lab graph page

We can also see the difference in model fit expressed in numbers. Go to the **Text output**
page where at the end of the estimation, model fit of the selected models is summarized.
Looking at **in-sample MSE** we see that the loss of the Local Level model is lower.

```
Variable: Nile_missing
Model(s):
TSL005 Exp Smoothing
TSL006 Local Level
TSL005 TSL006
Log likelihood - -380.01
Akaike Information Criterion (AIC) - 766.02
Bias corrected AIC (AICc) - 766.44
Bayesian Information Criterion (BIC) - 772.30
in-sample MSE 23735.33 23069.80
... RMSE 154.06 151.89
... MAE 119.86 118.60
... MAPE 14.06 13.70
Sample size 100 100
Effective sample size 99 99
* based on one-step-ahead forecast errors
```

If you want to compare models and conclude something like "model A is better than model B", it is important to note that only looking at in-sample (Training sample) model fit can
be misleading. It is often a good idea to take **forecast performance** into account as well. If
model A performs better on both model fit and **forecast performance**, it is a good indication
of model A being preferred over model B. We see examples of comparing forecast performance
in other Case studies.
The forecasts of both our models can be visually inspected on the **Forecasting** page.
The figure below plots the forecasts of both model in one graph. Since no new data is coming in,
the forecasts are just straight lines but the level (height) of the lines differ per model. Note
that the local level model is not just a theoretical model, it has practical value as well. For
example for inflation modelling, the local level model is a strong contender. We will see more
complex forecasting patterns in other case studies.

#### Time Series Lab forecast page

#### Outliers and Structural breaks

**Intervention analysis**, also called **anomaly detection**, is an important part of time series analysis.
We distinguish two types of anomalies, **Outliers and Structural breaks**. For example, early
warning systems rely on outlier and break detection. Could a catastrophic event have been seen in advance? Take for example sensor readings from an important piece of heavy machinery.
The breaking down of this machine would cost a company a lot of money. If anomalies were
detected in the sensor reading, preventive maintenance might have saved the company from
a break-down of the machine.
Intervention variables are dummy (or indicator) variables which are used to take account
of outlying observations and structural breaks. These data irregularities are usually thought
of as arising from a specific event, for example a strike in the case of an outlier or a change
in policy in the case of a structural break. An outlier can be thought of as an unusually
large value of the irregular disturbance at a particular time. It can be captured by an impulse
intervention variable which takes the value one at the time of the outlier and zero elsewhere.
A structural break in which the level of the series shifts up or down is modelled by a step
intervention variable which is zero before the event and one after. Alternatively it can be
modelled in exactly the same way by adding an outlying intervention to the level equation. In
other words the break is identified with an unusually large value of the level disturbance.
TSL is able to propose a set of potential outliers and structural breaks for time series.
It is an effective multi-step procedure based on the auxiliary residuals, see also Harvey and Koopman (1992) for details. First the selected model is estimated and the diagnostics are
investigated. Then a first (larger) set of potential outliers and trend breaks are selected from
the auxiliary residuals.
After re-estimation of the model, only those interventions survive
that are sufficiently significant. After the automatic selection, the results are reported. All
considered outliers and breaks are kept in the intervention dialog and they can be deleted
from the model or added to the model.

The Nile time series has some interesting features with regard to Intervention analysis.
To see this, go back to the **Database** page and select the Nile time series again without
missing values. Next, go to the **Build your own model** page and select a time-varying level
and time-varying slope. These two model components correspond to a model with the name
**Local Linear Trend model**. On top of that, select **Intervention variables** with the **automatic
setting**. Next, go to the **Estimation page**, make sure the sample starts at $t = 1$ and ends at
$t = 100$ and click the green **Estimate** button. Once TSL is done estimating, you should see
the graph as presented in the figure below.
We see from the figure that TSL finds a structural break and an outlier. We can also
inspect these in more detail by looking at the **Text output** page where we see

```
Beta Value Std.Err t-stat Prob
beta_outlier_1913-01-01 -389.4 123.92 -3.143 0.0022
beta_break_1899-01-01 -265.5 43.67 -6.079 2.4458e-08
```

TSL finds the location of the structural break at 1899 which is very plausible since the year 1899 corresponds to the building of a dam at Aswan. Interestingly, the addition of the outlier and structural break remove certain dynamics from the data which we can see from the straight lines in the graph which are the result of the (close to) zero variances from the Level and Slope component.

#### Graph page with structural break in Nile data

#### Further exploration

- On the graph page, plot the Autocorrelation Function (ACF) of the
**Predicted standardized residuals**for the Local Level model. Are all plotted lags within the confidence bounds? - Performing diagnostic tests can be done via the
**Print diagnostics**button located on the Graph page. Can you print the**Residual diagnostics**for the Exponential Smoothing model? Are all**Probabilities**for the**Normality**test above 0.05? - Outliers and structural breaks can be added (and removed) manually to (from) the
model by selecting the
**Manual**option of the Intervention variables. Estimate a Local Level model with only the structural break. - On the
**Estimation**page, specify the end of the estimation sample at 90 instead of 100. You now created a test sample which you can use to analyse**out-of-sample**forecast accuracy. Estimate the model with the new sample (1 - 90). You should see a new button (**Model comparison**) appear on the bottom left (button bar left) of the screen which allows you to do a forecast comparison with other models.

# Bibliography

### References

Durbin, J. and Koopman, S. J. (2012). Time series analysis by state space methods. *Oxford university press*.

Harvey, A. (1989). Forecasting, Structural Time Series Models and the Kalman Filter. *Cambridge: Cambridge University Press*. doi:10.1017/CBO9781107049994

A.C. Harvey, Koopman, S.J. (1992). Diagnostic checking of unobserved-components
time series models. *Journal of Business & Economic Statistics 10(4), 377–389.*