Case Studies

If you're interested in time series analysis and forecasting, this is the right place to be. The Time Series Lab (TSL) software platform makes time series analysis available to anyone with a basic knowledge of statistics. Future versions will remove the need for a basic knowledge altogether by providing fully automated forecasting systems. The platform is designed and developed in a way such that results can be obtained quickly and verified easily. At the same time, many advanced time series and forecasting operations are available for the experts. In our case studies, we often present screenshots of the program so that you can easily replicate results.

Did you know you can make a screenshot of a TSL program window? Press Ctrl + p to open a window which allows you to save a screenshot of the program. The TSL window should be located on your main monitor.

Click on the buttons below to go to our case studies. At the beginning of each case study, the required TSL package is mentioned. Our first case study, about the Nile data, is meant to illustrate the basic workings of the program and we advise you to start with that one.

Energy

Author: Rutger Lit
Date: July 04, 2022
Software: Time Series Lab - Home Edition
Topics: seasonal effects and intervention variables
Batch program: energy.txt

UK GAS consumption

The Energy data set that comes bundled with TSL has quarterly data on UK energy consumption. The data set is good for illustrative purposes since the data exhibits time-varying seasonality and the presence of strong outliers. Load the Energy data set and select variable ofuGASl where the trailing l means that logarithms were taken from the original data. The trend is upwards with some increase in the rate of growth in the years following the introduction of cheaper natural gas from the North Sea at the end of the 1960s.

Energy consumption without intervention variables

Select a time-varying level, time-varying slope, and time-varying seasonal on the Build your own model page. Since the data is quarterly data, we have a seasonal period of $s = 4$. Go to the Estimation page and click the Estimate button. After TSL is finished with the Estimation, go to the Text output page where you find (partially) the following output:


--------------------------------- PARAMETER SUMMARY ---------------------------------

Variance of disturbances:

Variance type                       Value        q-ratio
Level variance                 3.7108e-07     2.2775e-04
Slope variance                 7.4626e-06         0.0046
Seasonal variance              8.3846e-04         0.5146
Irregular variance                 0.0016         1.0000


Seasonal short properties:

Period                              Value        Std.Err        t-stat        Prob
1                                  0.6082         0.0804         7.562  1.7114e-11
2                                 -0.0884         0.0766        -1.153      0.2514
3                                 -0.6695         0.0658       -10.167      0.0000
4                                  0.1497         0.0573         2.611      0.0104

                                    Value                                     Prob
Seasonal chi2 test                  79.15                               4.6816e-17

--------------------------------- MODEL FIT ---------------------------------

Model: TSL001
variable: ofuGASl

                                               TSL001
Log likelihood                                83.1412   
Akaike Information Criterion (AIC)          -148.2824   
Bias corrected AIC (AICc)                   -146.4456   
Bayesian Information Criterion (BIC)        -124.1432   
in-sample MSE                                  0.0108   
... RMSE                                       0.1039   
... MAE                                        0.0696   
... MAPE                                       1.2493   
Sample size                                       108   
Effective sample size                             103  
* based on one-step-ahead forecast errors
                            

Under Seasonal short properties we see the average value of the seasonal periods. The seasonal component is a component that evolves around zero. The periods of the seasonal tell us that gas consumption is on average higher in Q1 and Q4 (> 0) and lower in Q2 and Q3 (< 0) which is not surprising due to temperature effects. An important statistic is the Seasonal chi2 test which is the combined effect of the seasonal component. The individual components are not all statistically different from zero (p-value of Q2 is 0.2514) but the total effect of the seasonal component is strongly significant with a p-value of 4.6816e-17.

Next, go to the Graph page and construct a figure with four subplots (use add subplot button bottom right) with the following content:

  • Top left: y data and smoothed level
  • Top right: smoothed seasonal
  • Bottom left: smoothed residuals
  • Bottom right: y data and Total signal, zoomed in on the period 1968 − 1973.

The figure should look like the figure below. The first graph shows the trend; there is a strong increase in gas usage with the introduction of natural gas from the North sea in the early 1970s. The seasonality is shown in the graph in the top right corner in terms of its multiplicative effect on the trend. The greater dispersion in the seasonal pattern over time is due to a higher proportion of gas being used for heating as usage increased in the 1970s. The final graph shows the seasonally adjusted series produced by fitting the structural time series model. Furthermore, two spikes are present in the residuals around 1970 as can be seen in the bottom left panel. The Total signal (bottom right) tracks the data accurately except for a discrepancy in 1970.

Basis Structural model for energy consumption

Data inspection and preparation page
The figure shows the result of fitting a model with time-varing level, slope, and seasonal s = 4. The top left panel show the data and the smoothed level. The top right panel shows the smoothed seasonal. The bottom left panel shows the smoothed residuals. The bottom right panel shows the data and the total signal, zoomed in on the period 1968 − 1973.

Let's investigate the residuals further. On the Graph page, press the Print diagnostics button and select Outlier and break diagnostics and click Continue. TSL prints the following on the Text output page.


Diagnostic output for:
model: TSL001
variable: ofuGASl

Outlier and break diagnostics

Values larger than 3.00 for Irregular residual:
Period                              Value             Prob
1970-07-01                          4.287       1.9773e-05
1970-10-01                         -3.972       6.4463e-05

No values larger than 3.00 for Level residual:

No values larger than 3.00 for Slope residual:
                            

Outliers are identified from large Irregular residuals and structural breaks from Level residuals. We see that the outlier periods are identified as 1970-07-01 (Q3) and 1970-10-01 (Q4).

Energy consumption with intervention variables

We use the same model as in the last section and add Intervention variables to the model, i.e. switch on Intervention variables on the Build your own model page and select the Automatically option. Estimate the model.


--------------------------------- PARAMETER SUMMARY ---------------------------------

Variance of disturbances:

Variance type                       Value        q-ratio
Level variance                 2.4771e-04         0.4515
Slope variance                 5.3951e-06         0.0098
Seasonal variance              5.4863e-04         1.0000
Irregular variance             1.0363e-04         0.1889


Seasonal short properties:

Period                              Value        Std.Err        t-stat        Prob
1                                  0.6117         0.0616         9.933      0.0000
2                                 -0.0862         0.0574        -1.501      0.1363
3                                 -0.6582         0.0479       -13.738      0.0000
4                                  0.1327         0.0412         3.223      0.0017

                                    Value                                     Prob
Seasonal chi2 test                  197.0                               1.8447e-42

Intervention coefficients:

Beta                                Value        Std.Err        t-stat        Prob
beta_outlier_1970-07-01            0.4023         0.0531         7.571  1.7910e-11
beta_outlier_1970-10-01           -0.3375         0.0531        -6.352  6.1672e-09

--------------------------------- MODEL FIT ---------------------------------

Model: TSL002
variable: ofuGASl

                                               TSL002
Log likelihood                               108.4862   
Akaike Information Criterion (AIC)          -194.9724   
Bias corrected AIC (AICc)                   -192.2224   
Bayesian Information Criterion (BIC)        -165.4690   
in-sample MSE                                  0.0095   
... RMSE                                       0.0976   
... MAE                                        0.0659   
... MAPE                                       1.1722   
Sample size                                       108   
Effective sample size                             101  
* based on one-step-ahead forecast errors
                            

We see that TSL correctly finds the outliers that were presented at the end of the last section and that the corresponding probabilities are closed to zero meaning the outliers are strongly significant. There is another way of assessing the contribution of the added outliers and that is by performing a likelihood ratio test (LR test) between the model with and without the outliers. In a LR test you compare two log-likelihood values and test if the difference is statistically significant based on a number of degrees-of-freedom. The null hypothesis of the LR test is: \[ H_0: \text{the smaller model provides as good a fit for the data as the larger model,} \] with the test statistic of the LR test given by \[ -2 [\ell(\theta_0) - \ell(\theta_1)] \] where $\ell(\theta_i)$ is the log-likelihood of model $i$. The LR statistic for our two likelihoods is 50.68 and the statistic is Chi-square distributed with degrees of freedom equal to the difference in the number of parameters for the two models, which is 2 in our case (2 extra outlier variables). The probability belonging to our LR statistic is 9.8416e-12 so we strongly reject our null hypothesis. A similar graph like the last figure, with the outliers added, is presented in the figure below.

Basis Structural model + Interventions for energy consumption

Data inspection and preparation page
The figure shows the result of fitting a model with time-varing level, slope, seasonal s = 4, and intervention variables. The top left panel show the data and the smoothed level + interventions. The top right panel shows the smoothed seasonal. The bottom left panel shows the smoothed residuals. The bottom right panel shows the data and the total signal, zoomed in on the period 1968 − 1973.

Compared to the first figure, we see some important differences. One, the spikes in the residuals are gone. Two, the discrepancy in 1970 in the Total signal is completely gone.