The FTSE 100 and financial time series
12 January 2010 | 0 comments | Tagged as: ARIMA neural-networks
What are Share Indices?
A Share Index is an average that lists the leading companies in a market based on their market capitalisation. Market capitalisation is a calculation of the companies share price multiplied by the number of its shares. This calculation is used as a weighting system, whereby movements in the share price of larger companies will have a greater effect on the index than movements in smaller companies.
The popularity of investing directly in these indices has increased over recent years. Share Indices provide a useful means for allowing individuals or organisations to invest in the overall market movement rather than take the higher risk of selecting individual securities.
The companies that make up these indices come from all over the business spectrum. As a result of this diversity and also due to additional influences exerted by other markets, there are a great many factors that can affect their value. This makes the task of forecasting their movements an extremely difficult one. Even so, much time and effort is spent by institutions and investors analysing their behaviour.
The FTSE 100 index
The first index to be calculated on the market traded at the London Stock Exchange was known as the Financial Times Ordinary Share Index. This began in 1935 and was based on the top 30 companies.
As time went on, the 30 Ordinary Share Index became an inadequate measure, so the Financial Times Stock Exchange 100 Index was formed in 1984, increasing the number of companies listed to 100. The index began at the level of 1000 and represent 77% of the capitalisation of the whole market. It is calculated every fifteen seconds from 8.30am to 4.30pm.
The figure shows how the index closing price has fluctuated between 26/09/2001 and 02/07/2004.
Financial Time Series Analysis
As the value of Share Indices rise and fall over time, we can view this price change/time combination as a financial time series’. A time series can be defined as a collection of observations made sequentially through time. It is said to be continuous when observations are made continuously through time and discrete when observations are taken at specific times, usually equally spaced.
These definitions can be applied to the movements of the FTSE 100 closing price over time and available techniques for analysis and prediction are therefore applicable. Traditional methods of analysis are primarily concerned with decomposing the variations in the time series into components representing long term trends, seasonal and other cyclical variations. The figure below shows an example of seasonal and long term variations.
After these variations have been removed, usually additional fluctuations remain. Whether or not these are random however is a contentious issue.
Fundamental Analysts study economic and political features, everything that make prices what they are. They believe that there are far too many random influencing factors so these fluctuations must be down to the price following a random walk. In other words, the price at time t equals the price at t-1 plus some random element.
Technical Analysts on the other hand are more concerned with the study of the market itself and trends in the price and traded volume. They have techniques such as Autoregressive (AR) and Moving Averages (MA) modelling to try to find additional shorter trends in this remaining data.
AR refers to using the past data to self predict and MA refers to concept of smoothing the data by using an average of the past n days.
A Traditional Methodology for Forecasting Time Series
Box and Jenkins developed a systematic approach for identifying forecasting models which incorporated both techniques of AR and MA. Known as Autoregressive Integrated Moving Average (ARIMA), it is made up of three stages, Model Identification, Estimation and Validation.
Model Identification
The first step in the process is to determine if the series is stationary and/or seasonal. A series is said to be stationary if the mean, variance and autocorrelation remain constant over time. This means that the series is flat and not trending. A common way to make a series stationary is to take differences. The table below shows a sample of Index values together with their first difference.
| Date | Index Value | 1st Diff | Date | Index Value | 1st Diff |
|---|---|---|---|---|---|
| 10/02/2004 | 4404.95 | * | 22/04/2004 | 4571.83 | 31.963 |
| 11/02/2004 | 4396.05 | -8.900 | 23/04/2004 | 4569.95 | -1.876 |
| 12/02/2004 | 4377.73 | -18.319 | 26/04/2004 | 4571.85 | 1.891 |
| 13/02/2004 | 4412.01 | 34.283 | 27/04/2004 | 4575.68 | 3.838 |
| 16/02/2004 | 4408.12 | -3.894 | 28/04/2004 | 4524.48 | -51.204 |
| 17/02/2004 | 4461.49 | 53.375 | 29/04/2004 | 4519.53 | -4.947 |
| 18/02/2004 | 4442.90 | -18.591 | 30/04/2004 | 4489.69 | -29.844 |
| 19/02/2004 | 4515.57 | 72.663 | 04/05/2004 | 4547.23 | 57.545 |
| 20/02/2004 | 4515.04 | -0.523 | 05/05/2004 | 4569.53 | 22.298 |
| 23/02/2004 | 4524.31 | 9.271 | 06/05/2004 | 4516.17 | -53.362 |
| 24/02/2004 | 4496.76 | -27.551 | 07/05/2004 | 4498.37 | -17.798 |
| 25/02/2004 | 4507.55 | 10.783 | 10/05/2004 | 4395.16 | -103.210 |
| 26/02/2004 | 4515.89 | 8.341 | 11/05/2004 | 4454.72 | 59.564 |
| 27/02/2004 | 4492.21 | -23.672 | 12/05/2004 | 4412.93 | -41.794 |
| 01/03/2004 | 4537.00 | 44.789 | 13/05/2004 | 4453.81 | 40.880 |
| 02/03/2004 | 4540.11 | 3.111 | 14/05/2004 | 4441.79 | -12.019 |
| 03/03/2004 | 4525.13 | -14.982 | 17/05/2004 | 4403.02 | -38.771 |
| 04/03/2004 | 4559.07 | 33.939 | 18/05/2004 | 4414.41 | 11.391 |
| 05/03/2004 | 4547.08 | -11.990 | 19/05/2004 | 4471.80 | 57.394 |
| 08/03/2004 | 4553.75 | 6.670 | 20/05/2004 | 4428.71 | -43.094 |
| 09/03/2004 | 4542.01 | -11.744 | 21/05/2004 | 4431.43 | 2.724 |
| 10/03/2004 | 4545.33 | 3.327 | 24/05/2004 | 4428.87 | -2.561 |
| 11/03/2004 | 4445.22 | -100.115 | 25/05/2004 | 4418.00 | -10.872 |
| 12/03/2004 | 4467.35 | 22.130 | 26/05/2004 | 4438.29 | 20.283 |
| 15/03/2004 | 4412.93 | -54.418 | 27/05/2004 | 4453.62 | 15.334 |
| 16/03/2004 | 4428.90 | 15.964 | 28/05/2004 | 4430.69 | -22.931 |
| 17/03/2004 | 4456.80 | 27.903 | 01/06/2004 | 4422.68 | -8.009 |
| 18/03/2004 | 4397.87 | -58.931 | 02/06/2004 | 4422.80 | 0.118 |
| 19/03/2004 | 4417.74 | 19.867 | 03/06/2004 | 4435.41 | 12.609 |
| 22/03/2004 | 4333.77 | -83.966 | 04/06/2004 | 4454.45 | 19.042 |
| 23/03/2004 | 4318.51 | -15.259 | 07/06/2004 | 4491.60 | 37.150 |
| 24/03/2004 | 4309.45 | -9.065 | 08/06/2004 | 4504.83 | 13.227 |
| 25/03/2004 | 4373.63 | 64.188 | 09/06/2004 | 4489.47 | -15.352 |
| 26/03/2004 | 4357.53 | -16.104 | 10/06/2004 | 4486.10 | -3.369 |
| 29/03/2004 | 4406.73 | 49.200 | 11/06/2004 | 4483.96 | -2.150 |
| 30/03/2004 | 4412.82 | 6.094 | 14/06/2004 | 4433.17 | -50.782 |
| 31/03/2004 | 4385.67 | -27.150 | 15/06/2004 | 4458.61 | 25.441 |
| 01/04/2004 | 4410.71 | 25.040 | 16/06/2004 | 4491.13 | 32.515 |
| 02/04/2004 | 4465.61 | 54.894 | 17/06/2004 | 4493.29 | 2.162 |
| 05/04/2004 | 4480.70 | 15.090 | 18/06/2004 | 4505.81 | 12.515 |
| 06/04/2004 | 4472.82 | -7.879 | 21/06/2004 | 4502.18 | -3.629 |
| 07/04/2004 | 4468.69 | -4.130 | 22/06/2004 | 4468.49 | -33.682 |
| 08/04/2004 | 4489.67 | 20.983 | 23/06/2004 | 4486.73 | 18.232 |
| 13/04/2004 | 4515.78 | 26.108 | 24/06/2004 | 4503.19 | 16.460 |
| 14/04/2004 | 4485.42 | -30.357 | 25/06/2004 | 4494.05 | -9.136 |
| 15/04/2004 | 4505.50 | 20.078 | 28/06/2004 | 4518.68 | 24.631 |
| 16/04/2004 | 4537.28 | 31.775 | 29/06/2004 | 4512.41 | -6.270 |
| 19/04/2004 | 4546.22 | 8.940 | 30/06/2004 | 4464.07 | -48.343 |
| 20/04/2004 | 4569.02 | 22.803 | 01/07/2004 | 4424.72 | -39.345 |
| 21/04/2004 | 4539.87 | -29.152 | 02/07/2004 | 4407.40 | -17.322 |
Once it has been established that the series is stationary, you need to identify which model (AR, MA or both) your data fits. This is done by examining plots of its Autocorrelation (ACF) and Partial Autocorrelation (PACF) Functions. These are measures of how related data values are to each other.
An ACF with large spikes at initial lags that decay to zero or a PACF with a large spike at the first and possible at the second lag indicates an autoregressive process An ACF with a large spike at the first and possibly at the second lag and a PACF with large spikes at initial lags that decay to zero indicates a moving average process. The ACF and the PACF both exhibiting large spikes that gradually die out indicates that both autoregressive and moving averages processes are present.
The figure below shows the Autocorrelation Function of first difference of our sample data set.
The figure below shows the Partial Autocorrelation Function of first difference of our sample data set.
Estimation
ARIMA is usually shown in the format ARIMA(p, d, q), where p is the number of autoregressive terms, d is the number of differences and q is the number of moving average terms. This next step involves the estimation of these coefficients. In general parameters are selected and then the validation step is performed. If the model is found to be unacceptable, new parameters are tested. Each parameter is a number between 0 and 5, with the sum of all 3 not exceeding 10.
Our sample data set shows small spikes throughout the plot. The shortness of these spikes suggests a randomness about the data. This makes the identification of a model quite difficult. However, as both plots are very similar this would suggest that both AR and MA elements exist. Through trial and error, the model ARIMA(1, 1, 1) has been chosen.
Validation
The final step is to examine the residuals, which is the data left over, and should be just noise. Plots of the ACF and PACF of the residuals are examined for any large spikes. If any appear, then new parameters should be selected.
The figure belows shows the Autocorrelation Function of residuals for our data set.
The figure below shows the Partial Autocorrelation Function of residuals for our data set.
Once it is established that the remaining spikes are due to noise, the model can be used to forecast. The figure below is a time series plot for our data set which shows a 5 day forecast. The forecast values are displayed as red triangles and the upper and lower 95% confidence limits are displayed as blue triangles.
Listed below is the printout from Minitab showing various key statistics together with the forecasted values.
ARIMA model for IndexValue
Estimates at each iteration
| Iteration | SSE | Params | ||||
|---|---|---|---|---|---|---|
| 0 | 224259 | 0.100 | 0.100 | 0.100 | 0.100 | -1.223 |
| 1 | 167665 | 0.038 | -0.050 | 0.162 | 0.249 | -1.404 |
| 2 | 158166 | -0.112 | 0.029 | 0.020 | 0.385 | -1.488 |
| 3 | 148504 | -0.262 | 0.086 | -0.123 | 0.508 | -1.569 |
| 4 | 137685 | -0.412 | 0.123 | -0.267 | 0.632 | -1.658 |
| 5 | 123858 | -0.562 | 0.116 | -0.406 | 0.766 | -1.812 |
| 6 | 110027 | -0.631 | -0.034 | -0.446 | 0.837 | -2.092 |
| 7 | 105695 | -0.586 | -0.184 | -0.378 | 0.846 | -1.983 |
| 8 | 105283 | -0.594 | -0.229 | -0.398 | 0.850 | -1.898 |
| 9 | 105243 | -0.569 | -0.242 | -0.373 | 0.851 | -1.823 |
| 10 | 105239 | -0.583 | -0.246 | -0.391 | 0.852 | -1.832 |
| 11 | 105239 | -0.569 | -0.247 | -0.374 | 0.852 | -1.808 |
| 12 | 105238 | -0.580 | -0.248 | -0.388 | 0.852 | -1.823 |
| 13 | 105238 | -0.570 | -0.247 | -0.376 | 0.852 | 1.809 |
| 14 | 105238 | -0.579 | -0.248 | -0.386 | 0.852 | -1.821 |
| 15 | 105238 | -0.571 | -0.248 | -0.377 | 0.852 | -1.810 |
| 16 | 105238 | -0.578 | -0.248 | -0.385 | 0.852 | -1.819 |
| 17 | 105238 | -0.572 | -0.248 | -0.378 | 0.852 | -1.811 |
| 18 | 105238 | -0.577 | -0.248 | -0.384 | 0.852 | -1.818 |
| 19 | 105238 | -0.573 | -0.248 | -0.379 | 0.852 | -1.812 |
| 20 | 105238 | -0.576 | -0.248 | -0.384 | 0.852 | -1.817 |
| 21 | 105238 | -0.573 | -0.248 | -0.380 | 0.852 | -1.813 |
| 22 | 105238 | -0.576 | -0.248 | -0.383 | 0.852 | -1.817 |
| 23 | 105238 | -0.574 | -0.248 | -0.380 | 0.852 | -1.814 |
| 24 | 105238 | -0.575 | -0.248 | -0.382 | 0.852 | -1.816 |
| 25 | 105238 | -0.574 | -0.248 | -0.381 | 0.852 | -1.814 |
Final Estimates of Parameters
| Type | Coef | SE Coef | T | P |
|---|---|---|---|---|
| AR | 1 | -0.5744 | 0.3866 | -1.49 |
| SAR | 13 | -0.2477 | 0.1260 | -1.97 |
| MA | 1 | -0.3812 | 0.4293 | -0.89 |
| SMA | 22 | 0.1109 | 7.69 | 0.000 |
| Constant | -1.8145 | 0.9366 | -1.94 | 0.056 |
Differencing: 1 regular, 1 seasonal of order 13 Number of observations: Original series 100, after differencing 86 Residuals: SS = 93938.9 (backforecasts excluded) MS = 1159.7 DF = 81
Modified Box-Pierce (Ljung-Box) Chi-Square statistic:
| Lag | 12 | 24 | 36 | 48 |
| Chi-Square | 15.7 | 22.4 | 50.6 | 60.7 |
| DF | 7 | 19 | 31 | 43 |
| P-Value | 0.028 | 0.266 | 0.015 | 0.039 |
Forecasts from period 100. 95 Percent Limits
| Period | Forecast | Lower | Upper | Actual |
|---|---|---|---|---|
| 101 | 4383.90 | 4317.14 | 4450.66 | |
| 102 | 4372.02 | 4286.23 | 4457.80 | |
| 103 | 4361.06 | 4255.64 | 4466.47 | |
| 104 | 4375.08 | 4255.23 | 4494.93 | |
| 105 | 4379.39 | 4245.60 | 4513.17 |
POST A COMMENT
Markdown available. Required *.











