forked from mnhrdt/ccmath
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathC11-tseries
725 lines (521 loc) · 27.8 KB
/
C11-tseries
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
Chapter 11
TIME SERIES ANALYSIS
Summary
Time series functions support the use of models
in practical applications requiring analysis
and/or forecasting of series. The basic
time-domain model is a series of the
autoregressive moving average (ARMA) type. The
software can successfully deal with
generalizations, such as integrated (ARIMA)
models and factor models. Specific areas covered
are:
o Generation and Prediction of ARMA Time Series
o Estimation of ARIMA Model Parameters
o Estimation of Factor Model Parameters
o Model Identification and Evaluation
o Time Series Utility Operations
Factor models are a powerful generalization of
the Box-Jenkins approach. They support the
extraction of meaningful quantitative measures
from series with highly correlated noise.
-------------------------------------------------------------------------------
Notes on Contents
The functions of this library segment are used for the analysis and
estimation of Box Jenkins time series models. These models have been
applied to a wide range of problems in forecasting, process control, and
the characterization of time series. Both standard ARIMA models and an
important generalization to factor models can be identified and estimated.
o Generation and Prediction of ARMA Time Series:
sarma -------- generate successive terms in an ARMA series.
parma -------- predict the values of an ARMA time series.
o Estimate the Parameters of an ARIMA Model:
evmod -------- compute the residual of an ARIMA series.
drmod -------- compute the model residual and the derivatives
with respect to model parameters.
seqts -------- perform a sequential update of model parameters.
fixts -------- perform a batch mode, Gauss-Newton update of
the ARIMA model parameters.
o Estimate the Parameters of a Factor Model:
evfmod ------- compute the residual in a time series Factor model.
drfmod ------- compute the model residual and the derivatives
with respect to model parameters for a Factor model.
seqtsf ------- perform a sequential update of Factor model parameters.
fixtsf ------- compute a batch mode, Gauss-Newton, update of Factor
model parameters.
o Model Identification and Evaluation:
sany -------- compute both direct and inverse autocorrelation
coefficients of a time series.
resid ------- analyze the residuals of a time series model.
The residuals of a model should be consistent with the assumption that
the series is driven by "white" noise. Several tests of this hypothesis are
applied in the resid() function.
o Time Series Utility Operations:
xmean ------- subtract the mean value of an input series from
each term.
sdiff ------- apply iterations of the sequential difference operator
to a series.
sintg ------- invert the sequential difference operation (ie.
integrate the series).
-------------------------------------------------------------------------------
General Technical Comments
The fundamental ARMA model underlying such applications is a stationary
stochastic process driven by uncorrelated "white" noise. The model equation
takes the form
x[i] = e[i] + Sum(j=1 to m) a[j]*x[i-la[j]] -
Sum(k=1 to n) b[k]*e[k-lm[k]] ,
where x[i] are the series values and e[i] are the values of the driving
noise. Nonstationary series can be addressed by employing differenced series,
with
x[i] = D^d{ y[i] }, D is the difference operator
D{ y[i] } = y[i] - y[i-1] ,
and d is the order of difference. Models with d>0 are referred to as ARIMA
models, with the I standing for integrated.
Factor models provide a powerful generalization of the time series model
in which a series has both a value y[i] and a condition code c(i) associated
with the "time" index i. The basic model equation takes the form
y[i] = f[c[i]] + v[i] ,
where v[i] is an ARIMA model series. The condition index c[i] specifies a
factor parameter f, which represents a mean value corresponding to the coded
condition. This condition concept is quite general. The index c[i] can be
used to distinguish physical conditions at different times, effects periodic
in time, or epochs separated by a change in the environment. Factor models
are used to estimate a quantitative measure of the effect of changing
conditions from highly correlated input series.
The estimation process for both ARIMA and Factor models employs a mix of
sequential and batch mode passes through the observed series. This
combination combines the powerful search capabilities of the sequential
technique with the refined estimates produced by using good initial
parameters in Gauss-Newton processing. Sequential estimation can also be used
to develop adaptive models.
-------------------------------------------------------------------------------
FUNCTION SYNOPSES
-------------------------------------------------------------------------------
External Variables for ARMA Models:
_______________________________________________________________________________
The header files arma.h and ccmath.h contain definitions of the basic
model specification structure used in the time series functions. One of
these header files must be included.
arma.h
#ifndef NULL
#define NULL 0L
#endif
struct mcof {double cf; int lag;};
The time series model's structure is specified in external arrays for
both autoregressive (par) and moving average (pma) parameters. Each element
of these structure arrays contains a model parameter (p->cf) and the lag
(p->lag) at which it acts. The external variables required for model
generation and estimation are listed below. Their declarations must appear
in any program that uses the ARMA model evaluation or estimation functions.
struct mcof *par,*pma;
int nar,nma,np;
par = pointer to store of structure array of autoregressive
parameters and lags (dimension=nar)
pma = pointer to store of structure array of moving average
parameters and lags (dimension=nma)
nar = number of autoregressive parameters
nma = number of moving average parameters
np = total number of parameters (np=nar+nma)
The estimation functions assume that parameters are stored in a single
structure array. Allocation of this storage is illustrated by the the
following sequence.
np=nar+nma;
par=(struct mcof *)calloc(np,sizeof(*par));
pma=par+nar;
-------------------------------------------------------------------------------
ARMA Model Generation and Prediction:
-------------------------------------------------------------------------------
sarma
Generate successive terms in an autoregressive moving average (ARMA)
series.
double sarma(double er)
double er;
er = value of the driving error input
return value: y = current term of series
The model is generated by
y[k] = er[k] + Sum(i=0 to nar-1){ par[i]->cf*y[k-par[i]->lag]}
- Sum(j=0 to nma-1){ pma[j]->cf*er[k-pma[j]->lag]} .
The simulation must be initialized by a call of setsim.
setsim
void setsim(int k)
k = control flag, with
k=1 -> initialize simulation storage
k=0 -> free the storage allocated by setsim(1)
The storage initialized by setsim contains no history.
Lagged values of er and y are initially zero. The
generator is normally run through a number of initial
steps before storing simulated output. This serves to
eliminate transient startup effects.
-----------------------------------------------------------------
parma
Predict values of an ARMA time series.
double parma(double *x,double *e)
x = pointer to storage of current series value
e = pointer to storage of current residual
return: xp = predicted value of series
Each call of parma generates a single step prediction.
Multi-step forward predictions are generated by multiple
calls with zeros loaded in the current residual. The
following code sequence generates a k-step prediction.
for(i=0; i<k ;++i){
*(x+i+1)=parma(x+i,e+i,m);
*(e+i+1)=0.;
}
This loop alters "future" values of both the series
and the residuals. The starting point (x,e) must be
retained to support repeated predictions.
-------------------------------------------------------------------------------
ARMA Model Evaluation:
-------------------------------------------------------------------------------
evmod
Compute the residual in an ARMA model.
double evmod(double y)
y = observed value of time series
return value: ee = y-yp, where yp is the value predicted by the model
The model prediction yp is based on the lagged values of the
observations and estimated errors computed by evmod.
yp[k] = Sum(i=0 to nar-1) {par[i]->cf*y[k-par[i]->lag]}
- Sum(j=0 to nma-1) {pma[j]->cf*ee[j-pma[j]->lag]} ,
where ee[k] = y[k] - yp[k] at "time" k. The lagged values of
observations and prediction error are automatically updated
by the function.
A call of setev must be used to initialize storage for evmod.
setev
void setev(int k)
k = control flag, with
k=1 -> initialize storage
k=0 -> free storage initialized by setev(1)
The storage initialized by setev contains no history.
Lagged values of ee and y are initially zero.
-----------------------------------------------------------------
drmod
Compute the model residual and the derivatives of the predictor with
respect to model parameters (core function in model estimation).
double drmod(double y,double *dr)
y = observed value of the time series
dr = pointer to array of derivatives of the predictor yp with
respect to model parameters (order nar-autoregressive
followed by nma-moving average, dimension = np)
return value: ee = y-yp, where yp is the value predicted by
the time series model
The derivatives evaluated by this function are
dr[i] = Del(par[i]->cf){yp} for i=0 to nar-1 , and
dr[nar+j] = Del(pma[j]->cf){yp} for j=0 to nma-1 ,
Here Del(t){x} denotes the partial derivative of x
with respect to the parameter t. The prediction
function yp is evaluated in the manner indicated
above (see evmod).
A call to setdr must be used to initialize internal storage.
setdr
setdr(int k)
k = control flag, with
k=1 -> initialize internal storage
k=0 -> free storage initialized by setdr(1)
The storage initialized by setdr contains no history
Lagged values of ee and y are initially zero.
-------------------------------------------------------------------------------
ARMA Model Estimation:
-------------------------------------------------------------------------------
The two estimation functions are complementary, since the sequential
form is an excellent search procedure while the Gauss-Newton algorithm
converges rapidly from a good estimate. In practice, the initial ARMA
parameters can normally be set to zero when estimation is started with
sequential passes.
seqts
Perform a sequential estimation update of the parameters of a
time series model.
double seqts(double *x,int n,double *var,int kf)
x = pointer to input array containing observed values of the series
n = length of the input series
var = pointer to matrix array (np by np) proportional
to the updated parameter variance matrix at exit
(the constant of proportionality = ssq/(n-np),
index order nar autoregressive parameters
followed by nma moving average parameters)
kf = control flag, with
kf=0 -> initialize var to the identity matrix
kf=1 -> use the input var matrix
return value: ssq = sum of the squares of the model residuals
ssq = Sum(i=0 to n-1) { (yp[i] - x[i])^2 } ,
with yp[i] = prediction i-1 -> i
---------------------------------------------------------------
fixts
Perform a fixed Gauss-Newton estimation iteration to fit time series
model parameters.
double fixts(double *x,int n,double *var,double *cr)
x = pointer to input array containing observed values of the series
n = length of the input series
var = pointer to matrix array (np by np) proportional
to the parameter variance matrix at exit
( the constant of proportionality = ssq/(n-np) )
cr = pointer to array of dimension np containing the parameter
corrections estimated by this function call
return value: ssq = sum of the squares of the model residuals
ssq = -1.0 -> singular var matrix
ssq = Sum(i=0 to n-1){ (yp[i] - x[i])^2 }
with yp[i] = prediction i-1 -> i
The index order for both cr and var is nar autoregressive
parameters followed by nma moving average parameters.
-------------------------------------------------------------------------------
External Variables for Factor Models:
-------------------------------------------------------------------------------
The basic factor model variables are delivered in a structure (of type
struct fmod), with components:
y.fac = integer encoding factor; and
y.val = corresponding value of series.
The structures employed in factor model estimation are defined in the
header files armaf.h and ccmath.h. One of these headers must be included.
armaf.h
#ifndef NULL
#define NULL 0L
#endif
struct mcof {double cf; int lag;};
struct fmod {int fac; double val;};
The structure of the model is specified in an external structure array,
including factor parameters (pfc), autoregressive parameters (par), and
moving average parameters (pma). Coefficients (p->cf) and lags (p->lag) are
required for autoregressive and moving average parameters. Only the
coefficient is employed for factors. Model parameters are delivered to
estimation and evaluation functions by the following external variables.
Their declarations must appear in any program that calls the factor model
evaluation and estimation functions.
struct mcof *pfc,*par,*pma;
int nfc,nar,nma,ndif,np;
pfc = pointer to store of structure array containing
factor parameters (dimension=nfc)
par = pointer to store of structure array of autoregressive
parameters and lags (dimension=nar)
pma = pointer to store of structure array of moving average
parameters and lags (dimension=nma)
nfc = number of factor parameters
nar = number of autoregressive parameters
nma = number of moving average parameters
np = total number of parameters (np=nfc+nar+nma)
ndif = order of difference used
The factor model estimation functions assume that the parameters are stored
in a single array. Allocation of external storage in the calling program
should employ a code sequence such as,
np=nfc+nar+nma;
pfc=calloc(np,sizeof(*pfc));
par=pfc+nfc; pma=par+nar;
if the estimation functions are to be used.
-------------------------------------------------------------------------------
Factor Model Evaluation:
-------------------------------------------------------------------------------
evfmod
Compute the residual in a factor model.
double evfmod(struct fmod y)
y = structure containing:
y.val = difference, of order ndif, of observed values
the of time series, and
y.fac = factor corresponding to current value
return value: ee = y.val - yp, where yp is the value predicted
by the model
The model prediction yp is based on the lagged values of the
observations and estimated errors computed by evfmod. The
prediction is simply an ARMA model prediction with a mean,
computed by differencing the factor averages, imposed.
yp = D^d{f(k)} + Sum(i=0 to nar-1){ par[i]->cf*y[k-par[i]->lag] }
- Sum(j=0 to nma-1){ pma[j]->cf*ee[k-par[k]->lag] } .
Here D denotes the difference operator (see sdiff),
pcf[m(t)]->cf = f(k) , y[t]->fac = m(t) , and d = ndif .
The one-step prediction errors ee are identical to those we
would obtain without differencing the values of the input series.
A call to setevf must be used to initialize internal storage.
setevf
void setevf(int k)
k = control flag, with
k=1 -> initialize storage
k=0 -> free storage initialized by setevf(1)
The storage initialized by setevf contains no history.
Lagged values of er, y.val, and factor indices are
initially zero.
-------------------------------------------------------------------
drfmod
Compute the model residual and the derivatives of the predictor with
respect to model parameters (core estimation function).
double drfmod(struct fmod y,double *dr)
y = structure containing:
y.val = difference, of order ndif, of observed values
the of time series, and
y.fac = factor corresponding to current value
dr = pointer to array of derivatives of the predictor yp
with respect to model parameters (order nfc-factor ,
nar-autoregressive , nma-moving average)
return value: ee = y.val - yp, where yp is the value predicted
by the model
The derivatives computed by drfmod() are:
dr[m] = Del(par[m]->cf){ yp} for m=0 to nfc-1,
dr[nfc+i] = Del(par[i]->cf){ yp} for i=0 to nar-1, and
dr[nfc+nar+j] = Del(pma[j]->cf){ yp} for j=0 to nma-1 .
Here Del(t){x} denotes the partial derivative of x
with respect to the parameter t. The prediction yp is
identical to that discussed above (see evfmod).
The internal storage used by drfmod must be initialized by a
call of setdrf.
setdrf
setdrf(int k)
k = control flag, with
k=1 -> initialize storage
k=0 -> free storage initialized by setdrf(1)
The storage initialized by setdrf contains no history.
Lagged values of er, y.val and factor indices are
initially zero.
-------------------------------------------------------------------------------
Factor Model Estimation:
-------------------------------------------------------------------------------
The two estimation functions are complementary, since the sequential
form is an excellent search procedure while the Gauss-Newton algorithm
converges rapidly from a good estimate. In practice, the initial ARMA
parameters can normally be set to zero and a series mean used for initial
factor parameters. The estimation starts with a sequential search, and
fixed estimation is employed for the final passes.
seqtsf
Perform a sequential estimation update of the parameters of a time
series factor model.
double seqtsf(struct fmod *x,int n,double *var,int kf)
x = pointer to input structure array containing
x[i].val = difference, of order ndif, of observed
values the of time series, and
x[i].fac = factor corresponding to "time" i
n = length of the input series
var = pointer to matrix array (np by np) proportional to
the updated parameter variance matrix at exit
(the constant of proportionality = ssq/(n-np),
index order nfc factor parameters, nar
autoregressive parameters, nma moving
average parameters)
kf = control flag, with
kf=0 -> initialize var to the identity matrix
kf=1 -> use the input var matrix
return value: ssq = sum of the squares of the model residuals
ssq = Sum(i=0 to n-1){ (yp[i] - x[i])^2 }
yp[i] = prediction i-1 > i
If ndif>0, the variance matrix, var, is constrained to be
orthogonal to the vector u, defined by:
u[i] = 1 for i=0 to nfc-1, and u[i] = 0 for i >= nfc.
Thus, V*u = 0 , where V[i,j] = var[np*i+j] , and the
computed variance matrix V is singular when ndif>0.
---------------------------------------------------------------------
fixtsf
Perform a fixed Gauss-Newton estimation iteration to update
parameters of a time series factor model.
double fixtsf(struct fmod *x,int n,double *var,double *cr)
x = pointer to input structure array containing
x[i].val = difference, of order ndif, of observed values
the of time series, and
x[i].fac = factor corresponding to "time" i
n = length of the input series
var = pointer to matrix array (np by np) proportional to the
updated parameter variance matrix at exit
(the constant of proportionality = ssq/(n-np) )
cr = pointer to array of dimension np containing the parameter
corrections estimated by this iteration
return value: ssq = sum of the squares of the model residuals
ssq = Sum(i=0 to n-1){ (yp[i] - x[i])^2 }
yp[i] = prediction i-1 -> i
The index order for both cr and var is nfc factor parameters
followed by nar autoregressive parameters followed by nma moving
average parameters. The constraint on var, defined above (see
seqtsf), applies when ndif>0.
-------------------------------------------------------------------------------
Model Identification and Evaluation:
-------------------------------------------------------------------------------
sany
Compute the direct and inverse autocorrelation coefficients
of a time series to support model identification.
int sany(double *x,int n,double *pm,double *cd,double *ci, \
int nd,int ms,int lag)
x = pointer to start of input series
(The series values are altered by the function, at exit the
power spectra is stored in the array, starting at x+ndif. )
n = length of input series
pm = pointer to store for mean value (xm) of series
cd = pointer to array to be loaded with direct
autocorrelation coefficients (i = 1 to lag)
cd[0] = 2nd moment of direct series
ci = pointer array to be loaded with inverse
autocorrelation coefficients (i = 1 to lag)
ci[0]= 2nd moment of inverse series
nd = order of differencing applied to input series
(If nd=0, the mean is subtracted.)
ms = order of smoothing of spectra
lag = maximum lag for which direct and inverse
autocorrelations are computed
return value: n = length of series used in Fourier
analysis of correlations
The direct and inverse autocorrelations are computed by applying
an inverse Fourier transform to the complex series
{ ps(w[j]) , 1./ps(w[j]) }, where ps(w[j])
is the power spectra of the input series. Direct autocorrelations
have a simple pattern for pure moving average series, while inverse
autocorrelations are simple for pure autoregressive series.
----------------------------------------------------------------------
resid
Perform an analysis of the residuals of a time series model.
int resid(double *x,int n,int lag,double **pac,int nbin, \
double xa,double xb,int **phs,int *cks)
x = pointer to start of input series of model residuals
(one-step prediction errors). (This series is altered
to an unsmoothed power spectra by the function.)
n = length of input series
pac = pointer to array containing:
*pac = sum of squares over series
*(pac+k) = kth autocorrelation coefficient, for k=1 to lag
lag = maximum autocorrelation lag desired
nbin = number of histogram bins ( bin size = (xb-xa)/nbin )
xa = lower limit of error histogram
xb = upper limit of error histogram
phs = pointer to store of error histogram, with
*(phs-1) = number of residuals <xa
*(phs+nbin) = number of residuals >xb
*(phs+i) = number in ith bin i=0,1, - ,nbin-1
cks = pointer to array containing Kolmogorav-Smirnov statistics,
based on cumulative power spectra, with
cks[0] = points outside .25 significance threshold
cks[1] = points outside .05 significance threshold
return value: n = series length used in power spectra computation
The autocorrelations have an asymptotic large sample distribution
with variance ra2 = 1/(n-np) , where np is the number of
model parameters employed. The cumulative spectra is computed using
I(j) = I(j-1) + ps(w[j]) + ps(w[j-1]) ,
for j = 1,2, -- ,n/2 , starting with I(0) = 0. Storage allocated
for autocorrelation and histograms may be released with free(pac)
and free(phs-1) respectively.
-------------------------------------------------------------------------------
Utility Series Operations:
-------------------------------------------------------------------------------
xmean
Subtract the mean value of a series from each term.
double xmean(double *x,int n)
x = pointer to start of input series
(at exit each term is modified by subtracting the mean xm)
n = length of the series
return value: xm = series mean
xm = { Sum(i=0 to n-1) x[i] }/n
-----------------------------------------------------------------
sdiff
Apply iterations of the sequential difference operator to a series.
double sdiff(double y,int nd,int k)
y = current value of time series (in sequence)
nd = order of differencing (1 <= nd <= 6)
k = control flag, with
k=0 -> set initial differences to zero
k!=0 -> normal operation
return value: u = value of series with difference order nd
----------------------------------------------------------------
sintg
Invert the differencing of a time series (integration).
double sintg(double y,int ndint ,k)
y = current value of series
nd = order of integration (1<= nd <=6
inverse of difference operation)
k = control flag, with
k=0 -> set initial sums to zero
k!=0 -> normal operation
return value: u = value of series integrated to order nd
The use of zeros for initialization implies that the first
correctly differenced (integrated) term in a sequence will
have index nd. The action of these operators is specified by
D^m{x[j]} = Sum(i=0 to m) { [m!/i!*(m-i)!](-1)^i *x[j-i] }
I^n{x[j]} = Sum(j=0 to n) { [n!/j!(n-j)!] *x[j-i] }