现代谱估计:多通道奇异值谱/主成份分析

最新推荐文章于 2022-05-30 22:06:46 发布

Big_quant

最新推荐文章于 2022-05-30 22:06:46 发布

阅读量1k

点赞数

文章标签：现代谱估计信号处理

Multi-channel Singular-Spectrum/Principal Component Analysis

Multi-channel SSA (or M-SSA) is a natural extension of SSA to a multivariate time series of vectors or maps, such as time-varying temperature or pressure distributions over the globe.

Let {Xl,n: l =1,…,L; n = 1,….,N} be an L-channel time series with N data points given at equally spaced intervals. We assume that each channel l is centered and stationary in the weak sense.

In the meteorological literature, extended EOF (EEOF) analysis is often assumed to be synonymous with M-SSA The two methods are both extensions of classical principal component analysis (PCA) but they differ in emphasis: EEOF analysis typically utilizes a number L of spatial channels much greater than the number M of temporal lags, thus limiting the temporal and spectral information. In M-SSA, on the other hand, based on the single-channel experience, one usually chooses L &le M. Often M-SSA is applied to a few leading PCA components of the spatial data, with M chosen large enough to extract detailed temporal and spectral information from the multivariate time series.

Both SSA (L=1) and conventional PCA (M=1) analysis are special cases of EEOF or M-SSA analysis. Both algorithms can be understood in practice in terms of a “window.”

These windows are illustrated schematically in the following figure:

axes are spatial coordinate (or spatial PC label) x, time t, and lag s; discrete values of these variables are labeled by l, n, and j, respectively.

Standard PCA slides a flat and narrow window, of length 1 and width L, over the data set’s N fields, each of which contains L data points. PCA thus identifies the spatial patterns, i.e., the EOFs, which account for a high proportion of the variance in the N views of the data set thus obtained. Equivalently, PCA can be described as sliding a long and narrow, N x 1 window across the L input channels and identifying high-variance temporal patterns, i.e. the PCs, in the corresponding L views. In the above figure, the former view of things corresponds to sliding the 1 x L window parallel to the x-axis along the t-axis. In the latter view, one slides an N x 1 window that starts out by lying along the t-axis parallel to the x-axis. This latter case is also somewhat analogous to the trajectory-matrix version of single-channel SSA, except that in SSA we reduce the length of this long and narrow window to M, and consider only one channel.

These two different window interpretations carry over into M-SSA. In the first case one proceeds from spatial PCA to EEOFs. To do so, we extend our 1 x L window by M lags to form an M x L window that lies in the horizontal (x,s)-plane. By moving this window along the t-axis, we search for spatio-temporal patterns, i.e. the EEOFs, that maximize the variance in the N’=N-M+1 overlapping views of the time series thus obtained. The EEOFs are the eigenvectors of the LM x LM lag-covariance matrix. This approach is a basis for computing covraiance matrix by Toeplitz or trajectory-matrix method.

The second conceptual route leads us from single-channel SSA to M-SSA. To follow this route, we reduce the length of the N x 1 window to N’ x 1. This window still lies initially along the t-axis and we search for temporal patterns, i.e. the N’-long PCs, that maximize the variance in the M x L views of the time series. This is equivalent to SSA of the univariate time series formed by stringing together all the channels of the original multi-channel time series end-to-end, with the complementary window N’ playing the role of M in SSA. This approach is a basis for computing reduced covraiance matrix in M-SSA.
Toeplitz method

The generalization of Toeplitz method of SSA to a multivariate time series requires the constructiuon of a “grand” block-matrix TX that has form:

Each block Tl,l’ is a matrix that contains estimates of the lag covariance between channels l and l’. The entries (j, j’) of each block Tl,l’ can be obtained with a “least-biased” estimator by using, for each lag m {m = 0,….,M}, the longest-possible segment of each channel:

where the normalizing factor depends on the range of summation:

Note that, unlike in the single-channel case, here Tl,l’ is Toeplitz but not symmetric. Its main diagonal contains the estimate of the lag-zero covariance of channels l and l’. The diagonals in the upper-right triangle of Tl,l’ contain the lag-m covariance of channels l and l’, with l’ leading l, while the diagonals in the lower-left triangle contain the covariances with l leading l’. In result Tl,l’ = (Tl,l’)t so that TX is symmetric, but it is not Toeplitz. More generally, we only expect to obtain a grand matrix TX that is both symmetric and of Toeplitz form provided the spatial field being analyzed is statistically homogeneous, i.e., its statistics are invariant with respect to arbitrary translations and rotations.
Trajectory-matrix method

An alternative approach to computing the lagged cross-covariances is to form the multi-channel trajectory matrix by first augmenting each channel of X with M lagged copies of itself:

and then forming the full augmented trajectory matrix:

Thereafter, one computes the grand covariance matrix:

The blocks of CX are given by

with entries

where I=j+M(l-1) and J=j’+M(l’-1) Here Cl,l’ is neither Toeplitz nor symmetric, but again Cl,l’ = (Cl,l’)t and hence CX is symmetric.

Each block Cl,l’ contains, like Tl,l’ , an estimate of the lag covariances of the two channels l and l’. There is a difference between the Toeplitz method and the trajectory-matrix method in estimating the lag-covariance matrix, when applied to M-SSA. The Toeplitz method extracts, according to the fixed lag m that is being considered at the moment, the two longest matching segments from the two channels l and l’; the matching is determined by the requirement that, for each entry in the designated trailing channel, there exist an entry in the designated leading channel that is m time steps “ahead.” It then uses the same estimated lag covariance for all entries along the appropriate diagnoal in the block and thus yields a Toeplitz-form submatrix.

The trajectory-matrix method gradually slides, regardless of m, the same N’-long windows over the two channels. To compute the first entry in the diagonal that contains the lag-m covariances, the two matching windows are situated so that one starts at the first time point of the trailing channel and at the (m+1)st point of the leading channel. Both windows are then slid forward by one point in time to produce the second entry. This results in slightly different values, from entry to entry, until the last point of the leading channel is covered and the (M-m)th entry of the diagonal, which is the last one, is calculated. This latter method thus retains detailed information on the variation of the lag-covariance estimates from one pair of segments of the channels l and l’ to another, while the Toeplitz method produces a single, and smoother, global estimate for each lag m.

Diagonalizing the LM x LM matrix CX or TX yields LM eigenvectors {Ek: k =1,…,LM} that are not necessarily distinct. The extent to which the eigenpairs {&lambdak, Ek} obtained by diagonalizing CX equal those obtained from TX is a good indication of the robustness of the M-SSA results. Each eigenvector Ek is composed of L consecutive M-long segments, with its elements denoted by Ekl,m. The associated space-time PCs Ak are single-channel time series that are computed by projectiion onto the EOFs:

where t varies from 1 to N’=N-M+1.

Selected RCs are obtained by convolving the corresponding PCs with the EOFs. Thus, the kth RC at time t for channel l is given by:

The normalization factor Mt equals M, except near the ends of the time series, and the sum of all the RCs recovers the original time series, as it does in the single-channel case.
Reduced Covariance

The (reduced) PCs are the eigenvectors of the N’ x N’ reduced covariance matrix with the elements:

reduced covariance

This matrix, unlike the LM x LM matrix CX is given by:

reduced covariance

Whenever N’

References

Allen, M.R., and A.W. Robertson, 1996: Distinguishing modulated oscillations from coloured noise in multivariate datasets. Climate Dynamics, 12 , 775-784.

Ghil, M., and Vautard, R., 1991: Interdecadal oscillations and the warming trend in global temperature time series, Nature, 350, 324-327.

Groth A, Ghil M, 2011: Multivariate singular spectrum analysis and the road to phase synchronization, Phys Rev E 84(3 Pt 2), 036206.

Plaut, G., and R. Vautard, Spells of low-frequency oscillations and weather regimes in the Northern Hemisphere,J. Atmos. Sci., 51, 210 –236, 1994.

Preisendorfer, R. W., Principal Component Analysis in Meteorology and Oceanography, 425 pp., Elsevier Sci., New York, 1988.

Big_quant

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
现代谱估计:多通道奇异值谱/主成份分析

Multi-channel Singular-Spectrum/Principal Component AnalysisMulti-channel SSA (or M-SSA) is a natural extension of SSA to a multivariate time series of vectors or maps, such as time-varying temperat...
复制链接

扫一扫