Estimation of signal parameters via rotational invariance techniques

Signal processing method

(Learn how and when to remove this message)

Estimation theory, or estimation of signal parameters via rotational invariant techniques (ESPRIT), is a technique to determine the parameters of a mixture of sinusoids in background noise. This technique was first proposed for frequency estimation.^[1] However, with the introduction of phased-array systems in everyday technology, it is also used for angle of arrival estimations.^[2]

General description

System model

The model under investigation is the following (1-D version):

y_{m}[t]=\sum _{k=1}^{K}a_{m,k}x_{k}[t]+n_{m}[t]

This model describes a system that is fed with ${\textstyle K}$ inputs signals ${\textstyle x_{k}[t]\in \mathbb {C} }$ , with ${\textstyle k=1,\ldots ,K}$ , and produces ${\textstyle M}$ output signals ${\textstyle y_{m}[t]\in \mathbb {C} }$ , with ${\textstyle m=1,\ldots ,M}$ . The system's output is sampled at discrete time instances $t$ . All ${\textstyle K}$ input signals are weighted and summed up. There are separate weights ${\textstyle a_{m,k}}$ for each input signal and for each output signal. The quantity ${\textstyle n_{m}[t]\in \mathbb {C} }$ denotes noise added by the system.

The one-dimensional form of ESPRIT can be applied if the weights have the following form.

a_{m,k}=e^{-j(m-1)w_{k}}

That is, the weights are complex exponential functions, and the phases are integer multiples of some radial frequency

w_{k}

. Note that this frequency only depends on the index of the system's input.

The goal of ESPRIT is to estimate the radial frequencies $w_{k}$ given the outputs ${\textstyle y_{m}[t]\in \mathbb {C} }$ and the number of input signals ${\textstyle K}$ .

Since the radial frequencies are the actual objectives, we will change the notation from ${\textstyle a_{m,k}}$ to ${\textstyle a_{m}(w_{k})}$ .

y_{m}[t]=\sum _{k=1}^{K}a_{m}(w_{k})x_{k}[t]+n_{m}[t]

Let us now change to a vector notation by putting the weights

{\textstyle a_{m}(w_{k})}

in a column vector

\mathbf {a} (w_{k})

\mathbf {a} (w_{k}):=[1\quad e^{-jw_{k}}\quad e^{-j2w_{k}}\quad ...\quad e^{-j(M-1)w_{k}}]^{\mathrm {T} }

Now, the system model can be rewritten using

{\textstyle \mathbf {a} (w_{k})}

and the output vector

{\textstyle \mathbf {y} [t]}

as follows.

\mathbf {y} [t]=\sum _{k=1}^{K}\mathbf {a} (w_{k})x_{k}[t]+\mathbf {n} [t]

Dividing into virtual sub-arrays

{\displaystyle J_{1}} — Maximum overlapping of two sub-arrays (N denotes number of sensors in the array, m is the number of sensors in each sub-array, and $J_{1}$ and $J_{2}$ are selection matrices)

The basis of ESPRIT is that the weight vector ${\textstyle \mathbf {a} (w_{k})}$ has the property that adjacent entries are related as follows:

[\mathbf {a} (w_{k})]_{m+1}=e^{-jw_{k}}[\mathbf {a} (w_{k})]_{m}

In order to write down this property for the whole vector $\mathbf {a} (w_{k})$ we define two selection matrices $\mathbf {J} _{1}$ and $\mathbf {J} _{2}$ :

{\begin{aligned}\mathbf {J} _{1}&=[\mathbf {I} _{M-1}\quad \mathbf {0} ]\\\mathbf {J} _{2}&=[\mathbf {0} \quad \mathbf {I} _{M-1}]\end{aligned}}

Here,

\mathbf {I} _{M-1}

is an identity matrix of size

{\textstyle (M-1)\times (M-1)}

and

\mathbf {0}

is a vector of zeros. The vector

\mathbf {J} _{1}\mathbf {a} (w_{k})

contains all elements of

\mathbf {a} (w_{k})

except the last one. The vector

\mathbf {J} _{2}\mathbf {a} (w_{k})

contains all elements of

\mathbf {a} (w_{k})

except the first one. Therefore, we can write:

\mathbf {J} _{2}\mathbf {a} (w_{k})=e^{-jw_{k}}\mathbf {J} _{1}\mathbf {a} (w_{k})

In general, we have multiple sinusoids with radial frequencies

w_{1},w_{2},...w_{K}

. Therefore, we put the corresponding weight vectors

\mathbf {a} (w_{1}),\mathbf {a} (w_{2}),...,\mathbf {a} (w_{K})

into a Vandermonde matrix

\mathbf {A}

\mathbf {A} :=[\mathbf {a} (w_{1})\quad \mathbf {a} (w_{2})\quad ...\quad \mathbf {a} (w_{K})]

Moreover, we define a matrix

\mathbf {H}

which has complex exponentials on its main diagonal and zero in all other places.

{\mathbf {H} }:={\begin{bmatrix}e^{-jw_{1}}&\\&e^{-jw_{2}}\\&&\ddots \\&&&e^{-jw_{K}}\end{bmatrix}}

Now, we can write down the property

\mathbf {a} (w_{k})

for the whole matrix

\mathbf {A}

\mathbf {J} _{2}\mathbf {A} =\mathbf {J} _{1}\mathbf {A} \mathbf {H}

Note:

\mathbf {H}

is multiplied from the right such that it scales each column of

\mathbf {A}

by the same value.

In the next sections, we will use the following matrices:

{\begin{aligned}\mathbf {A} _{1}&:=\mathbf {J} _{1}\mathbf {A} \\\mathbf {A} _{2}&:=\mathbf {J} _{2}\mathbf {A} \end{aligned}}

Here, $\mathbf {A} _{1}$ contains the first $M-1$ rows of $\mathbf {A}$ , while $\mathbf {A} _{2}$ contains the last $M-1$ rows of $\mathbf {A}$ .

Hence, the basic property becomes:

\mathbf {A} _{2}=\mathbf {A} _{1}\mathbf {H}

Notice that $\mathbf {H}$ applies a rotation to the matrix $\mathbf {A} _{1}$ in the complex plane. ESPRIT exploits similar rotations from the covariance matrix of the measured data.

Signal subspace

The relation $\mathbf {A} _{2}=\mathbf {A} _{1}\mathbf {H}$ is the first major observation required for ESPRIT. The second major observation concerns the signal subspace that can be computed from the output signals ${\textstyle \mathbf {y} [t]}$ .

We will now look at multiple-time instances ${\textstyle t=1,2,3,\dots ,T}$ . For each time instance, we measure an output vector ${\textstyle \mathbf {y} [t]}$ . Let ${\textstyle \mathbf {Y} }$ denote the matrix of size $M\times T$ comprising all of these measurements.

\mathbf {Y} :={\begin{bmatrix}\mathbf {y} [1]&\mathbf {y} [2]&\dots &\mathbf {y} [T]\end{bmatrix}}

Likewise, let us put all input signals ${\textstyle x_{k}[t]}$ into a matrix ${\textstyle \mathbf {X} }$ .

\mathbf {X} :={\begin{bmatrix}x_{1}[1]&x_{1}[2]&\dots &x_{1}[T]\\x_{2}[1]&x_{2}[2]&\dots &x_{2}[T]\\\vdots &\vdots &\ddots &\vdots \\x_{K}[1]&x_{K}[2]&\dots &x_{K}[T]\end{bmatrix}}

The same we do for the noise components:

\mathbf {N} :={\begin{bmatrix}\mathbf {n} [1]&\mathbf {n} [2]&\dots &\mathbf {n} [T]\end{bmatrix}}

The system model can now be written as

\mathbf {Y} =\mathbf {A} \mathbf {X} +\mathbf {N}

The singular value decomposition (SVD) of ${\textstyle \mathbf {Y} }$ is given as

\mathbf {Y} =\mathbf {U} \mathbf {E} \mathbf {V} ^{*}

where

{\textstyle \mathbf {U} }

and

{\textstyle \mathbf {V} }

are unitary matrices of sizes

{\textstyle M\times M}

and

{\textstyle T\times T}

, respectively.

{\textstyle \mathbf {E} }

is a non-rectangular diagonal matrix of size

{\textstyle M\times T}

that holds the singular values from the largest (top left) in descending order. The operator * denotes the complex-conjugate transpose (Hermitian transpose)

Let us assume that ${\textstyle T\geq M}$ , which means that the number of times ${\textstyle T}$ that we conduct a measurement is at least as large as the number of output signals ${\textstyle M}$ .

Notice that in the system model, we have ${\textstyle K}$ input signals. We presume that the ${\textstyle K}$ largest singular values stem from these input signals. All other singular values are presumed to stem from noise. That is, if there was no noise, there would only be ${\textstyle K}$ non-zero singular values. We will now decompose each SVD matrix into submatrices, where some submatrices correspond to the input signals and some correspond to the input noise, respectively:

{\begin{aligned}\mathbf {U} ={\begin{bmatrix}\mathbf {U} _{\mathrm {S} }&\mathbf {U} _{\mathrm {N} }\end{bmatrix}},&&\mathbf {E} ={\begin{bmatrix}\mathbf {E} _{\mathrm {S} }&\mathbf {0} &\mathbf {0} \\\mathbf {0} &\mathbf {E} _{\mathrm {N} }&\mathbf {0} \end{bmatrix}},&&\mathbf {V} ={\begin{bmatrix}\mathbf {V} _{\mathrm {S} }&\mathbf {V} _{\mathrm {N} }&\mathbf {V} _{\mathrm {0} }\end{bmatrix}}\end{aligned}}

Here,

{\textstyle \mathbf {U} _{\mathrm {S} }\in \mathbb {C} ^{M\times K}}

and

{\textstyle \mathbf {V} _{\mathrm {S} }\in \mathbb {C} ^{N\times K}}

contain the first

{\textstyle K}

columns of

{\textstyle \mathbf {U} }

and

{\textstyle \mathbf {V} }

, respectively.

{\textstyle \mathbf {E} _{\mathrm {S} }\in \mathbb {C} ^{K\times K}}

is a diagonal matrix comprising the

{\textstyle K}

largest singular values. The SVD can be equivalently written as follows.

\mathbf {Y} =\mathbf {U} _{\mathrm {S} }\mathbf {E} _{\mathrm {S} }\mathbf {V} _{\mathrm {S} }^{*}+\mathbf {U} _{\mathrm {N} }\mathbf {E} _{\mathrm {N} }\mathbf {V} _{\mathrm {N} }^{*}

{\textstyle \mathbf {U} _{\mathrm {S} }}

, ⁣

{\textstyle \mathbf {V} _{\mathrm {S} }}

, and

{\textstyle \mathbf {E} _{\mathrm {S} }}

represent the contribution of the input signal

{\textstyle x_{k}[t]}

{\textstyle \mathbf {Y} }

. Therefore, we will call

{\textstyle \mathbf {U} _{\mathrm {S} }}

the signal subspace. In contrast,

{\textstyle \mathbf {U} _{\mathrm {N} }}

{\textstyle \mathbf {V} _{\mathrm {N} }}

, and

{\textstyle \mathbf {E} _{\mathrm {N} }}

represent the contribution of noise

{\textstyle n_{m}[t]}

{\textstyle \mathbf {Y} }

Hence, by using the system model we can write:

\mathbf {A} \mathbf {X} =\mathbf {U} _{\mathrm {S} }\mathbf {E} _{\mathrm {S} }\mathbf {V} _{\mathrm {S} }^{*}

and

\mathbf {N} =\mathbf {U} _{\mathrm {N} }\mathbf {E} _{\mathrm {N} }\mathbf {V} _{\mathrm {N} }^{*}

By modifying the second-last equation, we get:

{\begin{aligned}\mathbf {U} _{\mathrm {S} }\mathbf {E} _{\mathrm {S} }\mathbf {V} _{\mathrm {S} }^{*}&=\mathbf {A} \mathbf {X} &\cdot \mathbf {V} _{\mathrm {S} }\\\mathbf {U} _{\mathrm {S} }\mathbf {E} _{\mathrm {S} }\underbrace {\mathbf {V} _{\mathrm {S} }^{*}\mathbf {V} _{\mathrm {S} }} _{\mathbf {I} }&=\mathbf {A} \mathbf {X} \mathbf {V} _{\mathrm {S} }\\\mathbf {U} _{\mathrm {S} }\mathbf {E} _{\mathrm {S} }&=\mathbf {A} \mathbf {X} \mathbf {V} _{\mathrm {S} }&\cdot \mathbf {E} _{\mathrm {S} }^{-1}\\\mathbf {U} _{\mathrm {S} }\underbrace {\mathbf {E} _{\mathrm {S} }\mathbf {E} _{\mathrm {S} }^{-1}} _{\mathbf {I} }&=\mathbf {A} \mathbf {X} \mathbf {V} _{\mathrm {S} }\mathbf {E} _{\mathrm {S} }^{-1}\\\mathbf {U} _{\mathrm {S} }&=\mathbf {A} \underbrace {\mathbf {X} \mathbf {V} _{\mathrm {S} }\mathbf {E} _{\mathrm {S} }^{-1}} _{=:\mathbf {F} }\\\mathbf {U} _{\mathrm {S} }&=\mathbf {A} \mathbf {F} \end{aligned}}

That is, the signal subspace

{\textstyle \mathbf {U} _{\mathrm {S} }}

is the product of the matrix

{\textstyle \mathbf {A} }

and some other matrix

{\textstyle \mathbf {F} }

. In the following, it is only important that there exist such an invertible matrix

{\textstyle \mathbf {F} }

. Its actual content will not be important.

Note:

The signal subspace is usually not computed from the measurement matrix ${\textstyle \mathbf {Y} }$ . Instead, one may use the auto-correlation matrix.

{\begin{aligned}\mathbf {R} _{\mathrm {YY} }:=&{\tfrac {1}{T}}\sum _{t=1}^{T}\mathbf {y} [t]\mathbf {y} ^{*}[t]\\=&{\tfrac {1}{T}}\mathbf {Y} \mathbf {Y} ^{*}={\tfrac {1}{T}}\mathbf {U} \underbrace {\mathbf {E} \mathbf {E} ^{*}} _{=:\mathbf {E} '}\mathbf {U} ^{*}\end{aligned}}

Hence,

{\textstyle \mathbf {R} _{\mathrm {YY} }}

can be decomposed into signal subspace and noise subspace

\mathbf {R} _{\mathrm {YY} }={\tfrac {1}{T}}\mathbf {U} _{\mathrm {S} }\mathbf {E} _{\mathrm {S} }'\mathbf {U} _{\mathrm {S} }^{*}+{\tfrac {1}{T}}\mathbf {U} _{\mathrm {N} }\mathbf {E} _{\mathrm {N} }'\mathbf {U} _{\mathrm {N} }^{*}

Putting the things together

These are the two basic properties that are known now:

{\begin{aligned}\mathbf {U} _{\mathrm {S} }&=\mathbf {A} \mathbf {F} &\mathbf {J} _{2}\mathbf {A} =\mathbf {J} _{1}\mathbf {A} \mathbf {H} \end{aligned}}

Let us start with the equation on the right:

{\begin{aligned}\mathbf {J} _{2}\mathbf {A} &=\mathbf {J} _{1}\mathbf {A} \mathbf {H} &&{\text{using:}}\quad \mathbf {U} _{\mathrm {S} }=\mathbf {A} \mathbf {F} \\\mathbf {J} _{2}\mathbf {U} _{\mathrm {S} }\mathbf {F} ^{-1}&=\mathbf {J} _{1}\mathbf {U} _{\mathrm {S} }\mathbf {F} ^{-1}\mathbf {H} &&\cdot \mathbf {F} \\\mathbf {J} _{2}\mathbf {U} _{\mathrm {S} }\underbrace {\mathbf {F} ^{-1}\mathbf {F} } _{\mathbf {I} }&=\mathbf {J} _{1}\mathbf {U} _{\mathrm {S} }\mathbf {F} ^{-1}\mathbf {H} \mathbf {F} \\\mathbf {J} _{2}\mathbf {U} _{\mathrm {S} }&=\mathbf {J} _{1}\mathbf {U} _{\mathrm {S} }\mathbf {F} ^{-1}\mathbf {H} \mathbf {F} \end{aligned}}

Define these abbreviations for the truncated signal sub spaces:

{\begin{aligned}\mathbf {S} _{1}&:=\mathbf {J} _{1}\mathbf {U} _{\mathrm {S} }&\mathbf {S} _{2}&:=\mathbf {J} _{2}\mathbf {U} _{\mathrm {S} }\end{aligned}}

Moreover, define this matrix:

{\begin{aligned}\mathbf {P} &:=\mathbf {F} ^{-1}\mathbf {H} \mathbf {F} \end{aligned}}

Note that the left-hand side of the last equation has the form of an eigenvalue decomposition, where the eigenvalues are stored in the matrix

\mathbf {H}

. As defined in some earlier section,

\mathbf {H}

stores complex exponentials on its main diagonals. Their phases are the sought-after radial frequencies

w_{1},w_{2},...w_{K}

Using these abbreviations, the following form is obtained:

{\begin{aligned}\mathbf {S} _{2}&=\mathbf {S} _{1}\mathbf {P} \end{aligned}}

The idea is now that, if we could compute $\mathbf {P}$ from this equation, we would be able to find the eigenvalues of $\mathbf {P}$ which in turn would give us the radial frequencies. However, $\mathbf {S} _{1}$ is generally not invertible. For that, a least squares solution can be used

{\mathbf {P} }=(\mathbf {S} _{1}^{*}{\mathbf {S} _{1}})^{-1}\mathbf {S} _{1}^{*}{\mathbf {S} _{2}}

Estimation of radial frequencies

The eigenvalues ${\textstyle \lambda _{1},\lambda _{2},\ldots ,\lambda _{K}}$ of P are complex numbers:

\lambda _{k}=\alpha _{k}\mathrm {e} ^{j\omega _{k}}

The estimated radial frequencies

w_{1},w_{2},...w_{K}

are the phases (angles) of the eigenvalues

{\textstyle \lambda _{1},\lambda _{2},\ldots ,\lambda _{K}}

Algorithm summary

Collect measurements ${\textstyle \mathbf {y} [1],\mathbf {y} [2],\ldots ,\mathbf {y} [T]}$ .
If not already known: Estimate the number of input signals ${\textstyle K}$ .
Compute auto-correlation matrix.
${\begin{aligned}\mathbf {R} _{\mathrm {YY} }={\tfrac {1}{T}}\sum _{t=1}^{T}\mathbf {y} [t]\mathbf {y} ^{*}[t]\\\end{aligned}}$
Compute singular value decomposition (SVD) of ${\textstyle \mathbf {R} _{\mathrm {YY} }}$ and extract the signal subspace ${\textstyle \mathbf {U} _{\mathrm {S} }\in \mathbb {C} ^{M\times K}}$ .
$\mathbf {R} _{\mathrm {YY} }={\tfrac {1}{T}}\mathbf {U} _{\mathrm {S} }\mathbf {E} _{\mathrm {S} }'\mathbf {U} _{\mathrm {S} }^{*}+{\tfrac {1}{T}}\mathbf {U} _{\mathrm {N} }\mathbf {E} _{\mathrm {N} }'\mathbf {U} _{\mathrm {N} }^{*}$
Compute matrices ${\textstyle \mathbf {S} _{\mathrm {1} }}$ and ${\textstyle \mathbf {S} _{\mathrm {2} }}$ .
${\begin{aligned}\mathbf {S} _{1}&:=\mathbf {J} _{1}\mathbf {U} _{\mathrm {S} }&\mathbf {S} _{2}&:=\mathbf {J} _{2}\mathbf {U} _{\mathrm {S} }\end{aligned}}$
where $\mathbf {J} _{1}=[\mathbf {I} _{M-1}\quad \mathbf {0} ]$ and $\mathbf {J} _{2}=[\mathbf {0} \quad \mathbf {I} _{M-1}]$ .
Solve the equation
${\begin{aligned}\mathbf {S} _{2}&=\mathbf {S} _{1}\mathbf {P} \end{aligned}}$
for ${\textstyle \mathbf {P} }$ . An example would be the least squares solution:
${\mathbf {P} }=(\mathbf {S} _{1}^{*}{\mathbf {S} _{1}})^{-1}\mathbf {S} _{1}^{*}{\mathbf {S} _{2}}$
(Here, * denotes the Hermitian (conjugate) transpose.) An alternative would be the total least squares solution.
Compute the eigenvalues ${\textstyle \lambda _{1},\lambda _{2},\ldots ,\lambda _{K}}$ of ${\textstyle \mathbf {P} }$ .
The phases of the eigenvalues ${\textstyle \lambda _{k}=\alpha _{k}\mathrm {e} ^{j\omega _{k}}}$ are the sought-after radial frequencies ${\textstyle \omega _{k}}$ .
$\omega _{k}=\arg \lambda _{k}$

Notes

Choice of selection matrices

In the derivation above, the selection matrices ${\textstyle \mathbf {J} _{1}}$ and $\mathbf {J} _{2}$ were used. For simplicity, they were defined as $\mathbf {J} _{1}=[\mathbf {I} _{M-1}\quad \mathbf {0} ]$ and $\mathbf {J} _{2}=[\mathbf {0} \quad \mathbf {I} _{M-1}]$ . However, at no point during the derivation it was required that ${\textstyle \mathbf {J} _{1}}$ and $\mathbf {J} _{2}$ must be defined like this. Indeed, any appropriate matrices may be used as long as the rotational invariance

\mathbf {J} _{2}\mathbf {a} (w_{k})=e^{-jw_{k}}\mathbf {J} _{1}\mathbf {a} (w_{k})

(or some generalization of it) is maintained. And accordingly,

{\textstyle \mathbf {A} _{1}:=\mathbf {J} _{1}\mathbf {A} }

and

{\textstyle \mathbf {A} _{2}:=\mathbf {J} _{2}\mathbf {A} }

may contain any rows of

{\textstyle \mathbf {A} }

Generalized rotational invariance

The rotational invariance used in the derivation may be generalized. So far, the matrix $\mathbf {H}$ has been defined to be a diagonal matrix that stores the sought-after complex exponentials on its main diagonal. However, $\mathbf {H}$ may also exhibit some other structure.^[3] For instance, it may be an upper triangular matrix. In this case, ${\textstyle {\begin{aligned}\mathbf {P} &:=\mathbf {F} ^{-1}\mathbf {H} \mathbf {F} \end{aligned}}}$ constitutes a triangularization of $\mathbf {P}$ .

Algorithm example

A pseudocode is given below for the implementation of ESPRIT algorithm.

function esprit(y, model_order, number_of_sources):
    m = model_order
    n = number_of_sources
    create covariance matrix R, from the noisy measurements y. Size of R will be (m-by-m).
    compute the svd of R
    [U, E, V] = svd(R)
    
    obtain the orthonormal eigenvectors corresponding to the sources
    S = U(:, 1:n)                 
      
    split the orthonormal eigenvectors in two
    S1 = S(1:m-1, :) and S2 = S(2:m, :)
                                               
    compute P via LS (MATLAB's backslash operator)
    P = S1\S2 
       
    find the angles of the eigenvalues of P
    w = angle(eig(P)) / (2*pi*elspacing)
     doa=asind(w)      %return the doa angle by taking the arcsin in degrees 
    return 'doa

References

^ Paulraj, A.; Roy, R.; Kailath, T. (1985), "Estimation Of Signal Parameters Via Rotational Invariance Techniques - Esprit", Nineteenth Asilomar Conference on Circuits, Systems and Computers, pp. 83–89, doi:10.1109/ACSSC.1985.671426, ISBN 978-0-8186-0729-5, S2CID 2293566
^ Volodymyr Vasylyshyn. The direction of arrival estimation using ESPRIT with sparse arrays.// Proc. 2009 European Radar Conference (EuRAD). – 30 Sept.-2 Oct. 2009. - Pp. 246 - 249. - [1]
^ Hu, Anzhong; Lv, Tiejun; Gao, Hui; Zhang, Zhang; Yang, Shaoshi (2014). "An ESPRIT-Based Approach for 2-D Localization of Incoherently Distributed Sources in Massive MIMO Systems". IEEE Journal of Selected Topics in Signal Processing. 8 (5): 996–1011. arXiv:1403.5352. Bibcode:2014ISTSP...8..996H. doi:10.1109/JSTSP.2014.2313409. ISSN 1932-4553. S2CID 11664051.