Khmaladze transformation

In statistics, the Khmaladze transformation is a mathematical tool used in constructing convenient goodness of fit tests for hypothetical distribution functions. More precisely, suppose X 1 , , X n {\displaystyle X_{1},\ldots ,X_{n}} are i.i.d., possibly multi-dimensional, random observations generated from an unknown probability distribution. A classical problem in statistics is to decide how well a given hypothetical distribution function F {\displaystyle F} , or a given hypothetical parametric family of distribution functions { F θ : θ Θ } {\displaystyle \{F_{\theta }:\theta \in \Theta \}} , fits the set of observations. The Khmaladze transformation allows us to construct goodness of fit tests with desirable properties. It is named after Estate V. Khmaladze.

Consider the sequence of empirical distribution functions F n {\displaystyle F_{n}} based on a sequence of i.i.d random variables, X 1 , , X n {\displaystyle X_{1},\ldots ,X_{n}} , as n increases. Suppose F {\displaystyle F} is the hypothetical distribution function of each X i {\displaystyle X_{i}} . To test whether the choice of F {\displaystyle F} is correct or not, statisticians use the normalized difference,

v n ( x ) = n [ F n ( x ) F ( x ) ] . {\displaystyle v_{n}(x)={\sqrt {n}}[F_{n}(x)-F(x)].}

This v n {\displaystyle v_{n}} , as a random process in x {\displaystyle x} , is called the empirical process. Various functionals of v n {\displaystyle v_{n}} are used as test statistics. The change of the variable v n ( x ) = u n ( t ) {\displaystyle v_{n}(x)=u_{n}(t)} , t = F ( x ) {\displaystyle t=F(x)} transforms to the so-called uniform empirical process u n {\displaystyle u_{n}} . The latter is an empirical processes based on independent random variables U i = F ( X i ) {\displaystyle U_{i}=F(X_{i})} , which are uniformly distributed on [ 0 , 1 ] {\displaystyle [0,1]} if the X i {\displaystyle X_{i}} s do indeed have distribution function F {\displaystyle F} .

This fact was discovered and first utilized by Kolmogorov (1933), Wald and Wolfowitz (1936) and Smirnov (1937) and, especially after Doob (1949) and Anderson and Darling (1952),[1] it led to the standard rule to choose test statistics based on v n {\displaystyle v_{n}} . That is, test statistics ψ ( v n , F ) {\displaystyle \psi (v_{n},F)} are defined (which possibly depend on the F {\displaystyle F} being tested) in such a way that there exists another statistic φ ( u n ) {\displaystyle \varphi (u_{n})} derived from the uniform empirical process, such that ψ ( v n , F ) = φ ( u n ) {\displaystyle \psi (v_{n},F)=\varphi (u_{n})} . Examples are

sup x | v n ( x ) | = sup t | u n ( t ) | , sup x | v n ( x ) | a ( F ( x ) ) = sup t | u n ( t ) | a ( t ) {\displaystyle \sup _{x}|v_{n}(x)|=\sup _{t}|u_{n}(t)|,\quad \sup _{x}{\frac {|v_{n}(x)|}{a(F(x))}}=\sup _{t}{\frac {|u_{n}(t)|}{a(t)}}}

and

v n 2 ( x ) d F ( x ) = 0 1 u n 2 ( t ) d t . {\displaystyle \int _{-\infty }^{\infty }v_{n}^{2}(x)\,dF(x)=\int _{0}^{1}u_{n}^{2}(t)\,dt.}

For all such functionals, their null distribution (under the hypothetical F {\displaystyle F} ) does not depend on F {\displaystyle F} , and can be calculated once and then used to test any F {\displaystyle F} .

However, it is only rarely that one needs to test a simple hypothesis, when a fixed F {\displaystyle F} as a hypothesis is given. Much more often, one needs to verify parametric hypotheses where the hypothetical F = F θ n {\displaystyle F=F_{\theta _{n}}} , depends on some parameters θ n {\displaystyle \theta _{n}} , which the hypothesis does not specify and which have to be estimated from the sample X 1 , , X n {\displaystyle X_{1},\ldots ,X_{n}} itself.

Although the estimators θ ^ n {\displaystyle {\hat {\theta }}_{n}} , most commonly converge to true value of θ {\displaystyle \theta } , it was discovered that the parametric,[2][3] or estimated, empirical process

v ^ n ( x ) = n [ F n ( x ) F θ ^ n ( x ) ] {\displaystyle {\hat {v}}_{n}(x)={\sqrt {n}}[F_{n}(x)-F_{{\hat {\theta }}_{n}}(x)]}

differs significantly from v n {\displaystyle v_{n}} and that the transformed process u ^ n ( t ) = v ^ n ( x ) {\displaystyle {\hat {u}}_{n}(t)={\hat {v}}_{n}(x)} , t = F θ ^ n ( x ) {\displaystyle t=F_{{\hat {\theta }}_{n}}(x)} has a distribution for which the limit distribution, as n {\displaystyle n\to \infty } , is dependent on the parametric form of F θ {\displaystyle F_{\theta }} and on the particular estimator θ ^ n {\displaystyle {\hat {\theta }}_{n}} and, in general, within one parametric family, on the value of θ {\displaystyle \theta } .

From mid-1950s to the late-1980s, much work was done to clarify the situation and understand the nature of the process v ^ n {\displaystyle {\hat {v}}_{n}} .

In 1981,[4] and then 1987 and 1993,[5] Khmaladze suggested to replace the parametric empirical process v ^ n {\displaystyle {\hat {v}}_{n}} by its martingale part w n {\displaystyle w_{n}} only.

v ^ n ( x ) K n ( x ) = w n ( x ) {\displaystyle {\hat {v}}_{n}(x)-K_{n}(x)=w_{n}(x)}

where K n ( x ) {\displaystyle K_{n}(x)} is the compensator of v ^ n ( x ) {\displaystyle {\hat {v}}_{n}(x)} . Then the following properties of w n {\displaystyle w_{n}} were established:

  • Although the form of K n {\displaystyle K_{n}} , and therefore, of w n {\displaystyle w_{n}} , depends on F θ ^ n ( x ) {\displaystyle F_{{\hat {\theta }}_{n}}(x)} , as a function of both x {\displaystyle x} and θ n {\displaystyle \theta _{n}} , the limit distribution of the time transformed process
ω n ( t ) = w n ( x ) , t = F θ ^ n ( x ) {\displaystyle \omega _{n}(t)=w_{n}(x),t=F_{{\hat {\theta }}_{n}}(x)}
is that of standard Brownian motion on [ 0 , 1 ] {\displaystyle [0,1]} , i.e., is again standard and independent of the choice of F θ ^ n {\displaystyle F_{{\hat {\theta }}_{n}}} .
  • The relationship between v ^ n {\displaystyle {\hat {v}}_{n}} and w n {\displaystyle w_{n}} and between their limits, is one to one, so that the statistical inference based on v ^ n {\displaystyle {\hat {v}}_{n}} or on w n {\displaystyle w_{n}} are equivalent, and in w n {\displaystyle w_{n}} , nothing is lost compared to v ^ n {\displaystyle {\hat {v}}_{n}} .
  • The construction of innovation martingale w n {\displaystyle w_{n}} could be carried over to the case of vector-valued X 1 , , X n {\displaystyle X_{1},\ldots ,X_{n}} , giving rise to the definition of the so-called scanning martingales in R d {\displaystyle \mathbb {R} ^{d}} .

For a long time the transformation was, although known, still not used. Later, the work of researchers like Koenker, Stute, Bai, Koul, Koening, and others made it popular in econometrics and other fields of statistics.[citation needed]

See also

  • Empirical process

References

  1. ^ Anderson, T. W.; Darling, D. A. (1952). "Asymptotic Theory of Certain "Goodness of Fit" Criteria Based on Stochastic Processes". Annals of Mathematical Statistics. 23 (2): 193–212. doi:10.1214/aoms/1177729437.
  2. ^ Kac, M.; Kiefer, J.; Wolfowitz, J. (1955). "On Tests of Normality and Other Tests of Goodness of Fit Based on Distance Methods". Annals of Mathematical Statistics. 26 (2): 189–211. doi:10.1214/aoms/1177728538. JSTOR 2236876.
  3. ^ Gikhman (1954)[full citation needed]
  4. ^ Khmaladze, E. V. (1981). "Martingale Approach in the Theory of Goodness-of-fit Tests". Theory of Probability & Its Applications. 26 (2): 240–257. doi:10.1137/1126027.
  5. ^ Khmaladze, E. V. (1993). "Goodness of fit Problems and Scanning Innovation Martingales". Annals of Statistics. 21 (2): 798–829. doi:10.1214/aos/1176349152. JSTOR 2242262.

Further reading

  • Koul, H. L.; Swordson, E. (2011). "Khmaladze transformation". International Encyclopedia of Statistical Science. Springer. pp. 715–718. doi:10.1007/978-3-642-04898-2_325. ISBN 978-3-642-04897-5.