In statistics, a pivotal quantity or pivot is a function of observations and unobservable parameters such that the function's probability distribution does not depend on the unknown parameters (including nuisance parameters).[1] A pivot need not be a statistic — the function and its 'value' can depend on the parameters of the model, but its 'distribution' must not. If it is a statistic, then it is known as an 'ancillary statistic'.

More formally,[2] let be a random sample from a distribution that depends on a parameter (or vector of parameters) . Let be a random variable whose distribution is the same for all . Then is called a 'pivotal quantity' (or simply a 'pivot').

Pivotal quantities are commonly used for normalization to allow data from different data sets to be compared. It is relatively easy to construct pivots for location and scale parameters: for the former we form differences so that location cancels, for the latter ratios so that scale cancels.

Pivotal quantities are fundamental to the construction of test statistics, as they allow the statistic to not depend on parameters – for example, Student's t-statistic is for a normal distribution with unknown variance (and mean). They also provide one method of constructing confidence intervals, and the use of pivotal quantities improves performance of the bootstrap. In the form of ancillary statistics, they can be used to construct frequentist prediction intervals (predictive confidence intervals).

Examples edit

Normal distribution edit

One of the simplest pivotal quantities is the z-score. Given a normal distribution with mean   and variance  , and an observation 'x', the z-score:

 

has distribution   – a normal distribution with mean 0 and variance 1. Similarly, since the 'n'-sample sample mean has sampling distribution  , the z-score of the mean

 

also has distribution   Note that while these functions depend on the parameters – and thus one can only compute them if the parameters are known (they are not statistics) — the distribution is independent of the parameters.

Given   independent, identically distributed (i.i.d.) observations   from the normal distribution with unknown mean   and variance  , a pivotal quantity can be obtained from the function:

 

where

 

and

 

are unbiased estimates of   and  , respectively. The function   is the Student's t-statistic for a new value  , to be drawn from the same population as the already observed set of values  .

Using   the function   becomes a pivotal quantity, which is also distributed by the Student's t-distribution with   degrees of freedom. As required, even though   appears as an argument to the function  , the distribution of   does not depend on the parameters   or   of the normal probability distribution that governs the observations  .

This can be used to compute a prediction interval for the next observation   see Prediction interval: Normal distribution.

Bivariate normal distribution edit

In more complicated cases, it is impossible to construct exact pivots. However, having approximate pivots improves convergence to asymptotic normality.

Suppose a sample of size   of vectors   is taken from a bivariate normal distribution with unknown correlation  .

An estimator of   is the sample (Pearson, moment) correlation

 

where   are sample variances of   and  . The sample statistic   has an asymptotically normal distribution:

 .

However, a variance-stabilizing transformation

 

known as Fisher's 'z' transformation of the correlation coefficient allows creating the distribution of   asymptotically independent of unknown parameters:

 

where   is the corresponding distribution parameter. For finite samples sizes  , the random variable   will have distribution closer to normal than that of  . An even closer approximation to the standard normal distribution is obtained by using a better approximation for the exact variance: the usual form is

 .

Robustness edit

From the point of view of robust statistics, pivotal quantities are robust to changes in the parameters — indeed, independent of the parameters — but not in general robust to changes in the model, such as violations of the assumption of normality. This is fundamental to the robust critique of non-robust statistics, often derived from pivotal quantities: such statistics may be robust within the family, but are not robust outside it.

See also edit

References edit

  1. ^ Shao, J. (2008). "Pivotal quantities". Mathematical Statistics (2nd ed.). New York: Springer. pp. 471–477. ISBN 978-0-387-21718-5.
  2. ^ DeGroot, Morris H.; Schervish, Mark J. (2011). Probability and Statistics (4th ed.). Pearson. p. 489. ISBN 978-0-321-70970-7.