** Tikhonov regularization** is the most commonly used method of

**regularization**of ill-posed problems. In some fields, it is also known as

**ridge regression**.

In its simplest form, an ill-conditioned system of linear equations

- ,

where *A* is an *m*×*n* matrix above, *x* is a column vector with *n* entries and *b* is a column vector with *m* entries, is replaced by the problem of seeking an *x* to minimize

for some suitably chosen * Tikhonov factor* α > 0. Here is the Euclidean norm. This improves the conditioning of the problem, thus enabling a numerical solution. An explicit solution, denoted by , is given by:

where *I* is the *n*×*n* identity matrix. For α = 0 this reduces to the least squares solution of an overdetermined problem (*m* > *n*).

#### Bayesian interpretation

Although at first the choice of the solution to this regularized problem may look artificial, and indeed the parameter α seems rather arbitrary, the process can be justified in a Bayesian point of view. Note that for an ill-posed problem one must necessarily introduce some additional assumptions in order to get a stable solution. Statistically we might assume that a priori we know that *x* is a random variable with a multivariate normal distribution. For simplicity we take the mean to be zero and assume that each component is independent with standard deviation σ_{x}. Our data is also subject to errors, and we take the errors in *b* to be also independent with zero mean and standard deviation σ_{b}. Under these assumptions the **Tikhonov**-regularized solution is the most probable solution given the data and the a priori distribution of *x*, according to Bayes' theorem. The **Tikhonov** parameter is then ...

If the assumption of normality is replaced by assumptions of homoscedasticity and uncorrelatedness of errors, and still assume zero mean, then the Gauss-Markov theorem entails that the solution is still optimal in a certain sense.

#### Generalized **Tikhonov** **regularization**

For general multivariate normal distributions for *x* and the data error, one can apply a transformation of the variables to reduce to the case above. Equivalently, one can seek an *x* to minimize

where we have used to stand for the weighted norm *x*^{T}*P**x*. In the Bayesian interpretation *P* is the inverse covariance matrix of *b*, *x*_{0} is the expected value of *x*, and α*Q* is the inverse covariance matrix of *x*.

This can be solved explicitly using the formula

**Regularization** in Hilbert space

Typically discrete linear ill-condition problems result as discretization of integral equations, and one can formulate **Tikhonov** **regularization** in the original infinite dimensional context. In the above we can interpret *A* as a compact operator on Hilbert spaces, and *x* and *b* as elements in the domain and range of *A*. The operator *A* ^{*} *A* + α^{2}*I* is then a self-adjoint bounded invertible operator for α > 0.

#### Relation to singular value decomposition and Wiener filter

Given the singular value decomposition

*A*=*U*Σ*V*^{T}

where Σ is the diagonal matrix of singular values σ_{i} (augmented with zeros so as to be *m*×*n*) and *U* and *V* respectively the matrices of left and right singular vectors then the **Tikhonov** regularized solution can be expressed as

where *D* is an *m*×*n* matrix equal to

on the diagonal and zero elsewhere. This demonstrates the effect of the **Tikhonov** parameter on the condition number of the regularized problem. For the generalized case a similar representation can be derived using a generalized singular value decomposition. Finally, it is related to the Wiener filter:

where the Wiener weights are and *q* is the rank of *X*.

#### Determination of the **Tikhonov** factor

The optimal regression parameter α is usually unknown. Whaba proved that the optimal parameter, in the sense of leave-one-out cross-validation minimizes:

where is the residual sum of squares and τ is the effective number degree of freedom.

Using the previous SVD decomposition, we can simplify the above expression:

and

#### Relation to probabilistic formulation

The probabilistic formulation of an inverse problem introduces (when all uncertainties are Gaussian) a covariance matrix *C*_{M} representing the a priori uncertainties on the model parameters, and a covariance matrix *C*_{D} representing the uncertainties on the observed parameters (see, for instance, Tarantola, 2005 [1]). In the special case when these two matrices are diagonal and isotropic, and , and, in this case, the equations of inverse theory reduce to the equations above, with α = σ_{D} / σ_{M}.

#### History

**Tikhonov** **regularization** has been invented independently in many different contexts. It became widely known from its application to integral equations from the work of AN **Tikhonov** and DL Phillips. Some authors use the term ** Tikhonov-Phillips regularization**. The finite dimensional case was expounded by AE Hoerl, who took a statistical approach, and by M Foster, who interpreted this method as a Wiener-Kolmogorov filter. Following Hoerl