Linear And Nonlinear Least-Squares With Math.NET
Linear And Nonlinear Least-Squares With Math.NET
by Libor Tinka
中国版本 (Chinese version of this article)

CONTENTS
3.1 Solution with Normal Equations
3.2 Solution with QR Decomposition
3.3 Solution with Singular Value Decomposition (SVD)
4.3 Levenberg-Marquardt Method
1. INTRODUCTION
This article explains how to solve linear and nonlinear least-squares problems and provides C# implementation for it. All the numerical methods used in the demo project stand on top of vector and matrix operations provided by the Math.NET numerical library. You can, of course, adjust the presented algorithms to work with any other numerical library that can handle commonly used matrix operations (e.g. LAPACK).
The tutorial covers discrete least-squares problems (i.e. fitting a set of points with known model). We will care about over-determined problems (e.g. where the number of measurements are greater than number of unknowns) and assume the problems are well-conditioned. The tutorial goes from simple to more complex and more general methods.
This article was written for programmers with basic knowledge of matrix computations and calculus.
2. LEAST-SQUARES IN GENERAL
The least-squares problem arise in many applications (statistics, financial, physics...). It can be viewed as an optimization problem. We gathered some data and want to find model parameters for which the model best fits the data in some sense.
The following section is purely mathematical. It is a general description of the problem, without reference to any specific application (e.g. fitting parabola to a set of points...). However, it is crucial for the following derivation of solutions for both linear and nonlinear problems.
The least-squares problem can be written compactly as:
where
Here
As you can see, the function
The residual value can be either positive or negative and in both cases it means that the model deviates from measured values. We would like to have as small residuals as possible, but in absolute value. This is why we look for least squares of the residuals, rather than residual values itself. Note that there exist variants of the above optimization problem, where different norms (e.g. absolute values) are used instead of squares. But least squares problem is easy to solve and have a remarkable statistical property of being so called maximum likelihood estimator when errors between measured values and expectations are distributed normally.
The norm in equation (1) is a 2-norm, so we can re-write this equation in terms of the objective function in the well-known form:
Of course, we would like to find parameters
We place partial derivatives of
3. LINEAR LEAST-SQUARES
Now we can be more specific about the problems we solve, but linear least-squares can still be applied to a whole family of functions. For example, when we are fitting straight line to 2D points (a common example), our model function has a form:
But we are not limited to just straight lines. We can fit any fuction that is linear in the paramters
can be used in linear least-squares framework as well.
The general form of the model function is:
Here
As we can observe, derivatives of residuals with respect to model parameters are particularly simple:
We can put all the partial derivatives to a single matrix called design matrix:
The residuals can be written as:
or in matrix terms:
