Linear regression is a common and powerful method for modelling the relationship
between some input vectors and some output scalars.
To be more specific, assume we have a data set , ,
where the 's are -element vectors .
We now seek a function such that
But what is and what does mean?
First, must be linear (which is the reason for the name linear regression).
This means, by definition:
- for all ,
- for all and .
Using these two rules we get
so is uniquely determined by the vector .
The output of will always be a linear combination
of the coefficients of the input vector,
(all vectors are treated as column vectors).
That was , but what about ?
It means that we would like each to be as small as possible.
To be more precise, we wish to solve the following optimization problem:
and some norm
to measure the magnitude of the error.
Any norm will do, but the most common choice is the
also called the 2-norm, .
This norm has the advantage that the solution can be computed exactly and in
a fairly efficient way (see, e.g., Section 5.5 in
A solution to this optimization problem is also computed by the NumPy function
See the following post for some examples
of how to apply linear regression to some problems.