**Data-reduction ****using
the** *Least square
polynomial*

Sometimes you have to operate on a more or less large set of data. Manipulating such sets is generally difficult.

Suppose your result should be a straight line in a plane area, but of course your data only moves around. You guess that there is a linear tendency, but the computer doesn't. Here the example of a 25 data-set:

x | y | x^{2} |
xy |

1 | 0,66496424 | 1 | 0,66496424 |

2 | 3,94075718 | 4 | 7,88151436 |

3 | 3,35247829 | 9 | 10,0574349 |

4 | 3,86552513 | 16 | 15,4621005 |

5 | 4,49854689 | 25 | 22,4927345 |

6 | 5,80525828 | 36 | 34,8315497 |

7 | 5,81779177 | 49 | 40,7245424 |

8 | 6,10924665 | 64 | 48,8739732 |

9 | 6,48756951 | 81 | 58,3881256 |

10 | 5,59638624 | 100 | 55,9638624 |

11 | 7,28547502 | 121 | 80,1402252 |

12 | 9,0264179 | 144 | 108,317015 |

13 | 8,02644055 | 169 | 104,343727 |

14 | 10,0489198 | 196 | 140,684877 |

15 | 9,21802309 | 225 | 138,270346 |

16 | 10,9809859 | 256 | 175,695774 |

17 | 8,63910735 | 289 | 146,864825 |

18 | 9,44954277 | 324 | 170,09177 |

19 | 12,3358053 | 361 | 234,380301 |

20 | 11,5203612 | 400 | 230,407224 |

21 | 11,4433213 | 441 | 240,309747 |

22 | 12,0350019 | 484 | 264,770042 |

23 | 13,729675 | 529 | 315,782525 |

24 | 12,7141765 | 576 | 305,140236 |

25 | 12,7032674 | 625 | 317,581685 |

325 | 205,295045 | 5525 | 3268,12112 |

Suppose you have **N **data points **(x _{i}, y_{i})**.
The equation for the straight line is

The error of the real y_{i}-value compared to the
straight line is declared as **e _{i }= y_{i }- (c_{1}
+ c_{2}x)**

To make sure big errors have a more important impact on the computation, we consider the sum of the squared errors:

As
we want to find the values c_{1} and c_{2}
that
will make sure S will be minimized, we set the partial derivates of S respective
to c_{1} and c_{2 }equal
to zero:

Rearranging these equations we get:

Again rearranging we have:

To
extract c_{1} and c_{2 }we
must only solve the system of two simultaneous equations.

Our example:

25c_{1} + 325c_{2} = 205.295045

325c_{1} + 5525c_{2} = 3268.12112

Using the determinant-procedure we get:

This technic may be generalyzed as:

**y = c _{1} + c_{2}x + c_{2}x^{2}
+ ... + c_{m+1}x^{m+1}**

if m=1 we fit the data in a straight line, if m=2 it is a parabola...

The coefficients are determined by solving the m+1 simultaneous equations:

Of course solving more simutaneous equations requires an adequate algorithm such as the Gauss-Jordan Elimination.