Blank

Data-reduction using the Least square polynomial

Sometimes you have to operate on a more or less large set of data. Manipulating such sets is generally difficult.

Suppose your result should be a straight line in a plane area, but of course your data only moves around. You guess that there is a linear tendency, but the computer doesn't. Here the example of a 25 data-set:

x	y	x²	xy
1	0,66496424	1	0,66496424
2	3,94075718	4	7,88151436
3	3,35247829	9	10,0574349
4	3,86552513	16	15,4621005
5	4,49854689	25	22,4927345
6	5,80525828	36	34,8315497
7	5,81779177	49	40,7245424
8	6,10924665	64	48,8739732
9	6,48756951	81	58,3881256
10	5,59638624	100	55,9638624
11	7,28547502	121	80,1402252
12	9,0264179	144	108,317015
13	8,02644055	169	104,343727
14	10,0489198	196	140,684877
15	9,21802309	225	138,270346
16	10,9809859	256	175,695774
17	8,63910735	289	146,864825
18	9,44954277	324	170,09177
19	12,3358053	361	234,380301
20	11,5203612	400	230,407224
21	11,4433213	441	240,309747
22	12,0350019	484	264,770042
23	13,729675	529	315,782525
24	12,7141765	576	305,140236
25	12,7032674	625	317,581685
325	205,295045	5525	3268,12112

Suppose you have N data points (x_i, y_i). The equation for the straight line is y = c₁ + c₂x, where c₁ and c₂ are unknown.

The error of the real y_i-value compared to the straight line is declared as e_i= y_i- (c₁ + c₂x)

To make sure big errors have a more important impact on the computation, we consider the sum of the squared errors:

As we want to find the values c₁ and c₂ that will make sure S will be minimized, we set the partial derivates of S respective to c₁ and c₂equal to zero: