[Scilab-users] the linear least-squares fitting problem

classic Classic list List threaded Threaded
2 messages Options
Heinz Nabielek-3 Heinz Nabielek-3
Reply | Threaded
Open this post in threaded view
|

[Scilab-users] the linear least-squares fitting problem

Dear Scilabers,

for mathematicians, the linear least-squares fitting is a trivial problem, but as an experimental physicist I had problems finding the right correlations for predicting the Confidence Limits of the fitted functional relationship. For a trivial set of experimental data, I wrote a few lines of Scilab code and have 2 questions:

- is the computation of confidence limits correct?

- for large data sets and higher degree polynomials, can we make the computations more efficient and reliable?

Thanks in advance
Heinz
_________________________
x=[0.1269868 0.135477 0.221034 0.45 0.8350086 0.9057919 0.9133759 0.9688678]';
y=[0.0695843 0.0154644 0.0893982 0.5619441 1.3879476 1.5639274  1.6570076 1.9061488]';
plot(x,y,'m.','markersize',14);xgrid(1,1,1);
X=[ones(x) x x^2]; 		// matrix of independent vectors
b=X\y; 	// parameters of linear LSFIT with a 2nd degree polynomial
yh = X*b; 	//Fitted values yh to approximate measured y's
e=y-yh; 	//Errors or residuals
SSE=e'*e; 	//Sum of squared errors
ybar=mean(y); R2=1-SSE/sum((y-ybar)^2);
[m n]=size(X);
MSE = SSE/(m-n-1); 	//Mean square error
se=sqrt(MSE);
C=MSE*inv(X'*X);
seb=sqrt(diag(C));
CONF=.95; alpha=1-CONF;
ta2 = cdft('T',m-n,1-alpha/2,alpha/2); 	//t-value for alpha/2
xx=(0.1:.025:1)';  	// cover the whole abscissa range in detail
XX=[ones(xx) xx xx^2];
yhh=XX*b; 	// predicted extended LSFIT curve
plot(xx,yhh,'kd-','MarkerSize',8);
[mm n]=size(XX);
sY = []; sYp = []; 	//Terms involved in Confidence Interval for Y, Ypred
for i=1:mm; 
sY = [sY; sqrt(XX(i,:)*C*XX(i,:)')]; 		//  standard error in fit
sYp = [sYp; se*sqrt(1+XX(i,:)*(C/se)*XX(i,:)')]; //  standard error in spread in population distribution
end;
plot(xx, yhh+ta2*sYp,'rd-');
plot(xx, yhh+ta2*sY,'bd-');
plot(xx, yhh-ta2*sY,'gd-');
plot(xx, yhh-ta2*sYp,'md-');
title('LINEAR LEAST-SQUARES FIT WITH 95% CONFIDENCE BOUNDS ON FIT AND ON THE POPULATION SPREAD','fontsize',2);
xlabel('x-values','fontsize',5);
ylabel('y-values','fontsize',5);
legend('measurement data','LSFIT of 2nd order polynomial','upper 95% CL population','upper 95% CL fit','lower 95% CL fit','lower 95% CL population',2);






_______________________________________________
users mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/users

LINEAR LEAST-SQUARES FIT WITH 95% CONFIDENCE BOUNDS ON FIT AND ON THE POPULATION SPREAD.pdf (64K) Download Attachment
Antoine Monmayrant Antoine Monmayrant
Reply | Threaded
Open this post in threaded view
|

Re: the linear least-squares fitting problem

Hello Heinz,

Please see the attached tutorial¹ I wrote for the other physicists in my lab on how to use scilab to fit data with a model and calculate the confidence interval.
The reference paper mentioned in the comments is worth reading, it helped me a lot understanding how the confidence interval relates to the covariance matrix.

Hope it helps,

Antoine
¹I made a poor choice for the data/model to fit as sin(x) is not really nice (multiple parameters work equally well). I should have used a gaussian or lorentzian curve...

Le 01/08/2020 à 16:45, Heinz Nabielek a écrit :
Dear Scilabers,

for mathematicians, the linear least-squares fitting is a trivial problem, but as an experimental physicist I had problems finding the right correlations for predicting the Confidence Limits of the fitted functional relationship. For a trivial set of experimental data, I wrote a few lines of Scilab code and have 2 questions:

- is the computation of confidence limits correct?

- for large data sets and higher degree polynomials, can we make the computations more efficient and reliable?

Thanks in advance
Heinz
_________________________
x=[0.1269868 0.135477 0.221034 0.45 0.8350086 0.9057919 0.9133759 0.9688678]';
y=[0.0695843 0.0154644 0.0893982 0.5619441 1.3879476 1.5639274  1.6570076 1.9061488]';
plot(x,y,'m.','markersize',14);xgrid(1,1,1);
X=[ones(x) x x^2]; 		// matrix of independent vectors
b=X\y; 	// parameters of linear LSFIT with a 2nd degree polynomial
yh = X*b; 	//Fitted values yh to approximate measured y's
e=y-yh; 	//Errors or residuals
SSE=e'*e; 	//Sum of squared errors
ybar=mean(y); R2=1-SSE/sum((y-ybar)^2);
[m n]=size(X);
MSE = SSE/(m-n-1); 	//Mean square error
se=sqrt(MSE);
C=MSE*inv(X'*X);
seb=sqrt(diag(C));
CONF=.95; alpha=1-CONF;
ta2 = cdft('T',m-n,1-alpha/2,alpha/2); 	//t-value for alpha/2
xx=(0.1:.025:1)';  	// cover the whole abscissa range in detail
XX=[ones(xx) xx xx^2];
yhh=XX*b; 	// predicted extended LSFIT curve
plot(xx,yhh,'kd-','MarkerSize',8);
[mm n]=size(XX);
sY = []; sYp = []; 	//Terms involved in Confidence Interval for Y, Ypred
for i=1:mm; 
sY = [sY; sqrt(XX(i,:)*C*XX(i,:)')]; 		//  standard error in fit
sYp = [sYp; se*sqrt(1+XX(i,:)*(C/se)*XX(i,:)')]; //  standard error in spread in population distribution
end;
plot(xx, yhh+ta2*sYp,'rd-');
plot(xx, yhh+ta2*sY,'bd-');
plot(xx, yhh-ta2*sY,'gd-');
plot(xx, yhh-ta2*sYp,'md-');
title('LINEAR LEAST-SQUARES FIT WITH 95% CONFIDENCE BOUNDS ON FIT AND ON THE POPULATION SPREAD','fontsize',2);
xlabel('x-values','fontsize',5);
ylabel('y-values','fontsize',5);
legend('measurement data','LSFIT of 2nd order polynomial','upper 95% CL population','upper 95% CL fit','lower 95% CL fit','lower 95% CL population',2);






_______________________________________________
users mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/users

_______________________________________________
users mailing list
[hidden email]
http://lists.scilab.org/mailman/listinfo/users

tuto_fit.sce (15K) Download Attachment