# [Scilab-users] the linear least-squares fitting problem Classic List Threaded 2 messages  Dear Scilabers,for mathematicians, the linear least-squares fitting is a trivial problem, but as an experimental physicist I had problems finding the right correlations for predicting the Confidence Limits of the fitted functional relationship. For a trivial set of experimental data, I wrote a few lines of Scilab code and have 2 questions:- is the computation of confidence limits correct?- for large data sets and higher degree polynomials, can we make the computations more efficient and reliable?Thanks in advanceHeinz_________________________```x=[0.1269868 0.135477 0.221034 0.45 0.8350086 0.9057919 0.9133759 0.9688678]'; y=[0.0695843 0.0154644 0.0893982 0.5619441 1.3879476 1.5639274 1.6570076 1.9061488]'; plot(x,y,'m.','markersize',14);xgrid(1,1,1); X=[ones(x) x x^2]; // matrix of independent vectors b=X\y; // parameters of linear LSFIT with a 2nd degree polynomial yh = X*b; //Fitted values yh to approximate measured y's e=y-yh; //Errors or residuals SSE=e'*e; //Sum of squared errors ybar=mean(y); R2=1-SSE/sum((y-ybar)^2); [m n]=size(X); MSE = SSE/(m-n-1); //Mean square error se=sqrt(MSE); C=MSE*inv(X'*X); seb=sqrt(diag(C)); CONF=.95; alpha=1-CONF; ta2 = cdft('T',m-n,1-alpha/2,alpha/2); //t-value for alpha/2 xx=(0.1:.025:1)'; // cover the whole abscissa range in detail XX=[ones(xx) xx xx^2]; yhh=XX*b; // predicted extended LSFIT curve plot(xx,yhh,'kd-','MarkerSize',8); [mm n]=size(XX); sY = []; sYp = []; //Terms involved in Confidence Interval for Y, Ypred for i=1:mm; sY = [sY; sqrt(XX(i,:)*C*XX(i,:)')]; // standard error in fit sYp = [sYp; se*sqrt(1+XX(i,:)*(C/se)*XX(i,:)')]; // standard error in spread in population distribution end; plot(xx, yhh+ta2*sYp,'rd-'); plot(xx, yhh+ta2*sY,'bd-'); plot(xx, yhh-ta2*sY,'gd-'); plot(xx, yhh-ta2*sYp,'md-'); title('LINEAR LEAST-SQUARES FIT WITH 95% CONFIDENCE BOUNDS ON FIT AND ON THE POPULATION SPREAD','fontsize',2); xlabel('x-values','fontsize',5); ylabel('y-values','fontsize',5); legend('measurement data','LSFIT of 2nd order polynomial','upper 95% CL population','upper 95% CL fit','lower 95% CL fit','lower 95% CL population',2);```_______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users LINEAR LEAST-SQUARES FIT WITH 95% CONFIDENCE BOUNDS ON FIT AND ON THE POPULATION SPREAD.pdf (64K) Download Attachment Hello Heinz, Please see the attached tutorial¹ I wrote for the other physicists in my lab on how to use scilab to fit data with a model and calculate the confidence interval. The reference paper mentioned in the comments is worth reading, it helped me a lot understanding how the confidence interval relates to the covariance matrix. Hope it helps, Antoine ¹I made a poor choice for the data/model to fit as sin(x) is not really nice (multiple parameters work equally well). I should have used a gaussian or lorentzian curve... Le 01/08/2020 à 16:45, Heinz Nabielek a écrit : Dear Scilabers, for mathematicians, the linear least-squares fitting is a trivial problem, but as an experimental physicist I had problems finding the right correlations for predicting the Confidence Limits of the fitted functional relationship. For a trivial set of experimental data, I wrote a few lines of Scilab code and have 2 questions: - is the computation of confidence limits correct? - for large data sets and higher degree polynomials, can we make the computations more efficient and reliable? Thanks in advance Heinz _________________________ ```x=[0.1269868 0.135477 0.221034 0.45 0.8350086 0.9057919 0.9133759 0.9688678]'; y=[0.0695843 0.0154644 0.0893982 0.5619441 1.3879476 1.5639274 1.6570076 1.9061488]'; plot(x,y,'m.','markersize',14);xgrid(1,1,1); X=[ones(x) x x^2]; // matrix of independent vectors b=X\y; // parameters of linear LSFIT with a 2nd degree polynomial yh = X*b; //Fitted values yh to approximate measured y's e=y-yh; //Errors or residuals SSE=e'*e; //Sum of squared errors ybar=mean(y); R2=1-SSE/sum((y-ybar)^2); [m n]=size(X); MSE = SSE/(m-n-1); //Mean square error se=sqrt(MSE); C=MSE*inv(X'*X); seb=sqrt(diag(C)); CONF=.95; alpha=1-CONF; ta2 = cdft('T',m-n,1-alpha/2,alpha/2); //t-value for alpha/2 xx=(0.1:.025:1)'; // cover the whole abscissa range in detail XX=[ones(xx) xx xx^2]; yhh=XX*b; // predicted extended LSFIT curve plot(xx,yhh,'kd-','MarkerSize',8); [mm n]=size(XX); sY = []; sYp = []; //Terms involved in Confidence Interval for Y, Ypred for i=1:mm; sY = [sY; sqrt(XX(i,:)*C*XX(i,:)')]; // standard error in fit sYp = [sYp; se*sqrt(1+XX(i,:)*(C/se)*XX(i,:)')]; // standard error in spread in population distribution end; plot(xx, yhh+ta2*sYp,'rd-'); plot(xx, yhh+ta2*sY,'bd-'); plot(xx, yhh-ta2*sY,'gd-'); plot(xx, yhh-ta2*sYp,'md-'); title('LINEAR LEAST-SQUARES FIT WITH 95% CONFIDENCE BOUNDS ON FIT AND ON THE POPULATION SPREAD','fontsize',2); xlabel('x-values','fontsize',5); ylabel('y-values','fontsize',5); legend('measurement data','LSFIT of 2nd order polynomial','upper 95% CL population','upper 95% CL fit','lower 95% CL fit','lower 95% CL population',2);``` ```_______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users ``` _______________________________________________ users mailing list [hidden email] http://lists.scilab.org/mailman/listinfo/users tuto_fit.sce (15K) Download Attachment