Sunteți pe pagina 1din 38

Tanushree Sharma , M.E.

( Electronics & Instrumentation Control) @ Thapar University, Patiala

Curve Fitting: The process of constructing a curve, or mathematical function, that has the best fit to a series of data points, possibly subject to constraints It can involve either interpolation, where an exact fit to the data is required, or smoothing, in which a "smooth" function is constructed that approximately fits the data

Curve fitting, also known as regression analysis, is used to find the "best fit" line or curve for a series of data points Most of the time, the curve fit will produce an equation that can be used to find points anywhere along the curve In some cases, you may not be concerned about finding an equation. Instead, you may just want to use a curve fit to smooth the data and improve the appearance of your plot

Let's start with a first degree polynomial equation: y = ax+ b This is a line with slope a. We know that a line will connect any two points. So, a first degree polynomial equation is an exact fit through any two points with distinct x coordinates. If we increase the order of the equation to a second degree polynomial, we get: y=ax + bx +c This will exactly fit a simple curve to three points. If we increase the order of the equation to a third degree polynomial, we get: y=ax+bx+cx+d

Least square curve fits Non Linear curve fits

Smoothing curve fits

Least Square Curve Fits It minimizes the square of the error between the original data and the values predicted by the equation Non Linear Curve Fits The relationship between measured values and measurement variables is nonlinear. It also seeks to find those parameter values that minimize the deviations between the observed values and the expected values

Smoothing Curve Fits A new mathematical method is developed for interpolation from a given set of data points in a plane and for fitting a smooth curve to the points.

This method is devised in such a way that the resultant curve will pass through the given points and will appear smooth and natural. It is based on a piecewise function composed of a set of polynomials, each of degree three, at most, and applicable to successive intervals of the given points

Data points (xi, yi ), i= 1n Choice of fitting function (linear) y=f(x)= ax+b Errors between function & datapoints ei = yi (axi +b) Sum of the squares of the errors z = e12 + e22 + +en2 In compact form z = [yi (axi + b)]2

Our goal is to determine the values of a and b that will minimize z, the sum of the squares of the errors To find the minimum value for z, we set the derivative of z to zero and solve for a and b

6 5 4 3 2 1 0 0 5 10 15

linear fit uses every data point the very small slope; y = f(x) is almost independent of x
Notice

This

The degree of the correct approximating function depends on the type of data being analyzed When a certain behavior is expected, we know what type of function to use, and simply have to solve for its coefficients When we dont know what sort of response to expect, ensure your data sample size is large enough to clearly distinguish which degree is the best fit e.g. Ohms law ( curve between voltage and current)

We need to fit a polynomial to data using polynomial regression.


A second-order polynomial or quadratic fit is


y = a0 + a1 x + a2 x 2 + I

The sum of squares of the residues: S ! y  a  a x  a x


n r i 0 1 i 2 2 2 i i !1

Differentiate

Sr wrt all parameters:

Set the partials equal to zero and arrange:

These equations are called the normal equations They form a system of linear equations with 3 equations and 3 unknowns In general, an mth order polynomial requires solving a system of m+1 linear equations

Not all experimental data can be approximated with polynomial functions. Exponential data can be fit using the least squares method by first converting the data to a linear form An exponential function, y = aebx can be rewritten as a linear polynomial by taking the natural logarithm of each side, ln y = ln a + bx By finding ln yi for each point in a data set, we can solve for a and b using the least squares method

The Curve Fitting Toolbox is a collection of graphical user interfaces (GUIs) and M-file functions built on the MATLAB technical computing environment. The toolbox provides you with these main features: 1. Data preprocessing such as sectioning and smoothing 2. Parametric and nonparametric data fitting 3. Standard linear least squares, nonlinear least squares, weighted least squares, constrained least squares, and robust fitting procedures

1. 2.

A graphical environment that allows you to: Explore and analyze data sets and fits visually and numerically Save your work in various formats including Mfiles, binary files, and workspace variables

The Curve Fitting Toolbox consists of two different environments: 1. The Curve Fitting Tool, which is a graphical user interface (GUI) environment 2. The MATLAB command line environment

The Curve Fitting Tool (GUI) allows you to : Visually explore one or more data sets and fits as scatter plots Graphically evaluate the goodness of fit using residuals and prediction bounds Access additional interfaces for Importing, viewing, and smoothing data Fitting data, and comparing fits and data sets Marking data points to be excluded from a fit Selecting which fits and data sets are displayed in the tool Interpolating, extrapolating, differentiating, or integrating fits

You can explore the Curve Fitting Tool by typing cftool

Before you can import data into the Curve Fitting Tool, the data variables must exist in the MATLAB workspace. For this example, the data is stored in the file census.mat, which is provided with MATLAB. Hence first we call load census The workspace now contains two new variables, cdate and pop: 1. cdate is a column vector containing the years 1790 to 1990 in 10-year increments. 2. pop is a column vector with the US population figures that correspond to the years in cdate.

You

can import data into the Curve Fitting Tool with the Data GUI.
You

open this GUI by clicking the Data button on the Curve Fitting Tool.
The

Data GUI consists of two panes: Data sets and Smooth. The Data Sets pane allows you to Import predictor (X) data, response (Y) data, and weights.
If

you do not import weights, then they are assumed to be 1 for all data points. Specify the name of the data set. load cdate and pop into the Curve Fitting Tool, select the appropriate variable names from the X Data and Y Data lists. The data is then displayed in the Preview window. Click the Create data set button to complete the data import process.
To

For this example, begin by fitting the census data with a second degree polynomial. Then continue fitting the data using polynomial equations up to sixth degree, and a single-term exponential equation. The data fitting procedure follows these general steps:

From the Fit Editor, click New Fit. Note that this action always defaults to a linear polynomial fit type. You use New Fit at the beginning of your curve fitting session, and when you are exploring different fit types for a given data set.

Because the initial fit uses a second degree polynomial, select quadratic polynomial from the Polynomial list. Name the fit poly2.

Click the Apply button or select the Immediate apply check box . The library model, fitted coefficients, and goodness of fit statistics are displayed in the Results area.

Fit the additional library equations. For fits of a given type (for example, polynomials), you should use Copy Fit instead of New Fit because copying a fit retains the current fit type state thereby requiring fewer steps than creating a new fit each time

You display the residuals as a line plot by selecting the menu item View->Residuals->Line plot from the Curve Fitting Tool.

residuals indicate that a better fit may be possible. Therefore, you should continue fitting the census data following the procedure outlined in the beginning of this section. residuals from a good fit should look random with no apparent pattern. A pattern, such as a tendency for consecutive residuals to have the same sign, can be an indication that a better model exists. you fit higher degree polynomials, the Results area displays this warning: Equation is badly conditioned. Remove repeated data points or try centering and scaling
When The

The

To determine the best fit, you should examine both the graphical and numerical fit results. Your initial approach in determining the best fit should be a graphical examination of the fits and residuals. The graphical fit results shown here indicate that The fits and residuals for the polynomial equations are all similar, making it difficult to choose the best one. The fit and residuals for the single-term exponential equation indicate it is a poor fit overall. Therefore, it is a poor choice for extrapolation.

Exponential fit residues

Use the Plotting GUI to remove exp1 from the scatter plot display.

The census data and fits are shown above for an upper abscissa limit of 2050. The behavior of the sixth degree polynomial fit beyond the data range makes it a poor choice for extrapolation.

Hence you should exercise caution when extrapolating with polynomial fits because they can diverge wildly outside the data range.

1. 2.

There are two types of numerical fit results displayed in the Fitting GUI: Goodness of fit statistics Confidence intervals on the fitted coefficients

The goodness of fit statistics help you determine how well the curve fits the data. The confidence intervals on the coefficients determine their accuracy.

In this example, the sum of squares due to error (SSE) and the adjusted R-square statistics are used to help determine the best fit.

As described in Goodness of Fit Statistics, the SSE statistic is the least squares error of the fit, with a value closer to zero indicating a better fit.

The adjusted R-square statistic is generally the best indicator of the fit quality when you add additional coefficients to your model.

The p1, p2, and p3 coefficients for the fifth degree polynomial suggest that it overfits the census data.

the confidence bounds for the quadratic fit, poly2, indicate that the fitted coefficients are known fairly accurately. after examining both the graphical and numerical fit results, it appears that you should use poly2 to extrapolate the census data
Therefore,

However,

By clicking the Save to workspace button, you can save the selected fit and the associated fit results to the MATLAB workspace. The fit is saved as a MATLAB object and the associated fit results are saved as structures. This example saves all the fit results for the best fit, poly2.

Working

with MATLAB not only makes our job easier but also adds fun to otherwise tedious problems. saves time and is very accurate.

It

The

different commands, toolbox and GUIs offer a range of operations on the click of a mouse.