next up previous contents
Next: Numerical Differentiation and Integration Up: Numerical Analysis for Chemical Previous: Optimization

Subsections


Curve Fitting

Figure 5.1: Three attempts to fit a best curve.
\includegraphics[scale=.5]{pic/best}

The simplest method for fitting a curve to data is to plot the points and then sketch a line

Simple statistics

Least-Squares Regression

Lest-squares regression is drived from a curve that minimized the discrepancy between the data points and the curve.

Linear Regression

A least-squares approximation is fitting a straight line to a set of paired observation. The mathematical expression for the straight line is

$\displaystyle y=a_0 + a_1 x + e$ (5.1)

The error, or residual, is the discrepancy between the true value of $ y$ and the approximate value, $ a_0 + a_1 x$ and that is

$\displaystyle e=y - a_0 + a_1 x$ (5.2)

The criterion for least-squares regression is

$\displaystyle \min S_r = \sum^n_i e^2_i = \sum^n_i (y_{i,\mathrm{measured}}-y_{i,\mathrm{model}})^2 = \sum^n_i (y_i - a_0 - a_1 x_i)^2$ (5.3)

To determine values of $ a_0$ and $ a_1$, differentiate (5.3)

$\displaystyle \frac{\partial S_r}{\partial a_0}$ $\displaystyle = -2 \sum (y_i - a_0 - a_1 x_i)$ (5.4)
$\displaystyle \frac{\partial S_r}{\partial a_1}$ $\displaystyle = -2 \sum \left[(y_i - a_0 - a_1 x_i)x_i \right]$ (5.5)

And setting these derivatives equal to zero, we get the so-called normal equations

0 $\displaystyle = \sum y_i - \sum a_0 - \sum a_1 x_i$ (5.6)
0 $\displaystyle = \sum y_ix_i - \sum a_0x_i - \sum a_1 x^2_i$ (5.7)

The coefficients of a straight line are

$\displaystyle a_1$ $\displaystyle = \frac{n\sum x_iy_i - \sum x_i \sum y_i}{n\sum x^2_i - (\sum x_i)^2}$ (5.8)
$\displaystyle a_0$ $\displaystyle = \bar{y} - a_1 \bar{x}$ (5.9)

Quantification of error of linear regression

See the figure 17.4 in the textbook

Figure 5.2: The residual in linear regression
\includegraphics[scale=.5]{pic/ls}

General Linear Least-Squares

The general linear least-square model:

$\displaystyle y = a_0 z_0 + a_1 z_1 + \cdots + a_m z_m + e$ (5.15)

In matrix notation

$\displaystyle Y=ZA+E$ (5.16)

Note that Z is not a square matrix but we want to know about $ A$.

$\displaystyle Z^T Z A = Z^T Y$ (5.17)

Now A is

$\displaystyle A = (Z^T Z)^{-1} Z^T Y$ (5.18)

Nonlinear Regression

Gauss-Newton method
  1. Use a Taylor series to linearize a nonlinear function
  2. Apply least-square theorie to obtain new estimate of the parameters that move in the direction of minimizing the residual.

Interpolation

Newton's Divided-Difference Interpolating Polynomials

Linear interpolation : connect two data points with a straight line

$\displaystyle f_1(x) = f(x_0) + \frac{f(x_1) - f(x_0)}{x_1 - x_0}(x - x_0)$ (5.19)

Quadratic interpolation : connect three data points with a second-order polynomial

$\displaystyle f_2(x) = b_0 + b_1(x-x_0) + b_2(x-x_0)(x-x_1)$ (5.20)

where

$\displaystyle b_0 = f(x_0)$    
$\displaystyle b_1 = \frac{f(x_1) - f(x_0)}{x_1 - x_0}$    
$\displaystyle b_2 = \frac{\dfrac{f(x_2) - f(x_1)}{x_2 - x_1} - \dfrac{f(x_1) - f(x_0)}{x_1 - x_0}}{x_2 - x_0}$    

Newton's interpolating polynomial : connect $ n+1$ data with $ n$th-order polynomial

$\displaystyle f_n(x) = b_0 + b_1(x-x_0) + \cdots + b_n(x-x_0)\cdots(x-x_{n-1})$ (5.21)

where the coefficients are

$\displaystyle b_0$ $\displaystyle = f(x_0)$    
$\displaystyle b_1$ $\displaystyle = f[x_1,x_0]$    
  $\displaystyle \vdots$    
$\displaystyle b_n$ $\displaystyle = f[x_n, x_{n-1},\ldots, x_0]$    

where the bracket function evaluations are finite divided differences.

$ n$th finite divided difference is

$\displaystyle f[x_n, x_{n-1},\ldots, x_0] = \frac{f[x_n,\ldots,x_1] - f[x_{n-1},\ldots,x_0]}{x_n - x_0}$ (5.22)

Newton's divided-difference interpolating polynomial is

$\displaystyle f_n(x) = f(x_0) + (x-x_0)f[x_1,x_0] + \cdots (x-x_0)(x-x_1)\cdots(x-x_{n-1})f[x_n,\ldots,x_0]$ (5.23)

Lagrange Interpolating Polynomial

The Lagrange interpolating polynomial is simply a reformulation of the Newton polynomial that avoids the computation of divided differences.

$\displaystyle f_n(x) = \sum^n_{i=0} L_i(x)f(x_i)$ (5.24)

where

$\displaystyle L_i(x) = \prod^n_{\substack{j=0  j \ne i}} \frac{x-x_j}{x_i-x_j}$ (5.25)

where $ \prod$ designates the ``product of.''

Spline Interpolation

Spline interpolation is an alternative approach that lower-order polynomial is applied to subsets of data point. Especially, when third-order curves are employed to connect each pair of data points, it is called cubic spline.

Linear splines : the simplest connection between two points is a straight line.

$\displaystyle f(x)$ $\displaystyle = f(x_0) + m_0(x-x_0) \quad$ $\displaystyle x_0 \le x \le x_1$    
$\displaystyle f(x)$ $\displaystyle = f(x_1) + m_1(x-x_1)$ $\displaystyle x_1 \le x \le x_2$    
  $\displaystyle \vdots$    
$\displaystyle f(x)$ $\displaystyle = f(x_{n-1}) + m_1(x-x_{n-1})$ $\displaystyle x_{n-1} \le x \le x_n$    

where $ m_i$ is the slope of the straight line

$\displaystyle m_i = \frac{f(x_{i+1}) - f(x_i)}{x_{i+1} - x_i}$ (5.26)

Quadratic splines : connect three points with second-order polynomials.

Cubic splines : derive a third-order polynomial for each interval between knots

$\displaystyle f_i(x) = a_i x^3 + b_i x^2 + c_i x + d_i$ (5.27)

Fourier Approximation

In early 1800s, the French mathematician Fourier proposed that ``any function can be represented by an infinite sum of sine and cosine terms.'' There are functions that do not have a representation as a Fourier series, however, most functions can be so represented. Fourier approximation is another representation of a function with trigonometric series.

Trigonometric identities

Fourier series

Assume that $ f(x)$ is a periodic function of period $ 2\pi$ and is integrable over a period.

$\displaystyle f(x) \simeq A_0 + \sum^\infty_{n=1} \left[ A_n \cos(nx) + B_n \sin(nx) \right]$ (5.28)

Fourier series for any period $ p=2L$

Consider the function whose period is $ p=2L$.

$\displaystyle f(x) = A_0 + \sum^\infty_{n=1} \left(A_n \cos\frac{n\pi}{L}x + B_n \sin \frac{n\pi}{L}x \right)$ (5.33)

where the Fourier coefficients of $ f(x)$ are given by the Euler formulas

\begin{displaymath}\begin{split}A_0 &= \frac{1}{2L}\int^L_{-L}f(x)dx   A_n &= ...
...&= \frac{1}{L}\int^L_{-L}f(x)\sin\frac{n\pi}{L}x dx \end{split}\end{displaymath} (5.34)

Fourier series for even and odd functions

Complex form of Fourier series : Real sines and cosines can be expressed in terms of complex exponentials by the formulas

\begin{displaymath}\begin{split}\sin nx &= \frac{e^{inx} - e^{-inx}}{2i}   \cos nx &= \frac{e^{inx} + e^{-inx}}{2} \end{split}\end{displaymath} (5.41)

From this

$\displaystyle A_n \cos nx + B_n \sin nx$ $\displaystyle = \frac{1}{2}A_n(e^{inx}+e^{-inx}) + \frac{1}{2i}B_n(e^{inx}-e^{-inx})$ (5.42)
  $\displaystyle =\frac{1}{2}(A_n - iB_n)e^{inx} + \frac{1}{2}(A_n + iB_n) e^{-inx}$ (5.43)
  $\displaystyle = c_n e^{inx} + c_{-n} e^{-inx}$ (5.44)

With the above equation

$\displaystyle f(x) = \sum^\infty_{-\infty} c_n e^{inx}$ (5.45)

where

$\displaystyle c_n = A_n - iB_n = \frac{1}{\pi} \int^\pi_{-\pi} f(x)(\cos(nx)-i\sin(nx)) dx = \frac{1}{2\pi}\int^\pi_{-\pi} f(x)e^{-inx}dx$    

This is the so-called complex form of the Fourier series, or complex Fourier series of $ f(x)$.

Sinusoidal function : represent any waveform with a sine or cosine

$\displaystyle f(t) = A_0 + C_1 \cos(\omega_0t+\theta)$ (5.46)

where $ A_0$ is the mean value, $ C_1$ is the amplitude, $ \omega_0$ is the angular frequency, and $ \theta$ is the phase angle or phase shift.

The angular frequency is related to frequency $ f$ (in cycles/time)

$\displaystyle \omega_0 = 2 \pi f$ (5.47)

and frequency is

$\displaystyle f = \frac{1}{T}$ (5.48)

The trigonometric identity gives

$\displaystyle f(t) = A_0 + A_1 \cos(\omega_0 t) + B_1 \sin(\omega_0 t)$ (5.49)

where $ A_1 = C_1 \cos(\theta)$, $ B_1 = -C_1 \sin(\theta)$

Curve Fitting with Sinusoidal Functions

Least-squares fit of a sinusoidal function is to determine coefficient values that minimize

$\displaystyle S_r = \sum^N_{i=1} \left\{y_i - \left[A_0 + A_1 \cos(\omega_0 t_i) + B_1 \sin(\omega_0 t_i) \right] \right\}^2$ (5.50)

$\displaystyle \begin{bmatrix}N & \sum \cos(\omega_0 t) & \sum \sin(\omega_0 t) ...
...ix}\sum y   \sum y \cos(\omega_0 t)   \sum y \sin(\omega_0 t) \end{Bmatrix}$ (5.51)

For equispaced system

$\displaystyle \int^T_0 \cos(\omega_0 t) dt = -\frac{1}{\omega_0} \sin(\omega_0 t)\bigg\vert^T_0 = 0$ (5.52)

where $ \omega_0 T = \dfrac{2\pi}{T}T = 2\pi$. These relationhips give

$\displaystyle \begin{bmatrix}N & 0 & 0   0 & N/2 & 0   0 & 0 & N/2 \end{bma...
...ix}\sum y   \sum y \cos(\omega_0 t)   \sum y \sin(\omega_0 t) \end{Bmatrix}$ (5.53)

or

$\displaystyle A_0$ $\displaystyle = \frac{\sum y}{N}$ (5.54)
$\displaystyle A_1$ $\displaystyle = \frac{2}{N} \sum y \cos(\omega_0t)$ (5.55)
$\displaystyle A_2$ $\displaystyle = \frac{2}{N} \sum y \sin(\omega_0t)$ (5.56)

The above equations are similar with the determination of Fourier series.

Figure 5.3: The Fourier series approximation of the square wave.
\includegraphics[scale=.5]{pic/fourier}

Fourier Integral and Transform

Some of phenomenon does not occured repeatedly or it will be a long time until it occurs again. In this case we use Fourier integral that can be used to represent nonperiodic functions, for example a single voltage pulse not repeated, or a flash of light, or a sound which is not repeated. The transition from a periodic to a nonperiodic function can be effected by allowing the period to approach infinity. In other words, as $ T$ becomes infinite, the function never repeats itself and thus becomes aperiodic.

From Fourier series to the Fourier intergral

Consider any periodic function $ f_L(x)$ of period $ 2L$

$\displaystyle f_L(x) = A_0 + \sum^\infty_{n=1}(A_n \cos\omega_n x + B_n \sin\omega_n x)$ (5.57)

where $ \omega_n = n \pi / L$. Insert $ A_n$ and $ B_n$ which are given by the Euler formulas.

\begin{multline}
f_L(x) = \frac{1}{2L} \int^L_{-L} f_L(v) dv \\
+ \frac{1}{L}...
... + \sin \omega_n x \int^L_{-L} f_L(v) \sin \omega_n v
dv \right]
\end{multline}

Now set

$\displaystyle \Delta \omega = \omega_{n+1} - \omega_n = \frac{(n+1)\pi}{L} - \frac{n\pi}{L} = \frac{\pi}{L}$ (5.58)

Then $ 1/L = \Delta \omega / \pi$, and

\begin{multline}
f_L(x) = \frac{1}{2L} \int^L_{-L} f_L(v) dv \\
+ \frac{1}{\p...
...n x \Delta \omega
\int^L_{-L} f_L(v) \sin \omega_n v dv \right]
\end{multline}

Let $ L \longrightarrow \infty$ and assume a periodic function $ f_L(x)$ to be a aperiodic function.

$\displaystyle f(x) = \lim_{L\rightarrow\infty} f_L(x)$ (5.59)

Then $ 1/L \longrightarrow 0$ and the first term of function approaches zero.

$\displaystyle f_L(x) = \frac{1}{\pi} \sum^\infty_{n=1} \left[ \cos \omega_n x \...
...v + \sin \omega_n x \Delta \omega \int^L_{-L} f_L(v) \sin \omega_n v dv \right]$ (5.60)

$ L \longrightarrow 0$ results in $ \Delta \omega \rightarrow 0$ and the sum of infinite series become an integral from 0 to $ \infty$.

$\displaystyle f(x) = \frac{1}{\pi} \int^\infty_0 \left[ \cos \omega x \int^\inf...
... dv + \sin \omega x \int^\infty_{-\infty} f(v) \sin \omega v dv \right] d\omega$ (5.61)

Introduce $ A(\omega)$ and $ B(\omega)$ as

$\displaystyle A(\omega) = \int^\infty_{-\infty} f(v) \cos \omega v dv, \quad B(\omega) = \int^\infty_{-\infty} f(v) \sin \omega v dv$ (5.62)

Finally Fourier series for an aperiodic equation become

$\displaystyle f(x) = \int^\infty_0 \left[ A(\omega) \cos \omega x + B(\omega) \sin \omega x \right] d\omega$ (5.63)

This is called a representation of $ f(x)$ by a Fourier integral.

Alternatively, the Fourier integral can be written as complex Fourier series.

\begin{displaymath}\begin{split}f(x) &= \sum^\infty_{-\infty} c_n e^{i\omega_nx}...
...= \frac{1}{2L} \int^L_{-L} f(u) e^{-i\omega_n u} du \end{split}\end{displaymath} (5.64)

$\displaystyle f(x) = \sum^\infty_{-\infty} \left[ \frac{1}{2L} \int^L_{-L} f(u) e^{-i\omega_n u} du \right] e^{i\omega_nx}$ (5.65)

Use $ 1/L = \Delta \omega / \pi$

$\displaystyle f(x)$ $\displaystyle = \sum^\infty_{-\infty} \left[ \frac{\Delta \omega}{2\pi} \int^L_{-L} f(u) e^{-i\omega_n u} du \right] e^{inx}$ (5.66)
  $\displaystyle = \sum^\infty_{-\infty} \frac{\Delta \omega}{2\pi} \int^L_{-L} f(...
...omega_n(x-u)} du = \frac{1}{2\pi}\sum^\infty_{-\infty} F(\omega_n)\Delta \omega$ (5.67)

where

$\displaystyle F(\omega_n) = \int^L_{-L} f(u) e^{i\omega_n(x-u)} du$ (5.68)

If $ \Delta \omega$ goes to zero, a limit of a sum becomes an integral

$\displaystyle f(x)$ $\displaystyle = \frac{1}{2\pi}\int^\infty_{-\infty} F(\omega)d\omega = \frac{1}{2\pi}\int^\infty_{-\infty} \int^\infty_{-\infty} f(u) e^{i\omega(x-u)} du d\omega$ (5.69)
  $\displaystyle = \frac{1}{2\pi}\int^\infty_{-\infty} e^{i\omega x} d\omega \int^\infty_{-\infty} f(u)e^{-i\omega u} du$ (5.70)

Define $ g(\omega)$ by

$\displaystyle g(\omega) = \frac{1}{2\pi} \int^\infty_{-\infty} f(u)e^{-i\omega u} du$ (5.71)

Then

$\displaystyle f(x) = \int^\infty_{-\infty} g(\omega) e^{i\omega x} d\omega$ (5.72)

Fourier Transform

\begin{displaymath}\begin{split}f(x) &= \int^\infty_{-\infty} g(\omega) e^{i\ome...
...{2\pi} \int^\infty_{-\infty} f(u) e^{-i\omega u} du \end{split}\end{displaymath} (5.73)

$ f(x)$ and $ g(\omega)$ are called a pair of Fourier transforms. Usually, $ g(\omega)$ is called the Fourier transform of $ f(x)$, and $ f(x)$ is called the inverse Fourier transform of $ g(\omega)$.

Discrete Fourier Transform (DFT)

In engineering, functions are often represented by finite sets of discrete values and data is often collected in or converted to such a discrete format. For the discrete time system, a discrete Fourier transform can be written as

$\displaystyle F_k = \sum^{N-1}_{n=0} f_n e^{-i\omega_0 n}$ (5.74)

and the inverse Fourier transform as

$\displaystyle f_n = \frac{1}{N} \sum^{N-1}_{k=0} F_ke^{i\omega_0 n}$ (5.75)

where $ \omega_0 = 2\pi/N$.

Fast Fourier Transform (FFT)

The fast Fourier transform (FFT) is an algorithm that has been developed to compute the DFT in an extremely economical fashion.

The Power Spectrum

A power spectrum is developed from the Fourier transform and it is derived from the analysis of the power output of electrical systems. The power of a periodic signal can be defined as

$\displaystyle P=\frac{1}{T} \int^{T/2}_{-T/2} f^2(t)dt$ (5.76)

A power spectrum can be calculated by the power associated with each frequency component.

Curve Fitting with Libraries and Packagies

Engineering Applications: Curve Fitting

See the textbook


next up previous contents
Next: Numerical Differentiation and Integration Up: Numerical Analysis for Chemical Previous: Optimization
Taechul Lee
2001-11-29