Algorithm for computing polynomial coefficients
In mathematics , divided differences is an algorithm , historically used for computing tables of logarithms and trigonometric functions .[citation needed ] Charles Babbage 's difference engine , an early mechanical calculator , was designed to use this algorithm in its operation.[ 1]
Divided differences is a recursive division process. Given a sequence of data points ( x 0 , y 0 ) , … , ( x n , y n ) {\displaystyle (x_{0},y_{0}),\ldots ,(x_{n},y_{n})} , the method calculates the coefficients of the interpolation polynomial of these points in the Newton form .
Given n + 1 data points ( x 0 , y 0 ) , … , ( x n , y n ) {\displaystyle (x_{0},y_{0}),\ldots ,(x_{n},y_{n})} where the x k {\displaystyle x_{k}} are assumed to be pairwise distinct, the forward divided differences are defined as: [ y k ] := y k , k ∈ { 0 , … , n } [ y k , … , y k + j ] := [ y k + 1 , … , y k + j ] − [ y k , … , y k + j − 1 ] x k + j − x k , k ∈ { 0 , … , n − j } , j ∈ { 1 , … , n } . {\displaystyle {\begin{aligned}{\mathopen {[}}y_{k}]&:=y_{k},&&k\in \{0,\ldots ,n\}\\{\mathopen {[}}y_{k},\ldots ,y_{k+j}]&:={\frac {[y_{k+1},\ldots ,y_{k+j}]-[y_{k},\ldots ,y_{k+j-1}]}{x_{k+j}-x_{k}}},&&k\in \{0,\ldots ,n-j\},\ j\in \{1,\ldots ,n\}.\end{aligned}}}
To make the recursive process of computation clearer, the divided differences can be put in tabular form, where the columns correspond to the value of j above, and each entry in the table is computed from the difference of the entries to its immediate lower left and to its immediate upper left, divided by a difference of corresponding x- values: x 0 y 0 = [ y 0 ] [ y 0 , y 1 ] x 1 y 1 = [ y 1 ] [ y 0 , y 1 , y 2 ] [ y 1 , y 2 ] [ y 0 , y 1 , y 2 , y 3 ] x 2 y 2 = [ y 2 ] [ y 1 , y 2 , y 3 ] [ y 2 , y 3 ] x 3 y 3 = [ y 3 ] {\displaystyle {\begin{matrix}x_{0}&y_{0}=[y_{0}]&&&\\&&[y_{0},y_{1}]&&\\x_{1}&y_{1}=[y_{1}]&&[y_{0},y_{1},y_{2}]&\\&&[y_{1},y_{2}]&&[y_{0},y_{1},y_{2},y_{3}]\\x_{2}&y_{2}=[y_{2}]&&[y_{1},y_{2},y_{3}]&\\&&[y_{2},y_{3}]&&\\x_{3}&y_{3}=[y_{3}]&&&\\\end{matrix}}}
Note that the divided difference [ y k , … , y k + j ] {\displaystyle [y_{k},\ldots ,y_{k+j}]} depends on the values x k , … , x k + j {\displaystyle x_{k},\ldots ,x_{k+j}} and y k , … , y k + j {\displaystyle y_{k},\ldots ,y_{k+j}} , but the notation hides the dependency on the x -values. If the data points are given by a function f , ( x 0 , y 0 ) , … , ( x k , y n ) = ( x 0 , f ( x 0 ) ) , … , ( x n , f ( x n ) ) {\displaystyle (x_{0},y_{0}),\ldots ,(x_{k},y_{n})=(x_{0},f(x_{0})),\ldots ,(x_{n},f(x_{n}))} one sometimes writes the divided difference in the notation f [ x k , … , x k + j ] = def [ f ( x k ) , … , f ( x k + j ) ] = [ y k , … , y k + j ] . {\displaystyle f[x_{k},\ldots ,x_{k+j}]\ {\stackrel {\text{def}}{=}}\ [f(x_{k}),\ldots ,f(x_{k+j})]=[y_{k},\ldots ,y_{k+j}].} Other notations for the divided difference of the function ƒ on the nodes x 0 , ..., x n are: f [ x k , … , x k + j ] = [ x 0 , … , x n ] f = [ x 0 , … , x n ; f ] = D [ x 0 , … , x n ] f . {\displaystyle f[x_{k},\ldots ,x_{k+j}]={\mathopen {[}}x_{0},\ldots ,x_{n}]f={\mathopen {[}}x_{0},\ldots ,x_{n};f]=D[x_{0},\ldots ,x_{n}]f.}
Divided differences for k = 0 {\displaystyle k=0} and the first few values of j {\displaystyle j} : [ y 0 ] = y 0 [ y 0 , y 1 ] = y 1 − y 0 x 1 − x 0 [ y 0 , y 1 , y 2 ] = [ y 1 , y 2 ] − [ y 0 , y 1 ] x 2 − x 0 = y 2 − y 1 x 2 − x 1 − y 1 − y 0 x 1 − x 0 x 2 − x 0 = y 2 − y 1 ( x 2 − x 1 ) ( x 2 − x 0 ) − y 1 − y 0 ( x 1 − x 0 ) ( x 2 − x 0 ) [ y 0 , y 1 , y 2 , y 3 ] = [ y 1 , y 2 , y 3 ] − [ y 0 , y 1 , y 2 ] x 3 − x 0 {\displaystyle {\begin{aligned}{\mathopen {[}}y_{0}]&=y_{0}\\{\mathopen {[}}y_{0},y_{1}]&={\frac {y_{1}-y_{0}}{x_{1}-x_{0}}}\\{\mathopen {[}}y_{0},y_{1},y_{2}]&={\frac {{\mathopen {[}}y_{1},y_{2}]-{\mathopen {[}}y_{0},y_{1}]}{x_{2}-x_{0}}}={\frac {{\frac {y_{2}-y_{1}}{x_{2}-x_{1}}}-{\frac {y_{1}-y_{0}}{x_{1}-x_{0}}}}{x_{2}-x_{0}}}={\frac {y_{2}-y_{1}}{(x_{2}-x_{1})(x_{2}-x_{0})}}-{\frac {y_{1}-y_{0}}{(x_{1}-x_{0})(x_{2}-x_{0})}}\\{\mathopen {[}}y_{0},y_{1},y_{2},y_{3}]&={\frac {{\mathopen {[}}y_{1},y_{2},y_{3}]-{\mathopen {[}}y_{0},y_{1},y_{2}]}{x_{3}-x_{0}}}\end{aligned}}}
Thus, the table corresponding to these terms upto two columns has the following form: x 0 y 0 y 1 − y 0 x 1 − x 0 x 1 y 1 y 2 − y 1 x 2 − x 1 − y 1 − y 0 x 1 − x 0 x 2 − x 0 y 2 − y 1 x 2 − x 1 x 2 y 2 ⋮ ⋮ ⋮ ⋮ ⋮ x n y n {\displaystyle {\begin{matrix}x_{0}&y_{0}&&\\&&{y_{1}-y_{0} \over x_{1}-x_{0}}&\\x_{1}&y_{1}&&{{y_{2}-y_{1} \over x_{2}-x_{1}}-{y_{1}-y_{0} \over x_{1}-x_{0}} \over x_{2}-x_{0}}\\&&{y_{2}-y_{1} \over x_{2}-x_{1}}&\\x_{2}&y_{2}&&\vdots \\&&\vdots &\\\vdots &&&\vdots \\&&\vdots &\\x_{n}&y_{n}&&\\\end{matrix}}}
Linearity ( f + g ) [ x 0 , … , x n ] = f [ x 0 , … , x n ] + g [ x 0 , … , x n ] ( λ ⋅ f ) [ x 0 , … , x n ] = λ ⋅ f [ x 0 , … , x n ] {\displaystyle {\begin{aligned}(f+g)[x_{0},\dots ,x_{n}]&=f[x_{0},\dots ,x_{n}]+g[x_{0},\dots ,x_{n}]\\(\lambda \cdot f)[x_{0},\dots ,x_{n}]&=\lambda \cdot f[x_{0},\dots ,x_{n}]\end{aligned}}} Leibniz rule ( f ⋅ g ) [ x 0 , … , x n ] = f [ x 0 ] ⋅ g [ x 0 , … , x n ] + f [ x 0 , x 1 ] ⋅ g [ x 1 , … , x n ] + ⋯ + f [ x 0 , … , x n ] ⋅ g [ x n ] = ∑ r = 0 n f [ x 0 , … , x r ] ⋅ g [ x r , … , x n ] {\displaystyle (f\cdot g)[x_{0},\dots ,x_{n}]=f[x_{0}]\cdot g[x_{0},\dots ,x_{n}]+f[x_{0},x_{1}]\cdot g[x_{1},\dots ,x_{n}]+\dots +f[x_{0},\dots ,x_{n}]\cdot g[x_{n}]=\sum _{r=0}^{n}f[x_{0},\ldots ,x_{r}]\cdot g[x_{r},\ldots ,x_{n}]} Divided differences are symmetric: If σ : { 0 , … , n } → { 0 , … , n } {\displaystyle \sigma :\{0,\dots ,n\}\to \{0,\dots ,n\}} is a permutation then f [ x 0 , … , x n ] = f [ x σ ( 0 ) , … , x σ ( n ) ] {\displaystyle f[x_{0},\dots ,x_{n}]=f[x_{\sigma (0)},\dots ,x_{\sigma (n)}]} Polynomial interpolation in the Newton form : if P {\displaystyle P} is a polynomial function of degree ≤ n {\displaystyle \leq n} , and p [ x 0 , … , x n ] {\displaystyle p[x_{0},\dots ,x_{n}]} is the divided difference, then P n − 1 ( x ) = p [ x 0 ] + p [ x 0 , x 1 ] ( x − x 0 ) + p [ x 0 , x 1 , x 2 ] ( x − x 0 ) ( x − x 1 ) + ⋯ + p [ x 0 , … , x n ] ( x − x 0 ) ( x − x 1 ) ⋯ ( x − x n − 1 ) {\displaystyle P_{n-1}(x)=p[x_{0}]+p[x_{0},x_{1}](x-x_{0})+p[x_{0},x_{1},x_{2}](x-x_{0})(x-x_{1})+\cdots +p[x_{0},\ldots ,x_{n}](x-x_{0})(x-x_{1})\cdots (x-x_{n-1})} If p {\displaystyle p} is a polynomial function of degree < n {\displaystyle <n} , then p [ x 0 , … , x n ] = 0. {\displaystyle p[x_{0},\dots ,x_{n}]=0.} Mean value theorem for divided differences : if f {\displaystyle f} is n times differentiable, then f [ x 0 , … , x n ] = f ( n ) ( ξ ) n ! {\displaystyle f[x_{0},\dots ,x_{n}]={\frac {f^{(n)}(\xi )}{n!}}} for a number ξ {\displaystyle \xi } in the open interval determined by the smallest and largest of the x k {\displaystyle x_{k}} 's. The divided difference scheme can be put into an upper triangular matrix : T f ( x 0 , … , x n ) = ( f [ x 0 ] f [ x 0 , x 1 ] f [ x 0 , x 1 , x 2 ] … f [ x 0 , … , x n ] 0 f [ x 1 ] f [ x 1 , x 2 ] … f [ x 1 , … , x n ] 0 0 f [ x 2 ] … f [ x 2 , … , x n ] ⋮ ⋮ ⋱ ⋮ 0 0 0 … f [ x n ] ) . {\displaystyle T_{f}(x_{0},\dots ,x_{n})={\begin{pmatrix}f[x_{0}]&f[x_{0},x_{1}]&f[x_{0},x_{1},x_{2}]&\ldots &f[x_{0},\dots ,x_{n}]\\0&f[x_{1}]&f[x_{1},x_{2}]&\ldots &f[x_{1},\dots ,x_{n}]\\0&0&f[x_{2}]&\ldots &f[x_{2},\dots ,x_{n}]\\\vdots &\vdots &&\ddots &\vdots \\0&0&0&\ldots &f[x_{n}]\end{pmatrix}}.}
Then it holds
T f + g ( x ) = T f ( x ) + T g ( x ) {\displaystyle T_{f+g}(x)=T_{f}(x)+T_{g}(x)} T λ f ( x ) = λ T f ( x ) {\displaystyle T_{\lambda f}(x)=\lambda T_{f}(x)} if λ {\displaystyle \lambda } is a scalar T f ⋅ g ( x ) = T f ( x ) ⋅ T g ( x ) {\displaystyle T_{f\cdot g}(x)=T_{f}(x)\cdot T_{g}(x)}
This follows from the Leibniz rule. It means that multiplication of such matrices is commutative . Summarised, the matrices of divided difference schemes with respect to the same set of nodes x form a commutative ring . Since T f ( x ) {\displaystyle T_{f}(x)} is a triangular matrix, its eigenvalues are obviously f ( x 0 ) , … , f ( x n ) {\displaystyle f(x_{0}),\dots ,f(x_{n})} . Let δ ξ {\displaystyle \delta _{\xi }} be a Kronecker delta -like function, that is δ ξ ( t ) = { 1 : t = ξ , 0 : else . {\displaystyle \delta _{\xi }(t)={\begin{cases}1&:t=\xi ,\\0&:{\mbox{else}}.\end{cases}}} Obviously f ⋅ δ ξ = f ( ξ ) ⋅ δ ξ {\displaystyle f\cdot \delta _{\xi }=f(\xi )\cdot \delta _{\xi }} , thus δ ξ {\displaystyle \delta _{\xi }} is an eigenfunction of the pointwise function multiplication. That is T δ x i ( x ) {\displaystyle T_{\delta _{x_{i}}}(x)} is somehow an "eigenmatrix " of T f ( x ) {\displaystyle T_{f}(x)} : T f ( x ) ⋅ T δ x i ( x ) = f ( x i ) ⋅ T δ x i ( x ) {\displaystyle T_{f}(x)\cdot T_{\delta _{x_{i}}}(x)=f(x_{i})\cdot T_{\delta _{x_{i}}}(x)} . However, all columns of T δ x i ( x ) {\displaystyle T_{\delta _{x_{i}}}(x)} are multiples of each other, the matrix rank of T δ x i ( x ) {\displaystyle T_{\delta _{x_{i}}}(x)} is 1. So you can compose the matrix of all eigenvectors of T f ( x ) {\displaystyle T_{f}(x)} from the i {\displaystyle i} -th column of each T δ x i ( x ) {\displaystyle T_{\delta _{x_{i}}}(x)} . Denote the matrix of eigenvectors with U ( x ) {\displaystyle U(x)} . Example U ( x 0 , x 1 , x 2 , x 3 ) = ( 1 1 ( x 1 − x 0 ) 1 ( x 2 − x 0 ) ( x 2 − x 1 ) 1 ( x 3 − x 0 ) ( x 3 − x 1 ) ( x 3 − x 2 ) 0 1 1 ( x 2 − x 1 ) 1 ( x 3 − x 1 ) ( x 3 − x 2 ) 0 0 1 1 ( x 3 − x 2 ) 0 0 0 1 ) {\displaystyle U(x_{0},x_{1},x_{2},x_{3})={\begin{pmatrix}1&{\frac {1}{(x_{1}-x_{0})}}&{\frac {1}{(x_{2}-x_{0})(x_{2}-x_{1})}}&{\frac {1}{(x_{3}-x_{0})(x_{3}-x_{1})(x_{3}-x_{2})}}\\0&1&{\frac {1}{(x_{2}-x_{1})}}&{\frac {1}{(x_{3}-x_{1})(x_{3}-x_{2})}}\\0&0&1&{\frac {1}{(x_{3}-x_{2})}}\\0&0&0&1\end{pmatrix}}} The diagonalization of T f ( x ) {\displaystyle T_{f}(x)} can be written as U ( x ) ⋅ diag ( f ( x 0 ) , … , f ( x n ) ) = T f ( x ) ⋅ U ( x ) . {\displaystyle U(x)\cdot \operatorname {diag} (f(x_{0}),\dots ,f(x_{n}))=T_{f}(x)\cdot U(x).} Polynomials and power series [ edit ] The matrix J = ( x 0 1 0 0 ⋯ 0 0 x 1 1 0 ⋯ 0 0 0 x 2 1 0 ⋮ ⋮ ⋱ ⋱ 0 0 0 0 ⋱ 1 0 0 0 0 x n ) {\displaystyle J={\begin{pmatrix}x_{0}&1&0&0&\cdots &0\\0&x_{1}&1&0&\cdots &0\\0&0&x_{2}&1&&0\\\vdots &\vdots &&\ddots &\ddots &\\0&0&0&0&\;\ddots &1\\0&0&0&0&&x_{n}\end{pmatrix}}} contains the divided difference scheme for the identity function with respect to the nodes x 0 , … , x n {\displaystyle x_{0},\dots ,x_{n}} , thus J m {\displaystyle J^{m}} contains the divided differences for the power function with exponent m {\displaystyle m} . Consequently, you can obtain the divided differences for a polynomial function p {\displaystyle p} by applying p {\displaystyle p} to the matrix J {\displaystyle J} : If p ( ξ ) = a 0 + a 1 ⋅ ξ + ⋯ + a m ⋅ ξ m {\displaystyle p(\xi )=a_{0}+a_{1}\cdot \xi +\dots +a_{m}\cdot \xi ^{m}} and p ( J ) = a 0 + a 1 ⋅ J + ⋯ + a m ⋅ J m {\displaystyle p(J)=a_{0}+a_{1}\cdot J+\dots +a_{m}\cdot J^{m}} then T p ( x ) = p ( J ) . {\displaystyle T_{p}(x)=p(J).} This is known as Opitz' formula .[ 2] [ 3]
Now consider increasing the degree of p {\displaystyle p} to infinity, i.e. turn the Taylor polynomial into a Taylor series . Let f {\displaystyle f} be a function which corresponds to a power series . You can compute the divided difference scheme for f {\displaystyle f} by applying the corresponding matrix series to J {\displaystyle J} : If f ( ξ ) = ∑ k = 0 ∞ a k ξ k {\displaystyle f(\xi )=\sum _{k=0}^{\infty }a_{k}\xi ^{k}} and f ( J ) = ∑ k = 0 ∞ a k J k {\displaystyle f(J)=\sum _{k=0}^{\infty }a_{k}J^{k}} then T f ( x ) = f ( J ) . {\displaystyle T_{f}(x)=f(J).}
Alternative characterizations [ edit ] f [ x 0 ] = f ( x 0 ) f [ x 0 , x 1 ] = f ( x 0 ) ( x 0 − x 1 ) + f ( x 1 ) ( x 1 − x 0 ) f [ x 0 , x 1 , x 2 ] = f ( x 0 ) ( x 0 − x 1 ) ⋅ ( x 0 − x 2 ) + f ( x 1 ) ( x 1 − x 0 ) ⋅ ( x 1 − x 2 ) + f ( x 2 ) ( x 2 − x 0 ) ⋅ ( x 2 − x 1 ) f [ x 0 , x 1 , x 2 , x 3 ] = f ( x 0 ) ( x 0 − x 1 ) ⋅ ( x 0 − x 2 ) ⋅ ( x 0 − x 3 ) + f ( x 1 ) ( x 1 − x 0 ) ⋅ ( x 1 − x 2 ) ⋅ ( x 1 − x 3 ) + f ( x 2 ) ( x 2 − x 0 ) ⋅ ( x 2 − x 1 ) ⋅ ( x 2 − x 3 ) + f ( x 3 ) ( x 3 − x 0 ) ⋅ ( x 3 − x 1 ) ⋅ ( x 3 − x 2 ) f [ x 0 , … , x n ] = ∑ j = 0 n f ( x j ) ∏ k ∈ { 0 , … , n } ∖ { j } ( x j − x k ) {\displaystyle {\begin{aligned}f[x_{0}]&=f(x_{0})\\f[x_{0},x_{1}]&={\frac {f(x_{0})}{(x_{0}-x_{1})}}+{\frac {f(x_{1})}{(x_{1}-x_{0})}}\\f[x_{0},x_{1},x_{2}]&={\frac {f(x_{0})}{(x_{0}-x_{1})\cdot (x_{0}-x_{2})}}+{\frac {f(x_{1})}{(x_{1}-x_{0})\cdot (x_{1}-x_{2})}}+{\frac {f(x_{2})}{(x_{2}-x_{0})\cdot (x_{2}-x_{1})}}\\f[x_{0},x_{1},x_{2},x_{3}]&={\frac {f(x_{0})}{(x_{0}-x_{1})\cdot (x_{0}-x_{2})\cdot (x_{0}-x_{3})}}+{\frac {f(x_{1})}{(x_{1}-x_{0})\cdot (x_{1}-x_{2})\cdot (x_{1}-x_{3})}}+\\&\quad \quad {\frac {f(x_{2})}{(x_{2}-x_{0})\cdot (x_{2}-x_{1})\cdot (x_{2}-x_{3})}}+{\frac {f(x_{3})}{(x_{3}-x_{0})\cdot (x_{3}-x_{1})\cdot (x_{3}-x_{2})}}\\f[x_{0},\dots ,x_{n}]&=\sum _{j=0}^{n}{\frac {f(x_{j})}{\prod _{k\in \{0,\dots ,n\}\setminus \{j\}}(x_{j}-x_{k})}}\end{aligned}}}
With the help of the polynomial function ω ( ξ ) = ( ξ − x 0 ) ⋯ ( ξ − x n ) {\displaystyle \omega (\xi )=(\xi -x_{0})\cdots (\xi -x_{n})} this can be written as f [ x 0 , … , x n ] = ∑ j = 0 n f ( x j ) ω ′ ( x j ) . {\displaystyle f[x_{0},\dots ,x_{n}]=\sum _{j=0}^{n}{\frac {f(x_{j})}{\omega '(x_{j})}}.}
If x 0 < x 1 < ⋯ < x n {\displaystyle x_{0}<x_{1}<\cdots <x_{n}} and n ≥ 1 {\displaystyle n\geq 1} , the divided differences can be expressed as[ 4] f [ x 0 , … , x n ] = 1 ( n − 1 ) ! ∫ x 0 x n f ( n ) ( t ) B n − 1 ( t ) d t {\displaystyle f[x_{0},\ldots ,x_{n}]={\frac {1}{(n-1)!}}\int _{x_{0}}^{x_{n}}f^{(n)}(t)\;B_{n-1}(t)\,dt} where f ( n ) {\displaystyle f^{(n)}} is the n {\displaystyle n} -th derivative of the function f {\displaystyle f} and B n − 1 {\displaystyle B_{n-1}} is a certain B-spline of degree n − 1 {\displaystyle n-1} for the data points x 0 , … , x n {\displaystyle x_{0},\dots ,x_{n}} , given by the formula B n − 1 ( t ) = ∑ k = 0 n ( max ( 0 , x k − t ) ) n − 1 ω ′ ( x k ) {\displaystyle B_{n-1}(t)=\sum _{k=0}^{n}{\frac {(\max(0,x_{k}-t))^{n-1}}{\omega '(x_{k})}}}
This is a consequence of the Peano kernel theorem ; it is called the Peano form of the divided differences and B n − 1 {\displaystyle B_{n-1}} is the Peano kernel for the divided differences, all named after Giuseppe Peano .
Forward and backward differences [ edit ] When the data points are equidistantly distributed we get the special case called forward differences . They are easier to calculate than the more general divided differences.
Given n +1 data points ( x 0 , y 0 ) , … , ( x n , y n ) {\displaystyle (x_{0},y_{0}),\ldots ,(x_{n},y_{n})} with x k = x 0 + k h , for k = 0 , … , n and fixed h > 0 {\displaystyle x_{k}=x_{0}+kh,\ {\text{ for }}\ k=0,\ldots ,n{\text{ and fixed }}h>0} the forward differences are defined as Δ ( 0 ) y k := y k , k = 0 , … , n Δ ( j ) y k := Δ ( j − 1 ) y k + 1 − Δ ( j − 1 ) y k , k = 0 , … , n − j , j = 1 , … , n . {\displaystyle {\begin{aligned}\Delta ^{(0)}y_{k}&:=y_{k},\qquad k=0,\ldots ,n\\\Delta ^{(j)}y_{k}&:=\Delta ^{(j-1)}y_{k+1}-\Delta ^{(j-1)}y_{k},\qquad k=0,\ldots ,n-j,\ j=1,\dots ,n.\end{aligned}}} whereas the backward differences are defined as: ∇ ( 0 ) y k := y k , k = 0 , … , n ∇ ( j ) y k := ∇ ( j − 1 ) y k − ∇ ( j − 1 ) y k − 1 , k = 0 , … , n − j , j = 1 , … , n . {\displaystyle {\begin{aligned}\nabla ^{(0)}y_{k}&:=y_{k},\qquad k=0,\ldots ,n\\\nabla ^{(j)}y_{k}&:=\nabla ^{(j-1)}y_{k}-\nabla ^{(j-1)}y_{k-1},\qquad k=0,\ldots ,n-j,\ j=1,\dots ,n.\end{aligned}}} Thus the forward difference table is written as: y 0 Δ y 0 y 1 Δ 2 y 0 Δ y 1 Δ 3 y 0 y 2 Δ 2 y 1 Δ y 2 y 3 {\displaystyle {\begin{matrix}y_{0}&&&\\&\Delta y_{0}&&\\y_{1}&&\Delta ^{2}y_{0}&\\&\Delta y_{1}&&\Delta ^{3}y_{0}\\y_{2}&&\Delta ^{2}y_{1}&\\&\Delta y_{2}&&\\y_{3}&&&\\\end{matrix}}} whereas the backwards difference table is written as: y 0 ∇ y 1 y 1 ∇ 2 y 2 ∇ y 2 ∇ 3 y 3 y 2 ∇ 2 y 3 ∇ y 3 y 3 {\displaystyle {\begin{matrix}y_{0}&&&\\&\nabla y_{1}&&\\y_{1}&&\nabla ^{2}y_{2}&\\&\nabla y_{2}&&\nabla ^{3}y_{3}\\y_{2}&&\nabla ^{2}y_{3}&\\&\nabla y_{3}&&\\y_{3}&&&\\\end{matrix}}}
The relationship between divided differences and forward differences is[ 5] [ y j , y j + 1 , … , y j + k ] = 1 k ! h k Δ ( k ) y j , {\displaystyle [y_{j},y_{j+1},\ldots ,y_{j+k}]={\frac {1}{k!h^{k}}}\Delta ^{(k)}y_{j},} whereas for backward differences:[citation needed ] [ y j , y j − 1 , … , y j − k ] = 1 k ! h k ∇ ( k ) y j . {\displaystyle [{y}_{j},y_{j-1},\ldots ,{y}_{j-k}]={\frac {1}{k!h^{k}}}\nabla ^{(k)}y_{j}.}
^ Isaacson, Walter (2014). The Innovators . Simon & Schuster. p. 20. ISBN 978-1-4767-0869-0 . ^ de Boor, Carl , Divided Differences , Surv. Approx. Theory 1 (2005), 46–69, [1] ^ Opitz, G. Steigungsmatrizen , Z. Angew. Math. Mech. (1964), 44, T52–T54 ^ Skof, Fulvia (2011-04-30). Giuseppe Peano between Mathematics and Logic: Proceeding of the International Conference in honour of Giuseppe Peano on the 150th anniversary of his birth and the centennial of the Formulario Mathematico Torino (Italy) October 2-3, 2008 . Springer Science & Business Media. p. 40. ISBN 978-88-470-1836-5 . ^ Burden, Richard L.; Faires, J. Douglas (2011). Numerical Analysis (9th ed.). Cengage Learning. p. 129 . ISBN 9780538733519 . Louis Melville Milne-Thomson (2000) [1933]. The Calculus of Finite Differences . American Mathematical Soc. Chapter 1: Divided Differences. ISBN 978-0-8218-2107-7 . Myron B. Allen; Eli L. Isaacson (1998). Numerical Analysis for Applied Science . John Wiley & Sons. Appendix A. ISBN 978-1-118-03027-1 . Ron Goldman (2002). Pyramid Algorithms: A Dynamic Programming Approach to Curves and Surfaces for Geometric Modeling . Morgan Kaufmann. Chapter 4:Newton Interpolation and Difference Triangles. ISBN 978-0-08-051547-2 .