A Parallel Algorithm for VIDEs

A PARALLEL ALGORITHM FOR NONLINEAR VOLTERRA INTEGRO-DIFFERENTIAL EQUATIONS

Ruth E. Shaw Dept. of Mathematics, Statistics & Computer Science University of New Brunswick Saint John, N.B. Canada E2L 4L5 reshaw@unbsj.ca

Keywords: Volterra integro-differential equation
parallel algorithm

ABSTRACT

Direct methods of solution for solving nonlinear Volterra integral and integro-differential equations are inherently serial and therefore have not received much attention for use on a parallel computer. It is possible, however, to make significant gains in speedup by employing some novel approaches to existing methods. It can be observed that when approximating the integral term, all approximations up to and including the kth step in the interval can be evaluated at the same time. The resulting algorithm is described here and a numerical example illustrates the results. With the four-processor multicomputer utilized, speedups of 3 to 4 were realized.

INTRODUCTION

Consider a nonlinear Volterra integro-differential equation (VIDE) of the form

y'(x)=F(x,y(x),z(x)), y(0)=y₀, 0

a (1)

where

z(x) =

K(x,t,y(t))dt

and y(x) is the unknown function. Equations of this type occur in a number of areas such as wave power hydraulics [6], population dynamics [5,12] and reactor dynamics [11]. By setting z(x)=0, equation (1) reduces to a first order ordinary differential equation (ODE) of the form

y'(x)=F(x,y(x)), y(0)=y₀, 0

a(2)

In general, a VIDE cannot be reduced to an ODE through differentiation. There have been numerous papers on parallel code for solving ODEs [2,3,8,9,16,17,18,19,20] but work on Volterra type equations has currently been limited to only a few authors [4,13,14,15,16,20]. This is due to the inherently serial nature of obtaining a numerical approximation to the integral term in (1). Papers by Crisci, van der Houwen, Russo and Vecchio [4] and Vecchio [21] concentrated on the stability aspects of parallel iteration of Volterra-Runge-Kutta (VRK) methods for solving Volterra integral equations on parallel computers. VRK methods are step-by-step methods and can take advantage of parallel architecture. A paper by Sommeijer, Couzy and van der Houwen [16] covered the stability of parallel block methods for ODEs and included equations of the integro-differential type in their discussion. The work in this paper concentrates on modifying the algorithmic side of the numerical solution process for (1) for use on a parallel processor while consciously utilizing methods that are known to be stable.

To define a class of problems of the form (1), let R₁ and R₂ be two regions defined by

R₁ = {(x,t,y):0

a, |y|<

} and
R₂ = {(x,y,z):0

a, y

t, |z|<

}.
If

i) K and F are uniformly continuous and bounded for all (x,t,y) in R₁ and (x,y,z) in R₂, respectively, and

ii) the following Lipschitz conditions are satisfied

|F(x,y,z) - F(x,y^*,z)| L₁|y - y^*|;

|F(x,y,z) - F(x,y,z^*)| L₂|z - z^*|;

|K(x,t,y) - K(x,t,y^*)| L₃|y - y^*|;

where L₁, L₂ and L₃ are the Lipschitz constants, then it is known [7] that equation (1) has a unique solution.

To find a numerical solution for equations of the type (1) we will use the fourth-order Adams-Bashforth-Moulton (ABM) predictor-corrector (PC) method with the fourth-order Runge-Kutta method being used to obtain the starting values. Newton-Gregory quadrature will be used to approximate the integral term in equation (1) and the fourth-order Runge-Kutta method will be used for starting values here as well. It is this integral term that leads to the problem of adapting the numerical methods for VIDEs for use on parallel processors. This paper suggests a unique approach to solving this problem that allows the resulting algorithm to be parallelized. The algorithm was coded on a four-processor Pentium Pro and the numerical examples show actual speedups of 3 to 4. Preliminary work on a 16 processor supercomputer shows similar speedups.

NUMERICAL METHODS

Let I_N be a partition of I=[0,a] where I_N={x_N=nh, n=0(1)N, Nh=a}. The problem is to find approximations y_n to the solution y(x_n) of equation (1) for each x_n in I_N. A multistep method for an integro-differential equation is given by

y_n+k = y_n + h

w_j F(x_n+j, y_n+j, z_n+j) (3)

where

z_n+j = h

c_n+j,i K(x_n+j, x_i, y_i), y₀ =

where is a constant, the weights {w_j} depend on the k-step method selected and the weights {c_n+j,i} are those of a Newton-Cotes or Newton-Gregory quadrature rule. The fourth order Adams-Bashforth predictor and Adams-Moulton corrector are the multistep methods used in this paper. On uniprocessor computers multistep PC methods of the Adams type are the most popular [19]. It has been suggested by van der Houwen, Sommeijer and de Swart [19] that the ABM method is one of several efficient parallel PC methods that can be used to solve initial value problems of the form (2). The integral term in equation (1) is solved by Newton-Gregory quadrature.

For the integration of equation (1) over [x_k-3,x_k+1], the PC method is given by

y_k+1 = y_k + (h/24)[55F(x_k, y_k, z_k) - 59F(x_k-1, y_k-1, z_k-1)(4)
+ 37F(x_k-2, y_k-2, z_k-2) - 9F(x_k-3, y_k-3, z_k-3)]

y_k+1 = y_k + (h/24)[9F(x_k+1, y_k+1, z_k+1) + 19F(x_k, y_k, z_k)(5)
- 5F(x_k-1, y_k-1, z_k-1) + F(x_k-2, y_k-2, z_k-2)]

where for each z_i the Newton-Gregory quadrature weights are given in Table 2.1. The fourth order Runge-Kutta method is used to obtain starting values for both the ABM PC multistep method and the Newton-Gregory quadrature rules.

Table 2.1

N	24w_j, j=0(1)N
2	8	32	8	0	0	0	0
3	9	27	27	9	0	0	0
4	9	28	22	28	9	0	0
5	9	28	23	23	28	9	0
6	9	28	23	24	23	28	9
for N>6, w_j=1, j=3(1)N-3

PARALLEL ALGORITHM

When solving VIDEs the integral term as given in equation (1) must be considered before attempting to convert the algorithm for parallel processing. Any direct method of solution requires that the approximation to each z_i in the interval of integration involve the points [x₀,x_i-1]. This is obviously a serial process and not a good candidate for parallelization. It can be observed, however, that once the starting values are obtained and the value of y_k for the k-step method is known, all approximations {z_i}, i=1(1)N can be evaluated up to and including this point. At each point x_i+1 to x_N all values of z_i+1 to z_N can also be evaluated simultaneously. This results in an algorithm that can then take advantage of a parallel processing computer.

To accommodate this approach for solving equation (1), a matrix of weights can be assigned to correspond to the quadrature rule employed. The matrix that results from the Newton-Gregory rules is given by Table 2.1. Note that, since the fourth order ABM method requires four starting values, z₄ is the first integral term where the modification to allow for parallel processing can be utilized. Using the convention of Akl [1], where braces indicate comments, gives the following algorithm:

ALGORITHM

1) {Set up matrix of weights for quadrature rule}

for i = 0 to N-k+1

for j = 0 to i

assign values for weights c_i,j

2) {Set up N+1 mesh points}

x(0) = 0

for i = 1 to N

x(i) = x(i-1) + h

3) {Find starting values}

for i = 0 to k-2

determine y(i+1) and z(i+1)

4) {Use a k-step method to approximate equation (1) over [x_k,x_N]}

for j = k to N

evaluate z(j) from x(0) to x(k)

for i = k-1 to N-1

predict y(i+1)

estimate z(i+1) with y(i+1)

correct y(i+1)

for j = i+1 to N

update each z(j) using y(i+1)

It should be noted that setting up the matrix of weights can also be a parallel procedure. This, combined with the unique approach used to approximate the integral term in step 4 of the algorithm, results in an algorithm that gives speedups of up to 4 as compared with no speedup in the original algorithm.

NUMERICAL EXAMPLES

In this section, a nonlinear problem is used to illustrate the method described. The code is run in both scalar and parallel mode for different values of N where N is the number of mesh points. The computer system used is a four-processor Pentium Pro running Linux. Portland Group's pgf77 Fortran compiler is used with OpenMP style directives. The maximum error in the approximation over the interval [0,1] and the mean CPU times in milliseconds are given in Table 4.1 for the multistep method described. The speedups in Table 4.1 for successively larger values of N illustrate the relationship between speedups and number of mesh points. The scaled efficiency of the processors, a better measure of the quality of the parallel algorithm, is included as well and illustrates up to a 96% efficiency for large values of N.

Example:

y'(x) = xe^(1-y(x)) - 1/(1+x)² - x - x/(1+t)² e^(1-y(t))dt

y(0) = 1 , 0

1, exact solution: y(x) = 1/(1+x)

Table 4.1

N	\|y(x_i)-y_i\|	Mean scalar CPU time	Mean parallel CPU time	Speedup	Scaled Efficiency
160	8.007x10^-9	0.033	0.012	2.75	68.75%
320	5.165x10^-10	0.120	0.043	2.79	69.75%
640	3.265x10^-11	0.478	0.145	3.30	82.50%
1280	2.053x10^-12	1.890	0.520	3.63	90.87%
2560	1.287x10^-13	7.560	2.000	3.78	94.50%
5120	8.157x10^-15	30.290	7.850	3.86	96.46%

CONCLUSIONS

For the multistep method described, the results indicate that as the number of mesh points increase the speedup is more significant. It can also be noted that, for larger values of N, the numerical approximation to the solution of (1) becomes more accurate. In addition, the scaled efficiency of the processors reaches about 96% for large N. This indicates that the processor utilization for this algorithm is very high.

ACKNOWLEDGEMENT

This work is supported by NSERC. The author would also like to thank Danny Lizotte, a 1999 NSERC student, for his programming efforts.

Akl, S.G.(1989)Design and Analysis of Parallel Algorithms,Prentice-Hall,New Jersey.
Cooper, G.J. & Vignesvaran, R.(1993)On the use of parallel processors for implicit Runge-Kutta methods,Computing 51, 135-150.
Crisci, M.R.(1994)Parallel frontal methods for ODEs,BIT 34,215-227.
Crisci, M.R., van der Houwen, P.J., Russo, E. & Vecchio, A.(1993)Stability of Parallel Volterra Runge-Kutta methods,J.of CAM 45,169-180.
Cushing, J.M.(1977)Integrodifferential equations and delay models in population dynamics,Springer-Verlag,Heidelberg.
Elliott, C.M. & McKee, S.(1981)On the numerical solution of an integro-differential equation arising from wave-power hydraulics,BIT 21,318-325.
Garey,L.E.,(1975)Step-by-step methods for the numerical solution of Volterra integro-differential equations, 5th Manitoba Conf. Numer. Math.,319-329.
Goldmann, M.(1988)Vectorization of the multiple shooting method for the nonlinear boundary value problem in ordinary differential equations,Parallel Computing 7,97-110.
Khalaf, B.M.S. & Hutchinson, D.(1992)Parallel algorithms for initial value problems: parallel shooting,Parallel Computing 18,661-673.
Linz,P.(1969)Linear multistep methods for Volterra integro-differential equations, Journal of the Assoc. for Comp. Mach., vol.16,no.2,295-301.
London,S.O.(1969)On a nonlinear Volterra integro-differential equation, Comm. Phys. Math. 38,5-11.
Miller,R.K.(1966)On Volterra's population equation,SIAM J. Appl. Math.14,446-452.
Shaw, R.E.(1985)Numerical Solutions and Simple Roots for Volterra integro-differential Equations,MScCS thesis,UNB.
Shaw,R.E.(1996)Vectorizing multistep methods for Volterra integro-differential equations, Inter. J. High Speed Computing, vol.8,no.2,137-144.
Shaw, R.E. & Garey, L.E.(1993)Vector processing an iterative method for nonlinear second order Volterra integro-differential equations with two point boundary conditions,Congress Numerantium, vol.92,129-135.
Sommeijer, B.P., Couzy, W. & van der Houwen, P.J.(1992)A-stable parallel block methods for ordinary and integro-differential equations, Appl. Numer. Math,vol.9,267-281.
Tam, H. & Skeel, R.D.(1993)Stability of parallel explicit ODE methods, SIAM J. Numer. Anal., vol.30,no.4,1179-1192.
van der Houwen, P.J .& Sommeijer, B.P. (1991)Iterated Runge-Kutta methods on parallel computers,SIAM J. Sci. Statist. Comput. 12,1991,1000-1028.
van der Houwen, P.J., Sommeijer, B.P. & de Swart, J.J.B.(1994)Parallel predictor-corrector methods, CWI Report,Dept. of Numer. Math., Report NM-R9408.
van der Houwen, P.J., Sommeijer, B.P. & van der Veen, W.A.(1993)Parallel iteration across the steps of high order Runge-Kutta methods for nonstiff initial value problems,CWI Rpt, Dept.Numer.Math.,Rpt NM-R9313.
Vecchio,A.(1993)Highly stable parallel Volterra Runge-Kutta methods,Istituto per Applicazioni Della Matematica, Consiglio Nazionale Delle Ricerche,via P. Castellino, 111,80131 Napoli,Italy,Rapp. Tecnico n.102.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and or fee.
SAC 2000 March 19-21 Como, Italy
(c) 2000 ACM 1-58113-239-5/00/003>...>$5.00

Copyright 2000 ACM