This paper presents a new parallel ILU preconditioner. It is based on the technique of Sequential Staging of Tasks (SST) which overcomes the difficulty of recursiveness arising from ILU factorization. This new parallel ILU preconditioner is easy to reconstruct from the sequential version. The only requirement is to insert the synchronization codes of stages of tasks into the sequential version, and any other modifications to the original serial codes are not needed. The characteristic of matching various ordering schemes is still maintained, and the new merit of handling different numbers of processors is obtained. Numerical results were obtained using a thermal model with different grid system. The parallel speedup is satisfactory.
Simulation of thermal recovery processes using a fully implicit treatment of component concentrations, phase saturations, pressure and temperature requires solution of large systems of linear equations. Currently, the most robust techniques for solving this large system of linear equations are preconditioned conjugate gradient like methods such as ORTHOMIN which is widely used in traditional solvers for sequential computers.
The major computation of ORTHOMIN is vector inner product and is easy to be paralleled. However, the most robust preconditioners such as ILU factorization and nested factorization are not suitable for parallel computer because of their intrinsical recursiveness. It is difficult to parallel ILU preconditioner directly. At present, the general methods to parallel preconditioners are the use of new preconditioning methods such as parallel nested factorization preconditioning. These new preconditioning methods have high parallel efficiency in parallel computers, but also have limitations:
limited types of ordering schemes;
comprehensive modifications of the sequential codes;
This paper presents a new parallel ILU preconditioner based on the technique of Sequential Staging of Tasks (SST). Using the SST technique, the new preconditioner exploits the small scale parallelism of ILU factorizations, and achieves a temporal, larger scale parallelism within certain computing domain, consequently obtains an applicable parallel preconditioner. The new parallel preconditioner maintains all the characteristics of the sequential version, and is easy to reconstruct from the sequential version. The new preconditioner is applicable to computers with different number of processors. The numerical experiments show that the parallel speedup is satisfactory. On an NP 1/52 Mini-Supercomputer System (produced by GOULD Co. in 1989, shared main memory, symmetrical operation system UTX/32) with two processors, the parallel speedup is 1.85.
Consider the linear system,
As a preconditioner of ORTHOMIN, ILU factorization provides a matrix M, a "good" approximation to coefficient matrix A and easy to factor, convergence may be accelerated by solving the equivalent system,
Such preconditionings should offset the added cost of factoring M and performing a forward and back solution with each matrix-vector multiplication by reducing the number of iterations substantially.
Main stages of ILU factorization are as follow:
A symbolic factorization, defining the non-zero structure of the incomplete factorization.
For k=l, NB Do
Inverting the main elements