## Numerical and Computational Challenges in Science
and Engineering

### Workshop on Numerical Linear Algebra in Scientific and Engineering
Applications

October 29 - November 2, 2001

The Fields Institute, Second Floor

## Speaker Abstracts

**Dhavide Aruliah**, The Fields Institute

*Multigrid Preconditioning for Time-harmonic Maxwell's Equations
in 3D*

We consider three-dimensional electromagnetic problems that arise in
forward-modelling of Maxwell's equations in the frequency domain. Traditional
formulations and discretizations of Maxwell's equations for this class
of problems leads to a large, sparse, system of linear algebraic equations
that is difficult to solve. That is, iterative methods applied to the
linear system are slow to converge which is a major obstacle in solving
practical inverse problems.

A finite volume discretization on a staggered grid derived from a potential-current
formulation of Maxwell's equations with a block structure that is diagonally
dominant. The block structure of the linear system stems directly from
the choice of potential formulation for Maxwell's frequency domain equations.
The system obtained is strongly elliptic which in turn suggests various
strategies for preconditioning Krylov space iterations used to solve
the linear system.

A constant-coefficient, periodic, local-mode analysis common suggests
that exact inversion of the diagonal blocks can be used as an effective
preconditioning strategy. In particular, local-mode analysis shows that
the preconditioned matrix has eigenvalues range and of the $l_2$-conditioned
number. For Hermitian problems, this implies Krylov-subspace iterations
will achieve rates of convergence independent of the mesh size; that
is, a fixed reduction of the relative residual is attainable in a constant
number of iterations regardless of the degree to which the mesh is refined.
The analysis extends readily to the case where a single multigrid V-cycle
is applied for preconditioning rather than invertinf the diagonal blocks
exactly giving similar convergence properties. The multigrid preconditioning
strategy is tested within Krylov-subspace iterations in a variety of
experiments. In spite of the non-Hermitian character of the underlying
linear system, the numerical experiments support theoretical predictions
and demonstrate the efficacy of multigrid preconditioning for low-to-moderate
frequency electromagnetic simulations.

This is work done in collaboration with U. Ascher and E. Haber.

**Mark Baertschy**, University of Colorado,
Boulder

**Solution of a three-Body problem in quantum mechanics using sparse
linear algebra on parallel computers**

A complete description of two outgoing electrons following an ionizing
collision between a single electron and an atom

or molecule has long stood as one of the unsolved fundamental problems
in quantum collision theory.

In this technical paper we describe our use of distributed memory parallel
computers to calculate a fully converged wave function

describing the electron-impact ionization of hydrogen. Our approach
hinges on a transformation of the Schr\"{o}dinger

equation that simplifies the boundary conditions but requires solving
very ill-conditioned systems of a few million complex, sparse linear
equations.

We developed a two-level iterative algorithm that requires repeated
solution of sets of a few hundred thousand linear equations. These are
solved directly by $LU$-factorization using a specially tuned, distributed
memory parallel version of the sparse $LU$-factorization library SuperLU.
In smaller cases, where direct solution is technically possible, our
iterative algorithm still gives significant savings in time despite
lower megaflop rates.

**Zhaojun Bai**, University of California

*Second-order Dynamical Systems: quadratic eigenvalue problem and
reduced-order modeling *

Second-order models arise naturally in the study of many types of physical
systems, with common examples being electrical and structural dynamics
systems. In this talk, we will discuss two tightly connected problems
associated with second-order systems, namely, quadratic eigenvalue problem
and reduced-order modeling. We will focus on the development of numerical
techniques which exploit and preserve underlying problem structures
and mathematical properties for solving quadratic eigenvalue problem
and extracting reduced-order model. Case studies from structural dynamics
and microelectromechanical systems (MEMS) will be discussed.

**Xiao-Wen Chang**, McGill University and **C.
C. Paige**

*A Recursive Least Squares Approach for GPS Based Positioning *

The Global Positioning System (GPS) is an all-weather, worldwide, continuous
coverage, satellite based navigation system. GPS satellites transmit
signals that allow one to determine, with great

accuracy, the location of GPS receivers. In GPS a typical technique
for dynamic position estimation

is differential positioning where two receivers are used, one receiver
is stationary and its exact position is known,

the other is roving and its position is to be estimated. In this talk
we describe the physical situation and derive the

mathematical model. We then present a recursive least squares approach
for position estimation based on the difference of the measurements
at the stationary and roving receivers. We take full account of the
structure of the problem to make our algorithm efficient and use orthogonal
transformations to ensure numerical reliability

of the algorithm. Simulation results will be presented to demonstrate
the performance of the algorithm. A comparison with the van Graas and
Lee positioning algorithm will be given.

**Christina C. Christara**, University of
Toronto

*Fast Fourier Transform Solvers and Preconditioners*

for Quadratic Spline Collocation

Joint work with Kit Sun Ng.

Quadratic Spline Collocation (QSC) methods of optimal order of convergence
have been recently developed for the solution of elliptic Partial differential
Equations (PDEs). In this paper, linear solvers based on Fast Fourier
Transforms (FFT) are developed for the solution of the QSC equations.
The complexity of the FFT solvers is $O(N^2 \log N)$, where $N$ is the
gridsize in one dimension. These direct solvers can handle PDEs with
coefficients in one variable or constant, and Dirichlet, Neumann and
periodic boundary conditions, along at least one direction of a rectangular
domain. General variable coefficient PDEs are handled by preconditioned
iterative solvers. The preconditioner is the QSC matrix arising from
a constant coefficient PDE. The convergence analysis of the preconditioner
is presented. It is shown that, under certain conditions, the convergence
rate is independent of the gridsize. The preconditioner is solved by
FFT techniques, and integrated with one-step or acceleration methods,
giving rise to asymptotically almost optimal linear solvers, with complexity
$O(N^2 \log N)$. Numerical experiments verify the effectiveness of the
solvers and preconditioners, even on problems more general than the
analysis assumes. The development and analysis of FFT solvers and preconditioners
is extended to QSC equations corresponding to systems of elliptic PDEs.

**Lieven De Lathauwer**, Universite de
Cergy-Pontoise

*Independent component analysis and multilinear algebra*

This talk is intended to provide an introduction to the concept of
Independent Component Analysis (ICA), also known as Blind Source Separation
(BSS). The goal of ICA is the decomposition of a set of multi-sensor
data in an a priori unknown linear mixture of a priori unknown source
signals, relying on the assumption that the source signals are mutually
statistically independent. We will follow an algebraic approach: in
a natural way, ICA poses the question of

generalizations of matrix algebraic techniques to multilinear algebra,
i.e., the algebra of ``multi-way matrices'' or ``higher-order tensors''.
We will discuss four orthogonal tensor decompositions that can be interpreted
as higher-order counterparts of the symmetric matrix Eigenvalue Decomposition
(EVD). Like for instance the EVD and the Singular Value Decomposition
(SVD) of matrices, these decompositions can be considered as tools,
useful for a wide range of applications.

**Oleg Diyankov**, Russian Federal Nuclear
Center - VNIITF

**The ILU-type preconditioner with 2 drop tolerance parameters**

Joint Work with V. Y. Pravilnikov, Russian Federal Nuclear Center -
VNIITF.

This paper is devoted to description of the ILU-type preconditioner
with two

drop tolerance parameters, named $FILU(\epsilon_1, \epsilon _2)$. This
preconditioner $M$ gives

good approximation of the original matrix $A$: $M = L \cdot U \approx
A$ and requires less

memory then a standard one - parameter preconditioner.

The procedure of creating of the preconditioner's matrix M consists
of the

following steps.

At first we calculate special left and right (with respect to main diagonal)

norms for each row, and store them in 2 vectors.

Then during elimination of $i$-th row we compare adding element (new
element

in the row) with corresponding (left or right) norm, multiplied by $\epsilon_2$
.

After elimination of the row is finished, all additional elements are
compared

with the corresponding norm, multiplied by $\epsilon_1$, and elements
with value less then

norm are dropped.

Thus we check an additional element two times. First time we check it
when

it is added, and second time - after elimination of the row.

Such approach allows us to diminish a number of nonzero elements in
the

preconditioner's matrix without worsening of quality of the preconditioner.

1

**Peter Forsyth**, University of Waterloo

*Penalty Methods For American Options*

The convergence of a penalty method for solving the discrete regularized
American option valuation problem is studied.

Sufficient conditions are derived which both guarantee convergence of
the nonlinear penalty iteration and ensure that the iterates converge
monotonically to the solution. In addition, we show that the penalty
iteration has finite termination, which means that the American Constraint
can be enforced with very high precsion with a small number of iterations.

The efficiency and quality of solutions obtained using the implicit
penalty method are compared with those produced with the commonly used
technique of handling the American constraint explicitly. Convergence
rates are studied as the timestep and mesh size tend to zero.

**Wilfried Gansterer**, University of Tennessee

**Approximating Eigenpairs in Quantum Chemistry**

An algorithmic framework for approximating eigenpairs of dense, but
diagonally dominant symmetric matrices is presented. It is based on
a new divide-and-conquer approach for computing approximate eigenvalues
and eigenvectors of a block-tridiagonal matrix. Application of this
method to eigenproblems arising in Materials Science and Quantum Chemistry
is discussed. It is shown that for many of these problems low rank approximations
of the off-diagonal blocks as well as relaxation of deflation criteria
permit the computation of approximate eigenpairs with prescribed accuracy
at significantly reduced computational costs compared to standard methods
as, for example, implemented in LAPACK. (Joint work with Robert C. Ward,
California Institute of Technology)

**Chen Greif**, Parametric Technology Corporation

**Techniques for solving indefinite linear systems **

In a large number of applications one needs to solve a linear system
whose associated matrix can be introduced as a 2x2 block matrix with
a zero (2,2) block. Such linear systems arise in constrained optimization,
least squares problems, the discrete Stokes equation, and other areas
of applications.

In this talk we discuss solution techniques for solving such systems,
with focus on situations where the (1,1) block is possibly singular.
In such cases straightforward Schur complement techniques cannot be
applied. Two

different methods are described: a deflation approach, and an augmented
Lagrangian technique. The first approach is based on identifying the
nullity of the (1,1) block and eliminating a small number of unknowns
so as to generate a 'reduced' indefinite system with a nonsingular (1,1)
block.

In contrast, the augmented Lagrangian technique does not reduce the
size of the linear system; rather, it modifies the (1,1) block so as
to eliminate a possible singularity. As some of our analytical observations

and numerical examples illustrate, the numerical properties of the resulting
linear system are considerably different than those of the original
one.

This is joint work with Gene Golub at Stanford University.

**Francois Gygi**, Lawrence Livermore National
Laboratory

*Numerical challanges of large-scale first-principles molecular
dynamics*

First-principles molecular dynamics (FPMD) is emerging as a very powerful
atomistic simulation approach, which combines a classical description
of nuclei with a quantum mechanical description of electrons. Recent
advances in electronic structure methods, notably Density Functional
Theory (DFT), have made FPMD a truly predictive approach, which provides
information on the structural, dynamical and electronic properties of
a physical system. FPMD has been succesfully applied to several areas
of research in materials science, chemistry and biochemistry. The computational
cost of FPMD simulations is high, due to the detailed description of
electronic structure that is required. Straightforward implementations
of FPMD incur a computational cost of O(N3) for N atoms. More recent
approaches have been proposed to reduce this cost to O(N). We will present
recent progress in the development of O(N) algorithms based on real-space,
finite-difference methods, as well as the challenges that arise when
implementing conventional O(N3) FPMD on large, massively parallel computers.

**Eldad Haber**, University of British Columbia

*Techniques for Forward and Inverse Electromagnetic Problems in
3D*

Inverse problems involving recovery of distributed parameter functions
arise in many applications. Many practical instances require electromagnetic
data inversion, where the forward problem comprises Maxwell's equations
for moderate frequencies. Realistic instances of such problems in 3D
can be very computationally intensive and require care in the selection
and development of appropriate algorithms.

In this talk I will describe work we have been doing in the context
of a project involving geophysical mining applications with the objective
of making such computations practically feasible.

For Maxwell's equations we use a Helmholtz decomposition reformulated
with an additional differentiation. Following a finite volume discretization
suitable for discontinuous coefficients we apply multigrid or ILU block
preconditioning to obtain rapid convergence of a Krylov space method.

For the inverse problem a nonlinear constrained optimization formulation
is obtained. The necessary conditions for the inverse optimization problem
yield a large, nonlinear, tightly coupled set of PDEs. We devise preconditioned
Krylov solvers for the rapid solution of such systems.

**Marko Huhtanen**, Massachusetts Institute
of Technology

*On generating discrete orthogonal bivariate polynomials *

In this talk we present an algorithm for recursively generating orthogonal
bivariate polynomials

on a discrete set of the plane. For this purpose we employ commuting
pairs of real symmetric matrices

to obtain, in a certain sense, a two dimensional Hermitian Lanczos method.
The resulting algorithm relies on a recurrence having a slowly growing
length. Practical implementation issues and applications are considered.
The method can be generalized to compute orthogonal polynomials depending
on an arbitrary number of variables.

**Richard B. Lehoucq**, Sandia National Labs

*A***n Automated Multilevel Substructuring Method for
Eigenspace Computation in Linear Elastodynamics**

Dynamic analysis of complex two- and three-dimensional structures frequently
involves finite element discretizations with over a million unknowns.
The discretization is typically used for frequency response at many
frequencies so that a dimensional reduction step is advantageous or
even necessary. The standard reduction approach is modal truncation,
which requires a costly partial eigensolution but reduces the number
of unknowns by orders of magnitude.

While modal truncation is justified in the continuous setting-higher
eigenfunctions have much lower participation in the response than lower
ones-there is another important reason. The modes that are well approximated
by the finite element discretization are the only modes that should
be retained. Hence the cost of the frequency response is reduced greatly
without a significant loss of accuracy.

The cost of the partial eigensolution required for modal truncation
increases dramatically as the frequency range for the analysis increases
because the number of eigenpairs needed can easily reach into the thousands.
High modal density (close spacing of eigenvalues) also contributes to
the cost. An alternative to this approach is the Automated Multi-Level
Substructuring (AMLS) method in which the structure is recursively divided
into thousands of subdomains. Eigenvectors associated with these subdomains
are used to represent the structure's response rather than the traditional
global

eigenvectors. Reduction of the finite element discretization is based
on many small, local, inexpensive eigenvalue problems.

This presentation examines the mathematical basis for AMLS in the continuous
variational setting. Differential eigenvalue problems are defined on
subdomains and on interfaces between subdomains. For the interface eigenvalue
problems, an operator is defined that acts on interface trace functions
and consistently represents mass associated

with extensions of those trace functions. With this new operator, all
of the differential eigenvalue problems are projections of the global
eigenvalue problem onto a hierarchy of mutually orthogonal subspaces.

The objective of our dimensional reduction is to retain only as many
eigenfunctions as are needed so that the eigenspace truncation error
is consistent with the finite element discretization error. AMLS is
an alternative to conventional approaches for solving large systems
of equations arising from a finite element discretization: instead of

approximating the solution of a system of equations, identify the subspace
that contains good approximate solutions of the partial differential
equation and project the problem onto that subspace. This approach is
advantageous particularly when many solutions are needed from one discretization.

(Joint work with Jeff Bennighof, University of Texas, Aerospace and
Engineering Mechanics)

**Gregory Lewis**, The Fields Institute

**The numerical approximation of eigenvalues in the differentially
heated rotating fluid annulus**

Many laboratory experiments have been performed with the differentially
heated rotating fluid annulus, where various flow patterns are observed
as the rotation and temperature difference between the annulus walls
are varied. A rich variety of dynamical behaviour has been observed
in the experiments that has not been theoretically explained. Bifurcation
analysis is a promising means of gaining insight into these unexplained
dynamics. For example, an analysis of the double Hopf bifurcations that
occur in a mathematical model of the annulus along the transition from
steady flow to wave motion indicates the mechanism by which the experimentally
observed hysteresis of the wave motion occurs.

To perform such a bifurcation analysis, the boundaries of the region
of linear stability of a steady solution of the dynamical equations
must be found. This involves the computation of certain eigenvalues
at many locations in the space of parameters. For the present model
of the differentially heated rotating annulus, the eigenvalues (and
eigenfunctions) satisfy a partial differential generalized eigenvalue
problem (in two spatial dimensions) that cannot be solved analytically.
Discretization leads to a generalized matrix eigenvalue problem, with
large sparse matrices, from which approximations of the eigenvalues
of the continuous problem may be found.

Because the approximation of the eigenvalues is the most computationally
intensive step in the bifurcation analysis, much can be gained by increasing
the efficiency of these approximations. I will discuss the eigenvalue
approximation in the context of the differentially heated rotating annulus
and the possibility of using the nature of the problem to make improvements.
Such improvements should be directly applicable to many other systems.

**Xiezhang Li, **Georgia Southern University

**The Representation and Iterative Methods for Drazin Inverse
with Complex Spectrum**** **

A general iterative method for the Drazin inverse of a squre matrix
with complex spectrum and its error bound are derived. Euler-Knopp and
Newton-Raphson methods for the real spectrum case are generalized. Semiiterative
methods for the Drazin inverse is also developed. Numerical examples
are given to illustrate the theoretic results.

**Nicola Mastronardi**, Katholieke Universiteit
Leuven

**Fast algorithms for Structured Total Least Squares in Speech Compression**

We present a fast implementation of a recently proposed speech compression
scheme, based on an all-pole model of the vocal tract. Each frame of
the speech signal is analyzed by storing the parameters of the complex
damped exponentials deduced from the all-pole model and its initial
conditions. In mathematical terms, the analysis stage corresponds to
solving a structured total least squares (STLS) problem. It is shown
that by exploiting the displacement rank structure of the involved matrices
the STLS problem can be solved in a very fast way. Synthesis is computationally
very cheap since it consists of adding the complex damped exponentials
based on the transmitted parameters. The compression scheme is applied
on a speech signal. The speed improvement of the fast vocoder analysis
scheme is demonstrated. Furthermore, the quality of the compression
scheme is comparable with that of a standard coding algorithm at comparable
compression ratios, by using the segmental Signal-to-Noise Ratio (SNR).

**James G. Nagy**, Emory University

*Preconditioned Iterative Methods for Image Restoration*

Image restoration is an example of an "ill-posed problem",
which can be defined as a problem that does not have a unique solution,
or the solution is not a continuous function of the data. Such problems
are extremely sensitive to perturbations (noise) in the data; that is,
small perturbation of the data can lead to arbitrarily large perturbations
in

the solution. Many algorithms have been developed to compute approximate
solutions of ill-posed problems, but they may differ in a variety of
ways. For example, there are several different regularization methods
one could use, and for each of these, various different methods for
choosing a regularization parameter. For structured, large-scale problems,
an

attractive approach is to use certain iterative methods, such as the
conjugate gradient algorithm, where regularization is enforced through
early termination of the iterations. In some cases, preconditioning
can be used to accelerate convergence. However, if not done carefully,
preconditioning can lead to erratic convergence behavior and result
in

fast convergence to a poor approximate solution. In addition to discussing
aspects of preconditioning conjugate gradients for ill-posed problems,
an alternative scheme based on steepest descent is considered. The steepest
descent approach has the additional advantage of enforcing nonnegativity,
which may be important in some image restoration

applications.

**John Reid**, Rutherford Appleton Laboratory

**Frontal and multifrontal codes in HSL 2000 for direct solution
of large sparse sets of linear equations**

The frontal method was first conceived (Irons, 1970) for symmetric
positive-definite finite-element problems, but it has since been extended
to handle unsymmetric matrices and matrices that do not originate from
finite-element problems. It has also been extended to the multifrontal
method that is better able to preserve the sparsity and is used now,
for example, by Nastran.

The first part of this talk will explain the methods, assuming only
that the audience is familiar with Gaussian elimination (all these methods
are variants of Gaussian elimination).

The second part of the talk will review the frontal and multifrontal
work of the Numerical Analysis Group at the Rutherford Appleton Laboratory.
This has led to many Fortran codes that are available in HSL (formerly
Harwell Subroutine Library) and to a Fortran/MPI code that was one of
the deliverables of the ESPRIT IV project PARASOL.

**Jianbo Shi**, Robotics Institute, CMU

**Image Segmentation and Grouping with Normalize Cut **

This is a joint work with J. Malik, S. Yu, M. Maila.

I will describe an approach for solving image segmentation and grouping
problem with Normalized Cuts, which exploits the connections between
graph partitioning, eigen-value system, and Markov Random Walk.

In our approach, image segmentation is formulated as a graph partition
problem where the pixels are the graph nodes, and pair-wise similarity
measures between pixels are the graph edge weights. A global graph partitioning
criteria, called normalized cut, is proposed for finding minimally connected
subgroups. This criterion also ensures that within group similarity
is maximized. While the Normalized Cut is NP-hard, we show that in the
continuous value space, it can be solved exactly by computing a generalized
Eigen-value problem. Results of this algorithm on a large number of
real images with simple brightness, texture and motion cues have been
very encouraging.

In this talk, I will highlight two current research areas where we
are building on this equivalence between the normalized graph cuts and
generalized eigen-system for solving higher order vision problems. In
the first area, we show how more complex pair-wise graph relationships
such as directed relationships, which arise from object occlusion for
example, can be encoded in this system. We show the solution can be
found by solving a eigen-value problem in the complex value domain.
In the second area, we have developed a probabilistic interpretation
of the Normalized Cuts criterion in terms of Markov Random Walk. We
show how this interpretation can lead to an algorithm for supervised
learning based image segmentation.

Author: **Gene Golub, Rasmus Larsen,** **Justin
Wan** (Speaker)

**Solving Complex Symmetric Linear Systems with Multiple Right-Hand
Sides**

We consider the simultaneous solution of large linear systems of the
form AX=B, where A is complex symmetric and B is a set of RHS vectors.
We present a new CG-type iterative method, CCG, which is based on a

tridiagonalization of A by a unitary matrix. Moreover, we combine CCG
and its minimum residual variant CSYM with the single-seed technique
to solve linear systems with multiple right-hand sides; we solve the
seed system using CCG or CSYM and approximate solutions to the non-seed
systems are obtained by Galerkin projection. We demonstrate the effectiveness
of the proposed method by solving an electromagnetic application.

**Jacob White***, *Massachusetts Institute
of Technology

Model Order Reduction Applied ot the IC Design Problem

Designing an integrated circuit is complicated because each design
involves specifying millions of transistors and wires. The way IC designers
make such complicated system designs more tractable is to use hierarchy,
in which wires and transistors are grouped together into functional
cells, and those cells are then grouped into functional blocks. The
numerical problem of extracting functional cell descriptions from transistors
and wires, or extracting block descriptions from cells, is often referred
to as model-order reduction because the problem can usually be recast
as one of generating

low-order dynamical systems which preserve the input-output behavior
of much higher-order dynamical systems. In this talk we will survey
the recent developments in numerical model-order reduction for linear
problems, by trying to connect Krylov-subspace based methods, Pade approximates,
and truncated balanced realizations. We will then describe some of the
challenges in extending these approaches to nonlinear problems.

**Petter Wiberg**, University of Toronto

**Hedging with Value-at-risk**

The \emph{value-at-risk} is the maximum loss that a portfolio might
suffer over a given holding period with a certain confidence level.
In recent years, value-at-risk has become a benchmark for measuring
financial risk used by both practitioners and regulators. We present
an efficient algorithm for computing value-at-risk and the value-at-risk
gradient for portfolios of derivative securities. In particular, we
discuss dimensional reduction of the model and applications to hedging
of derivatives portfolios.

**Chao Yang**, Lawrence Berkeley National Laboratory

*Computational Challenges in Cryo-Electron Microscopy Image Reconstruction*

The problem of computing 3-D density maps of macromolecules from a set
of 2-D projection images taken by an electron microscope is becoming
increasingly important in biological research. The problem is challenging
in a number of ways:

1) Limited by radiation damage, the 2-D images are obtained by taking
a single shot of many identical but randomly oriented molecules embedded
in a thin layer of vitreous ice (a process known as cryo-electron microscopy).
As a result, the relative orientations of the 2-D images are unknown,
and must be determined as part of the solution to the 3-D image reconstruction
problem.

2) To avoid radiation damage, the electron dose of the microscope has
to be kept low. This limited exposure of molecule to electron beams
results in a typical low signal-to-noise ratio in the 2-D projection
images. These 2-D images must also be corrected by a contrast transfer
function to account for defocus and aberration artifacts introduced
by the electron

microscope.

3) To obtain a high resolution reconstruction, a large number of 2-D
projection images must be collected to ensure a uniform and detailed
coverage of projection directions. In this talk, we will give an overview
on the mathematical

formulation of the image reconstruction problem, and provide a survey
on numerical techniques currently employed

at various stages of the reconstruction process.