# THEMATIC PROGRAMS

March  8, 2014

## Speaker Abstracts

 Dhavide Aruliah Chen Greif Justin Wan Mark Baertschy Francois Gygi Jacob White Zhaojun Bai Eldad Haber Petter Wiberg Tucker Carrington Marko Huhtanen Chao Yang Xiao-Wen Chang Richard B. Lehoucq Christina Christara Gregory Lewis Lieven De Lathauwer Xiezhang Li Oleg Diyankov Nicola Mastronardi Kenneth Driessel James G. Nagy Peter Forsyth John Reid Wilfried Gansterer Jianbo Shi

Dhavide Aruliah, The Fields Institute
Multigrid Preconditioning for Time-harmonic Maxwell's Equations in 3D

We consider three-dimensional electromagnetic problems that arise in forward-modelling of Maxwell's equations in the frequency domain. Traditional formulations and discretizations of Maxwell's equations for this class of problems leads to a large, sparse, system of linear algebraic equations that is difficult to solve. That is, iterative methods applied to the linear system are slow to converge which is a major obstacle in solving practical inverse problems.

A finite volume discretization on a staggered grid derived from a potential-current formulation of Maxwell's equations with a block structure that is diagonally dominant. The block structure of the linear system stems directly from the choice of potential formulation for Maxwell's frequency domain equations. The system obtained is strongly elliptic which in turn suggests various strategies for preconditioning Krylov space iterations used to solve the linear system.

A constant-coefficient, periodic, local-mode analysis common suggests that exact inversion of the diagonal blocks can be used as an effective preconditioning strategy. In particular, local-mode analysis shows that the preconditioned matrix has eigenvalues range and of the $l_2$-conditioned number. For Hermitian problems, this implies Krylov-subspace iterations will achieve rates of convergence independent of the mesh size; that is, a fixed reduction of the relative residual is attainable in a constant number of iterations regardless of the degree to which the mesh is refined. The analysis extends readily to the case where a single multigrid V-cycle is applied for preconditioning rather than invertinf the diagonal blocks exactly giving similar convergence properties. The multigrid preconditioning strategy is tested within Krylov-subspace iterations in a variety of experiments. In spite of the non-Hermitian character of the underlying linear system, the numerical experiments support theoretical predictions and demonstrate the efficacy of multigrid preconditioning for low-to-moderate frequency electromagnetic simulations.

This is work done in collaboration with U. Ascher and E. Haber.

Mark Baertschy, University of Colorado, Boulder
Solution of a three-Body problem in quantum mechanics using sparse linear algebra on parallel computers

A complete description of two outgoing electrons following an ionizing collision between a single electron and an atom
or molecule has long stood as one of the unsolved fundamental problems in quantum collision theory.

In this technical paper we describe our use of distributed memory parallel computers to calculate a fully converged wave function
describing the electron-impact ionization of hydrogen. Our approach hinges on a transformation of the Schr\"{o}dinger
equation that simplifies the boundary conditions but requires solving very ill-conditioned systems of a few million complex, sparse linear equations.

We developed a two-level iterative algorithm that requires repeated solution of sets of a few hundred thousand linear equations. These are solved directly by $LU$-factorization using a specially tuned, distributed memory parallel version of the sparse $LU$-factorization library SuperLU. In smaller cases, where direct solution is technically possible, our iterative algorithm still gives significant savings in time despite lower megaflop rates.

Zhaojun Bai, University of California
Second-order Dynamical Systems: quadratic eigenvalue problem and reduced-order modeling

Second-order models arise naturally in the study of many types of physical systems, with common examples being electrical and structural dynamics systems. In this talk, we will discuss two tightly connected problems associated with second-order systems, namely, quadratic eigenvalue problem and reduced-order modeling. We will focus on the development of numerical techniques which exploit and preserve underlying problem structures and mathematical properties for solving quadratic eigenvalue problem and extracting reduced-order model. Case studies from structural dynamics and microelectromechanical systems (MEMS) will be discussed.

Xiao-Wen Chang, McGill University and C. C. Paige
A Recursive Least Squares Approach for GPS Based Positioning

The Global Positioning System (GPS) is an all-weather, worldwide, continuous coverage, satellite based navigation system. GPS satellites transmit signals that allow one to determine, with great
accuracy, the location of GPS receivers. In GPS a typical technique for dynamic position estimation
is differential positioning where two receivers are used, one receiver is stationary and its exact position is known,
the other is roving and its position is to be estimated. In this talk we describe the physical situation and derive the
mathematical model. We then present a recursive least squares approach for position estimation based on the difference of the measurements at the stationary and roving receivers. We take full account of the structure of the problem to make our algorithm efficient and use orthogonal transformations to ensure numerical reliability
of the algorithm. Simulation results will be presented to demonstrate the performance of the algorithm. A comparison with the van Graas and Lee positioning algorithm will be given.

Christina C. Christara, University of Toronto
Fast Fourier Transform Solvers and Preconditioners

Joint work with Kit Sun Ng.

Quadratic Spline Collocation (QSC) methods of optimal order of convergence have been recently developed for the solution of elliptic Partial differential Equations (PDEs). In this paper, linear solvers based on Fast Fourier Transforms (FFT) are developed for the solution of the QSC equations. The complexity of the FFT solvers is $O(N^2 \log N)$, where $N$ is the gridsize in one dimension. These direct solvers can handle PDEs with coefficients in one variable or constant, and Dirichlet, Neumann and periodic boundary conditions, along at least one direction of a rectangular domain. General variable coefficient PDEs are handled by preconditioned iterative solvers. The preconditioner is the QSC matrix arising from a constant coefficient PDE. The convergence analysis of the preconditioner is presented. It is shown that, under certain conditions, the convergence rate is independent of the gridsize. The preconditioner is solved by FFT techniques, and integrated with one-step or acceleration methods, giving rise to asymptotically almost optimal linear solvers, with complexity $O(N^2 \log N)$. Numerical experiments verify the effectiveness of the solvers and preconditioners, even on problems more general than the analysis assumes. The development and analysis of FFT solvers and preconditioners is extended to QSC equations corresponding to systems of elliptic PDEs.

Lieven De Lathauwer, Universite de Cergy-Pontoise
Independent component analysis and multilinear algebra

This talk is intended to provide an introduction to the concept of Independent Component Analysis (ICA), also known as Blind Source Separation (BSS). The goal of ICA is the decomposition of a set of multi-sensor data in an a priori unknown linear mixture of a priori unknown source signals, relying on the assumption that the source signals are mutually statistically independent. We will follow an algebraic approach: in a natural way, ICA poses the question of
generalizations of matrix algebraic techniques to multilinear algebra, i.e., the algebra of multi-way matrices'' or higher-order tensors''. We will discuss four orthogonal tensor decompositions that can be interpreted as higher-order counterparts of the symmetric matrix Eigenvalue Decomposition (EVD). Like for instance the EVD and the Singular Value Decomposition (SVD) of matrices, these decompositions can be considered as tools, useful for a wide range of applications.

Oleg Diyankov, Russian Federal Nuclear Center - VNIITF
The ILU-type preconditioner with 2 drop tolerance parameters

Joint Work with V. Y. Pravilnikov, Russian Federal Nuclear Center - VNIITF.

This paper is devoted to description of the ILU-type preconditioner with two
drop tolerance parameters, named $FILU(\epsilon_1, \epsilon _2)$. This preconditioner $M$ gives
good approximation of the original matrix $A$: $M = L \cdot U \approx A$ and requires less
memory then a standard one - parameter preconditioner.
The procedure of creating of the preconditioner's matrix M consists of the
following steps.
At first we calculate special left and right (with respect to main diagonal)
norms for each row, and store them in 2 vectors.
Then during elimination of $i$-th row we compare adding element (new element
in the row) with corresponding (left or right) norm, multiplied by $\epsilon_2$ .
After elimination of the row is finished, all additional elements are compared
with the corresponding norm, multiplied by $\epsilon_1$, and elements with value less then
norm are dropped.
Thus we check an additional element two times. First time we check it when
it is added, and second time - after elimination of the row.
Such approach allows us to diminish a number of nonzero elements in the
preconditioner's matrix without worsening of quality of the preconditioner.
1

Peter Forsyth, University of Waterloo
Penalty Methods For American Options

The convergence of a penalty method for solving the discrete regularized American option valuation problem is studied.
Sufficient conditions are derived which both guarantee convergence of the nonlinear penalty iteration and ensure that the iterates converge monotonically to the solution. In addition, we show that the penalty iteration has finite termination, which means that the American Constraint can be enforced with very high precsion with a small number of iterations.
The efficiency and quality of solutions obtained using the implicit penalty method are compared with those produced with the commonly used technique of handling the American constraint explicitly. Convergence rates are studied as the timestep and mesh size tend to zero.

Wilfried Gansterer, University of Tennessee
Approximating Eigenpairs in Quantum Chemistry

An algorithmic framework for approximating eigenpairs of dense, but diagonally dominant symmetric matrices is presented. It is based on a new divide-and-conquer approach for computing approximate eigenvalues and eigenvectors of a block-tridiagonal matrix. Application of this method to eigenproblems arising in Materials Science and Quantum Chemistry is discussed. It is shown that for many of these problems low rank approximations of the off-diagonal blocks as well as relaxation of deflation criteria permit the computation of approximate eigenpairs with prescribed accuracy at significantly reduced computational costs compared to standard methods as, for example, implemented in LAPACK. (Joint work with Robert C. Ward, California Institute of Technology)

Chen Greif, Parametric Technology Corporation
Techniques for solving indefinite linear systems

In a large number of applications one needs to solve a linear system whose associated matrix can be introduced as a 2x2 block matrix with a zero (2,2) block. Such linear systems arise in constrained optimization, least squares problems, the discrete Stokes equation, and other areas of applications.

In this talk we discuss solution techniques for solving such systems, with focus on situations where the (1,1) block is possibly singular. In such cases straightforward Schur complement techniques cannot be applied. Two
different methods are described: a deflation approach, and an augmented Lagrangian technique. The first approach is based on identifying the nullity of the (1,1) block and eliminating a small number of unknowns so as to generate a 'reduced' indefinite system with a nonsingular (1,1) block.

In contrast, the augmented Lagrangian technique does not reduce the size of the linear system; rather, it modifies the (1,1) block so as to eliminate a possible singularity. As some of our analytical observations
and numerical examples illustrate, the numerical properties of the resulting linear system are considerably different than those of the original one.

This is joint work with Gene Golub at Stanford University.

Francois Gygi, Lawrence Livermore National Laboratory
Numerical challanges of large-scale first-principles molecular dynamics

First-principles molecular dynamics (FPMD) is emerging as a very powerful atomistic simulation approach, which combines a classical description of nuclei with a quantum mechanical description of electrons. Recent advances in electronic structure methods, notably Density Functional Theory (DFT), have made FPMD a truly predictive approach, which provides information on the structural, dynamical and electronic properties of a physical system. FPMD has been succesfully applied to several areas of research in materials science, chemistry and biochemistry. The computational cost of FPMD simulations is high, due to the detailed description of electronic structure that is required. Straightforward implementations of FPMD incur a computational cost of O(N3) for N atoms. More recent approaches have been proposed to reduce this cost to O(N). We will present recent progress in the development of O(N) algorithms based on real-space, finite-difference methods, as well as the challenges that arise when implementing conventional O(N3) FPMD on large, massively parallel computers.

Eldad Haber, University of British Columbia
Techniques for Forward and Inverse Electromagnetic Problems in 3D

Inverse problems involving recovery of distributed parameter functions arise in many applications. Many practical instances require electromagnetic data inversion, where the forward problem comprises Maxwell's equations for moderate frequencies. Realistic instances of such problems in 3D can be very computationally intensive and require care in the selection and development of appropriate algorithms.

In this talk I will describe work we have been doing in the context of a project involving geophysical mining applications with the objective of making such computations practically feasible.

For Maxwell's equations we use a Helmholtz decomposition reformulated with an additional differentiation. Following a finite volume discretization suitable for discontinuous coefficients we apply multigrid or ILU block preconditioning to obtain rapid convergence of a Krylov space method.

For the inverse problem a nonlinear constrained optimization formulation is obtained. The necessary conditions for the inverse optimization problem yield a large, nonlinear, tightly coupled set of PDEs. We devise preconditioned Krylov solvers for the rapid solution of such systems.

Marko Huhtanen, Massachusetts Institute of Technology
On generating discrete orthogonal bivariate polynomials

In this talk we present an algorithm for recursively generating orthogonal bivariate polynomials
on a discrete set of the plane. For this purpose we employ commuting pairs of real symmetric matrices
to obtain, in a certain sense, a two dimensional Hermitian Lanczos method. The resulting algorithm relies on a recurrence having a slowly growing length. Practical implementation issues and applications are considered. The method can be generalized to compute orthogonal polynomials depending on an arbitrary number of variables.

Richard B. Lehoucq, Sandia National Labs
An Automated Multilevel Substructuring Method for Eigenspace Computation in Linear Elastodynamics

Dynamic analysis of complex two- and three-dimensional structures frequently involves finite element discretizations with over a million unknowns. The discretization is typically used for frequency response at many frequencies so that a dimensional reduction step is advantageous or even necessary. The standard reduction approach is modal truncation, which requires a costly partial eigensolution but reduces the number of unknowns by orders of magnitude.

While modal truncation is justified in the continuous setting-higher eigenfunctions have much lower participation in the response than lower ones-there is another important reason. The modes that are well approximated by the finite element discretization are the only modes that should be retained. Hence the cost of the frequency response is reduced greatly without a significant loss of accuracy.

The cost of the partial eigensolution required for modal truncation increases dramatically as the frequency range for the analysis increases because the number of eigenpairs needed can easily reach into the thousands. High modal density (close spacing of eigenvalues) also contributes to the cost. An alternative to this approach is the Automated Multi-Level Substructuring (AMLS) method in which the structure is recursively divided into thousands of subdomains. Eigenvectors associated with these subdomains are used to represent the structure's response rather than the traditional global
eigenvectors. Reduction of the finite element discretization is based on many small, local, inexpensive eigenvalue problems.

This presentation examines the mathematical basis for AMLS in the continuous variational setting. Differential eigenvalue problems are defined on subdomains and on interfaces between subdomains. For the interface eigenvalue problems, an operator is defined that acts on interface trace functions and consistently represents mass associated
with extensions of those trace functions. With this new operator, all of the differential eigenvalue problems are projections of the global eigenvalue problem onto a hierarchy of mutually orthogonal subspaces.

The objective of our dimensional reduction is to retain only as many eigenfunctions as are needed so that the eigenspace truncation error is consistent with the finite element discretization error. AMLS is an alternative to conventional approaches for solving large systems of equations arising from a finite element discretization: instead of
approximating the solution of a system of equations, identify the subspace that contains good approximate solutions of the partial differential equation and project the problem onto that subspace. This approach is advantageous particularly when many solutions are needed from one discretization.

(Joint work with Jeff Bennighof, University of Texas, Aerospace and Engineering Mechanics)

Gregory Lewis, The Fields Institute
The numerical approximation of eigenvalues in the differentially heated rotating fluid annulus

Many laboratory experiments have been performed with the differentially heated rotating fluid annulus, where various flow patterns are observed as the rotation and temperature difference between the annulus walls are varied. A rich variety of dynamical behaviour has been observed in the experiments that has not been theoretically explained. Bifurcation analysis is a promising means of gaining insight into these unexplained dynamics. For example, an analysis of the double Hopf bifurcations that occur in a mathematical model of the annulus along the transition from steady flow to wave motion indicates the mechanism by which the experimentally observed hysteresis of the wave motion occurs.

To perform such a bifurcation analysis, the boundaries of the region of linear stability of a steady solution of the dynamical equations must be found. This involves the computation of certain eigenvalues at many locations in the space of parameters. For the present model of the differentially heated rotating annulus, the eigenvalues (and eigenfunctions) satisfy a partial differential generalized eigenvalue problem (in two spatial dimensions) that cannot be solved analytically. Discretization leads to a generalized matrix eigenvalue problem, with large sparse matrices, from which approximations of the eigenvalues of the continuous problem may be found.

Because the approximation of the eigenvalues is the most computationally intensive step in the bifurcation analysis, much can be gained by increasing the efficiency of these approximations. I will discuss the eigenvalue approximation in the context of the differentially heated rotating annulus and the possibility of using the nature of the problem to make improvements. Such improvements should be directly applicable to many other systems.

Xiezhang Li, Georgia Southern University
The Representation and Iterative Methods for Drazin Inverse with Complex Spectrum

A general iterative method for the Drazin inverse of a squre matrix with complex spectrum and its error bound are derived. Euler-Knopp and Newton-Raphson methods for the real spectrum case are generalized. Semiiterative methods for the Drazin inverse is also developed. Numerical examples are given to illustrate the theoretic results.

Nicola Mastronardi, Katholieke Universiteit Leuven
Fast algorithms for Structured Total Least Squares in Speech Compression

We present a fast implementation of a recently proposed speech compression scheme, based on an all-pole model of the vocal tract. Each frame of the speech signal is analyzed by storing the parameters of the complex damped exponentials deduced from the all-pole model and its initial conditions. In mathematical terms, the analysis stage corresponds to solving a structured total least squares (STLS) problem. It is shown that by exploiting the displacement rank structure of the involved matrices the STLS problem can be solved in a very fast way. Synthesis is computationally very cheap since it consists of adding the complex damped exponentials based on the transmitted parameters. The compression scheme is applied on a speech signal. The speed improvement of the fast vocoder analysis scheme is demonstrated. Furthermore, the quality of the compression scheme is comparable with that of a standard coding algorithm at comparable compression ratios, by using the segmental Signal-to-Noise Ratio (SNR).

James G. Nagy, Emory University
Preconditioned Iterative Methods for Image Restoration

Image restoration is an example of an "ill-posed problem", which can be defined as a problem that does not have a unique solution, or the solution is not a continuous function of the data. Such problems are extremely sensitive to perturbations (noise) in the data; that is, small perturbation of the data can lead to arbitrarily large perturbations in
the solution. Many algorithms have been developed to compute approximate solutions of ill-posed problems, but they may differ in a variety of ways. For example, there are several different regularization methods one could use, and for each of these, various different methods for choosing a regularization parameter. For structured, large-scale problems, an
attractive approach is to use certain iterative methods, such as the conjugate gradient algorithm, where regularization is enforced through early termination of the iterations. In some cases, preconditioning can be used to accelerate convergence. However, if not done carefully, preconditioning can lead to erratic convergence behavior and result in
fast convergence to a poor approximate solution. In addition to discussing aspects of preconditioning conjugate gradients for ill-posed problems, an alternative scheme based on steepest descent is considered. The steepest descent approach has the additional advantage of enforcing nonnegativity, which may be important in some image restoration
applications.

John Reid, Rutherford Appleton Laboratory
Frontal and multifrontal codes in HSL 2000 for direct solution of large sparse sets of linear equations

The frontal method was first conceived (Irons, 1970) for symmetric positive-definite finite-element problems, but it has since been extended to handle unsymmetric matrices and matrices that do not originate from finite-element problems. It has also been extended to the multifrontal method that is better able to preserve the sparsity and is used now, for example, by Nastran.

The first part of this talk will explain the methods, assuming only that the audience is familiar with Gaussian elimination (all these methods are variants of Gaussian elimination).

The second part of the talk will review the frontal and multifrontal work of the Numerical Analysis Group at the Rutherford Appleton Laboratory. This has led to many Fortran codes that are available in HSL (formerly Harwell Subroutine Library) and to a Fortran/MPI code that was one of the deliverables of the ESPRIT IV project PARASOL.

Jianbo Shi, Robotics Institute, CMU
Image Segmentation and Grouping with Normalize Cut

This is a joint work with J. Malik, S. Yu, M. Maila.

I will describe an approach for solving image segmentation and grouping problem with Normalized Cuts, which exploits the connections between graph partitioning, eigen-value system, and Markov Random Walk.

In our approach, image segmentation is formulated as a graph partition problem where the pixels are the graph nodes, and pair-wise similarity measures between pixels are the graph edge weights. A global graph partitioning criteria, called normalized cut, is proposed for finding minimally connected subgroups. This criterion also ensures that within group similarity is maximized. While the Normalized Cut is NP-hard, we show that in the continuous value space, it can be solved exactly by computing a generalized Eigen-value problem. Results of this algorithm on a large number of real images with simple brightness, texture and motion cues have been very encouraging.

In this talk, I will highlight two current research areas where we are building on this equivalence between the normalized graph cuts and generalized eigen-system for solving higher order vision problems. In the first area, we show how more complex pair-wise graph relationships such as directed relationships, which arise from object occlusion for example, can be encoded in this system. We show the solution can be found by solving a eigen-value problem in the complex value domain. In the second area, we have developed a probabilistic interpretation of the Normalized Cuts criterion in terms of Markov Random Walk. We show how this interpretation can lead to an algorithm for supervised learning based image segmentation.

Author: Gene Golub, Rasmus Larsen, Justin Wan (Speaker)
Solving Complex Symmetric Linear Systems with Multiple Right-Hand Sides

We consider the simultaneous solution of large linear systems of the form AX=B, where A is complex symmetric and B is a set of RHS vectors. We present a new CG-type iterative method, CCG, which is based on a
tridiagonalization of A by a unitary matrix. Moreover, we combine CCG and its minimum residual variant CSYM with the single-seed technique to solve linear systems with multiple right-hand sides; we solve the seed system using CCG or CSYM and approximate solutions to the non-seed systems are obtained by Galerkin projection. We demonstrate the effectiveness of the proposed method by solving an electromagnetic application.

Jacob White, Massachusetts Institute of Technology
Model Order Reduction Applied ot the IC Design Problem

Designing an integrated circuit is complicated because each design involves specifying millions of transistors and wires. The way IC designers make such complicated system designs more tractable is to use hierarchy, in which wires and transistors are grouped together into functional cells, and those cells are then grouped into functional blocks. The numerical problem of extracting functional cell descriptions from transistors and wires, or extracting block descriptions from cells, is often referred to as model-order reduction because the problem can usually be recast as one of generating
low-order dynamical systems which preserve the input-output behavior of much higher-order dynamical systems. In this talk we will survey the recent developments in numerical model-order reduction for linear problems, by trying to connect Krylov-subspace based methods, Pade approximates, and truncated balanced realizations. We will then describe some of the challenges in extending these approaches to nonlinear problems.

Petter Wiberg, University of Toronto
Hedging with Value-at-risk

The \emph{value-at-risk} is the maximum loss that a portfolio might suffer over a given holding period with a certain confidence level. In recent years, value-at-risk has become a benchmark for measuring financial risk used by both practitioners and regulators. We present an efficient algorithm for computing value-at-risk and the value-at-risk gradient for portfolios of derivative securities. In particular, we discuss dimensional reduction of the model and applications to hedging of derivatives portfolios.

Chao Yang, Lawrence Berkeley National Laboratory
Computational Challenges in Cryo-Electron Microscopy Image Reconstruction

The problem of computing 3-D density maps of macromolecules from a set of 2-D projection images taken by an electron microscope is becoming increasingly important in biological research. The problem is challenging in a number of ways:
1) Limited by radiation damage, the 2-D images are obtained by taking a single shot of many identical but randomly oriented molecules embedded in a thin layer of vitreous ice (a process known as cryo-electron microscopy). As a result, the relative orientations of the 2-D images are unknown, and must be determined as part of the solution to the 3-D image reconstruction problem.

2) To avoid radiation damage, the electron dose of the microscope has to be kept low. This limited exposure of molecule to electron beams results in a typical low signal-to-noise ratio in the 2-D projection images. These 2-D images must also be corrected by a contrast transfer function to account for defocus and aberration artifacts introduced by the electron
microscope.

3) To obtain a high resolution reconstruction, a large number of 2-D projection images must be collected to ensure a uniform and detailed coverage of projection directions. In this talk, we will give an overview on the mathematical
formulation of the image reconstruction problem, and provide a survey on numerical techniques currently employed
at various stages of the reconstruction process.