
Distinguished Lecture Series in Statistical Science
Jianqing Fan

Frederick L. Moore Professor of Finance
Director of Committee of Statistical Studies
Department of Operation Research and Financial Engineering,Princeton
University
May 3, 2010  3:30 p.m.
Vastdimensionality and sparsity
May 4, 2010  3:30 p.m.
ISIS: A vehicle for the universe of sparsity
Room 230, Fields Institute (map
to Fields)


May 3, 2010
Vastdimensionality and sparsity
Technological innovations have revolutionized the process
of scientific research and knowledge discovery. The
availability of massive data and challenges from frontiers
of research and development have reshaped statistical
thinking, data analysis and theoretical studies. The
challenges of dimensionality arise from diverse fields
of sciences and the humanities, ranging from computational
biology and health studies to economics and finance.
A comprehensive overview will be given on statistical
challenges with vast dimensionality. The impact of dimensionality
and spurious correlation will be addressed. What makes
the highdimensional problems feasible is the notion
of sparsity.
While the dimensionality can be much higher than the
sample size, the intrinsic dimensionality is much smaller.
A unified framework expoiting sparsity will be outlined.
Other related problems with vastdimensionality are
also discussed. The effectiveness of the method will
be illustrated on forecasting home price indexes at
zip level.
May 4, 2010
ISIS: A vehicle for the universe of sparsity
Vastdimensionality characterizes many contemporary statistical
problems from genomics and genetics to finance and economics.
The challenges are tackled via exploiting the sparsity
of problems. We outline a unified framework to ultrahigh
dimensional variable selection problems: Iterative applications
of vastscale screening followed by moderatescale variable
selection, resulting in a viable procedure called ISIS.
The framework is widely applicable to many statistical
contexts: from multiple regression, generalized linear
models, survival analysis to machine learning and compress
sensing.
The fundamental building blocks are marginal variable
screening and penalized likelihood methods. How high dimensionality
can such methods handle? How large can false positive
and negative be with marginal screening methods? What
is the role of penalty functions? This talk will provide
some fundamental insights into these problems. The focus
will be on the sure screening property, false selection
size, the model selection consistency and oracle properties.
The advantages of using foldedconcave over convex penalty
will be clearly demonstrated. The methods will be convincingly
illustrated by carefully designed simulation studies and
the empirical studies on disease classifications and survival
analysis using microarray data and eQTL.

The Distinguished Lecture Series in Statistical Science
series was established in 2000 and takes place annually.
It consists of two lectures by a prominent statistical
scientist. The first lecture is intended for a broad mathematical
sciences audience. The series occasionally takes place
at a member university and is tied to any current thematic
program related to statistical science; in the absence
of such a program the speaker is chosen independently
of current activity at the Institute. A nominating committee
of representatives from the member universities solicits
nominations from the Canadian statistical community and
makes a recommendation to the Fields Scientific Advisory
Panel, which is responsible for the selection of speakers.
Distinguished
Lecture Series in Statistical Science Index


back to top

