**Program and Abstracts**

**Tuesday,
February 1, 2000**
**11:00-11:30 - ***Opening address*

Claudine Simson, V.P., Disruptive Technology, Network and Business Solutions,
Nortel Networks
**11:30-12:30 - ***Modern data
analysis and its application to Nortel Networks data*

Otakar Fojt, The University of York
In this talk we outline an approach to the analysis
of sequential manufacturing and telecom traffic data from industry using
techniques from nonlinear dynamics. The aim of the talk is to show the
potential of nonlinear techniques for processing real world data and developing
new advanced methods of commercial data analysis.

The basic idea is to consider a factory as a dynamical
system. A process in the factory generates data, which contains information
about the state of the system. If it is possible to analyse this data
in such a way that knowledge of the system is increased, control and decision-making
processes can be improved. This will result, if applied, in a basis of
competitive advantage to the factory.

First, we give details of the general idea and the type
of recorded data together with the necessary preprocessing techniques.
We follow this with a description of our analysis. Our approach consists
of state space reconstruction, applications of principal component analysis
and nonlinear deterministic prediction algorithms. The talk will conclude
with our results and with suggestions for future work.

**1:30-2:00 - ***The need for
real-time data analysis in telecommunications*

Chris Hobbs, Sr. Mgr., System Architecture, Nortel Networks
A telecommunications network typically comprises many
independently-controlled layers: from the physical fibre interconnectivity,
through wavelengths, STS connexions, ATM Virtual Channels, MPLS Paths
to the end-to-end connexions established for user services. Each of these
layers generates statistics that, in a large network, may easily be measured
in tens of gigaBytes per hour.

Traditionally, the layers have been controlled individually
since the complexity of "tuning" a lower layer to the traffic it is carrying
has been too great for human operators (particularly where the carried
traffic itself has complex statistics) and since the work involved in
moving connexions (particularly fibres and wavelengths) has been prohibitive.

Technological advances in Optical Switches, capable
of logically relaying fibre or wavelengths in micro-seconds, have made
flexible network rebalancing possible and Carriers, the owners of these
large networks, are demanding lower costs by combining layers and exploiting
this new agility. In order to address this problem, the Terabytes of data
being extracted daily from the large networks need to be analysed: initially
statically to determine the gross inter-related behaviours, and then dynamically
to detect and react to changing traffic patterns.

**2:30-3:30 - ***Noise reduction
for human speech using chaos-like features*

Holger Kantz, Max-Planck-Institut für Physik komplexer Systeme
A local projective noise reduction scheme, originally
developed for low-dimensional stationary signals, is successfully applied
to human speech. This is possible by exploiting properties of the speech
signal which mimic structure exhibited by deterministic chaotic systems.
In high-dimensional embedding spaces, the strong non-stationarity is resolved
as a sequence of different dynamical regimes of moderate complexity. This
filtering technique does not make use of the spectral contents of the
signal and is far superior to the Ephraim-Malah adaptive filter.

**4:00-5:00 - ***Scaling phenomena
in telecommunications*

Murad Taqqu, Boston University (Lecture co-sponsored by Dept. of Statistics,
University of Toronto)
Ethernet local area network traffic appears to be approximately
statistically self-similar. This discovery, made about eight years ago,
has had a profound impact on the field. I will try to explain what statistical
self-similarity means and how it is detected. I will also indicate how
its presence can be explained physically, by aggregating a large number
of "on-off" renewal processes, whose distributions are heavy-tailed. As
the size of the aggregation becomes large, then, after rescaling, the
behavior turns out to be the Gaussian self-similar process called fractional
Brownian motion. If, however, the rewards instead of being 0 and 1 are
heavy-tailed as well, then the limit is a stable non-Gaussian process
with infinite variance and dependent increments. Since linear fractional
stable motion is the stable counterpart of the Gaussian fractional Brownian
motion, a natural conjecture is that the limit process is linear fractional
stable motion. This conjecture, it turns out, is false. The limit is a
new type of infinite variance self-similar process.

Back to Top

Wednesday, February 2,
2000

**9:30-10:30 - ***Electrical/Biological
networks of nonlinear neurons*

Henry Abarbanel, Institute for Nonlinear Science at USCD, San Diego
Using analysis tools for time series from nonlinear
sources, we have been able to characterize the chaotic oscillations of
individual neurons in a small biological network that controls simple
behavior in an invertebrate. Using these characteristics, we have built
computer simulations and simple analog electronic circuits, which reproduce
the biological oscillations. We have performed experiments in which biological
neurons are replaced by the electronic neurons retaining the functional
behavior of the biological circuits. We will describe the nonlinear analysis
tools (widely applicable), the electronic neurons, and the experiments
on neural transplants.

**11:00-11:30 - ***E-commerce
and data mining challenges*

Weidong Kou, IBM Centre for Advanced Studies
E-commerce over Internet is having a profound impact
on the global economy. Goldman, Sachs & Co. estimates B2B e-commerce
revenue alone will grow to $1.5 trillion (US) over the next five years.
Electronic commerce is becoming a major channel for conducting business,
with increasing number organizations developing, deploying and installing
e-commerce products, applications and solutions.

With rapid e-commerce growth, there are many challenges,
for example, how to analyze e-commerce data and provide an organization
with meaningful information to improve their product and services offering
to target customers, and how to group millions web users who access a
web site so that the organization can serve each group of users better
and can reduce the business cost and increase the revenue. These challenges
would bring a lot of opportunities for data mining researchers to develop
better intelligent algorithms and systems to solve the practical e-commerce
problems. In this talk, we will use IBM Net.Commerce as example to explain
the e-commerce development and challenges that we face today.

**11:30-12:00 - ***Occurrence
of ill-defined probability distribution in real-world data*

John Hudson, Advisor, Radio Technology, Nortel Networks
In many communications problems the statistics of the
data, communication channels, and behaviour of users is ill defined and
not handled well by the simpler concepts in classical probability theory.
We can have data with alpha-stable (infinite variance) characteristics,
long-tailed and large variance log normal distributions, self similarity
in the time domain, and so on. If the higher moments of the underlying
distributions do not exist or have disproportionate values then laws of
large numbers and the central limit theorem may not be safely applied
to a surprising number of problems. The behaviour of some control mechanisms
can begin to take on a chaotic appearance when driven by such data.

In this talk, some of the properties of data, channels
and systems that are confronting workers in the communication field are
discussed. It is illustrated with examples taken from network data traffic,
Internet browsing, radio propagation, video images, speech statistics
and so on.

**1:30-2:30 - ***The analysis
of experimental time series*

Tom Mullin, The University of Manchester
We will discuss the application of modern dynamical
systems time series analysis methods to data from experimental systems.
These will include vibrating beams, nonlinear oscillators and physiological
measures. The emphasis will be placed on obtaining quantitative estimates
of the essential dynamics. We will also describe the application of data
synergy methods to multivariate data.

**2:30-3:00 - ***Fuzzy-pharmacology:
Rationale and applications*

Beth Sproule, Faculty of Pharmacy and Department of Psychiatry Psychopharmacology,
SunnyBrook Health Sciences Centre, Toronto
Pharmacological investigations are undertaken in order
to optimize the use of medications. The complexity and variability associated
with biological data has prompted our explorations into the use fuzzy
logic for modeling pharmacological systems. Fuzzy logic approaches have
been used in other areas of medicine (e.g., imaging technologies, control
of biomedical devices, decision support systems), however, their uses
in pharmacology are incipient. The results of our preliminary studies
will be presented in which we assessed the feasibility of fuzzy logic:
a) to predict serum lithium concentrations in elderly patients; and b)
to predict the response of alcohol dependent patients to citalopram in
attempting to reduce their drinking. Since then many current projects
have evolved. Approaches to this line of investigation will be presented.

**3:30-4:30 - ***Geospatial backbones
of environmental monitoring programs: the challenges of timely data acquisition,
processing and visualization*

Chad P. Gubala, Director, The Scientific Assessment Technologies Laboratory
University of Toronto
When considering ‘environmental’ issues or legalities,
a general and useful description of a pollutant is an element or entity
in the wrong place at the wrong time and perhaps in the wrong amount.
Prior to the establishment of cost-effective global positioning, monitoring
the fate and transport of environmental pollutants was limited to reduced
scale and statistically based sampling programs. Whole systems models
developed from parcels of environmental studies have been limited in predictive
capability due to unnoticed attributes, undocumented synergies or antagonisms
and un-quantifiable spatial and temporal variances.

Advances in the areas of commercial geospatial technologies
and high-speed sensors arrays have now offered the possibility of assessing
a whole ecosystem in near real time and in a spatially complete manner.
This capacity should then greatly improve quantitative environmental modeling
and the adaptive management process, further ‘tuning’ the balance between
global environments and economies. However, the promise of increased knowledge
about our natural resources is now limited by our capacity to move the
data collected from integrated geopositioning and sensor systems into
meaningful management products. This talk describes these limitations
and addresses the needs for developments in the areas of real time analytical
protocols.

**4:30-5:00 - ***Data mining and
its challenges in the banking industry*

Chen Wei Xu, Manager, Statistical Modeling, Customer Knowledge Management,
Bank of Montreal
Back to Top

**Thursday,
February 3, 2000**

**9:30-10:30 ***Elements of
fuzzy system modeling*

I.B. Turksen, University of Toronto

In most system modeling methodologies, we attempt
to find out, in an inductive manner, how a particular system behaves.
That is, we essentially try to determine how the input factors affect
the performance measure of our concern. There are at least three approaches
to system modeling: (1) personal experience, (2) expert interviews and
teachings, and (3) data mining with historical data.

In all these approaches, there are two fundamental
theoretical base structures for system modeling: (1) classical two -
valued set and logic theory based functional analyses and / or (2) novel
(35 years old) Infinite (fuzzy) - valued set and logic based super functional
analyses. Furthermore there are to basic learning methods in these two
approaches: (1) unsupervised learning and (2) supervised learning. The
basic difference between these two methods of learning is that the first
has no goal whereas the second has a goal. Generally the goal of supervised
learning is to assure that the model result compared to the actual is
minimized.

In classical two-valued set and logic based functional
analyses, the world and its systems are seen through the two-valued,
black and white, restricted view of, what is called, the clear patterns.
Unfortunately, first the two - valued dichotomy forces one to make arbitrary
choices when there are many alternatives to choose from. Secondly, functional
view can only represent many to one mapping by its very definition.
Thirdly, the combination of variables are assumed to be additive and
multiplicative leading to linear superposition schema in functional
representation of systems. In this view, logical “OR ness” is simply
mapped to “algebraic plus” and “AND ness” to “algebraic multiplication”.
Fourthly, imprecision in data are generally assumed to originate due
to random occurrences.

Whereas, in fuzzy (infinite) - valued set and logic
based super functional analyses, the world and its systems are seen
through information granules which admit an unrestricted view of fuzzy
patterns. Fortunately, first we are not forced to make arbitrary choices
but have the freedom to choose the gradation that is appropriate for
a given situation. Secondly, super functional view allows us to make
many to many mapping. That is membership functions are identified to
specify patterns via fuzzy cluster analyses. But then we can establish
cluster to cluster mappings over these functions that gives us super
functional representations. Thirdly, the combination of variables are
generally super additive or sub additive requiring highly nonlinear
representations. In fuzzy theory there are infinitely many ways to represent
“AND ness” (conjunction) and “OR ness” (disjunction) depending on context
and the behavior of a given system. Fourthly, imprecision in data are
generally deterministic due to incapability of our measurement devices.

In our integrated fuzzy system modeling approach,
we first use fuzzy clustering techniques to learn patterns with fuzzy
scatter matrices and diagrams to determine the essential fuzzy clusters,
i.e., the effective rules of system behavior. This is an unsupervised
learning method. Next we fit membership functions to these clusters.
As well we determine significant and critical variables that affect
the system behavior drastically and moderately, etc.

Later, we apply supervised learning to determine the
nonlinear operators that combine the fuzzy clusters in many to many
maps of input and output variables in order to achieve minimum system
model error. In this supervised learning we also implement compensation
and compromise between the extreme values of formulas that specify combination
of concepts and hence the appropriate combination of variables as well
as alternate inference schemas.

Real-life system model building examples include:
(1) a continuous caster model that attempts to balance tardiness of
customer delivery due dates versus mixed grade steel production and
(2) pharmacological models that attempt to determine the effects of
medication on humans. Simulated system model building examples include:
(1) utilization of Internet data links, (2) analyses of traffic characteristics,
and (3) discard rate prediction.

**11:00-12:00 - ***A Steel Industry
Viewpoint on Fuzzy Technology -Scheduling Analysis Application*

Michael Dudzic, Manager, Process Automation Technology, Dofasco Inc.

This presentation will discuss the experiences in
the use of Fuzzy Expert system technologies as it was applied in a proof-of-concept
project looking at 2 specific issues in scheduling the #1 Continuous
Caster at Dofasco. This talk complements I. B. Turksen’s talk on Elements
of Fuzzy System Modeling.

&

**The Application of Multivariate
Statistical Technologies at Dofasco**

This presentation will discuss the experiences in
the use of Multivariate Statistics (Principle Component Analysis and
Partial Least Squares) in applications at Dofasco. The focus example
will be the on-line monitoring system at the #1 Continuous Caster.

**1:00-2:00 - ***Recent developments
in decision tree models*

Hugh Chipman, University of Waterloo

Decision trees are an appealing predictive model because
of their interpretability and flexibility. In this talk, I will outline
some recent developments in decision tree modeling, including improvements
in model search techniques, and enrichments to the tree model, such
as linear models within terminal nodes.

**2:30-3:30 - ***A hybrid predictive
model for database marketing*

Zhen Mei, Generation 5

We discuss a simple hybrid approach for predicting
response rate in mailing campaigns and for predicting certain demographic
and expenditure characteristics in customer database. This method is
based on cluster analysis and predictive modeling. As an example we
model home ownership for the State of New York.

&

*Missing value filling*

Wenxue Huang , Generation 5

The talk is about the missing value filling methodology
and software that are being developed by Generation 5 and focused on
the mathematics for target data being interval-scaled. A local-and-global
(or vertical-and-horizontal) balanced approach in a multivariate and
a large database setting will be discussed. The methodology and software
may apply to doing prediction: filling in missing values is equivalent
to predicting instant target values based on reliable complete historical
records and current incomplete input.

**4:00-5:00 - ***Challenges
in the development of segmentation solutions in the banking industry
and a genetic algorithms approach*

Chris Ralph, Senior Manager Market Segmentation, Bank of Montreal

The Bank of Montreal team is in the process of building
market segmentation solutions for a few different lines of business
using syndicated survey data. The dataset consists of 4,200 responses
from households across Canada (geographically unbiased sample), and
contains detailed information on their financial holdings across all
institutions, as well as channel usage, banking habits, and household
profile information. The process we typically follow in the development
of a segmentation solution consists of the following steps:

1) Standard preprocessing
stuff (treating outliers, missing values, standardization.) -->
3-5 days

2) Data reduction via factor analysis, PCA, or simple cross-correlations
to help avoid redundancy in the cluster runs --> 2-3 days

3) Brainstorming sessions with the lines of business to help us understand
key business issues, and generate a list of potential driver variables
--> 1-2 weeks

4) AIternative cluster runs using the brainstorming suggestions and
data reduction output to generate potential solutions through trial
and error. --> 2-4 weeks

The evaluation of solutions in Step 4 involves making
trade-offs between the number of clusters, cluster size, cluster overlap,
and the degree to which the current solution meets the needs of the
business as determined through the brainstorming sessions. This is usually
a painful process that relies heavily on the experience of the analyst
to bridge the gap between cluster solution statistics and relevance
to the business. Given the highly manual nature of this task, we can
only evaluate a very small subset of the universe of possible solutions,
and different analysts will generate very different solutions.

The discussion will focus on the development of an
objective function which captures both business rules and cluster statistics,
and which allows for the evaluation and ranking of a much larger number
of potential solutions. The elements of the objective function will
be described in fairly simple terms, which apply to any segmentation
problem, and show how genetic algorithms may be used to “evolve” potential
solutions. An open discussion will be encouraged of ways to improve
the encoding of the problem and the objective function, as well as a
discussion of the challenges associated with the integration of business
rules. There are also plenty of issues surrounding the use of genetic
algorithms to help optimize the search through the space of possible
solutions.

The current objective function captures the business
rules simply by measuring the average variance of key “business driver”
variables across the clusters, where these variables have been selected
ahead of time in cooperation with the line of business. The higher the
variance of these variables across the segments, the more distinct and
relevant the clusters should be. Average cluster overlap is calculated
by building n-dimensional hypersheres (where n = # of cluster drivers)
around the centroids of the clusters, where the radius of the hypershere
is between 2 and 3 RMS standard deviations. Overlap is defined as occurring
when any single observation falls within the hypershere of a cluster,
which it has not been assigned to. Cluster size may be integrated into
the objective function, where solutions are penalized for having clusters
that are either too large or too small.

Back to Top

**Friday,
February 4, 2000**
**9:30-10:30 - ***Interdisciplinary
application of time series methods inspired by chaos theory*

Thomas Schreiber, University of Wuppertal
We report on real world applications of time series
methods developed on the basis of the theory of deterministic chaos. First,
we demonstrate statistical criteria for the necessity of a nonlinear approach.
Nonlinear processes are not in general purely deterministic. Then we discuss
modified methods that can cope with noise and nonstationarities. In particular,
we will discuss nonlinear filtering, signal classification, and the detection
of nonlinear coherence between processes.

**11:00-12:00 - ***Symbolic data compression concepts for analyzing experimental
data*

Matt Kennel, Institute for Nonlinear Science at USCD, San Diego
**1:00-2:00 - ***Geometric time series analysis*

Mark Muldoon, University of Science and Technology in Manchester
A discussion of a circle of techniques, all developed
within the last 20 years and all loosely organized around the idea that
one can extract detailed information about a dynamical system (say, the
equations of motion governing some industrial process...) by forming vectors
out of successive entries in a time series of measurements.

**2:30-3:30 - ***Chaotic communication using optical and wireless devices*

Henry Abarbanel, Institute for Nonlinear Science at USCD, San Diego
**3:30-4:30 - ***Status of cosmic microwave background data analysis:
motivations and methods*

Simon Prunet, CITA (Canadian Institute for Theoretical Astrophysics),
University of Toronto
After a brief review of the physics that motivates
measurements of Cosmic Microwave Background anisotropies, I will present
the current observational status, the analysis methods used so far, and
the challenge posed by the upcoming huge data sets from future satellite
experiments.