Seminar: “A Framework for Solving Complex Biomedical Problems”
Columbia University
December 3, 2007
2:30-3:30 PM
341 Coverdell Center
ABSTRACT
Scientific
advances have presented statistical sciences with many interesting challenges,
one of which is the challenge of large dimensions, both in terms of the number
of data vectors and of the dimension of the data vectors. Assessment of the fit
of a model in such situations may be challenging.
We will discuss a framework for high dimensional model assessment that
has the potential to be useful in solving complex biomedical problems. This
framework is based on the concept of distances and has connections with the
usual goodness-of-fit literature.
What distances are useful for high dimensional data? We answer this question by
considering a wide and flexible class of distances that is characterized by a
simple, quadratic form structure that is adaptable through the choice of a
non-negative definite kernel. We illustrate connections with the field of
machine learning and discuss potential applications in the field of genomics
where data may be represented as long binary vectors, and in clinical medicine,
thereby indicating a vision for research and training in biostatistics.

