|
|
KDD-2000 Sixth ACM SIGKDD International Conference on Time Series Similarity Measures
Gautam Das, Dimitrios Gunopulos
Abstract: Time series data arise in a variety of domains, such
as stock market analysis, environmental data, telecommunications data,
medical and financial data. Typically each time series describes the evolution
of an object as a function of time at a given data collection station.
Examples are, the daily price fluctuations of a stock, or web data that count
the number of clicks at different sites. Higher dimensional time series can
be used to describe the evolution of more complex objects, for example
digital image sequences. Currently time series data account for a large
fraction of the data stored in commercial databases. Recently there is
increasing recognition of this fact, and support for time series as a new
data type in commercial databases management systems is increasing. IBM DB2
for example implements support for time series using data-blades. A fundamental problem of interest is to determine
whether two given time series display similar behavior. The problem is interesting (and difficult) because
the similarity measures should allow for imprecise matches. There are several applications of such measures. For
example, they can be used to cluster the different time series into similar
groups, or to classify a time series based on a set of known examples. Another problem of interest is the indexing problem: given a set of time series Q, prepare an index offline such that given a query series q, the time series in Q that are most similar to q can be reported quickly. As an application, an investor may wish to know the stocks that behave similarly to a certain query stock. In the database and data mining communities, various
similarity measures and indexing techniques for time series have been
proposed. In this tutorial we describe the state-of-art of this area by
comparing and summarizing several of these techniques in detail. Biographies
of Organizers: Gautam Das received a Ph.D. in Computer Science from
the University of Wisconsin-Madison in 1990, and a B.Tech from the Indian
Institute of Technology, Kanpur. Dr.
Das is currently a Researcher in the Data Mining and Exploration at Microsoft
Research. He has also held positions at Compaq Computer Corp. and the
University of Memphis. His research
interests include data mining, data bases, algorithms, and computational geometry.
His current research focuses on techniques for defining context-based
similarity measures between complex data objects, on sequence analysis, and
on database indexing techniques. Dimitrios Gunopulos received a Ph.D. in Computer Science from
Princeton University in 1995. Prior to that he received a M.A. in Computer
Science from Princeton and a Diploma in Computer Engineering from the
University of Patras. Dr. Gunopulos
is currently an Assistant Professor in the Department of Computer Science and
Engineering at the University of California, Riverside. He has also held
positions at IBM Almaden and the Max-Plank-Institut for Informatik. His research interests include data
mining, databases, algorithms, and computational geometry. His current research focuses on techniques
for approximating range queries, on applying data mining techniques to
geospatial data, and on database indexing techniques. |
|