Talk Dr. Mi-Yen Yeh, Inst of Information Science, Academia Sinica, Taiwan

Date
Location

ICCS 304

Random Error Reduction in Similarity Search on Time Series: A Statistical Approach

Abstract: Errors in measurement can be categorized into two types: systematic errors that are predictable, and random errors that are inherently unpredictable and have null expected value. Random error is always present in a measurement. More often than not, readings in time series may contain inherent random errors due to causes like dynamic error, drift, noise, hysteresis, digitalization error and limited sampling frequency. Random errors may affect the quality of time series analysis substantially. Unfortunately, most of the existing time series mining and analysis methods, such as similarity search, clustering, and classification tasks, do not address random errors, possibly because random error in a time series, which can be modeled as a random variable of unknown distribution, is hard to handle. In this talk, I will introduce how we tackle this challenging problem. Taking similarity search as an example, which is an essential task in time series analysis, we develop MISQ, a statistical approach for random error reduction in time series analysis. The major intuition in our method is to use only the readings at different time instants in a time series to reduce random errors. We achieve a highly desirable property in MISQ: it can ensure that the recall is above a user-specified threshold. An extensive empirical study on 20 benchmark real data sets clearly shows that our method can lead to better performance than the baseline method without random error reduction in real applications such as classification. Moreover, MISQ achieves good quality in similarity search. Bio: Mi-Yen Yeh is currently an Assistant Research Fellow of Institute of Information Science at Academia Sinica, Taiwan. She received her Ph.D. degree in Electrical Engineering from National Taiwan University, Taiwan