Temporal Data Mining

Any information having a time component can be represented in a general way in a temporal database. Our task is to develop a query language that is flexible enough to access this general kind of representation, and generate (as output) information to be processed by a time series analysis package.

A time series is a sequence of observations on a particular variable over time. This statistical method provides a natural way of storing time related information in a temporal database.

Uses for temporal databases:

  • Forecasting events in the future
  • Analyzing patterns

We are interested in performing time series analysis on the information held in temporal databases. Some of the examples that make time series analysis interesting are:

  • Identifying the number of workers in different job categories, in order to plan recruiting and training
  • Identifying the demand for each product line needed, for accurate production
  • Calculating patterns of minimum, maximum, etc. growth in employees' salaries over different periods of service
  • Calculating patterns of expenses in projects over different periods of time

The Classical Multiplicative Model views a time series as being built up of four different components. In order to identify patterns in a time series, it is convenient to think of a time series as consisting of several components:

  1. Trend
    - upward or downward growth (may be linear or exponential), to characterize the time series over a period of time
  2. Cycle
    - refers to recurring up and down movements around trend levels. For example, the peaks and troughs of a business cycle: expansion followed by contraction (not necessarily affected by changes in economic factors).
  3. Seasonal
    - patterns that complete themselves in a year. (e.g. monthly housing starts related to weather; increase in sales during Christmas).
  4. Irregular
    - erratic movement in a time series, that follows no regular pattern. (e.g. leftover or unaccountable parts after considering trend, cycle, or seasonal variations).

We can use these components to:

  1. Calculate moving trend averages
    - to smooth out the series
  2. Obtain the ratio-to-moving average
    - for derivation of seasonal components (i.e., extracting periodic fluctuations)
  3. Plot the final graph having trend, seasonal, and cyclical components
  4. Extrapolate the final graph to forecast new values for variables

The approaches described above can be used for analyzing patterns and for forecasting.