Summarizing Multidimensional Data Streams: A Hierarchy-Graph-Based Approach Authors(s): Yoann Pitarch, Anne Laurent, Pascal Poncelet
When dealing with potentially infinite data streams, storing the whole data stream history is unfeasible and providing a high-quality summary is required. In this paper, we propose a summarization method for multidimensional data streams based on a graph structure and taking advantage of the data hierarchies. The summarization method considers the data distribution and thus overcomes a major drawback of the Tilted Time Window common framework. We adapt this structure for synthesizing frequent itemsets extracted on temporal windows. Thanks to our approach, as users do not analyze any more numerous extraction results, the result processing is improved.
As a text scholar, I would presume that all occurrences are stored.
For high speed data streams too large to store, that are read in one pass, that isn’t an option.
If terabytes of high speed data are on your topic mapping horizon, start here.
****
PS: Posts on temporal modeling with proxies to follow (but not real soon).