Conditional heavy hitters : detecting interesting correlations in data streams

[thumbnail of WRAP_condhh-vldbj.pdf]
Preview
PDF
WRAP_condhh-vldbj.pdf - Accepted Version - Requires a PDF viewer.

Download (4MB) | Preview

Request Changes to record.

Abstract

The notion of heavy hitters—items that make up a large fraction of the population—has been successfully used in a variety of applications across sensor and RFID monitoring, network data analysis, event mining, and more. Yet this notion often fails to capture the semantics we desire when we observe data in the form of correlated pairs. Here, we are interested in items that are conditionally frequent: when a particular item is frequent within the context of its parent item. In this work, we introduce and formalize the notion of conditional heavy hitters to identify such items, with applications in network monitoring and Markov chain modeling. We explore the relationship between conditional heavy hitters and other related notions in the literature, and show analytically and experimentally the usefulness of our approach. We introduce several algorithm variations that allow us to efficiently find conditional heavy hitters for input data with very different characteristics, and provide analytical results for their performance. Finally, we perform experimental evaluations with several synthetic and real datasets to demonstrate the efficacy of our methods and to study the behavior of the proposed algorithms for different types of data.

Item Type: Journal Article
Subjects: Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software
Divisions: Faculty of Science, Engineering and Medicine > Science > Computer Science
Library of Congress Subject Headings (LCSH): Online algorithms, Streaming technology (Telecommunications)
Journal or Publication Title: The VLDB Journal
Publisher: Springer Berlin Heidelberg
ISSN: 1066-8888
Official Date: June 2015
Dates:
Date
Event
June 2015
Published
26 February 2015
Available
Volume: 24
Number: 3
Number of Pages: 20
Page Range: pp. 395-414
DOI: 10.1007/s00778-015-0382-5
Status: Peer Reviewed
Publication Status: Published
Access rights to Published version: Restricted or Subscription Access
Date of first compliant deposit: 31 March 2016
Date of first compliant Open Access: 31 March 2016
Persistent URL: https://wrap.warwick.ac.uk/74443/

Export / Share Citation


Request changes or add full text files to a record

Repository staff actions (login required)

View Item View Item