Sequential random sampling revisited : hidden shuffle method

[thumbnail of WRAP-sequential-random-sampling-revisited-hidden-shuffle-method-2021.pdf]
Preview
PDF
WRAP-sequential-random-sampling-revisited-hidden-shuffle-method-2021.pdf - Published Version - Requires a PDF viewer.

Download (927kB) | Preview

Request Changes to record.

Abstract

Random sampling (without replacement) is ubiquitously employed to obtain a representative subset of the data. Unlike common methods, sequential methods report samples in ascending order of index without keeping track of previous samples. This enables lightweight iterators that can jump directly from one sampled position to the next. Previously, sequential methods focused on drawing from the distribution of gap sizes, which requires intricate algorithms that are difficult to validate and can be slow in the worst-case. This can be avoided by a new method, the Hidden Shuffle. The name mirrors the fact that although the algorithm does not resemble shuffling, its correctness can be proven by conceptualising the sampling process as a random shuffle. The Hidden Shuffle algorithm stores just a handful of values, can be implemented in few lines of code, offers strong worst-case guarantees and is shown to be faster than state-of-the-art methods while using comparably few random variates.

Item Type: Conference Item (Paper)
Subjects: H Social Sciences > HA Statistics
Divisions: Faculty of Science, Engineering and Medicine > Science > Computer Science
Library of Congress Subject Headings (LCSH): Sampling (Statistics), Random access memory, Algorithms
Series Name: Proceedings of Machine Learning Research
Journal or Publication Title: Proceedings of The 24th International Conference on Artificial Intelligence and Statistics
ISSN: 2640-3498
Official Date: 2021
Dates:
Date
Event
2021
Published
23 February 2021
Accepted
Volume: 130
Page Range: pp. 3628-3636
Status: Peer Reviewed
Publication Status: Published
Re-use Statement: Copyright 2021 by the author(s).
Access rights to Published version: Open Access (Creative Commons open licence)
Date of first compliant deposit: 16 April 2021
Date of first compliant Open Access: 16 April 2021
RIOXX Funder/Project Grant:
Project/Grant ID
RIOXX Funder Name
Funder ID
ERC-2014-CoG 647557
European Research Council
Conference Paper Type: Paper
Title of Event: The 24th International Conference on Artificial Intelligence and Statistics (AISTATS 2021)
Type of Event: Conference
Location of Event: Virtual
Date(s) of Event: 13-15 Apr 2021
Related URLs:
Open Access Version:
Persistent URL: https://wrap.warwick.ac.uk/150064/

Export / Share Citation


Request changes or add full text files to a record

Repository staff actions (login required)

View Item View Item