Skip to main content

Showing 1–9 of 9 results for author: Tai, K S

  1. arXiv:2311.10873  [pdf, other

    cs.CV

    Multi-entity Video Transformers for Fine-Grained Video Representation Learning

    Authors: Matthew Walmer, Rose Kanjirathinkal, Kai Sheng Tai, Keyur Muzumdar, Taipeng Tian, Abhinav Shrivastava

    Abstract: The area of temporally fine-grained video representation learning aims to generate frame-by-frame representations for temporally dense tasks. In this work, we advance the state-of-the-art for this area by re-examining the design of transformer architectures for video representation learning. A salient aspect of our self-supervised method is the improved integration of spatial information in the te… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

  2. arXiv:2302.14078  [pdf, other

    cs.LG math.DS

    Analyzing Populations of Neural Networks via Dynamical Model Embedding

    Authors: Jordan Cotler, Kai Sheng Tai, Felipe Hernández, Blake Elias, David Sussillo

    Abstract: A core challenge in the interpretation of deep neural networks is identifying commonalities between the underlying algorithms implemented by distinct networks trained for the same task. Motivated by this problem, we introduce DYNAMO, an algorithm that constructs low-dimensional manifolds where each point corresponds to a neural network model, and two points are nearby if the corresponding neural n… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

    Comments: 12+8 pages, 11 figures

  3. arXiv:2205.14107  [pdf, other

    cs.LG

    Spartan: Differentiable Sparsity via Regularized Transportation

    Authors: Kai Sheng Tai, Taipeng Tian, Ser-Nam Lim

    Abstract: We present Spartan, a method for training sparse neural network models with a predetermined level of sparsity. Spartan is based on a combination of two techniques: (1) soft top-k masking of low-magnitude parameters via a regularized optimal transportation problem and (2) dual averaging-based parameter updates with hard sparsification in the forward pass. This scheme realizes an exploration-exploit… ▽ More

    Submitted 17 October, 2022; v1 submitted 27 May, 2022; originally announced May 2022.

    Comments: NeurIPS 2022 camera ready

  4. arXiv:2102.08622  [pdf, other

    cs.LG stat.ML

    Sinkhorn Label Allocation: Semi-Supervised Classification via Annealed Self-Training

    Authors: Kai Sheng Tai, Peter Bailis, Gregory Valiant

    Abstract: Self-training is a standard approach to semi-supervised learning where the learner's own predictions on unlabeled data are used as supervision during training. In this paper, we reinterpret this label assignment process as an optimal transportation problem between examples and classes, wherein the cost of assigning an example to a class is mediated by the current predictions of the classifier. Thi… ▽ More

    Submitted 11 June, 2021; v1 submitted 17 February, 2021; originally announced February 2021.

    Comments: ICML 2021 camera ready version

  5. arXiv:1901.11399  [pdf, other

    cs.CV cs.LG stat.ML

    Equivariant Transformer Networks

    Authors: Kai Sheng Tai, Peter Bailis, Gregory Valiant

    Abstract: How can prior knowledge on the transformation invariances of a domain be incorporated into the architecture of a neural network? We propose Equivariant Transformers (ETs), a family of differentiable image-to-image mappings that improve the robustness of models towards pre-defined continuous transformation groups. Through the use of specially-derived canonical coordinate systems, ETs incorporate fu… ▽ More

    Submitted 24 May, 2019; v1 submitted 25 January, 2019; originally announced January 2019.

    Comments: ICML 2019

  6. arXiv:1803.01969  [pdf, other

    cs.DB

    Moment-Based Quantile Sketches for Efficient High Cardinality Aggregation Queries

    Authors: Edward Gan, Jialin Ding, Kai Sheng Tai, Vatsal Sharan, Peter Bailis

    Abstract: Interactive analytics increasingly involves querying for quantiles over sub-populations of high cardinality datasets. Data processing engines such as Druid and Spark use mergeable summaries to estimate quantiles, but summary merge times can be a bottleneck during aggregation. We show how a compact and efficiently mergeable quantile sketch can support aggregation workloads. This data structure, whi… ▽ More

    Submitted 13 July, 2018; v1 submitted 5 March, 2018; originally announced March 2018.

    Comments: Technical Report for paper to be published in VLDB 2018

  7. arXiv:1711.02305  [pdf, other

    cs.LG cs.DS stat.ML

    Sketching Linear Classifiers over Data Streams

    Authors: Kai Sheng Tai, Vatsal Sharan, Peter Bailis, Gregory Valiant

    Abstract: We introduce a new sub-linear space sketch---the Weight-Median Sketch---for learning compressed linear classifiers over data streams while supporting the efficient recovery of large-magnitude weights in the model. This enables memory-limited execution of several statistical analyses over streams, including online feature selection, streaming data explanation, relative deltoid detection, and stream… ▽ More

    Submitted 6 April, 2018; v1 submitted 7 November, 2017; originally announced November 2017.

    Comments: Full version of paper appearing at SIGMOD 2018 with more detailed proofs of theoretical results. Code available at https://github.com/stanford-futuredata/wmsketch

  8. arXiv:1706.08146  [pdf, other

    cs.LG cs.AI stat.ML

    Compressed Factorization: Fast and Accurate Low-Rank Factorization of Compressively-Sensed Data

    Authors: Vatsal Sharan, Kai Sheng Tai, Peter Bailis, Gregory Valiant

    Abstract: What learning algorithms can be run directly on compressively-sensed data? In this work, we consider the question of accurately and efficiently computing low-rank matrix or tensor factorizations given data compressed via random projections. We examine the approach of first performing factorization in the compressed domain, and then reconstructing the original high-dimensional factors from the reco… ▽ More

    Submitted 27 May, 2019; v1 submitted 25 June, 2017; originally announced June 2017.

    Comments: Updates for ICML'19 camera-ready

  9. arXiv:1503.00075  [pdf, other

    cs.CL cs.AI cs.LG

    Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks

    Authors: Kai Sheng Tai, Richard Socher, Christopher D. Manning

    Abstract: Because of their superior ability to preserve sequence information over time, Long Short-Term Memory (LSTM) networks, a type of recurrent neural network with a more complex computational unit, have obtained strong results on a variety of sequence modeling tasks. The only underlying LSTM structure that has been explored so far is a linear chain. However, natural language exhibits syntactic properti… ▽ More

    Submitted 30 May, 2015; v1 submitted 28 February, 2015; originally announced March 2015.

    Comments: Accepted for publication at ACL 2015