Google Scholar

Learning transferable architectures for scalable image recognition

B Zoph, V Vasudevan, J Shlens…�- Proceedings of the�…, 2018 - openaccess.thecvf.com

Developing neural network image classification models often requires significant
architecture engineering. In this paper, we study a method to learn the model architectures�…

Save Cite Cited by 6976 Related articles All 15 versions View as HTML

[PDF] arxiv.org

Neural architecture search with reinforcement learning

B Zoph, QV Le�- arXiv preprint arXiv:1611.01578, 2016 - arxiv.org

Neural networks are powerful and flexible models that work well for many difficult learning
tasks in image, speech and natural language understanding. Despite their success, neural�…

Save Cite Cited by 6319 Related articles All 17 versions View as HTML

[PDF] thecvf.com

Efficientdet: Scalable and efficient object detection

M Tan, R Pang, QV Le�- …�of the IEEE/CVF conference on�…, 2020 - openaccess.thecvf.com

Abstract Model efficiency has become increasingly important in computer vision. In this
paper, we systematically study neural network architecture design choices for object�…

Save Cite Cited by 6462 Related articles All 8 versions View as HTML

[PDF] thecvf.com

Self-training with noisy student improves imagenet classification

Q Xie, MT Luong, E Hovy…�- Proceedings of the IEEE�…, 2020 - openaccess.thecvf.com

We present a simple self-training method that achieves 88.4% top-1 accuracy on ImageNet,
which is 2.0% better than the state-of-the-art model that requires 3.5 B weakly labeled�…

Save Cite Cited by 2663 Related articles All 12 versions View as HTML

[PDF] aclanthology.org

Transformer-xl: Attentive language models beyond a fixed-length context

Z Dai, Z Yang, Y Yang, J Carbonell, QV Le…�- arXiv preprint arXiv�…, 2019 - arxiv.org

Transformer networks have a potential of learning longer-term dependency, but are limited
by a fixed-length context in the setting of language modeling. As a solution, we propose a�…

Save Cite Cited by 4094 Related articles All 11 versions View as HTML

[PDF] aminer.cn

Electra: Pre-training text encoders as discriminators rather than generators

K Clark, MT Luong, QV Le, CD Manning�- arXiv preprint arXiv:2003.10555, 2020 - arxiv.org

Masked language modeling (MLM) pre-training methods such as BERT corrupt the input by
replacing some tokens with [MASK] and then train a model to reconstruct the original tokens�…

Save Cite Cited by 3953 Related articles All 11 versions View as HTML

[PDF] arxiv.org

Specaugment: A simple data augmentation method for automatic speech recognition

DS Park, W Chan, Y Zhang, CC Chiu, B Zoph…�- arXiv preprint arXiv�…, 2019 - arxiv.org

We present SpecAugment, a simple data augmentation method for speech recognition.
SpecAugment is applied directly to the feature inputs of a neural network (ie, filter bank�…

Save Cite Cited by 3890 Related articles All 8 versions View as HTML

[PDF] openreview.net

Searching for activation functions

P Ramachandran, B Zoph, QV Le�- arXiv preprint arXiv:1710.05941, 2017 - arxiv.org

The choice of activation functions in deep networks has a significant effect on the training
dynamics and task performance. Currently, the most successful and widely-used activation�…

Save Cite Cited by 3861 Related articles All 6 versions View as HTML

[PDF] neurips.cc

Unsupervised data augmentation for consistency training

…, Z Dai, E Hovy, T Luong, Q Le�- Advances in neural�…, 2020 - proceedings.neurips.cc

Semi-supervised learning lately has shown much promise in improving deep learning
models when labeled data is scarce. Common among recent approaches is the use of�…

Save Cite Cited by 2322 Related articles All 12 versions View as HTML

[PDF] mlr.press

Distributed representations of sentences and documents

Q Le, T Mikolov�- International conference on machine�…, 2014 - proceedings.mlr.press

Many machine learning algorithms require the input to be represented as a fixed length
feature vector. When it comes to texts, one of the most common representations is bag-of�…

Save Cite Cited by 12710 Related articles All 22 versions View as HTML

Create alert

Cite

Advanced search

Saved to My library