Skip to main content

Showing 1–28 of 28 results for author: Shelhamer, E

  1. arXiv:2406.10427  [pdf, other

    cs.LG cs.CR

    Adaptive Randomized Smoothing: Certifying Multi-Step Defences against Adversarial Examples

    Authors: Saiyue Lyu, Shadab Shaikh, Frederick Shpilevskiy, Evan Shelhamer, Mathias Lécuyer

    Abstract: We propose Adaptive Randomized Smoothing (ARS) to certify the predictions of our test-time adaptive models against adversarial examples. ARS extends the analysis of randomized smoothing using f-Differential Privacy to certify the adaptive composition of multiple steps. For the first time, our theory covers the sound adaptive composition of general and high-dimensional functions of noisy input. We… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  2. arXiv:2302.10164  [pdf, other

    cs.LG cs.CV

    Seasoning Model Soups for Robustness to Adversarial and Natural Distribution Shifts

    Authors: Francesco Croce, Sylvestre-Alvise Rebuffi, Evan Shelhamer, Sven Gowal

    Abstract: Adversarial training is widely used to make classifiers robust to a specific threat or adversary, such as $\ell_p$-norm bounded perturbations of a given $p$-norm. However, existing methods for training classifiers robust to multiple threats require knowledge of all attacks during training and remain vulnerable to unseen distribution shifts. In this work, we describe how to obtain adversarially-rob… ▽ More

    Submitted 20 February, 2023; originally announced February 2023.

  3. arXiv:2209.15589  [pdf, other

    cs.CV cs.LG

    Where Should I Spend My FLOPS? Efficiency Evaluations of Visual Pre-training Methods

    Authors: Skanda Koppula, Yazhe Li, Evan Shelhamer, Andrew Jaegle, Nikhil Parthasarathy, Relja Arandjelovic, João Carreira, Olivier Hénaff

    Abstract: Self-supervised methods have achieved remarkable success in transfer learning, often achieving the same or better accuracy than supervised pre-training. Most prior work has done so by increasing pre-training computation by adding complex data augmentation, multiple views, or lengthy training schedules. In this work, we investigate a related, but orthogonal question: given a fixed FLOP budget, what… ▽ More

    Submitted 18 October, 2022; v1 submitted 30 September, 2022; originally announced September 2022.

    Comments: 11 pages. 36th Conference on Neural Information Processing Systems, Workshop on Self-Supervised Learning (2022)

  4. arXiv:2207.03442  [pdf, other

    cs.LG cs.CV

    Back to the Source: Diffusion-Driven Test-Time Adaptation

    Authors: Jin Gao, Jialing Zhang, Xihui Liu, Trevor Darrell, Evan Shelhamer, Dequan Wang

    Abstract: Test-time adaptation harnesses test inputs to improve the accuracy of a model trained on source data when tested on shifted target data. Existing methods update the source model by (re-)training on each target domain. While effective, re-training is sensitive to the amount and order of the data and the hyperparameters for optimization. We instead update the target data, by projecting all test inpu… ▽ More

    Submitted 21 June, 2023; v1 submitted 7 July, 2022; originally announced July 2022.

    Comments: published at CVPR 2023

  5. arXiv:2203.08777  [pdf, other

    cs.CV cs.AI cs.LG

    Object discovery and representation networks

    Authors: Olivier J. Hénaff, Skanda Koppula, Evan Shelhamer, Daniel Zoran, Andrew Jaegle, Andrew Zisserman, João Carreira, Relja Arandjelović

    Abstract: The promise of self-supervised learning (SSL) is to leverage large amounts of unlabeled data to solve complex tasks. While there has been excellent progress with simple, image-level learning, recent methods have shown the advantage of including knowledge of image structure. However, by introducing hand-crafted image segmentations to define regions of interest, or specialized augmentation strategie… ▽ More

    Submitted 27 July, 2022; v1 submitted 16 March, 2022; originally announced March 2022.

    Comments: European Conference on Computer Vision (ECCV) 2022

  6. arXiv:2202.13711  [pdf, other

    cs.LG cs.CR cs.CV

    Evaluating the Adversarial Robustness of Adaptive Test-time Defenses

    Authors: Francesco Croce, Sven Gowal, Thomas Brunner, Evan Shelhamer, Matthias Hein, Taylan Cemgil

    Abstract: Adaptive defenses, which optimize at test time, promise to improve adversarial robustness. We categorize such adaptive test-time defenses, explain their potential benefits and drawbacks, and evaluate a representative variety of the latest adaptive defenses for image classification. Unfortunately, none significantly improve upon static defenses when subjected to our careful case study evaluation. S… ▽ More

    Submitted 13 July, 2022; v1 submitted 28 February, 2022; originally announced February 2022.

    Comments: ICML'22

  7. arXiv:2202.10890  [pdf, other

    cs.CV

    HiP: Hierarchical Perceiver

    Authors: Joao Carreira, Skanda Koppula, Daniel Zoran, Adria Recasens, Catalin Ionescu, Olivier Henaff, Evan Shelhamer, Relja Arandjelovic, Matt Botvinick, Oriol Vinyals, Karen Simonyan, Andrew Zisserman, Andrew Jaegle

    Abstract: General perception systems such as Perceivers can process arbitrary modalities in any combination and are able to handle up to a few hundred thousand inputs. They achieve this generality by using exclusively global attention operations. This however hinders them from scaling up to the inputs sizes required to process raw high-resolution images or video. In this paper, we show that some degree of l… ▽ More

    Submitted 3 November, 2022; v1 submitted 22 February, 2022; originally announced February 2022.

  8. arXiv:2109.01087  [pdf, other

    cs.CV cs.AI cs.LG

    On-target Adaptation

    Authors: Dequan Wang, Shaoteng Liu, Sayna Ebrahimi, Evan Shelhamer, Trevor Darrell

    Abstract: Domain adaptation seeks to mitigate the shift between training on the \emph{source} domain and testing on the \emph{target} domain. Most adaptation methods rely on the source data by joint optimization over source data and target data. Source-free methods replace the source data with a source model by fine-tuning it on target. Either way, the majority of the parameter updates for the model represe… ▽ More

    Submitted 2 September, 2021; originally announced September 2021.

  9. arXiv:2107.14795  [pdf, other

    cs.LG cs.CL cs.CV cs.SD eess.AS

    Perceiver IO: A General Architecture for Structured Inputs & Outputs

    Authors: Andrew Jaegle, Sebastian Borgeaud, Jean-Baptiste Alayrac, Carl Doersch, Catalin Ionescu, David Ding, Skanda Koppula, Daniel Zoran, Andrew Brock, Evan Shelhamer, Olivier Hénaff, Matthew M. Botvinick, Andrew Zisserman, Oriol Vinyals, Joāo Carreira

    Abstract: A central goal of machine learning is the development of systems that can solve many problems in as many data domains as possible. Current architectures, however, cannot be applied beyond a small set of stereotyped settings, as they bake in domain & task assumptions or scale poorly to large inputs or outputs. In this work, we propose Perceiver IO, a general-purpose architecture that handles data f… ▽ More

    Submitted 15 March, 2022; v1 submitted 30 July, 2021; originally announced July 2021.

    Comments: ICLR 2022 camera ready. Code: https://dpmd.ai/perceiver-code

  10. arXiv:2105.08714  [pdf, other

    cs.LG cs.CR cs.CV

    Fighting Gradients with Gradients: Dynamic Defenses against Adversarial Attacks

    Authors: Dequan Wang, An Ju, Evan Shelhamer, David Wagner, Trevor Darrell

    Abstract: Adversarial attacks optimize against models to defeat defenses. Existing defenses are static, and stay the same once trained, even while attacks change. We argue that models should fight back, and optimize their defenses against attacks at test time. We propose dynamic defenses, to adapt the model and input during testing, by defensive entropy minimization (dent). Dent alters testing, but not trai… ▽ More

    Submitted 18 May, 2021; originally announced May 2021.

  11. arXiv:2104.00749  [pdf, other

    cs.CV cs.LG

    Anytime Dense Prediction with Confidence Adaptivity

    Authors: Zhuang Liu, Zhiqiu Xu, Hung-Ju Wang, Trevor Darrell, Evan Shelhamer

    Abstract: Anytime inference requires a model to make a progression of predictions which might be halted at any time. Prior research on anytime visual recognition has mostly focused on image classification. We propose the first unified and end-to-end approach for anytime dense prediction. A cascade of "exits" is attached to the model to make multiple predictions. We redesign the exits to account for the dept… ▽ More

    Submitted 25 April, 2022; v1 submitted 1 April, 2021; originally announced April 2021.

    Comments: Published in ICLR 2022

  12. arXiv:2007.06059  [pdf, other

    cs.LG cs.CV stat.ML

    It Is Likely That Your Loss Should be a Likelihood

    Authors: Mark Hamilton, Evan Shelhamer, William T. Freeman

    Abstract: Many common loss functions such as mean-squared-error, cross-entropy, and reconstruction loss are unnecessarily rigid. Under a probabilistic interpretation, these common losses correspond to distributions with fixed shapes and scales. We instead argue for optimizing full likelihoods that include parameters like the normal variance and softmax temperature. Joint optimization of these "likelihood pa… ▽ More

    Submitted 2 October, 2020; v1 submitted 12 July, 2020; originally announced July 2020.

  13. arXiv:2006.10726  [pdf, other

    cs.LG cs.CV stat.ML

    Tent: Fully Test-time Adaptation by Entropy Minimization

    Authors: Dequan Wang, Evan Shelhamer, Shaoteng Liu, Bruno Olshausen, Trevor Darrell

    Abstract: A model must adapt itself to generalize to new and different data during testing. In this setting of fully test-time adaptation the model has only the test data and its own parameters. We propose to adapt by test entropy minimization (tent): we optimize the model for confidence as measured by the entropy of its predictions. Our method estimates normalization statistics and optimizes channel-wise a… ▽ More

    Submitted 18 March, 2021; v1 submitted 18 June, 2020; originally announced June 2020.

    Comments: ICLR 2021 Spotlight

  14. arXiv:1910.09185  [pdf, other

    cs.CV cs.LG

    Exploring Simple and Transferable Recognition-Aware Image Processing

    Authors: Zhuang Liu, Hung-Ju Wang, Tinghui Zhou, Zhiqiang Shen, Bingyi Kang, Evan Shelhamer, Trevor Darrell

    Abstract: Recent progress in image recognition has stimulated the deployment of vision systems at an unprecedented scale. As a result, visual data are now often consumed not only by humans but also by machines. Existing image processing methods only optimize for better human perception, yet the resulting images may not be accurately recognized by machines. This can be undesirable, e.g., the images can be im… ▽ More

    Submitted 10 September, 2022; v1 submitted 21 October, 2019; originally announced October 2019.

    Comments: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

  15. arXiv:1908.03182  [pdf, other

    cs.CV cs.LG

    Dynamic Scale Inference by Entropy Minimization

    Authors: Dequan Wang, Evan Shelhamer, Bruno Olshausen, Trevor Darrell

    Abstract: Given the variety of the visual world there is not one true scale for recognition: objects may appear at drastically different sizes across the visual field. Rather than enumerate variations across filter channels or pyramid levels, dynamic models locally predict scale and adapt receptive fields accordingly. The degree of variation and diversity of inputs makes this a difficult task. Existing meth… ▽ More

    Submitted 8 August, 2019; originally announced August 2019.

  16. arXiv:1904.11487  [pdf, other

    cs.CV

    Blurring the Line Between Structure and Learning to Optimize and Adapt Receptive Fields

    Authors: Evan Shelhamer, Dequan Wang, Trevor Darrell

    Abstract: The visual world is vast and varied, but its variations divide into structured and unstructured factors. We compose free-form filters and structured Gaussian filters, optimized end-to-end, to factorize deep representations and learn both local features and their degree of locality. Our semi-structured composition is strictly more expressive than free-form filtering, and changes in its structured p… ▽ More

    Submitted 25 April, 2019; originally announced April 2019.

  17. arXiv:1902.04552  [pdf, other

    cs.LG stat.ML

    Infinite Mixture Prototypes for Few-Shot Learning

    Authors: Kelsey R. Allen, Evan Shelhamer, Hanul Shin, Joshua B. Tenenbaum

    Abstract: We propose infinite mixture prototypes to adaptively represent both simple and complex data distributions for few-shot learning. Our infinite mixture prototypes represent each class by a set of clusters, unlike existing prototypical methods that represent each class by a single cluster. By inferring the number of clusters, infinite mixture prototypes interpolate between nearest neighbor and protot… ▽ More

    Submitted 12 February, 2019; originally announced February 2019.

  18. arXiv:1806.07373  [pdf, other

    cs.CV cs.LG stat.ML

    Few-Shot Segmentation Propagation with Guided Networks

    Authors: Kate Rakelly, Evan Shelhamer, Trevor Darrell, Alexei A. Efros, Sergey Levine

    Abstract: Learning-based methods for visual segmentation have made progress on particular types of segmentation tasks, but are limited by the necessary supervision, the narrow definitions of fixed tasks, and the lack of control during inference for correcting errors. To remedy the rigidity and annotation burden of standard approaches, we address the problem of few-shot segmentation: given few image and few… ▽ More

    Submitted 25 May, 2018; originally announced June 2018.

  19. arXiv:1804.08606  [pdf, other

    cs.LG cs.AI cs.CV cs.RO stat.ML

    Zero-Shot Visual Imitation

    Authors: Deepak Pathak, Parsa Mahmoudieh, Guanghao Luo, Pulkit Agrawal, Dian Chen, Yide Shentu, Evan Shelhamer, Jitendra Malik, Alexei A. Efros, Trevor Darrell

    Abstract: The current dominant paradigm for imitation learning relies on strong supervision of expert actions to learn both 'what' and 'how' to imitate. We pursue an alternative paradigm wherein an agent first explores the world without any expert supervision and then distills its experience into a goal-conditioned skill policy with a novel forward consistency loss. In our framework, the role of the expert… ▽ More

    Submitted 23 April, 2018; originally announced April 2018.

    Comments: Oral presentation at ICLR 2018. Website at https://pathak22.github.io/zeroshot-imitation/

  20. arXiv:1707.06484  [pdf, other

    cs.CV cs.LG

    Deep Layer Aggregation

    Authors: Fisher Yu, Dequan Wang, Evan Shelhamer, Trevor Darrell

    Abstract: Visual recognition requires rich representations that span levels from low to high, scales from small to large, and resolutions from fine to coarse. Even with the depth of features in a convolutional network, a layer in isolation is not enough: compounding and aggregating these representations improves inference of what and where. Architectural efforts are exploring many dimensions for network bac… ▽ More

    Submitted 4 January, 2019; v1 submitted 20 July, 2017; originally announced July 2017.

    Comments: Published at the Conference on Computer Vision and Pattern Recognition (CVPR) 2018

  21. arXiv:1612.07307  [pdf, other

    cs.LG

    Loss is its own Reward: Self-Supervision for Reinforcement Learning

    Authors: Evan Shelhamer, Parsa Mahmoudieh, Max Argus, Trevor Darrell

    Abstract: Reinforcement learning optimizes policies for expected cumulative reward. Need the supervision be so narrow? Reward is delayed and sparse for many tasks, making it a difficult and impoverished signal for end-to-end optimization. To augment reward, we consider a range of self-supervised tasks that incorporate states, actions, and successors to provide auxiliary losses. These losses offer ubiquitous… ▽ More

    Submitted 9 March, 2017; v1 submitted 21 December, 2016; originally announced December 2016.

  22. arXiv:1608.03609  [pdf, other

    cs.CV

    Clockwork Convnets for Video Semantic Segmentation

    Authors: Evan Shelhamer, Kate Rakelly, Judy Hoffman, Trevor Darrell

    Abstract: Recent years have seen tremendous progress in still-image segmentation; however the naïve application of these state-of-the-art algorithms to every video frame requires considerable computation and ignores the temporal continuity inherent in video. We propose a video recognition framework that relies on two key observations: 1) while pixels may change rapidly from frame to frame, the semantic cont… ▽ More

    Submitted 11 August, 2016; originally announced August 2016.

  23. arXiv:1605.06211  [pdf, other

    cs.CV

    Fully Convolutional Networks for Semantic Segmentation

    Authors: Evan Shelhamer, Jonathan Long, Trevor Darrell

    Abstract: Convolutional networks are powerful visual models that yield hierarchies of features. We show that convolutional networks by themselves, trained end-to-end, pixels-to-pixels, improve on the previous best result in semantic segmentation. Our key insight is to build "fully convolutional" networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and… ▽ More

    Submitted 20 May, 2016; originally announced May 2016.

    Comments: to appear in PAMI (accepted May, 2016); journal edition of arXiv:1411.4038

  24. arXiv:1511.07063  [pdf, other

    cs.CV

    Fine-grained pose prediction, normalization, and recognition

    Authors: Ning Zhang, Evan Shelhamer, Yang Gao, Trevor Darrell

    Abstract: Pose variation and subtle differences in appearance are key challenges to fine-grained classification. While deep networks have markedly improved general recognition, many approaches to fine-grained recognition rely on anchoring networks to parts for better accuracy. Identifying parts to find correspondence discounts pose variation so that features can be tuned to appearance. To this end previous… ▽ More

    Submitted 22 November, 2015; originally announced November 2015.

  25. arXiv:1412.7144  [pdf, other

    cs.CV cs.LG cs.NE

    Fully Convolutional Multi-Class Multiple Instance Learning

    Authors: Deepak Pathak, Evan Shelhamer, Jonathan Long, Trevor Darrell

    Abstract: Multiple instance learning (MIL) can reduce the need for costly annotation in tasks such as semantic segmentation by weakening the required degree of supervision. We propose a novel MIL formulation of multi-class semantic segmentation learning by a fully convolutional network. In this setting, we seek to learn a semantic segmentation model from just weak image-level labels. The model is trained en… ▽ More

    Submitted 15 April, 2015; v1 submitted 22 December, 2014; originally announced December 2014.

    Comments: in ICLR 2015

  26. arXiv:1411.4038  [pdf, other

    cs.CV

    Fully Convolutional Networks for Semantic Segmentation

    Authors: Jonathan Long, Evan Shelhamer, Trevor Darrell

    Abstract: Convolutional networks are powerful visual models that yield hierarchies of features. We show that convolutional networks by themselves, trained end-to-end, pixels-to-pixels, exceed the state-of-the-art in semantic segmentation. Our key insight is to build "fully convolutional" networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning… ▽ More

    Submitted 8 March, 2015; v1 submitted 14 November, 2014; originally announced November 2014.

    Comments: to appear in CVPR (2015)

  27. arXiv:1410.0759  [pdf, other

    cs.NE cs.LG cs.MS

    cuDNN: Efficient Primitives for Deep Learning

    Authors: Sharan Chetlur, Cliff Woolley, Philippe Vandermersch, Jonathan Cohen, John Tran, Bryan Catanzaro, Evan Shelhamer

    Abstract: We present a library of efficient implementations of deep learning primitives. Deep learning workloads are computationally intensive, and optimizing their kernels is difficult and time-consuming. As parallel architectures evolve, kernels must be reoptimized, which makes maintaining codebases difficult over time. Similar issues have long been addressed in the HPC community by libraries such as the… ▽ More

    Submitted 17 December, 2014; v1 submitted 3 October, 2014; originally announced October 2014.

  28. arXiv:1408.5093  [pdf, other

    cs.CV cs.LG cs.NE

    Caffe: Convolutional Architecture for Fast Feature Embedding

    Authors: Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, Trevor Darrell

    Abstract: Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models. The framework is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures. Caffe fits i… ▽ More

    Submitted 20 June, 2014; originally announced August 2014.

    Comments: Tech report for the Caffe software at http://github.com/BVLC/Caffe/