Skip to main content

Showing 1–23 of 23 results for author: Hoffman, M D

  1. arXiv:2402.01915  [pdf, other

    cs.CV stat.CO

    Robust Inverse Graphics via Probabilistic Inference

    Authors: Tuan Anh Le, Pavel Sountsov, Matthew D. Hoffman, Ben Lee, Brian Patton, Rif A. Saurous

    Abstract: How do we infer a 3D scene from a single image in the presence of corruptions like rain, snow or fog? Straightforward domain randomization relies on knowing the family of corruptions ahead of time. Here, we propose a Bayesian approach-dubbed robust inverse graphics (RIG)-that relies on a strong scene prior and an uninformative uniform corruption prior, making it applicable to a wide range of corru… ▽ More

    Submitted 11 June, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: ICML submission. Reworked main body, new appendix figures

  2. arXiv:2307.09607  [pdf, other

    cs.LG cs.AI stat.ME stat.ML

    Sequential Monte Carlo Learning for Time Series Structure Discovery

    Authors: Feras A. Saad, Brian J. Patton, Matthew D. Hoffman, Rif A. Saurous, Vikash K. Mansinghka

    Abstract: This paper presents a new approach to automatically discovering accurate models of complex time series data. Working within a Bayesian nonparametric prior over a symbolic space of Gaussian process time series models, we present a novel structure learning algorithm that integrates sequential Monte Carlo (SMC) and involutive MCMC for highly effective posterior inference. Our method can be used both… ▽ More

    Submitted 13 July, 2023; originally announced July 2023.

    Comments: 17 pages, 8 figures, 2 tables. Appearing in ICML 2023

    Journal ref: Proceedings of the 40th International Conference on Machine Learning, PMLR 202:29473-29489, 2023

  3. arXiv:2206.08889  [pdf, other

    stat.ML cs.IT cs.LG

    Lossy Compression with Gaussian Diffusion

    Authors: Lucas Theis, Tim Salimans, Matthew D. Hoffman, Fabian Mentzer

    Abstract: We consider a novel lossy compression approach based on unconditional diffusion generative models, which we call DiffC. Unlike modern compression schemes which rely on transform coding and quantization to restrict the transmitted information, DiffC relies on the efficient communication of pixels corrupted by Gaussian noise. We implement a proof of concept and find that it works surprisingly well d… ▽ More

    Submitted 31 December, 2022; v1 submitted 17 June, 2022; originally announced June 2022.

  4. arXiv:2110.13017  [pdf, other

    stat.ME

    Nested $\hat R$: Assessing the convergence of Markov chain Monte Carlo when running many short chains

    Authors: Charles C. Margossian, Matthew D. Hoffman, Pavel Sountsov, Lionel Riou-Durand, Aki Vehtari, Andrew Gelman

    Abstract: Recent developments in parallel Markov chain Monte Carlo (MCMC) algorithms allow us to run thousands of chains almost as quickly as a single chain, using hardware accelerators such as GPUs. While each chain still needs to forget its initial point during a warmup phase, the subsequent sampling phase can be shorter than in classical settings, where we run only a few chains. To determine if the resul… ▽ More

    Submitted 30 May, 2024; v1 submitted 25 October, 2021; originally announced October 2021.

    Journal ref: Bayesian Analysis 2024

  5. arXiv:2110.11576  [pdf, other

    stat.CO

    Focusing on Difficult Directions for Learning HMC Trajectory Lengths

    Authors: Pavel Sountsov, Matt D. Hoffman

    Abstract: Hamiltonian Monte Carlo (HMC) is a premier Markov Chain Monte Carlo (MCMC) algorithm for continuous target distributions. Its full potential can only be unleashed when its problem-dependent hyperparameters are tuned well. The adaptation of one such hyperparameter, trajectory length ($τ$), has been closely examined by many research programs with the No-U-Turn Sampler (NUTS) coming out as the prefer… ▽ More

    Submitted 6 May, 2022; v1 submitted 21 October, 2021; originally announced October 2021.

    Comments: Improved exposition. Fixed Figure 2

  6. arXiv:2104.14421  [pdf, other

    cs.LG stat.ML

    What Are Bayesian Neural Network Posteriors Really Like?

    Authors: Pavel Izmailov, Sharad Vikram, Matthew D. Hoffman, Andrew Gordon Wilson

    Abstract: The posterior over Bayesian neural network (BNN) parameters is extremely high-dimensional and non-convex. For computational reasons, researchers approximate this posterior using inexpensive mini-batch methods such as mean-field variational inference or stochastic-gradient Markov chain Monte Carlo (SGMCMC). To investigate foundational questions in Bayesian deep learning, we instead use full-batch H… ▽ More

    Submitted 29 April, 2021; originally announced April 2021.

  7. arXiv:2011.03395  [pdf, other

    cs.LG stat.ML

    Underspecification Presents Challenges for Credibility in Modern Machine Learning

    Authors: Alexander D'Amour, Katherine Heller, Dan Moldovan, Ben Adlam, Babak Alipanahi, Alex Beutel, Christina Chen, Jonathan Deaton, Jacob Eisenstein, Matthew D. Hoffman, Farhad Hormozdiari, Neil Houlsby, Shaobo Hou, Ghassen Jerfel, Alan Karthikesalingam, Mario Lucic, Yian Ma, Cory McLean, Diana Mincu, Akinori Mitani, Andrea Montanari, Zachary Nado, Vivek Natarajan, Christopher Nielson, Thomas F. Osborne , et al. (15 additional authors not shown)

    Abstract: ML models often exhibit unexpectedly poor behavior when they are deployed in real-world domains. We identify underspecification as a key reason for these failures. An ML pipeline is underspecified when it can return many predictors with equivalently strong held-out performance in the training domain. Underspecification is common in modern ML pipelines, such as those based on deep learning. Predict… ▽ More

    Submitted 24 November, 2020; v1 submitted 6 November, 2020; originally announced November 2020.

    Comments: Updates: Updated statistical analysis in Section 6; Additional citations

  8. arXiv:2002.01184  [pdf, ps, other

    stat.CO cs.PL stat.ML

    tfp.mcmc: Modern Markov Chain Monte Carlo Tools Built for Modern Hardware

    Authors: Junpeng Lao, Christopher Suter, Ian Langmore, Cyril Chimisov, Ashish Saxena, Pavel Sountsov, Dave Moore, Rif A. Saurous, Matthew D. Hoffman, Joshua V. Dillon

    Abstract: Markov chain Monte Carlo (MCMC) is widely regarded as one of the most important algorithms of the 20th century. Its guarantees of asymptotic convergence, stability, and estimator-variance bounds using only unnormalized probability functions make it indispensable to probabilistic programming. In this paper, we introduce the TensorFlow Probability MCMC toolkit, and discuss some of the considerations… ▽ More

    Submitted 4 February, 2020; originally announced February 2020.

    Comments: Based on extended abstract submitted to PROBPROG 2020

  9. arXiv:2001.05033  [pdf, other

    stat.CO

    Hamiltonian Monte Carlo Swindles

    Authors: Dan Piponi, Matthew D. Hoffman, Pavel Sountsov

    Abstract: Hamiltonian Monte Carlo (HMC) is a powerful Markov chain Monte Carlo (MCMC) algorithm for estimating expectations with respect to continuous un-normalized probability distributions. MCMC estimators typically have higher variance than classical Monte Carlo with i.i.d. samples due to autocorrelations; most MCMC research tries to reduce these autocorrelations. In this work, we explore a complementary… ▽ More

    Submitted 2 March, 2020; v1 submitted 14 January, 2020; originally announced January 2020.

    Comments: To be published in AISTATS 2020

  10. arXiv:1906.03028  [pdf, other

    stat.ML cs.LG cs.PL

    Automatic Reparameterisation of Probabilistic Programs

    Authors: Maria I. Gorinova, Dave Moore, Matthew D. Hoffman

    Abstract: Probabilistic programming has emerged as a powerful paradigm in statistics, applied science, and machine learning: by decoupling modelling from inference, it promises to allow modellers to directly reason about the processes generating data. However, the performance of inference algorithms can be dramatically affected by the parameterisation used to express a model, requiring users to transform th… ▽ More

    Submitted 7 June, 2019; originally announced June 2019.

  11. arXiv:1811.11926  [pdf, other

    cs.LG cs.PL stat.ML

    Autoconj: Recognizing and Exploiting Conjugacy Without a Domain-Specific Language

    Authors: Matthew D. Hoffman, Matthew J. Johnson, Dustin Tran

    Abstract: Deriving conditional and marginal distributions using conjugacy relationships can be time consuming and error prone. In this paper, we propose a strategy for automating such derivations. Unlike previous systems which focus on relationships between pairs of random variables, our system (which we call Autoconj) operates directly on Python functions that compute log-joint distribution functions. Auto… ▽ More

    Submitted 28 November, 2018; originally announced November 2018.

    Comments: Appears in Neural Information Processing Systems, 2018. Code available at https://github.com/google-research/autoconj

  12. arXiv:1810.06891  [pdf, other

    cs.LG stat.ML

    The LORACs prior for VAEs: Letting the Trees Speak for the Data

    Authors: Sharad Vikram, Matthew D. Hoffman, Matthew J. Johnson

    Abstract: In variational autoencoders, the prior on the latent codes $z$ is often treated as an afterthought, but the prior shapes the kind of latent representation that the model learns. If the goal is to learn a representation that is interpretable and useful, then the prior should reflect the ways in which the high-level factors that describe the data vary. The "default" prior is an isotropic normal, but… ▽ More

    Submitted 16 October, 2018; originally announced October 2018.

  13. arXiv:1809.04281  [pdf, other

    cs.LG cs.SD eess.AS stat.ML

    Music Transformer

    Authors: Cheng-Zhi Anna Huang, Ashish Vaswani, Jakob Uszkoreit, Noam Shazeer, Ian Simon, Curtis Hawthorne, Andrew M. Dai, Matthew D. Hoffman, Monica Dinculescu, Douglas Eck

    Abstract: Music relies heavily on repetition to build structure and meaning. Self-reference occurs on multiple timescales, from motifs to phrases to reusing of entire sections of music, such as in pieces with ABA structure. The Transformer (Vaswani et al., 2017), a sequence model based on self-attention, has achieved compelling results in many generation tasks that require maintaining long-range coherence.… ▽ More

    Submitted 12 December, 2018; v1 submitted 12 September, 2018; originally announced September 2018.

    Comments: Improved skewing section and accompanying figures. Previous titles are "An Improved Relative Self-Attention Mechanism for Transformer with Application to Music Generation" and "Music Transformer"

  14. arXiv:1802.05814  [pdf, other

    stat.ML cs.IR cs.LG

    Variational Autoencoders for Collaborative Filtering

    Authors: Dawen Liang, Rahul G. Krishnan, Matthew D. Hoffman, Tony Jebara

    Abstract: We extend variational autoencoders (VAEs) to collaborative filtering for implicit feedback. This non-linear probabilistic model enables us to go beyond the limited modeling capacity of linear factor models which still largely dominate collaborative filtering research.We introduce a generative model with multinomial likelihood and use Bayesian inference for parameter estimation. Despite widespread… ▽ More

    Submitted 15 February, 2018; originally announced February 2018.

    Comments: 10 pages, 3 figures. WWW 2018

  15. arXiv:1711.09268  [pdf, other

    stat.ML cs.AI cs.LG

    Generalizing Hamiltonian Monte Carlo with Neural Networks

    Authors: Daniel Levy, Matthew D. Hoffman, Jascha Sohl-Dickstein

    Abstract: We present a general-purpose method to train Markov chain Monte Carlo kernels, parameterized by deep neural networks, that converge and mix quickly to their target distribution. Our method generalizes Hamiltonian Monte Carlo and is trained to maximize expected squared jumped distance, a proxy for mixing speed. We demonstrate large empirical gains on a collection of simple but challenging distribut… ▽ More

    Submitted 2 March, 2018; v1 submitted 25 November, 2017; originally announced November 2017.

    Comments: ICLR 2018

  16. arXiv:1704.04997  [pdf, other

    stat.ML cs.LG

    Multimodal Prediction and Personalization of Photo Edits with Deep Generative Models

    Authors: Ardavan Saeedi, Matthew D. Hoffman, Stephen J. DiVerdi, Asma Ghandeharioun, Matthew J. Johnson, Ryan P. Adams

    Abstract: Professional-grade software applications are powerful but complicated$-$expert users can achieve impressive results, but novices often struggle to complete even basic tasks. Photo editing is a prime example: after loading a photo, the user is confronted with an array of cryptic sliders like "clarity", "temp", and "highlights". An automatically generated suggestion could help, but there is no singl… ▽ More

    Submitted 17 April, 2017; originally announced April 2017.

  17. arXiv:1704.04289  [pdf, other

    stat.ML cs.LG

    Stochastic Gradient Descent as Approximate Bayesian Inference

    Authors: Stephan Mandt, Matthew D. Hoffman, David M. Blei

    Abstract: Stochastic Gradient Descent with a constant learning rate (constant SGD) simulates a Markov chain with a stationary distribution. With this perspective, we derive several new results. (1) We show that constant SGD can be used as an approximate Bayesian posterior inference algorithm. Specifically, we show how to adjust the tuning parameters of constant SGD to best match the stationary distribution… ▽ More

    Submitted 19 January, 2018; v1 submitted 13 April, 2017; originally announced April 2017.

    Comments: 35 pages, published version (JMLR 2017)

    Journal ref: Journal of Machine Learning Research 18 (2017) 1-35

  18. arXiv:1701.03757  [pdf, ps, other

    stat.ML cs.AI cs.LG cs.PL stat.CO

    Deep Probabilistic Programming

    Authors: Dustin Tran, Matthew D. Hoffman, Rif A. Saurous, Eugene Brevdo, Kevin Murphy, David M. Blei

    Abstract: We propose Edward, a Turing-complete probabilistic programming language. Edward defines two compositional representations---random variables and inference. By treating inference as a first class citizen, on a par with modeling, we show that probabilistic programming can be as flexible and computationally efficient as traditional deep learning. For flexibility, Edward makes it easy to fit the same… ▽ More

    Submitted 7 March, 2017; v1 submitted 13 January, 2017; originally announced January 2017.

    Comments: Appears in International Conference on Learning Representations, 2017. A companion webpage for this paper is available at http://edwardlib.org/iclr2017

  19. arXiv:1602.02666  [pdf, other

    stat.ML cs.LG

    A Variational Analysis of Stochastic Gradient Algorithms

    Authors: Stephan Mandt, Matthew D. Hoffman, David M. Blei

    Abstract: Stochastic Gradient Descent (SGD) is an important algorithm in machine learning. With constant learning rates, it is a stochastic process that, after an initial phase of convergence, generates samples from a stationary distribution. We show that SGD with constant rates can be effectively used as an approximate posterior inference algorithm for probabilistic modeling. Specifically, we show how to a… ▽ More

    Submitted 8 February, 2016; originally announced February 2016.

    Comments: 8 pages, 3 figures

    Journal ref: International Conference on Machine Learning (ICML 2016), p. 354--363

  20. arXiv:1505.07649  [pdf, other

    stat.ML stat.AP

    A trust-region method for stochastic variational inference with applications to streaming data

    Authors: Lucas Theis, Matthew D. Hoffman

    Abstract: Stochastic variational inference allows for fast posterior inference in complex Bayesian models. However, the algorithm is prone to local optima which can make the quality of the posterior approximation sensitive to the choice of hyperparameters and initialization. We address this problem by replacing the natural gradient step of stochastic varitional inference with a trust-region update. We show… ▽ More

    Submitted 28 May, 2015; originally announced May 2015.

    Comments: in Proceedings of the 32nd International Conference on Machine Learning, 2015

  21. arXiv:1411.1804  [pdf, other

    stat.ML cs.LG

    Beta Process Non-negative Matrix Factorization with Stochastic Structured Mean-Field Variational Inference

    Authors: Dawen Liang, Matthew D. Hoffman

    Abstract: Beta process is the standard nonparametric Bayesian prior for latent factor model. In this paper, we derive a structured mean-field variational inference algorithm for a beta process non-negative matrix factorization (NMF) model with Poisson likelihood. Unlike the linear Gaussian model, which is well-studied in the nonparametric Bayesian literature, NMF model with beta process prior does not enjoy… ▽ More

    Submitted 2 December, 2014; v1 submitted 6 November, 2014; originally announced November 2014.

    Comments: 6 pages, 1 figure

  22. arXiv:1312.5857  [pdf, ps, other

    stat.ML cs.LG

    A Generative Product-of-Filters Model of Audio

    Authors: Dawen Liang, Matthew D. Hoffman, Gautham J. Mysore

    Abstract: We propose the product-of-filters (PoF) model, a generative model that decomposes audio spectra as sparse linear combinations of "filters" in the log-spectral domain. PoF makes similar assumptions to those used in the classic homomorphic filtering approach to signal processing, but replaces hand-designed decompositions built of basic signal processing operations with a learned decomposition based… ▽ More

    Submitted 25 November, 2014; v1 submitted 20 December, 2013; originally announced December 2013.

    Comments: ICLR 2014 conference-track submission. Added link to the source code

  23. arXiv:1111.4246  [pdf, other

    stat.CO cs.LG

    The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo

    Authors: Matthew D. Hoffman, Andrew Gelman

    Abstract: Hamiltonian Monte Carlo (HMC) is a Markov chain Monte Carlo (MCMC) algorithm that avoids the random walk behavior and sensitivity to correlated parameters that plague many MCMC methods by taking a series of steps informed by first-order gradient information. These features allow it to converge to high-dimensional target distributions much more quickly than simpler methods such as random walk Metro… ▽ More

    Submitted 17 November, 2011; originally announced November 2011.

    Comments: 30 pages, 7 figures