Skip to main content

Showing 1–50 of 120 results for author: Shih, D

  1. arXiv:2405.20407  [pdf, other

    physics.ins-det cs.LG hep-ex hep-ph physics.data-an

    Convolutional L2LFlows: Generating Accurate Showers in Highly Granular Calorimeters Using Convolutional Normalizing Flows

    Authors: Thorsten Buss, Frank Gaede, Gregor Kasieczka, Claudius Krause, David Shih

    Abstract: In the quest to build generative surrogate models as computationally efficient alternatives to rule-based simulations, the quality of the generated samples remains a crucial frontier. So far, normalizing flows have been among the models with the best fidelity. However, as the latent space in such models is required to have the same dimensionality as the data space, scaling up normalizing flows to… ▽ More

    Submitted 3 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Report number: HEPHY-ML-24-02

  2. arXiv:2405.12131  [pdf, other

    astro-ph.GA hep-ph physics.data-an

    SkyCURTAINs: Model agnostic search for Stellar Streams with Gaia data

    Authors: Debajyoti Sengupta, Stephen Mulligan, David Shih, John Andrew Raine, Tobias Golling

    Abstract: We present SkyCURTAINs, a data driven and model agnostic method to search for stellar streams in the Milky Way galaxy using data from the Gaia telescope. SkyCURTAINs is a weakly supervised machine learning algorithm that builds a background enriched template in the signal region by leveraging the correlation of the source's characterising features with their proper motion in the sky. This allows f… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  3. arXiv:2404.18992  [pdf, other

    hep-ph hep-ex physics.data-an physics.ins-det stat.ML

    Unifying Simulation and Inference with Normalizing Flows

    Authors: Haoxing Du, Claudius Krause, Vinicius Mikuni, Benjamin Nachman, Ian Pang, David Shih

    Abstract: There have been many applications of deep neural networks to detector calibrations and a growing number of studies that propose deep generative models as automated fast detector simulators. We show that these two tasks can be unified by using maximum likelihood estimation (MLE) from conditional generative models for energy regression. Unlike direct regression techniques, the MLE approach is prior-… ▽ More

    Submitted 9 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

    Comments: 12 pages, 7 figures

    Report number: HEPHY-ML-24-01

  4. arXiv:2404.07258  [pdf, other

    hep-ph hep-ex physics.data-an

    Complete Optimal Non-Resonant Anomaly Detection

    Authors: Gregor Kasieczka, John Andrew Raine, David Shih, Aman Upadhyay

    Abstract: We propose the first-ever complete, model-agnostic search strategy based on the optimal anomaly score, for new physics on the tails of distributions. Signal sensitivity is achieved via a classifier trained on auxiliary features in a weakly-supervised fashion, and backgrounds are predicted using the ABCD method in the classifier output and the primary tail feature. The independence between the clas… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: 9 pages, 9 figures

  5. arXiv:2312.11629  [pdf, other

    hep-ph cs.LG hep-ex physics.data-an

    Residual ANODE

    Authors: Ranit Das, Gregor Kasieczka, David Shih

    Abstract: We present R-ANODE, a new method for data-driven, model-agnostic resonant anomaly detection that raises the bar for both performance and interpretability. The key to R-ANODE is to enhance the inductive bias of the anomaly detection task by fitting a normalizing flow directly to the small and unknown signal component, while holding fixed a background model (also a normalizing flow) learned from sid… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

    Comments: 9 pages, 6 figures

  6. arXiv:2312.11618  [pdf, other

    hep-ph astro-ph.IM hep-ex physics.data-an physics.ins-det

    Anomaly detection with flow-based fast calorimeter simulators

    Authors: Claudius Krause, Benjamin Nachman, Ian Pang, David Shih, Yunhao Zhu

    Abstract: Recently, several normalizing flow-based deep generative models have been proposed to accelerate the simulation of calorimeter showers. Using CaloFlow as an example, we show that these models can simultaneously perform unsupervised anomaly detection with no additional training cost. As a demonstration, we consider electromagnetic showers initiated by one (background) or multiple (signal) photons.… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

    Comments: 12 pages, 6 figures

  7. arXiv:2312.09290  [pdf, other

    hep-ph

    Normalizing Flows for High-Dimensional Detector Simulations

    Authors: Florian Ernst, Luigi Favaro, Claudius Krause, Tilman Plehn, David Shih

    Abstract: Whenever invertible generative networks are needed for LHC physics, normalizing flows show excellent performance. A challenge is their scaling to high-dimensional phase spaces. We investigate their performance for fast calorimeter shower simulations with increasing phase space dimension. In addition to the standard architecture we also employ a VAE to compress the dimensionality. Our study provide… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

    Comments: 24 pages, 9 figures, 5 tables

  8. arXiv:2312.00123  [pdf, other

    hep-ph cs.LG hep-ex physics.data-an

    Flow Matching Beyond Kinematics: Generating Jets with Particle-ID and Trajectory Displacement Information

    Authors: Joschka Birk, Erik Buhmann, Cedric Ewen, Gregor Kasieczka, David Shih

    Abstract: We introduce the first generative model trained on the JetClass dataset. Our model generates jets at the constituent level, and it is a permutation-equivariant continuous normalizing flow (CNF) trained with the flow matching technique. It is conditioned on the jet type, so that a single model can be used to generate the ten different jet types of JetClass. For the first time, we also introduce a g… ▽ More

    Submitted 30 November, 2023; originally announced December 2023.

  9. arXiv:2310.12209  [pdf, other

    astro-ph.IM astro-ph.HE cs.LG gr-qc hep-ph

    Fast Parameter Inference on Pulsar Timing Arrays with Normalizing Flows

    Authors: David Shih, Marat Freytsis, Stephen R. Taylor, Jeff A. Dror, Nolan Smyth

    Abstract: Pulsar timing arrays (PTAs) perform Bayesian posterior inference with expensive MCMC methods. Given a dataset of ~10-100 pulsars and O(10^3) timing residuals each, producing a posterior distribution for the stochastic gravitational wave background (SGWB) can take days to a week. The computational bottleneck arises because the likelihood evaluation required for MCMC is extremely costly when conside… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: 8 pages, 3 figures

  10. arXiv:2310.06897  [pdf, other

    hep-ph hep-ex physics.data-an

    Full Phase Space Resonant Anomaly Detection

    Authors: Erik Buhmann, Cedric Ewen, Gregor Kasieczka, Vinicius Mikuni, Benjamin Nachman, David Shih

    Abstract: Physics beyond the Standard Model that is resonant in one or more dimensions has been a longstanding focus of countless searches at colliders and beyond. Recently, many new strategies for resonant anomaly detection have been developed, where sideband information can be used in conjunction with modern machine learning, in order to generate synthetic datasets representing the Standard Model backgrou… ▽ More

    Submitted 9 February, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: 10 pages, 7 figures

    Journal ref: Phys. Rev. D 109, 055015 (2024)

  11. arXiv:2310.00049  [pdf, other

    hep-ph cs.LG

    EPiC-ly Fast Particle Cloud Generation with Flow-Matching and Diffusion

    Authors: Erik Buhmann, Cedric Ewen, Darius A. Faroughy, Tobias Golling, Gregor Kasieczka, Matthew Leigh, Guillaume Quétant, John Andrew Raine, Debajyoti Sengupta, David Shih

    Abstract: Jets at the LHC, typically consisting of a large number of highly correlated particles, are a fascinating laboratory for deep generative modeling. In this paper, we present two novel methods that generate LHC jets as point clouds efficiently and accurately. We introduce \epcjedi, which combines score-matching diffusion models with the Equivariant Point Cloud (EPiC) architecture based on the deep s… ▽ More

    Submitted 29 September, 2023; originally announced October 2023.

    Comments: 21 pages, 8 figures

  12. arXiv:2309.13111  [pdf, other

    hep-ph hep-ex physics.data-an

    Back To The Roots: Tree-Based Algorithms for Weakly Supervised Anomaly Detection

    Authors: Thorben Finke, Marie Hein, Gregor Kasieczka, Michael Krämer, Alexander Mück, Parada Prangchaikul, Tobias Quadfasel, David Shih, Manuel Sommerhalder

    Abstract: Weakly supervised methods have emerged as a powerful tool for model-agnostic anomaly detection at the Large Hadron Collider (LHC). While these methods have shown remarkable performance on specific signatures such as di-jet resonances, their application in a more model-agnostic manner requires dealing with a larger number of potentially noisy input features. In this paper, we show that using booste… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

    Comments: 11 pages, 9 figures

    Report number: TTK-23-26

  13. Combining Resonant and Tail-based Anomaly Detection

    Authors: Gerrit Bickendorf, Manuel Drees, Gregor Kasieczka, Claudius Krause, David Shih

    Abstract: In many well-motivated models of the electroweak scale, cascade decays of new particles can result in highly boosted hadronic resonances (e.g. $Z/W/h$). This can make these models rich and promising targets for recently developed resonant anomaly detection methods powered by modern machine learning. We demonstrate this using the state-of-the-art CATHODE method applied to supersymmetry scenarios wi… ▽ More

    Submitted 28 May, 2024; v1 submitted 22 September, 2023; originally announced September 2023.

    Comments: 13 pages, 15 figures

  14. arXiv:2308.11700  [pdf, other

    physics.ins-det cs.LG hep-ex hep-ph physics.data-an

    Calorimeter shower superresolution

    Authors: Ian Pang, John Andrew Raine, David Shih

    Abstract: Calorimeter shower simulation is a major bottleneck in the Large Hadron Collider computational pipeline. There have been recent efforts to employ deep-generative surrogate models to overcome this challenge. However, many of best performing models have training and generation times that do not scale well to high-dimensional calorimeter showers. In this work, we introduce SuperCalo, a flow-based sup… ▽ More

    Submitted 15 May, 2024; v1 submitted 22 August, 2023; originally announced August 2023.

    Comments: 16 pages, 13 figures, v3: title changed, matches published version

    Journal ref: Phys. Rev. D 109, 092009 (2024)

  15. arXiv:2307.11157  [pdf, other

    hep-ph hep-ex physics.data-an

    The Interplay of Machine Learning--based Resonant Anomaly Detection Methods

    Authors: Tobias Golling, Gregor Kasieczka, Claudius Krause, Radha Mastandrea, Benjamin Nachman, John Andrew Raine, Debajyoti Sengupta, David Shih, Manuel Sommerhalder

    Abstract: Machine learning--based anomaly detection (AD) methods are promising tools for extending the coverage of searches for physics beyond the Standard Model (BSM). One class of AD methods that has received significant attention is resonant anomaly detection, where the BSM is assumed to be localized in at least one known variable. While there have been many methods proposed to identify such a BSM signal… ▽ More

    Submitted 14 March, 2024; v1 submitted 20 July, 2023; originally announced July 2023.

    Comments: 27 pages, 21 figures. Updated with revisions for journal acceptance

  16. arXiv:2307.08738  [pdf, other

    astro-ph.GA

    Discovery and Characterization of Two Ultra Faint-Dwarfs Outside the Halo of the Milky Way: Leo M and Leo K

    Authors: Kristen B. W. McQuinn, Yao-Yuan Mao, Erik J. Tollerud, Roger E. Cohen, David Shih, Matthew R. Buckley, Andrew E. Dolphin

    Abstract: We report the discovery of two ultra-faint dwarf galaxies, Leo M and Leo K, that lie outside the halo of the Milky Way. Using Hubble Space Telescope imaging of the resolved stars, we create color-magnitude diagrams reaching the old main sequence turn-off of each system and (i) fit for structural parameters of the galaxies; (ii) measure their distances using the luminosity of the Horizontal Branch… ▽ More

    Submitted 19 May, 2024; v1 submitted 17 July, 2023; originally announced July 2023.

    Comments: 12 pages, 9 figures, 1 table

  17. arXiv:2307.08593  [pdf, other

    physics.acc-ph cs.LG hep-ex nucl-ex nucl-th

    Artificial Intelligence for the Electron Ion Collider (AI4EIC)

    Authors: C. Allaire, R. Ammendola, E. -C. Aschenauer, M. Balandat, M. Battaglieri, J. Bernauer, M. Bondì, N. Branson, T. Britton, A. Butter, I. Chahrour, P. Chatagnon, E. Cisbani, E. W. Cline, S. Dash, C. Dean, W. Deconinck, A. Deshpande, M. Diefenthaler, R. Ent, C. Fanelli, M. Finger, M. Finger, Jr., E. Fol, S. Furletov , et al. (70 additional authors not shown)

    Abstract: The Electron-Ion Collider (EIC), a state-of-the-art facility for studying the strong force, is expected to begin commissioning its first experiments in 2028. This is an opportune time for artificial intelligence (AI) to be included from the start at this facility and in all phases that lead up to the experiments. The second annual workshop organized by the AI4EIC working group, which recently took… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

    Comments: 27 pages, 11 figures, AI4EIC workshop, tutorials and hackathon

  18. How to Understand Limitations of Generative Networks

    Authors: Ranit Das, Luigi Favaro, Theo Heimel, Claudius Krause, Tilman Plehn, David Shih

    Abstract: Well-trained classifiers and their complete weight distributions provide us with a well-motivated and practicable method to test generative networks in particle physics. We illustrate their benefits for distribution-shifted jets, calorimeter showers, and reconstruction-level events. In all cases, the classifier weights make for a powerful test of the generative network, identify potential problems… ▽ More

    Submitted 7 December, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: 32 pages, 19 figures

    Journal ref: SciPost Phys. 16, 031 (2024)

  19. arXiv:2305.13358  [pdf, other

    astro-ph.GA hep-ph

    Mapping Dark Matter in the Milky Way using Normalizing Flows and Gaia DR3

    Authors: Sung Hak Lim, Eric Putney, Matthew R. Buckley, David Shih

    Abstract: We present a novel, data-driven analysis of Galactic dynamics, using unsupervised machine learning -- in the form of density estimation with normalizing flows -- to learn the underlying phase space distribution of 6 million nearby stars from the Gaia DR3 catalog. Solving the collisionless Boltzmann equation with the assumption of approximate equilibrium, we calculate -- for the first time ever --… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: 19 pages, 13 figures, 3 tables

  20. arXiv:2305.11934  [pdf, other

    physics.ins-det cs.LG hep-ex hep-ph physics.data-an

    Inductive Simulation of Calorimeter Showers with Normalizing Flows

    Authors: Matthew R. Buckley, Claudius Krause, Ian Pang, David Shih

    Abstract: Simulating particle detector response is the single most expensive step in the Large Hadron Collider computational pipeline. Recently it was shown that normalizing flows can accelerate this process while achieving unprecedented levels of accuracy, but scaling this approach up to higher resolutions relevant for future detector upgrades leads to prohibitive memory constraints. To overcome this probl… ▽ More

    Submitted 13 February, 2024; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: 19 pages, 15 figures; v2: title changed, matches published version

    Journal ref: Phys. Rev. D 109, 033006 (2024)

  21. arXiv:2305.03761  [pdf, other

    astro-ph.GA cs.LG hep-ph physics.data-an

    Weakly-Supervised Anomaly Detection in the Milky Way

    Authors: Mariel Pettee, Sowmya Thanvantri, Benjamin Nachman, David Shih, Matthew R. Buckley, Jack H. Collins

    Abstract: Large-scale astrophysics datasets present an opportunity for new machine learning techniques to identify regions of interest that might otherwise be overlooked by traditional searches. To this end, we use Classification Without Labels (CWoLa), a weakly-supervised anomaly detection method, to identify cold stellar streams within the more than one billion Milky Way stars observed by the Gaia satelli… ▽ More

    Submitted 5 May, 2023; originally announced May 2023.

  22. arXiv:2303.01529  [pdf, other

    astro-ph.GA hep-ph

    Via Machinae 2.0: Full-Sky, Model-Agnostic Search for Stellar Streams in Gaia DR2

    Authors: David Shih, Matthew R. Buckley, Lina Necib

    Abstract: We present an update to Via Machinae, an automated stellar stream-finding algorithm based on the deep learning anomaly detector ANODE. Via Machinae identifies stellar streams within Gaia, using only angular positions, proper motions, and photometry, without reference to a model of the Milky Way potential for orbit integration or stellar distances. This new version, Via Machinae 2.0, includes many… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

    Comments: 22 pages, 24 figures

  23. arXiv:2302.11594  [pdf, other

    physics.ins-det hep-ex hep-ph physics.data-an

    L2LFlows: Generating High-Fidelity 3D Calorimeter Images

    Authors: Sascha Diefenbacher, Engin Eren, Frank Gaede, Gregor Kasieczka, Claudius Krause, Imahn Shekhzadeh, David Shih

    Abstract: We explore the use of normalizing flows to emulate Monte Carlo detector simulations of photon showers in a high-granularity electromagnetic calorimeter prototype for the International Large Detector (ILD). Our proposed method -- which we refer to as "Layer-to-Layer-Flows" (L$2$LFlows) -- is an evolution of the CaloFlow architecture adapted to a higher-dimensional setting (30 layers of… ▽ More

    Submitted 20 October, 2023; v1 submitted 22 February, 2023; originally announced February 2023.

    Comments: v2: 28 pages, 13 figures; matches version accepted for publication in JINST. Neither SISSA Medialab Srl nor IOP Publishing Ltd is responsible for any errors or omissions in this version of the manuscript or any version derived from it. Published version available via DOI

    Journal ref: 2023 JINST 18 P10017

  24. Pegasus W: An Ultra-Faint Dwarf Galaxy Outside the Halo of M31 Not Quenched by Reionization

    Authors: Kristen. B. W. McQuinn, Yao-Yuan Mao, Matthew R. Buckley, David Shih, Roger E. Cohen, Andrew E. Dolphin

    Abstract: We report the discovery of an ultrafaint dwarf (UFD) galaxy, Pegasus W, located on the far side of the Milky Way-M31 system and outside the virial radius of M31. The distance to the galaxy is 915 (+60/-91) kpc, measured using the luminosity of horizontal branch (HB) stars identified in Hubble Space Telescope optical imaging. The galaxy has a half-light radius (r_h) of 100 (+11/-13) pc, M_V = -7.20… ▽ More

    Submitted 24 January, 2023; v1 submitted 10 January, 2023; originally announced January 2023.

    Comments: 15 pages, 10 figures, 2 tables

  25. arXiv:2212.00046  [pdf, other

    hep-ph cs.LG hep-ex physics.data-an

    Feature Selection with Distance Correlation

    Authors: Ranit Das, Gregor Kasieczka, David Shih

    Abstract: Choosing which properties of the data to use as input to multivariate decision algorithms -- a.k.a. feature selection -- is an important step in solving any problem with machine learning. While there is a clear trend towards training sophisticated deep networks on large numbers of relatively unprocessed inputs (so-called automated feature engineering), for many tasks in physics, sets of theoretica… ▽ More

    Submitted 30 November, 2022; originally announced December 2022.

    Comments: 14 pages, 8 figures, 3 tables

  26. arXiv:2211.11765  [pdf, other

    astro-ph.GA astro-ph.IM hep-ph

    GalaxyFlow: Upsampling Hydrodynamical Simulations for Realistic Gaia Mock Catalogs

    Authors: Sung Hak Lim, Kailash A. Raman, Matthew R. Buckley, David Shih

    Abstract: Cosmological N-body simulations of galaxies operate at the level of "star particles" with a mass resolution on the scale of thousands of solar masses. Turning these simulations into stellar mock catalogs requires "upsampling" the star particles into individual stars following the same phase-space density. In this paper, we demonstrate that normalizing flows provide a viable upsampling method that… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

    Comments: 17 pages, 11 figures

  27. arXiv:2210.14924  [pdf, other

    hep-ph hep-ex physics.data-an

    Resonant anomaly detection without background sculpting

    Authors: Anna Hallin, Gregor Kasieczka, Tobias Quadfasel, David Shih, Manuel Sommerhalder

    Abstract: We introduce a new technique named Latent CATHODE (LaCATHODE) for performing "enhanced bump hunts", a type of resonant anomaly search that combines conventional one-dimensional bump hunts with a model-agnostic anomaly score in an auxiliary feature space where potential signals could also be localized. The main advantage of LaCATHODE over existing methods is that it provides an anomaly score that i… ▽ More

    Submitted 10 July, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

    Comments: 11 pages, 8 figures; v2 (published version): referencing code and minor style updates

    Journal ref: Phys. Rev. D 107, 114012 (2023)

  28. arXiv:2210.14245  [pdf, other

    physics.ins-det cs.LG hep-ex hep-ph physics.data-an

    CaloFlow for CaloChallenge Dataset 1

    Authors: Claudius Krause, Ian Pang, David Shih

    Abstract: CaloFlow is a new and promising approach to fast calorimeter simulation based on normalizing flows. Applying CaloFlow to the photon and charged pion Geant4 showers of Dataset 1 of the Fast Calorimeter Simulation Challenge 2022, we show how it can produce high-fidelity samples with a sampling time that is several orders of magnitude faster than Geant4. We demonstrate the fidelity of the samples usi… ▽ More

    Submitted 15 May, 2024; v1 submitted 25 October, 2022; originally announced October 2022.

    Comments: 36 pages, 21 figures, v3: match published version

    Journal ref: SciPost Phys. 16, 126 (2024)

  29. arXiv:2209.06225  [pdf, other

    hep-ph hep-ex physics.data-an

    Anomaly Detection under Coordinate Transformations

    Authors: Gregor Kasieczka, Radha Mastandrea, Vinicius Mikuni, Benjamin Nachman, Mariel Pettee, David Shih

    Abstract: There is a growing need for machine learning-based anomaly detection strategies to broaden the search for Beyond-the-Standard-Model (BSM) physics at the Large Hadron Collider (LHC) and elsewhere. The first step of any anomaly detection approach is to specify observables and then use them to decide on a set of anomalous events. One common choice is to select events that have low probability density… ▽ More

    Submitted 13 September, 2022; originally announced September 2022.

    Comments: 10 pages, 6 figures

  30. arXiv:2209.05518  [pdf, other

    hep-ph hep-ex

    VBF vs. GGF Higgs with Full-Event Deep Learning: Towards a Decay-Agnostic Tagger

    Authors: Cheng-Wei Chiang, David Shih, Shang-Fu Wei

    Abstract: We study the benefits of jet- and event-level deep learning methods in distinguishing vector boson fusion (VBF) from gluon-gluon fusion (GGF) Higgs production at the LHC. We show that a variety of classifiers (CNNs, attention-based networks) trained on the complete low-level inputs of the full event achieve significant performance gains over shallow machine learning methods (BDTs) trained on jet k… ▽ More

    Submitted 4 November, 2022; v1 submitted 12 September, 2022; originally announced September 2022.

    Comments: 21 pages+appendices, 16 figures; added references, updated Pythia shower scheme for VBF, and added Appendix C for version 2

  31. arXiv:2205.01129  [pdf, other

    astro-ph.GA hep-ph

    Measuring Galactic Dark Matter through Unsupervised Machine Learning

    Authors: Matthew R Buckley, Sung Hak Lim, Eric Putney, David Shih

    Abstract: Measuring the density profile of dark matter in the Solar neighborhood has important implications for both dark matter theory and experiment. In this work, we apply autoregressive flows to stars from a realistic simulation of a Milky Way-type galaxy to learn -- in an unsupervised way -- the stellar phase space density and its derivatives. With these as inputs, and under the assumption of dynamic e… ▽ More

    Submitted 2 May, 2022; originally announced May 2022.

    Comments: 23 pages, 9 figures

  32. arXiv:2203.08806  [pdf, other

    hep-ph cs.LG hep-ex physics.comp-ph physics.ins-det

    New directions for surrogate models and differentiable programming for High Energy Physics detector simulation

    Authors: Andreas Adelmann, Walter Hopkins, Evangelos Kourlitis, Michael Kagan, Gregor Kasieczka, Claudius Krause, David Shih, Vinicius Mikuni, Benjamin Nachman, Kevin Pedro, Daniel Winklehner

    Abstract: The computational cost for high energy physics detector simulation in future experimental facilities is going to exceed the current available resources. To overcome this challenge, new ideas on surrogate models using machine learning methods are being explored to replace computationally expensive components. Additionally, differentiable programming has been proposed as a complementary approach, pr… ▽ More

    Submitted 15 March, 2022; originally announced March 2022.

    Comments: contribution to Snowmass 2021

    Report number: FERMILAB-CONF-22-199-SCD

  33. Machine Learning and LHC Event Generation

    Authors: Anja Butter, Tilman Plehn, Steffen Schumann, Simon Badger, Sascha Caron, Kyle Cranmer, Francesco Armando Di Bello, Etienne Dreyer, Stefano Forte, Sanmay Ganguly, Dorival Gonçalves, Eilam Gross, Theo Heimel, Gudrun Heinrich, Lukas Heinrich, Alexander Held, Stefan Höche, Jessica N. Howard, Philip Ilten, Joshua Isaacson, Timo Janßen, Stephen Jones, Marumi Kado, Michael Kagan, Gregor Kasieczka , et al. (26 additional authors not shown)

    Abstract: First-principle simulations are at the heart of the high-energy physics research program. They link the vast data output of multi-purpose detectors with fundamental theory predictions and interpretation. This review illustrates a wide range of applications of modern machine learning to event generation and simulation-based inference, including conceptional developments driven by the specific requi… ▽ More

    Submitted 28 December, 2022; v1 submitted 14 March, 2022; originally announced March 2022.

    Comments: Review article based on a Snowmass 2021 contribution

    Journal ref: SciPost Phys. 14, 079 (2023)

  34. arXiv:2202.09375  [pdf, other

    hep-ph hep-ex physics.data-an

    Ephemeral Learning -- Augmenting Triggers with Online-Trained Normalizing Flows

    Authors: Anja Butter, Sascha Diefenbacher, Gregor Kasieczka, Benjamin Nachman, Tilman Plehn, David Shih, Ramon Winterhalder

    Abstract: The large data rates at the LHC require an online trigger system to select relevant collisions. Rather than compressing individual events, we propose to compress an entire data set at once. We use a normalizing flow as a deep generative model to learn the probability density of the data online. The events are then represented by the generative neural network and can be inspected offline for anomal… ▽ More

    Submitted 28 June, 2022; v1 submitted 18 February, 2022; originally announced February 2022.

    Comments: 17 pages, 9 figures, minor changes to text, addressed referee comments

    Report number: CP3-22-10

    Journal ref: SciPost Phys. 13, 087 (2022)

  35. Dark Photons and Displaced Vertices at the MUonE Experiment

    Authors: Iftah Galon, David Shih, Isaac R. Wang

    Abstract: MUonE is a proposed experiment designed to measure the hadronic vacuum polarization contribution to muon $g-2$ through elastic $μ-e$ scattering. As such it employs an extremely high-resolution tracking apparatus. We point out that this makes MUonE also a very promising experiment to search for displaced vertices from light, weakly-interacting new particles. We demonstrate its potential by showing… ▽ More

    Submitted 5 May, 2023; v1 submitted 17 February, 2022; originally announced February 2022.

    Comments: PRD version, 10 pages, 5 figures

  36. Resolving Combinatorial Ambiguities in Dilepton $t \bar t$ Event Topologies with Neural Networks

    Authors: Haider Alhazmi, Zhongtian Dong, Li Huang, Jeong Han Kim, Kyoungchul Kong, David Shih

    Abstract: We study the potential of deep learning to resolve the combinatorial problem in SUSY-like events with two invisible particles at the LHC. As a concrete example, we focus on dileptonic $t \bar t$ events, where the combinatorial problem becomes an issue of binary classification: pairing the correct lepton with each $b$ quark coming from the decays of the tops. We investigate the performance of a num… ▽ More

    Submitted 27 June, 2022; v1 submitted 11 February, 2022; originally announced February 2022.

    Comments: 22 pages, 15 figures, 1 table, matches the published version

    Journal ref: Phys.Rev.D 105 (2022) 11, 115011

  37. arXiv:2112.03769  [pdf, other

    hep-ph hep-ex physics.data-an stat.ML

    Machine Learning in the Search for New Fundamental Physics

    Authors: Georgia Karagiorgi, Gregor Kasieczka, Scott Kravitz, Benjamin Nachman, David Shih

    Abstract: Machine learning plays a crucial role in enhancing and accelerating the search for new fundamental physics. We review the state of machine learning methods and applications for new physics searches in the context of terrestrial high energy physics experiments, including the Large Hadron Collider, rare event searches, and neutrino experiments. While machine learning has a long history in these fiel… ▽ More

    Submitted 7 December, 2021; originally announced December 2021.

    Comments: Preprint of article submitted to Nature Reviews Physics, 19 pages, 1 figure

  38. arXiv:2111.06417  [pdf, other

    cs.LG hep-ex hep-ph physics.acc-ph physics.data-an

    Online-compatible Unsupervised Non-resonant Anomaly Detection

    Authors: Vinicius Mikuni, Benjamin Nachman, David Shih

    Abstract: There is a growing need for anomaly detection methods that can broaden the search for new particles in a model-agnostic manner. Most proposals for new methods focus exclusively on signal sensitivity. However, it is not enough to select anomalous events - there must also be a strategy to provide context to the selected events. We propose the first complete strategy for unsupervised detection of non… ▽ More

    Submitted 11 November, 2021; originally announced November 2021.

    Comments: 9 pages, 3 figures

  39. arXiv:2110.11377  [pdf, other

    physics.ins-det cs.LG hep-ex hep-ph physics.data-an

    CaloFlow II: Even Faster and Still Accurate Generation of Calorimeter Showers with Normalizing Flows

    Authors: Claudius Krause, David Shih

    Abstract: Recently, we introduced CaloFlow, a high-fidelity generative model for GEANT4 calorimeter shower emulation based on normalizing flows. Here, we present CaloFlow v2, an improvement on our original framework that speeds up shower generation by a further factor of 500 relative to the original. The improvement is based on a technique called Probability Density Distillation, originally developed for sp… ▽ More

    Submitted 5 May, 2023; v1 submitted 21 October, 2021; originally announced October 2021.

    Comments: 24 pages, 15 figures, 4 tables; v2: matches accepted version

  40. arXiv:2109.00546  [pdf, other

    hep-ph hep-ex physics.data-an

    Classifying Anomalies THrough Outer Density Estimation (CATHODE)

    Authors: Anna Hallin, Joshua Isaacson, Gregor Kasieczka, Claudius Krause, Benjamin Nachman, Tobias Quadfasel, Matthias Schlaffer, David Shih, Manuel Sommerhalder

    Abstract: We propose a new model-agnostic search strategy for physics beyond the standard model (BSM) at the LHC, based on a novel application of neural density estimation to anomaly detection. Our approach, which we call Classifying Anomalies THrough Outer Density Estimation (CATHODE), assumes the BSM signal is localized in a signal region (defined e.g. using invariant mass). By training a conditional dens… ▽ More

    Submitted 11 September, 2022; v1 submitted 1 September, 2021; originally announced September 2021.

    Comments: 17 pages, 12 figures; v2: minor updates; v3 (published version): added study of background sculpting and minor fixes

    Report number: EFI-20-5, FERMILAB-PUB-21-389-T

    Journal ref: Phys. Rev. D 106, 055006 (2022)

  41. arXiv:2107.02821  [pdf, other

    stat.ML cs.LG hep-ex hep-ph

    New Methods and Datasets for Group Anomaly Detection From Fundamental Physics

    Authors: Gregor Kasieczka, Benjamin Nachman, David Shih

    Abstract: The identification of anomalous overdensities in data - group or collective anomaly detection - is a rich problem with a large number of real world applications. However, it has received relatively little attention in the broader ML community, as compared to point anomalies or other types of single instance outliers. One reason for this is the lack of powerful benchmark datasets. In this paper, we… ▽ More

    Submitted 6 July, 2021; originally announced July 2021.

    Comments: Accepted for ANDEA (Anomaly and Novelty Detection, Explanation and Accommodation) Workshop at KDD 2021

  42. arXiv:2106.05285  [pdf, other

    physics.ins-det cs.LG hep-ex hep-ph physics.data-an

    CaloFlow: Fast and Accurate Generation of Calorimeter Showers with Normalizing Flows

    Authors: Claudius Krause, David Shih

    Abstract: We introduce CaloFlow, a fast detector simulation framework based on normalizing flows. For the first time, we demonstrate that normalizing flows can reproduce many-channel calorimeter showers with extremely high fidelity, providing a fresh alternative to computationally expensive GEANT4 simulations, as well as other state-of-the-art fast simulation frameworks based on GANs and VAEs. Besides the u… ▽ More

    Submitted 5 May, 2023; v1 submitted 9 June, 2021; originally announced June 2021.

    Comments: 33 pages, 19 figures, 5 tables; v2: improved handling of datasets, conclusions unchanged; v3: matches accepted version

  43. arXiv:2104.12789  [pdf, other

    astro-ph.GA hep-ph physics.data-an

    Via Machinae: Searching for Stellar Streams using Unsupervised Machine Learning

    Authors: David Shih, Matthew R. Buckley, Lina Necib, John Tamanas

    Abstract: We develop a new machine learning algorithm, Via Machinae, to identify cold stellar streams in data from the Gaia telescope. Via Machinae is based on ANODE, a general method that uses conditional density estimation and sideband interpolation to detect local overdensities in the data in a model agnostic way. By applying ANODE to the positions, proper motions, and photometry of stars observed by Gai… ▽ More

    Submitted 28 December, 2021; v1 submitted 26 April, 2021; originally announced April 2021.

    Comments: 17 pages, 17 figures, v2: references added, minor corrections, v3: published version

  44. arXiv:2104.02092  [pdf, other

    hep-ph hep-ex physics.data-an stat.ML

    Comparing Weak- and Unsupervised Methods for Resonant Anomaly Detection

    Authors: Jack H. Collins, Pablo Martín-Ramiro, Benjamin Nachman, David Shih

    Abstract: Anomaly detection techniques are growing in importance at the Large Hadron Collider (LHC), motivated by the increasing need to search for new physics in a model-agnostic way. In this work, we provide a detailed comparative study between a well-studied unsupervised method called the autoencoder (AE) and a weakly-supervised approach based on the Classification Without Labels (CWoLa) technique. We ex… ▽ More

    Submitted 5 April, 2021; originally announced April 2021.

    Comments: 39 pages, 17 figures

  45. arXiv:2101.08320  [pdf, other

    hep-ph hep-ex physics.data-an

    The LHC Olympics 2020: A Community Challenge for Anomaly Detection in High Energy Physics

    Authors: Gregor Kasieczka, Benjamin Nachman, David Shih, Oz Amram, Anders Andreassen, Kees Benkendorfer, Blaz Bortolato, Gustaaf Brooijmans, Florencia Canelli, Jack H. Collins, Biwei Dai, Felipe F. De Freitas, Barry M. Dillon, Ioan-Mihail Dinu, Zhongtian Dong, Julien Donini, Javier Duarte, D. A. Faroughy, Julia Gonski, Philip Harris, Alan Kahn, Jernej F. Kamenik, Charanjit K. Khosa, Patrick Komiske, Luc Le Pottier , et al. (22 additional authors not shown)

    Abstract: A new paradigm for data-driven, model-agnostic new physics searches at colliders is emerging, and aims to leverage recent breakthroughs in anomaly detection and machine learning. In order to develop and benchmark new anomaly detection methods within this framework, it is essential to have standard datasets. To this end, we have created the LHC Olympics 2020, a community challenge accompanied by a… ▽ More

    Submitted 20 January, 2021; originally announced January 2021.

    Comments: 108 pages, 53 figures, 3 tables

  46. arXiv:2009.03796  [pdf, other

    hep-ph hep-ex physics.data-an physics.ins-det stat.ML

    DCTRGAN: Improving the Precision of Generative Models with Reweighting

    Authors: Sascha Diefenbacher, Engin Eren, Gregor Kasieczka, Anatolii Korol, Benjamin Nachman, David Shih

    Abstract: Significant advances in deep learning have led to more widely used and precise neural network-based generative models such as Generative Adversarial Networks (GANs). We introduce a post-hoc correction to deep generative models to further improve their fidelity, based on the Deep neural networks using the Classification for Tuning and Reweighting (DCTR) protocol. The correction takes the form of a… ▽ More

    Submitted 3 September, 2020; originally announced September 2020.

    Comments: 14 pages, 8 figures

  47. arXiv:2007.14400  [pdf, other

    hep-ph hep-ex physics.data-an

    ABCDisCo: Automating the ABCD Method with Machine Learning

    Authors: Gregor Kasieczka, Benjamin Nachman, Matthew D. Schwartz, David Shih

    Abstract: The ABCD method is one of the most widely used data-driven background estimation techniques in high energy physics. Cuts on two statistically-independent classifiers separate signal and background into four regions, so that background in the signal region can be estimated simply using the other three control regions. Typically, the independent classifiers are chosen "by hand" to be intuitive and p… ▽ More

    Submitted 28 July, 2020; originally announced July 2020.

    Comments: 37 pages, 12 figures

    Journal ref: Phys. Rev. D 103, 035021 (2021)

  48. A Complete Framework for Tau Polarimetry in $B\to D^{(*)}τν$ Decays

    Authors: Pouya Asadi, Anna Hallin, Jorge Martin Camalich, David Shih, Susanne Westhoff

    Abstract: The meson decays $B\to Dτν$ and $B\to D^* τν$ are sensitive probes of the $b\to cτν$ transition. In this work we present a complete framework to obtain the maximum information on the physics of $B\to D^{(*)}τν$ with polarized $τ$ leptons and unpolarized $D^{(*)}$ mesons. Focusing on the hadronic decays $τ\to πν$ and $τ\toρν$, we show how to extract seven $τ$ asymmetries from a fully differential a… ▽ More

    Submitted 29 June, 2020; originally announced June 2020.

    Comments: 20 pages + appendices, 5 figures

  49. arXiv:2003.09517  [pdf, other

    hep-ph hep-ex

    Strange Jet Tagging

    Authors: Yuichiro Nakai, David Shih, Scott Thomas

    Abstract: Tagging jets of strongly interacting particles initiated by energetic strange quarks is one of the few largely unexplored Standard Model object classification problems remaining in high energy collider physics. In this paper we investigate the purest version of this classification problem in the form of distinguishing strange-quark jets from down-quark jets. Our strategy relies on the fact that a… ▽ More

    Submitted 20 March, 2020; originally announced March 2020.

    Comments: 38 pages, 23 figures

  50. arXiv:2001.05310  [pdf, other

    hep-ph hep-ex physics.data-an

    DisCo Fever: Robust Networks Through Distance Correlation

    Authors: Gregor Kasieczka, David Shih

    Abstract: While deep learning has proven to be extremely successful at supervised classification tasks at the LHC and beyond, for practical applications, raw classification accuracy is often not the only consideration. One crucial issue is the stability of network predictions, either versus changes of individual features of the input data, or against systematic perturbations. We present a new method based o… ▽ More

    Submitted 30 September, 2020; v1 submitted 13 January, 2020; originally announced January 2020.

    Comments: 9 pages, v2: essentially the journal version (refs added, typos fixed, minor improvements)

    Journal ref: Phys. Rev. Lett. 125, 122001 (2020)