subscribe to arXiv mailings

arXiv:2405.20407 [pdf, other]

Convolutional L2LFlows: Generating Accurate Showers in Highly Granular Calorimeters Using Convolutional Normalizing Flows

Authors: Thorsten Buss, Frank Gaede, Gregor Kasieczka, Claudius Krause, David Shih

Abstract: In the quest to build generative surrogate models as computationally efficient alternatives to rule-based simulations, the quality of the generated samples remains a crucial frontier. So far, normalizing flows have been among the models with the best fidelity. However, as the latent space in such models is required to have the same dimensionality as the data space, scaling up normalizing flows to… ▽ More In the quest to build generative surrogate models as computationally efficient alternatives to rule-based simulations, the quality of the generated samples remains a crucial frontier. So far, normalizing flows have been among the models with the best fidelity. However, as the latent space in such models is required to have the same dimensionality as the data space, scaling up normalizing flows to high dimensional datasets is not straightforward. The prior L2LFlows approach successfully used a series of separate normalizing flows and sequence of conditioning steps to circumvent this problem. In this work, we extend L2LFlows to simulate showers with a 9-times larger profile in the lateral direction. To achieve this, we introduce convolutional layers and U-Net-type connections, move from masked autoregressive flows to coupling layers, and demonstrate the successful modelling of showers in the ILD Electromagnetic Calorimeter as well as Dataset 3 from the public CaloChallenge dataset. △ Less

Submitted 3 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

Report number: HEPHY-ML-24-02

arXiv:2405.12131 [pdf, other]

SkyCURTAINs: Model agnostic search for Stellar Streams with Gaia data

Authors: Debajyoti Sengupta, Stephen Mulligan, David Shih, John Andrew Raine, Tobias Golling

Abstract: We present SkyCURTAINs, a data driven and model agnostic method to search for stellar streams in the Milky Way galaxy using data from the Gaia telescope. SkyCURTAINs is a weakly supervised machine learning algorithm that builds a background enriched template in the signal region by leveraging the correlation of the source's characterising features with their proper motion in the sky. This allows f… ▽ More We present SkyCURTAINs, a data driven and model agnostic method to search for stellar streams in the Milky Way galaxy using data from the Gaia telescope. SkyCURTAINs is a weakly supervised machine learning algorithm that builds a background enriched template in the signal region by leveraging the correlation of the source's characterising features with their proper motion in the sky. This allows for a more representative template of the background in the signal region, and reduces the false positives in the search for stellar streams. The minimal model assumptions in the SkyCURTAINs method allow for a flexible and efficient search for various kinds of anomalies such as streams, globular clusters, or dwarf galaxies directly from the data. We test the performance of SkyCURTAINs on the GD-1 stream and show that it is able to recover the stream with a purity of 75.4% which is an improvement of over 10% over existing machine learning based methods while retaining a signal efficiency of 37.9%. △ Less

Submitted 20 May, 2024; originally announced May 2024.

arXiv:2404.18992 [pdf, other]

Unifying Simulation and Inference with Normalizing Flows

Authors: Haoxing Du, Claudius Krause, Vinicius Mikuni, Benjamin Nachman, Ian Pang, David Shih

Abstract: There have been many applications of deep neural networks to detector calibrations and a growing number of studies that propose deep generative models as automated fast detector simulators. We show that these two tasks can be unified by using maximum likelihood estimation (MLE) from conditional generative models for energy regression. Unlike direct regression techniques, the MLE approach is prior-… ▽ More There have been many applications of deep neural networks to detector calibrations and a growing number of studies that propose deep generative models as automated fast detector simulators. We show that these two tasks can be unified by using maximum likelihood estimation (MLE) from conditional generative models for energy regression. Unlike direct regression techniques, the MLE approach is prior-independent and non-Gaussian resolutions can be determined from the shape of the likelihood near the maximum. Using an ATLAS-like calorimeter simulation, we demonstrate this concept in the context of calorimeter energy calibration. △ Less

Submitted 9 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

Comments: 12 pages, 7 figures

Report number: HEPHY-ML-24-01

arXiv:2404.07258 [pdf, other]

Complete Optimal Non-Resonant Anomaly Detection

Authors: Gregor Kasieczka, John Andrew Raine, David Shih, Aman Upadhyay

Abstract: We propose the first-ever complete, model-agnostic search strategy based on the optimal anomaly score, for new physics on the tails of distributions. Signal sensitivity is achieved via a classifier trained on auxiliary features in a weakly-supervised fashion, and backgrounds are predicted using the ABCD method in the classifier output and the primary tail feature. The independence between the clas… ▽ More We propose the first-ever complete, model-agnostic search strategy based on the optimal anomaly score, for new physics on the tails of distributions. Signal sensitivity is achieved via a classifier trained on auxiliary features in a weakly-supervised fashion, and backgrounds are predicted using the ABCD method in the classifier output and the primary tail feature. The independence between the classifier output and the tail feature required for ABCD is achieved by first training a conditional normalizing flow that yields a decorrelated version of the auxiliary features; the classifier is then trained on these features. Both the signal sensitivity and background prediction require a sample of events accurately approximating the SM background; we assume this can be furnished by closely related control processes in the data or by accurate simulations, as is the case in countless conventional analyses. The viability of our approach is demonstrated for signatures consisting of (mono)jets and missing transverse energy, where the main SM background is $Z(νν) +\text{jets}$, and the data-driven control process is $γ+\text{jets}$. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: 9 pages, 9 figures

arXiv:2312.11629 [pdf, other]

Residual ANODE

Authors: Ranit Das, Gregor Kasieczka, David Shih

Abstract: We present R-ANODE, a new method for data-driven, model-agnostic resonant anomaly detection that raises the bar for both performance and interpretability. The key to R-ANODE is to enhance the inductive bias of the anomaly detection task by fitting a normalizing flow directly to the small and unknown signal component, while holding fixed a background model (also a normalizing flow) learned from sid… ▽ More We present R-ANODE, a new method for data-driven, model-agnostic resonant anomaly detection that raises the bar for both performance and interpretability. The key to R-ANODE is to enhance the inductive bias of the anomaly detection task by fitting a normalizing flow directly to the small and unknown signal component, while holding fixed a background model (also a normalizing flow) learned from sidebands. In doing so, R-ANODE is able to outperform all classifier-based, weakly-supervised approaches, as well as the previous ANODE method which fit a density estimator to all of the data in the signal region instead of just the signal. We show that the method works equally well whether the unknown signal fraction is learned or fixed, and is even robust to signal fraction misspecification. Finally, with the learned signal model we can sample and gain qualitative insights into the underlying anomaly, which greatly enhances the interpretability of resonant anomaly detection and offers the possibility of simultaneously discovering and characterizing the new physics that could be hiding in the data. △ Less

Submitted 18 December, 2023; originally announced December 2023.

Comments: 9 pages, 6 figures

arXiv:2312.11618 [pdf, other]

Anomaly detection with flow-based fast calorimeter simulators

Authors: Claudius Krause, Benjamin Nachman, Ian Pang, David Shih, Yunhao Zhu

Abstract: Recently, several normalizing flow-based deep generative models have been proposed to accelerate the simulation of calorimeter showers. Using CaloFlow as an example, we show that these models can simultaneously perform unsupervised anomaly detection with no additional training cost. As a demonstration, we consider electromagnetic showers initiated by one (background) or multiple (signal) photons.… ▽ More Recently, several normalizing flow-based deep generative models have been proposed to accelerate the simulation of calorimeter showers. Using CaloFlow as an example, we show that these models can simultaneously perform unsupervised anomaly detection with no additional training cost. As a demonstration, we consider electromagnetic showers initiated by one (background) or multiple (signal) photons. The CaloFlow model is designed to generate single photon showers, but it also provides access to the shower likelihood. We use this likelihood as an anomaly score and study the showers tagged as being unlikely. As expected, the tagger struggles when the signal photons are nearly collinear, but is otherwise effective. This approach is complementary to a supervised classifier trained on only specific signal models using the same low-level calorimeter inputs. While the supervised classifier is also highly effective at unseen signal models, the unsupervised method is more sensitive in certain regions and thus we expect that the ultimate performance will require a combination of approaches. △ Less

Submitted 18 December, 2023; originally announced December 2023.

Comments: 12 pages, 6 figures

arXiv:2312.09290 [pdf, other]

Normalizing Flows for High-Dimensional Detector Simulations

Authors: Florian Ernst, Luigi Favaro, Claudius Krause, Tilman Plehn, David Shih

Abstract: Whenever invertible generative networks are needed for LHC physics, normalizing flows show excellent performance. A challenge is their scaling to high-dimensional phase spaces. We investigate their performance for fast calorimeter shower simulations with increasing phase space dimension. In addition to the standard architecture we also employ a VAE to compress the dimensionality. Our study provide… ▽ More Whenever invertible generative networks are needed for LHC physics, normalizing flows show excellent performance. A challenge is their scaling to high-dimensional phase spaces. We investigate their performance for fast calorimeter shower simulations with increasing phase space dimension. In addition to the standard architecture we also employ a VAE to compress the dimensionality. Our study provides benchmarks for invertible networks applied to the CaloChallenge. △ Less

Submitted 14 December, 2023; originally announced December 2023.

Comments: 24 pages, 9 figures, 5 tables

arXiv:2312.00123 [pdf, other]

Flow Matching Beyond Kinematics: Generating Jets with Particle-ID and Trajectory Displacement Information

Authors: Joschka Birk, Erik Buhmann, Cedric Ewen, Gregor Kasieczka, David Shih

Abstract: We introduce the first generative model trained on the JetClass dataset. Our model generates jets at the constituent level, and it is a permutation-equivariant continuous normalizing flow (CNF) trained with the flow matching technique. It is conditioned on the jet type, so that a single model can be used to generate the ten different jet types of JetClass. For the first time, we also introduce a g… ▽ More We introduce the first generative model trained on the JetClass dataset. Our model generates jets at the constituent level, and it is a permutation-equivariant continuous normalizing flow (CNF) trained with the flow matching technique. It is conditioned on the jet type, so that a single model can be used to generate the ten different jet types of JetClass. For the first time, we also introduce a generative model that goes beyond the kinematic features of jet constituents. The JetClass dataset includes more features, such as particle-ID and track impact parameter, and we demonstrate that our CNF can accurately model all of these additional features as well. Our generative model for JetClass expands on the versatility of existing jet generation techniques, enhancing their potential utility in high-energy physics research, and offering a more comprehensive understanding of the generated jets. △ Less

Submitted 30 November, 2023; originally announced December 2023.

arXiv:2310.12209 [pdf, other]

Fast Parameter Inference on Pulsar Timing Arrays with Normalizing Flows

Authors: David Shih, Marat Freytsis, Stephen R. Taylor, Jeff A. Dror, Nolan Smyth

Abstract: Pulsar timing arrays (PTAs) perform Bayesian posterior inference with expensive MCMC methods. Given a dataset of ~10-100 pulsars and O(10^3) timing residuals each, producing a posterior distribution for the stochastic gravitational wave background (SGWB) can take days to a week. The computational bottleneck arises because the likelihood evaluation required for MCMC is extremely costly when conside… ▽ More Pulsar timing arrays (PTAs) perform Bayesian posterior inference with expensive MCMC methods. Given a dataset of ~10-100 pulsars and O(10^3) timing residuals each, producing a posterior distribution for the stochastic gravitational wave background (SGWB) can take days to a week. The computational bottleneck arises because the likelihood evaluation required for MCMC is extremely costly when considering the dimensionality of the search space. Fortunately, generating simulated data is fast, so modern simulation-based inference techniques can be brought to bear on the problem. In this paper, we demonstrate how conditional normalizing flows trained on simulated data can be used for extremely fast and accurate estimation of the SGWB posteriors, reducing the sampling time from weeks to a matter of seconds. △ Less

Submitted 18 October, 2023; originally announced October 2023.

Comments: 8 pages, 3 figures

arXiv:2310.06897 [pdf, other]

Full Phase Space Resonant Anomaly Detection

Authors: Erik Buhmann, Cedric Ewen, Gregor Kasieczka, Vinicius Mikuni, Benjamin Nachman, David Shih

Abstract: Physics beyond the Standard Model that is resonant in one or more dimensions has been a longstanding focus of countless searches at colliders and beyond. Recently, many new strategies for resonant anomaly detection have been developed, where sideband information can be used in conjunction with modern machine learning, in order to generate synthetic datasets representing the Standard Model backgrou… ▽ More Physics beyond the Standard Model that is resonant in one or more dimensions has been a longstanding focus of countless searches at colliders and beyond. Recently, many new strategies for resonant anomaly detection have been developed, where sideband information can be used in conjunction with modern machine learning, in order to generate synthetic datasets representing the Standard Model background. Until now, this approach was only able to accommodate a relatively small number of dimensions, limiting the breadth of the search sensitivity. Using recent innovations in point cloud generative models, we show that this strategy can also be applied to the full phase space, using all relevant particles for the anomaly detection. As a proof of principle, we show that the signal from the R\&D dataset from the LHC Olympics is findable with this method, opening up the door to future studies that explore the interplay between depth and breadth in the representation of the data for anomaly detection. △ Less

Submitted 9 February, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

Comments: 10 pages, 7 figures

Journal ref: Phys. Rev. D 109, 055015 (2024)

arXiv:2310.00049 [pdf, other]

EPiC-ly Fast Particle Cloud Generation with Flow-Matching and Diffusion

Authors: Erik Buhmann, Cedric Ewen, Darius A. Faroughy, Tobias Golling, Gregor Kasieczka, Matthew Leigh, Guillaume Quétant, John Andrew Raine, Debajyoti Sengupta, David Shih

Abstract: Jets at the LHC, typically consisting of a large number of highly correlated particles, are a fascinating laboratory for deep generative modeling. In this paper, we present two novel methods that generate LHC jets as point clouds efficiently and accurately. We introduce \epcjedi, which combines score-matching diffusion models with the Equivariant Point Cloud (EPiC) architecture based on the deep s… ▽ More Jets at the LHC, typically consisting of a large number of highly correlated particles, are a fascinating laboratory for deep generative modeling. In this paper, we present two novel methods that generate LHC jets as point clouds efficiently and accurately. We introduce \epcjedi, which combines score-matching diffusion models with the Equivariant Point Cloud (EPiC) architecture based on the deep sets framework. This model offers a much faster alternative to previous transformer-based diffusion models without reducing the quality of the generated jets. In addition, we introduce \epcfm, the first permutation equivariant continuous normalizing flow (CNF) for particle cloud generation. This model is trained with {\it flow-matching}, a scalable and easy-to-train objective based on optimal transport that directly regresses the vector fields connecting the Gaussian noise prior to the data distribution. Our experiments demonstrate that \epcjedi and \epcfm both achieve state-of-the-art performance on the top-quark JetNet datasets whilst maintaining fast generation speed. Most notably, we find that the \epcfm model consistently outperforms all the other generative models considered here across every metric. Finally, we also introduce two new particle cloud performance metrics: the first based on the Kullback-Leibler divergence between feature distributions, the second is the negative log-posterior of a multi-model ParticleNet classifier. △ Less

Submitted 29 September, 2023; originally announced October 2023.

Comments: 21 pages, 8 figures

arXiv:2309.13111 [pdf, other]

Back To The Roots: Tree-Based Algorithms for Weakly Supervised Anomaly Detection

Authors: Thorben Finke, Marie Hein, Gregor Kasieczka, Michael Krämer, Alexander Mück, Parada Prangchaikul, Tobias Quadfasel, David Shih, Manuel Sommerhalder

Abstract: Weakly supervised methods have emerged as a powerful tool for model-agnostic anomaly detection at the Large Hadron Collider (LHC). While these methods have shown remarkable performance on specific signatures such as di-jet resonances, their application in a more model-agnostic manner requires dealing with a larger number of potentially noisy input features. In this paper, we show that using booste… ▽ More Weakly supervised methods have emerged as a powerful tool for model-agnostic anomaly detection at the Large Hadron Collider (LHC). While these methods have shown remarkable performance on specific signatures such as di-jet resonances, their application in a more model-agnostic manner requires dealing with a larger number of potentially noisy input features. In this paper, we show that using boosted decision trees as classifiers in weakly supervised anomaly detection gives superior performance compared to deep neural networks. Boosted decision trees are well known for their effectiveness in tabular data analysis. Our results show that they not only offer significantly faster training and evaluation times, but they are also robust to a large number of noisy input features. By using advanced gradient boosted decision trees in combination with ensembling techniques and an extended set of features, we significantly improve the performance of weakly supervised methods for anomaly detection at the LHC. This advance is a crucial step towards a more model-agnostic search for new physics. △ Less

Submitted 22 September, 2023; originally announced September 2023.

Comments: 11 pages, 9 figures

Report number: TTK-23-26

arXiv:2309.12918 [pdf, other]

doi 10.1103/PhysRevD.109.096031

Combining Resonant and Tail-based Anomaly Detection

Authors: Gerrit Bickendorf, Manuel Drees, Gregor Kasieczka, Claudius Krause, David Shih

Abstract: In many well-motivated models of the electroweak scale, cascade decays of new particles can result in highly boosted hadronic resonances (e.g. $Z/W/h$). This can make these models rich and promising targets for recently developed resonant anomaly detection methods powered by modern machine learning. We demonstrate this using the state-of-the-art CATHODE method applied to supersymmetry scenarios wi… ▽ More In many well-motivated models of the electroweak scale, cascade decays of new particles can result in highly boosted hadronic resonances (e.g. $Z/W/h$). This can make these models rich and promising targets for recently developed resonant anomaly detection methods powered by modern machine learning. We demonstrate this using the state-of-the-art CATHODE method applied to supersymmetry scenarios with gluino pair production. We show that CATHODE, despite being model-agnostic, is nevertheless competitive with dedicated cut-based searches, while simultaneously covering a much wider region of parameter space. The gluino events also populate the tails of the missing energy and $H_T$ distributions, making this a novel combination of resonant and tail-based anomaly detection. △ Less

Submitted 28 May, 2024; v1 submitted 22 September, 2023; originally announced September 2023.

Comments: 13 pages, 15 figures

arXiv:2308.11700 [pdf, other]

doi 10.1103/PhysRevD.109.092009

Calorimeter shower superresolution

Authors: Ian Pang, John Andrew Raine, David Shih

Abstract: Calorimeter shower simulation is a major bottleneck in the Large Hadron Collider computational pipeline. There have been recent efforts to employ deep-generative surrogate models to overcome this challenge. However, many of best performing models have training and generation times that do not scale well to high-dimensional calorimeter showers. In this work, we introduce SuperCalo, a flow-based sup… ▽ More Calorimeter shower simulation is a major bottleneck in the Large Hadron Collider computational pipeline. There have been recent efforts to employ deep-generative surrogate models to overcome this challenge. However, many of best performing models have training and generation times that do not scale well to high-dimensional calorimeter showers. In this work, we introduce SuperCalo, a flow-based superresolution model, and demonstrate that high-dimensional fine-grained calorimeter showers can be quickly upsampled from coarse-grained showers. This novel approach presents a way to reduce computational cost, memory requirements and generation time associated with fast calorimeter simulation models. Additionally, we show that the showers upsampled by SuperCalo possess a high degree of variation. This allows a large number of high-dimensional calorimeter showers to be upsampled from much fewer coarse showers with high-fidelity, which results in additional reduction in generation time. △ Less

Submitted 15 May, 2024; v1 submitted 22 August, 2023; originally announced August 2023.

Comments: 16 pages, 13 figures, v3: title changed, matches published version

Journal ref: Phys. Rev. D 109, 092009 (2024)

arXiv:2307.11157 [pdf, other]

doi 10.1140/epjc/s10052-024-12607-x

The Interplay of Machine Learning--based Resonant Anomaly Detection Methods

Authors: Tobias Golling, Gregor Kasieczka, Claudius Krause, Radha Mastandrea, Benjamin Nachman, John Andrew Raine, Debajyoti Sengupta, David Shih, Manuel Sommerhalder

Abstract: Machine learning--based anomaly detection (AD) methods are promising tools for extending the coverage of searches for physics beyond the Standard Model (BSM). One class of AD methods that has received significant attention is resonant anomaly detection, where the BSM is assumed to be localized in at least one known variable. While there have been many methods proposed to identify such a BSM signal… ▽ More Machine learning--based anomaly detection (AD) methods are promising tools for extending the coverage of searches for physics beyond the Standard Model (BSM). One class of AD methods that has received significant attention is resonant anomaly detection, where the BSM is assumed to be localized in at least one known variable. While there have been many methods proposed to identify such a BSM signal that make use of simulated or detected data in different ways, there has not yet been a study of the methods' complementarity. To this end, we address two questions. First, in the absence of any signal, do different methods pick the same events as signal-like? If not, then we can significantly reduce the false-positive rate by comparing different methods on the same dataset. Second, if there is a signal, are different methods fully correlated? Even if their maximum performance is the same, since we do not know how much signal is present, it may be beneficial to combine approaches. Using the Large Hadron Collider (LHC) Olympics dataset, we provide quantitative answers to these questions. We find that there are significant gains possible by combining multiple methods, which will strengthen the search program at the LHC and beyond. △ Less

Submitted 14 March, 2024; v1 submitted 20 July, 2023; originally announced July 2023.

Comments: 27 pages, 21 figures. Updated with revisions for journal acceptance

arXiv:2307.08738 [pdf, other]

Discovery and Characterization of Two Ultra Faint-Dwarfs Outside the Halo of the Milky Way: Leo M and Leo K

Authors: Kristen B. W. McQuinn, Yao-Yuan Mao, Erik J. Tollerud, Roger E. Cohen, David Shih, Matthew R. Buckley, Andrew E. Dolphin

Abstract: We report the discovery of two ultra-faint dwarf galaxies, Leo M and Leo K, that lie outside the halo of the Milky Way. Using Hubble Space Telescope imaging of the resolved stars, we create color-magnitude diagrams reaching the old main sequence turn-off of each system and (i) fit for structural parameters of the galaxies; (ii) measure their distances using the luminosity of the Horizontal Branch… ▽ More We report the discovery of two ultra-faint dwarf galaxies, Leo M and Leo K, that lie outside the halo of the Milky Way. Using Hubble Space Telescope imaging of the resolved stars, we create color-magnitude diagrams reaching the old main sequence turn-off of each system and (i) fit for structural parameters of the galaxies; (ii) measure their distances using the luminosity of the Horizontal Branch stars; (iii) estimate integrated magnitudes and stellar masses; and (iv) reconstruct the star formation histories. Based on their location in the Local Group, neither galaxy is currently a satellite of the Milky Way, although Leo K is located ~26 kpc from the low-mass galaxy Leo T and these two systems may have had a past interaction. Leo M and Leo K have stellar masses of 1.8 (+0.3/-0.2) x 10^4 Msun and 1.2+/-0.2 x 10^4 Msun, and were quenched 10.6 (+2.2/-1.1) Gyr and 12.8 (+0.1/-4.2) Gyr ago, respectively. Given that the galaxies are not satellites of the MW, it is unlikely that they were quenched by environmental processing. Instead, given their low stellar masses, their early quenching timescales are consistent with the scenario that a combination of reionization and stellar feedback shut-down star formation at early cosmic times. △ Less

Submitted 19 May, 2024; v1 submitted 17 July, 2023; originally announced July 2023.

Comments: 12 pages, 9 figures, 1 table

arXiv:2307.08593 [pdf, other]

Artificial Intelligence for the Electron Ion Collider (AI4EIC)

Authors: C. Allaire, R. Ammendola, E. -C. Aschenauer, M. Balandat, M. Battaglieri, J. Bernauer, M. Bondì, N. Branson, T. Britton, A. Butter, I. Chahrour, P. Chatagnon, E. Cisbani, E. W. Cline, S. Dash, C. Dean, W. Deconinck, A. Deshpande, M. Diefenthaler, R. Ent, C. Fanelli, M. Finger, M. Finger, Jr., E. Fol, S. Furletov , et al. (70 additional authors not shown)

Abstract: The Electron-Ion Collider (EIC), a state-of-the-art facility for studying the strong force, is expected to begin commissioning its first experiments in 2028. This is an opportune time for artificial intelligence (AI) to be included from the start at this facility and in all phases that lead up to the experiments. The second annual workshop organized by the AI4EIC working group, which recently took… ▽ More The Electron-Ion Collider (EIC), a state-of-the-art facility for studying the strong force, is expected to begin commissioning its first experiments in 2028. This is an opportune time for artificial intelligence (AI) to be included from the start at this facility and in all phases that lead up to the experiments. The second annual workshop organized by the AI4EIC working group, which recently took place, centered on exploring all current and prospective application areas of AI for the EIC. This workshop is not only beneficial for the EIC, but also provides valuable insights for the newly established ePIC collaboration at EIC. This paper summarizes the different activities and R&D projects covered across the sessions of the workshop and provides an overview of the goals, approaches and strategies regarding AI/ML in the EIC community, as well as cutting-edge techniques currently studied in other experiments. △ Less

Submitted 17 July, 2023; originally announced July 2023.

Comments: 27 pages, 11 figures, AI4EIC workshop, tutorials and hackathon

arXiv:2305.16774 [pdf, other]

doi 10.21468/SciPostPhys.16.1.031

How to Understand Limitations of Generative Networks

Authors: Ranit Das, Luigi Favaro, Theo Heimel, Claudius Krause, Tilman Plehn, David Shih

Abstract: Well-trained classifiers and their complete weight distributions provide us with a well-motivated and practicable method to test generative networks in particle physics. We illustrate their benefits for distribution-shifted jets, calorimeter showers, and reconstruction-level events. In all cases, the classifier weights make for a powerful test of the generative network, identify potential problems… ▽ More Well-trained classifiers and their complete weight distributions provide us with a well-motivated and practicable method to test generative networks in particle physics. We illustrate their benefits for distribution-shifted jets, calorimeter showers, and reconstruction-level events. In all cases, the classifier weights make for a powerful test of the generative network, identify potential problems in the density estimation, relate them to the underlying physics, and tie in with a comprehensive precision and uncertainty treatment for generative networks. △ Less

Submitted 7 December, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

Comments: 32 pages, 19 figures

Journal ref: SciPost Phys. 16, 031 (2024)

arXiv:2305.13358 [pdf, other]

Mapping Dark Matter in the Milky Way using Normalizing Flows and Gaia DR3

Authors: Sung Hak Lim, Eric Putney, Matthew R. Buckley, David Shih

Abstract: We present a novel, data-driven analysis of Galactic dynamics, using unsupervised machine learning -- in the form of density estimation with normalizing flows -- to learn the underlying phase space distribution of 6 million nearby stars from the Gaia DR3 catalog. Solving the collisionless Boltzmann equation with the assumption of approximate equilibrium, we calculate -- for the first time ever --… ▽ More We present a novel, data-driven analysis of Galactic dynamics, using unsupervised machine learning -- in the form of density estimation with normalizing flows -- to learn the underlying phase space distribution of 6 million nearby stars from the Gaia DR3 catalog. Solving the collisionless Boltzmann equation with the assumption of approximate equilibrium, we calculate -- for the first time ever -- a model-free, unbinned, fully 3D map of the local acceleration and mass density fields within a 3 kpc sphere around the Sun. As our approach makes no assumptions about symmetries, we can test for signs of disequilibrium in our results. We find our results are consistent with equilibrium at the 10% level, limited by the current precision of the normalizing flows. After subtracting the known contribution of stars and gas from the calculated mass density, we find clear evidence for dark matter throughout the analyzed volume. Assuming spherical symmetry and averaging mass density measurements, we find a local dark matter density of $0.47\pm 0.05\;\mathrm{GeV/cm}^3$. We fit our results to a generalized NFW, and find a profile broadly consistent with other recent analyses. △ Less

Submitted 22 May, 2023; originally announced May 2023.

Comments: 19 pages, 13 figures, 3 tables

arXiv:2305.11934 [pdf, other]

doi 10.1103/PhysRevD.109.033006

Inductive Simulation of Calorimeter Showers with Normalizing Flows

Authors: Matthew R. Buckley, Claudius Krause, Ian Pang, David Shih

Abstract: Simulating particle detector response is the single most expensive step in the Large Hadron Collider computational pipeline. Recently it was shown that normalizing flows can accelerate this process while achieving unprecedented levels of accuracy, but scaling this approach up to higher resolutions relevant for future detector upgrades leads to prohibitive memory constraints. To overcome this probl… ▽ More Simulating particle detector response is the single most expensive step in the Large Hadron Collider computational pipeline. Recently it was shown that normalizing flows can accelerate this process while achieving unprecedented levels of accuracy, but scaling this approach up to higher resolutions relevant for future detector upgrades leads to prohibitive memory constraints. To overcome this problem, we introduce Inductive CaloFlow (iCaloFlow), a framework for fast detector simulation based on an inductive series of normalizing flows trained on the pattern of energy depositions in pairs of consecutive calorimeter layers. We further use a teacher-student distillation to increase sampling speed without loss of expressivity. As we demonstrate with Datasets 2 and 3 of the CaloChallenge2022, iCaloFlow can realize the potential of normalizing flows in performing fast, high-fidelity simulation on detector geometries that are ~ 10 - 100 times higher granularity than previously considered. △ Less

Submitted 13 February, 2024; v1 submitted 19 May, 2023; originally announced May 2023.

Comments: 19 pages, 15 figures; v2: title changed, matches published version

Journal ref: Phys. Rev. D 109, 033006 (2024)

arXiv:2305.03761 [pdf, other]

Weakly-Supervised Anomaly Detection in the Milky Way

Authors: Mariel Pettee, Sowmya Thanvantri, Benjamin Nachman, David Shih, Matthew R. Buckley, Jack H. Collins

Abstract: Large-scale astrophysics datasets present an opportunity for new machine learning techniques to identify regions of interest that might otherwise be overlooked by traditional searches. To this end, we use Classification Without Labels (CWoLa), a weakly-supervised anomaly detection method, to identify cold stellar streams within the more than one billion Milky Way stars observed by the Gaia satelli… ▽ More Large-scale astrophysics datasets present an opportunity for new machine learning techniques to identify regions of interest that might otherwise be overlooked by traditional searches. To this end, we use Classification Without Labels (CWoLa), a weakly-supervised anomaly detection method, to identify cold stellar streams within the more than one billion Milky Way stars observed by the Gaia satellite. CWoLa operates without the use of labeled streams or knowledge of astrophysical principles. Instead, we train a classifier to distinguish between mixed samples for which the proportions of signal and background samples are unknown. This computationally lightweight strategy is able to detect both simulated streams and the known stream GD-1 in data. Originally designed for high-energy collider physics, this technique may have broad applicability within astrophysics as well as other domains interested in identifying localized anomalies. △ Less

Submitted 5 May, 2023; originally announced May 2023.

arXiv:2303.01529 [pdf, other]

Via Machinae 2.0: Full-Sky, Model-Agnostic Search for Stellar Streams in Gaia DR2

Authors: David Shih, Matthew R. Buckley, Lina Necib

Abstract: We present an update to Via Machinae, an automated stellar stream-finding algorithm based on the deep learning anomaly detector ANODE. Via Machinae identifies stellar streams within Gaia, using only angular positions, proper motions, and photometry, without reference to a model of the Milky Way potential for orbit integration or stellar distances. This new version, Via Machinae 2.0, includes many… ▽ More We present an update to Via Machinae, an automated stellar stream-finding algorithm based on the deep learning anomaly detector ANODE. Via Machinae identifies stellar streams within Gaia, using only angular positions, proper motions, and photometry, without reference to a model of the Milky Way potential for orbit integration or stellar distances. This new version, Via Machinae 2.0, includes many improvements and refinements to nearly every step of the algorithm, that altogether result in more robust and visually distinct stream candidates than our original formulation. In this work, we also provide a quantitative estimate of the false positive rate of Via Machinae 2.0 by applying it to a simulated Gaia-mock catalog based on Galaxia, a smooth model of the Milky Way that does not contain substructure or stellar streams. Finally, we perform the first full-sky search for stellar streams with Via Machinae 2.0, identifying 102 streams at high significance within the Gaia Data Release 2, of which only 10 have been previously identified. While follow-up observations for further confirmation are required, taking into account the false positive rate presented in this work, we expect approximately 90 of these stream candidates to correspond to real stellar structures. △ Less

Submitted 2 March, 2023; originally announced March 2023.

Comments: 22 pages, 24 figures

arXiv:2302.11594 [pdf, other]

doi 10.1088/1748-0221/18/10/P10017

L2LFlows: Generating High-Fidelity 3D Calorimeter Images

Authors: Sascha Diefenbacher, Engin Eren, Frank Gaede, Gregor Kasieczka, Claudius Krause, Imahn Shekhzadeh, David Shih

Abstract: We explore the use of normalizing flows to emulate Monte Carlo detector simulations of photon showers in a high-granularity electromagnetic calorimeter prototype for the International Large Detector (ILD). Our proposed method -- which we refer to as "Layer-to-Layer-Flows" (L$2$LFlows) -- is an evolution of the CaloFlow architecture adapted to a higher-dimensional setting (30 layers of… ▽ More We explore the use of normalizing flows to emulate Monte Carlo detector simulations of photon showers in a high-granularity electromagnetic calorimeter prototype for the International Large Detector (ILD). Our proposed method -- which we refer to as "Layer-to-Layer-Flows" (L$2$LFlows) -- is an evolution of the CaloFlow architecture adapted to a higher-dimensional setting (30 layers of $10\times 10$ voxels each). The main innovation of L$2$LFlows consists of introducing $30$ separate normalizing flows, one for each layer of the calorimeter, where each flow is conditioned on the previous five layers in order to learn the layer-to-layer correlations. We compare our results to the BIB-AE, a state-of-the-art generative network trained on the same dataset and find our model has a significantly improved fidelity. △ Less

Submitted 20 October, 2023; v1 submitted 22 February, 2023; originally announced February 2023.

Comments: v2: 28 pages, 13 figures; matches version accepted for publication in JINST. Neither SISSA Medialab Srl nor IOP Publishing Ltd is responsible for any errors or omissions in this version of the manuscript or any version derived from it. Published version available via DOI

Journal ref: 2023 JINST 18 P10017

arXiv:2301.04157 [pdf, other]

doi 10.3847/1538-4357/acaec9

Pegasus W: An Ultra-Faint Dwarf Galaxy Outside the Halo of M31 Not Quenched by Reionization

Authors: Kristen. B. W. McQuinn, Yao-Yuan Mao, Matthew R. Buckley, David Shih, Roger E. Cohen, Andrew E. Dolphin

Abstract: We report the discovery of an ultrafaint dwarf (UFD) galaxy, Pegasus W, located on the far side of the Milky Way-M31 system and outside the virial radius of M31. The distance to the galaxy is 915 (+60/-91) kpc, measured using the luminosity of horizontal branch (HB) stars identified in Hubble Space Telescope optical imaging. The galaxy has a half-light radius (r_h) of 100 (+11/-13) pc, M_V = -7.20… ▽ More We report the discovery of an ultrafaint dwarf (UFD) galaxy, Pegasus W, located on the far side of the Milky Way-M31 system and outside the virial radius of M31. The distance to the galaxy is 915 (+60/-91) kpc, measured using the luminosity of horizontal branch (HB) stars identified in Hubble Space Telescope optical imaging. The galaxy has a half-light radius (r_h) of 100 (+11/-13) pc, M_V = -7.20 (+0.17/-0.16) mag, and a present-day stellar mass of 6.5 (+1.1/-1.4) x 10^4 Msun. We identify sources in the color-magnitude diagram (CMD) that may be younger than ~500 Myr suggesting late-time star formation in the UFD galaxy, although further study is needed to confirm these are bona fide young stars in the galaxy. Based on fitting the CMD with stellar evolution libraries, Pegasus W shows an extended star formation history (SFH). Using the tau_90 metric (defined as the timescale by which the galaxy formed 90% of its stellar mass), the galaxy was quenched only 7.4 (+2.2/-2.6) Gyr ago, which is similar to the quenching timescale of a number of UFD satellites of M31 but significantly more recent than the UFD satellites of the Milky Way. Such late-time quenching is inconsistent with the more rapid timescale expected by reionization and suggests that, while not currently a satellite of M31, Pegasus W was nonetheless slowly quenched by environmental processes. △ Less

Submitted 24 January, 2023; v1 submitted 10 January, 2023; originally announced January 2023.

Comments: 15 pages, 10 figures, 2 tables

arXiv:2212.00046 [pdf, other]

Feature Selection with Distance Correlation

Authors: Ranit Das, Gregor Kasieczka, David Shih

Abstract: Choosing which properties of the data to use as input to multivariate decision algorithms -- a.k.a. feature selection -- is an important step in solving any problem with machine learning. While there is a clear trend towards training sophisticated deep networks on large numbers of relatively unprocessed inputs (so-called automated feature engineering), for many tasks in physics, sets of theoretica… ▽ More Choosing which properties of the data to use as input to multivariate decision algorithms -- a.k.a. feature selection -- is an important step in solving any problem with machine learning. While there is a clear trend towards training sophisticated deep networks on large numbers of relatively unprocessed inputs (so-called automated feature engineering), for many tasks in physics, sets of theoretically well-motivated and well-understood features already exist. Working with such features can bring many benefits, including greater interpretability, reduced training and run time, and enhanced stability and robustness. We develop a new feature selection method based on Distance Correlation (DisCo), and demonstrate its effectiveness on the tasks of boosted top- and $W$-tagging. Using our method to select features from a set of over 7,000 energy flow polynomials, we show that we can match the performance of much deeper architectures, by using only ten features and two orders-of-magnitude fewer model parameters. △ Less

Submitted 30 November, 2022; originally announced December 2022.

Comments: 14 pages, 8 figures, 3 tables

arXiv:2211.11765 [pdf, other]

GalaxyFlow: Upsampling Hydrodynamical Simulations for Realistic Gaia Mock Catalogs

Authors: Sung Hak Lim, Kailash A. Raman, Matthew R. Buckley, David Shih

Abstract: Cosmological N-body simulations of galaxies operate at the level of "star particles" with a mass resolution on the scale of thousands of solar masses. Turning these simulations into stellar mock catalogs requires "upsampling" the star particles into individual stars following the same phase-space density. In this paper, we demonstrate that normalizing flows provide a viable upsampling method that… ▽ More Cosmological N-body simulations of galaxies operate at the level of "star particles" with a mass resolution on the scale of thousands of solar masses. Turning these simulations into stellar mock catalogs requires "upsampling" the star particles into individual stars following the same phase-space density. In this paper, we demonstrate that normalizing flows provide a viable upsampling method that greatly improves on conventionally-used kernel smoothing algorithms such as EnBiD. We demonstrate our flow-based upsampling technique, dubbed GalaxyFlow, on a neighborhood of the Solar location in two simulated galaxies: Auriga 6 and h277. By eye, GalaxyFlow produces stellar distributions that are smoother than EnBiD-based methods and more closely match the Gaia DR3 catalog. For a quantitative comparison of generative model performance, we introduce a novel multi-model classifier test. Using this classifier test, we show that GalaxyFlow more accurately estimates the density of the underlying star particles than previous methods. △ Less

Submitted 21 November, 2022; originally announced November 2022.

Comments: 17 pages, 11 figures

arXiv:2210.14924 [pdf, other]

doi 10.1103/PhysRevD.107.114012

Resonant anomaly detection without background sculpting

Authors: Anna Hallin, Gregor Kasieczka, Tobias Quadfasel, David Shih, Manuel Sommerhalder

Abstract: We introduce a new technique named Latent CATHODE (LaCATHODE) for performing "enhanced bump hunts", a type of resonant anomaly search that combines conventional one-dimensional bump hunts with a model-agnostic anomaly score in an auxiliary feature space where potential signals could also be localized. The main advantage of LaCATHODE over existing methods is that it provides an anomaly score that i… ▽ More We introduce a new technique named Latent CATHODE (LaCATHODE) for performing "enhanced bump hunts", a type of resonant anomaly search that combines conventional one-dimensional bump hunts with a model-agnostic anomaly score in an auxiliary feature space where potential signals could also be localized. The main advantage of LaCATHODE over existing methods is that it provides an anomaly score that is well behaved when evaluating it beyond the signal region, which is essential to prevent the sculpting of background distributions in the bump hunt. LaCATHODE accomplishes this by constructing the anomaly score directly in the latent space learned by a conditional normalizing flow trained on sideband regions. We demonstrate the superior stability and comparable performance of LaCATHODE for enhanced bump hunting in an illustrative toy example as well as on the LHC Olympics R&D dataset. △ Less

Submitted 10 July, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

Comments: 11 pages, 8 figures; v2 (published version): referencing code and minor style updates

Journal ref: Phys. Rev. D 107, 114012 (2023)

arXiv:2210.14245 [pdf, other]

doi 10.21468/SciPostPhys.16.5.126

CaloFlow for CaloChallenge Dataset 1

Authors: Claudius Krause, Ian Pang, David Shih

Abstract: CaloFlow is a new and promising approach to fast calorimeter simulation based on normalizing flows. Applying CaloFlow to the photon and charged pion Geant4 showers of Dataset 1 of the Fast Calorimeter Simulation Challenge 2022, we show how it can produce high-fidelity samples with a sampling time that is several orders of magnitude faster than Geant4. We demonstrate the fidelity of the samples usi… ▽ More CaloFlow is a new and promising approach to fast calorimeter simulation based on normalizing flows. Applying CaloFlow to the photon and charged pion Geant4 showers of Dataset 1 of the Fast Calorimeter Simulation Challenge 2022, we show how it can produce high-fidelity samples with a sampling time that is several orders of magnitude faster than Geant4. We demonstrate the fidelity of the samples using calorimeter shower images, histograms of high-level features, and aggregate metrics such as a classifier trained to distinguish CaloFlow from Geant4 samples. △ Less

Submitted 15 May, 2024; v1 submitted 25 October, 2022; originally announced October 2022.

Comments: 36 pages, 21 figures, v3: match published version

Journal ref: SciPost Phys. 16, 126 (2024)

arXiv:2209.06225 [pdf, other]

doi 10.1103/PhysRevD.107.015009

Anomaly Detection under Coordinate Transformations

Authors: Gregor Kasieczka, Radha Mastandrea, Vinicius Mikuni, Benjamin Nachman, Mariel Pettee, David Shih

Abstract: There is a growing need for machine learning-based anomaly detection strategies to broaden the search for Beyond-the-Standard-Model (BSM) physics at the Large Hadron Collider (LHC) and elsewhere. The first step of any anomaly detection approach is to specify observables and then use them to decide on a set of anomalous events. One common choice is to select events that have low probability density… ▽ More There is a growing need for machine learning-based anomaly detection strategies to broaden the search for Beyond-the-Standard-Model (BSM) physics at the Large Hadron Collider (LHC) and elsewhere. The first step of any anomaly detection approach is to specify observables and then use them to decide on a set of anomalous events. One common choice is to select events that have low probability density. It is a well-known fact that probability densities are not invariant under coordinate transformations, so the sensitivity can depend on the initial choice of coordinates. The broader machine learning community has recently connected coordinate sensitivity with anomaly detection and our goal is to bring awareness of this issue to the growing high energy physics literature on anomaly detection. In addition to analytical explanations, we provide numerical examples from simple random variables and from the LHC Olympics Dataset that show how using probability density as an anomaly score can lead to events being classified as anomalous or not depending on the coordinate frame. △ Less

Submitted 13 September, 2022; originally announced September 2022.

Comments: 10 pages, 6 figures

arXiv:2209.05518 [pdf, other]

VBF vs. GGF Higgs with Full-Event Deep Learning: Towards a Decay-Agnostic Tagger

Authors: Cheng-Wei Chiang, David Shih, Shang-Fu Wei

Abstract: We study the benefits of jet- and event-level deep learning methods in distinguishing vector boson fusion (VBF) from gluon-gluon fusion (GGF) Higgs production at the LHC. We show that a variety of classifiers (CNNs, attention-based networks) trained on the complete low-level inputs of the full event achieve significant performance gains over shallow machine learning methods (BDTs) trained on jet k… ▽ More We study the benefits of jet- and event-level deep learning methods in distinguishing vector boson fusion (VBF) from gluon-gluon fusion (GGF) Higgs production at the LHC. We show that a variety of classifiers (CNNs, attention-based networks) trained on the complete low-level inputs of the full event achieve significant performance gains over shallow machine learning methods (BDTs) trained on jet kinematics and jet shapes, and we elucidate the reasons for these performance gains. Finally, we take initial steps towards the possibility of a VBF vs. GGF tagger that is agnostic to the Higgs decay mode, by demonstrating that the performance of our event-level CNN does not change when the Higgs decay products are removed. These results highlight the potentially powerful benefits of event-level deep learning at the LHC. △ Less

Submitted 4 November, 2022; v1 submitted 12 September, 2022; originally announced September 2022.

Comments: 21 pages+appendices, 16 figures; added references, updated Pythia shower scheme for VBF, and added Appendix C for version 2

arXiv:2205.01129 [pdf, other]

doi 10.1093/mnras/stad843

Measuring Galactic Dark Matter through Unsupervised Machine Learning

Authors: Matthew R Buckley, Sung Hak Lim, Eric Putney, David Shih

Abstract: Measuring the density profile of dark matter in the Solar neighborhood has important implications for both dark matter theory and experiment. In this work, we apply autoregressive flows to stars from a realistic simulation of a Milky Way-type galaxy to learn -- in an unsupervised way -- the stellar phase space density and its derivatives. With these as inputs, and under the assumption of dynamic e… ▽ More Measuring the density profile of dark matter in the Solar neighborhood has important implications for both dark matter theory and experiment. In this work, we apply autoregressive flows to stars from a realistic simulation of a Milky Way-type galaxy to learn -- in an unsupervised way -- the stellar phase space density and its derivatives. With these as inputs, and under the assumption of dynamic equilibrium, the gravitational acceleration field and mass density can be calculated directly from the Boltzmann Equation without the need to assume either cylindrical symmetry or specific functional forms for the galaxy's mass density. We demonstrate our approach can accurately reconstruct the mass density and acceleration profiles of the simulated galaxy, even in the presence of Gaia-like errors in the kinematic measurements. △ Less

Submitted 2 May, 2022; originally announced May 2022.

Comments: 23 pages, 9 figures

arXiv:2203.08806 [pdf, other]

New directions for surrogate models and differentiable programming for High Energy Physics detector simulation

Authors: Andreas Adelmann, Walter Hopkins, Evangelos Kourlitis, Michael Kagan, Gregor Kasieczka, Claudius Krause, David Shih, Vinicius Mikuni, Benjamin Nachman, Kevin Pedro, Daniel Winklehner

Abstract: The computational cost for high energy physics detector simulation in future experimental facilities is going to exceed the current available resources. To overcome this challenge, new ideas on surrogate models using machine learning methods are being explored to replace computationally expensive components. Additionally, differentiable programming has been proposed as a complementary approach, pr… ▽ More The computational cost for high energy physics detector simulation in future experimental facilities is going to exceed the current available resources. To overcome this challenge, new ideas on surrogate models using machine learning methods are being explored to replace computationally expensive components. Additionally, differentiable programming has been proposed as a complementary approach, providing controllable and scalable simulation routines. In this document, new and ongoing efforts for surrogate models and differential programming applied to detector simulation are discussed in the context of the 2021 Particle Physics Community Planning Exercise (`Snowmass'). △ Less

Submitted 15 March, 2022; originally announced March 2022.

Comments: contribution to Snowmass 2021

Report number: FERMILAB-CONF-22-199-SCD

arXiv:2203.07460 [pdf, other]

doi 10.21468/SciPostPhys.14.4.079

Machine Learning and LHC Event Generation

Authors: Anja Butter, Tilman Plehn, Steffen Schumann, Simon Badger, Sascha Caron, Kyle Cranmer, Francesco Armando Di Bello, Etienne Dreyer, Stefano Forte, Sanmay Ganguly, Dorival Gonçalves, Eilam Gross, Theo Heimel, Gudrun Heinrich, Lukas Heinrich, Alexander Held, Stefan Höche, Jessica N. Howard, Philip Ilten, Joshua Isaacson, Timo Janßen, Stephen Jones, Marumi Kado, Michael Kagan, Gregor Kasieczka , et al. (26 additional authors not shown)

Abstract: First-principle simulations are at the heart of the high-energy physics research program. They link the vast data output of multi-purpose detectors with fundamental theory predictions and interpretation. This review illustrates a wide range of applications of modern machine learning to event generation and simulation-based inference, including conceptional developments driven by the specific requi… ▽ More First-principle simulations are at the heart of the high-energy physics research program. They link the vast data output of multi-purpose detectors with fundamental theory predictions and interpretation. This review illustrates a wide range of applications of modern machine learning to event generation and simulation-based inference, including conceptional developments driven by the specific requirements of particle physics. New ideas and tools developed at the interface of particle physics and machine learning will improve the speed and precision of forward simulations, handle the complexity of collision data, and enhance inference as an inverse simulation problem. △ Less

Submitted 28 December, 2022; v1 submitted 14 March, 2022; originally announced March 2022.

Comments: Review article based on a Snowmass 2021 contribution

Journal ref: SciPost Phys. 14, 079 (2023)

arXiv:2202.09375 [pdf, other]

doi 10.21468/SciPostPhys.13.4.087

Ephemeral Learning -- Augmenting Triggers with Online-Trained Normalizing Flows

Authors: Anja Butter, Sascha Diefenbacher, Gregor Kasieczka, Benjamin Nachman, Tilman Plehn, David Shih, Ramon Winterhalder

Abstract: The large data rates at the LHC require an online trigger system to select relevant collisions. Rather than compressing individual events, we propose to compress an entire data set at once. We use a normalizing flow as a deep generative model to learn the probability density of the data online. The events are then represented by the generative neural network and can be inspected offline for anomal… ▽ More The large data rates at the LHC require an online trigger system to select relevant collisions. Rather than compressing individual events, we propose to compress an entire data set at once. We use a normalizing flow as a deep generative model to learn the probability density of the data online. The events are then represented by the generative neural network and can be inspected offline for anomalies or used for other analysis purposes. We demonstrate our new approach for a toy model and a correlation-enhanced bump hunt. △ Less

Submitted 28 June, 2022; v1 submitted 18 February, 2022; originally announced February 2022.

Comments: 17 pages, 9 figures, minor changes to text, addressed referee comments

Report number: CP3-22-10

Journal ref: SciPost Phys. 13, 087 (2022)

arXiv:2202.08843 [pdf, other]

doi 10.1103/PhysRevD.107.095003

Dark Photons and Displaced Vertices at the MUonE Experiment

Authors: Iftah Galon, David Shih, Isaac R. Wang

Abstract: MUonE is a proposed experiment designed to measure the hadronic vacuum polarization contribution to muon $g-2$ through elastic $μ-e$ scattering. As such it employs an extremely high-resolution tracking apparatus. We point out that this makes MUonE also a very promising experiment to search for displaced vertices from light, weakly-interacting new particles. We demonstrate its potential by showing… ▽ More MUonE is a proposed experiment designed to measure the hadronic vacuum polarization contribution to muon $g-2$ through elastic $μ-e$ scattering. As such it employs an extremely high-resolution tracking apparatus. We point out that this makes MUonE also a very promising experiment to search for displaced vertices from light, weakly-interacting new particles. We demonstrate its potential by showing how it has excellent sensitivity to dark photons in the mass range $10~\mathrm{MeV} \le m_{A'} \le 100~\mathrm{MeV}$ and kinetic mixing parameter $10^{-5} \le εe \le 10^{-3}$, through the process $μ^{\pm}\, e^- \to μ^{\pm}\, e^-\, A'$ followed by $A'\to e^+e^-$. △ Less

Submitted 5 May, 2023; v1 submitted 17 February, 2022; originally announced February 2022.

Comments: PRD version, 10 pages, 5 figures

arXiv:2202.05849 [pdf, other]

doi 10.1103/PhysRevD.105.115011

Resolving Combinatorial Ambiguities in Dilepton $t \bar t$ Event Topologies with Neural Networks

Authors: Haider Alhazmi, Zhongtian Dong, Li Huang, Jeong Han Kim, Kyoungchul Kong, David Shih

Abstract: We study the potential of deep learning to resolve the combinatorial problem in SUSY-like events with two invisible particles at the LHC. As a concrete example, we focus on dileptonic $t \bar t$ events, where the combinatorial problem becomes an issue of binary classification: pairing the correct lepton with each $b$ quark coming from the decays of the tops. We investigate the performance of a num… ▽ More We study the potential of deep learning to resolve the combinatorial problem in SUSY-like events with two invisible particles at the LHC. As a concrete example, we focus on dileptonic $t \bar t$ events, where the combinatorial problem becomes an issue of binary classification: pairing the correct lepton with each $b$ quark coming from the decays of the tops. We investigate the performance of a number of machine learning algorithms, including attention-based networks, which have been used for a similar problem in the fully-hadronic channel of $t\bar t$ production; and the Lorentz Boost Network, which is motivated by physics principles. We then consider the general case when the underlying mass spectrum is unknown, and hence no kinematic endpoint information is available. Compared against existing methods based on kinematic variables, we demonstrate that the efficiency for selecting the correct pairing is greatly improved by utilizing deep learning techniques. △ Less

Submitted 27 June, 2022; v1 submitted 11 February, 2022; originally announced February 2022.

Comments: 22 pages, 15 figures, 1 table, matches the published version

Journal ref: Phys.Rev.D 105 (2022) 11, 115011

arXiv:2112.03769 [pdf, other]

Machine Learning in the Search for New Fundamental Physics

Authors: Georgia Karagiorgi, Gregor Kasieczka, Scott Kravitz, Benjamin Nachman, David Shih

Abstract: Machine learning plays a crucial role in enhancing and accelerating the search for new fundamental physics. We review the state of machine learning methods and applications for new physics searches in the context of terrestrial high energy physics experiments, including the Large Hadron Collider, rare event searches, and neutrino experiments. While machine learning has a long history in these fiel… ▽ More Machine learning plays a crucial role in enhancing and accelerating the search for new fundamental physics. We review the state of machine learning methods and applications for new physics searches in the context of terrestrial high energy physics experiments, including the Large Hadron Collider, rare event searches, and neutrino experiments. While machine learning has a long history in these fields, the deep learning revolution (early 2010s) has yielded a qualitative shift in terms of the scope and ambition of research. These modern machine learning developments are the focus of the present review. △ Less

Submitted 7 December, 2021; originally announced December 2021.

Comments: Preprint of article submitted to Nature Reviews Physics, 19 pages, 1 figure

arXiv:2111.06417 [pdf, other]

doi 10.1103/PhysRevD.105.055006

Online-compatible Unsupervised Non-resonant Anomaly Detection

Authors: Vinicius Mikuni, Benjamin Nachman, David Shih

Abstract: There is a growing need for anomaly detection methods that can broaden the search for new particles in a model-agnostic manner. Most proposals for new methods focus exclusively on signal sensitivity. However, it is not enough to select anomalous events - there must also be a strategy to provide context to the selected events. We propose the first complete strategy for unsupervised detection of non… ▽ More There is a growing need for anomaly detection methods that can broaden the search for new particles in a model-agnostic manner. Most proposals for new methods focus exclusively on signal sensitivity. However, it is not enough to select anomalous events - there must also be a strategy to provide context to the selected events. We propose the first complete strategy for unsupervised detection of non-resonant anomalies that includes both signal sensitivity and a data-driven method for background estimation. Our technique is built out of two simultaneously-trained autoencoders that are forced to be decorrelated from each other. This method can be deployed offline for non-resonant anomaly detection and is also the first complete online-compatible anomaly detection strategy. We show that our method achieves excellent performance on a variety of signals prepared for the ADC2021 data challenge. △ Less

Submitted 11 November, 2021; originally announced November 2021.

Comments: 9 pages, 3 figures

arXiv:2110.11377 [pdf, other]

CaloFlow II: Even Faster and Still Accurate Generation of Calorimeter Showers with Normalizing Flows

Authors: Claudius Krause, David Shih

Abstract: Recently, we introduced CaloFlow, a high-fidelity generative model for GEANT4 calorimeter shower emulation based on normalizing flows. Here, we present CaloFlow v2, an improvement on our original framework that speeds up shower generation by a further factor of 500 relative to the original. The improvement is based on a technique called Probability Density Distillation, originally developed for sp… ▽ More Recently, we introduced CaloFlow, a high-fidelity generative model for GEANT4 calorimeter shower emulation based on normalizing flows. Here, we present CaloFlow v2, an improvement on our original framework that speeds up shower generation by a further factor of 500 relative to the original. The improvement is based on a technique called Probability Density Distillation, originally developed for speech synthesis in the ML literature, and which we develop further by introducing a set of powerful new loss terms. We demonstrate that CaloFlow v2 preserves the same high fidelity of the original using qualitative (average images, histograms of high level features) and quantitative (classifier metric between GEANT4 and generated samples) measures. The result is a generative model for calorimeter showers that matches the state-of-the-art in speed (a factor of $10^4$ faster than GEANT4) and greatly surpasses the previous state-of-the-art in fidelity. △ Less

Submitted 5 May, 2023; v1 submitted 21 October, 2021; originally announced October 2021.

Comments: 24 pages, 15 figures, 4 tables; v2: matches accepted version

arXiv:2109.00546 [pdf, other]

doi 10.1103/PhysRevD.106.055006

Classifying Anomalies THrough Outer Density Estimation (CATHODE)

Authors: Anna Hallin, Joshua Isaacson, Gregor Kasieczka, Claudius Krause, Benjamin Nachman, Tobias Quadfasel, Matthias Schlaffer, David Shih, Manuel Sommerhalder

Abstract: We propose a new model-agnostic search strategy for physics beyond the standard model (BSM) at the LHC, based on a novel application of neural density estimation to anomaly detection. Our approach, which we call Classifying Anomalies THrough Outer Density Estimation (CATHODE), assumes the BSM signal is localized in a signal region (defined e.g. using invariant mass). By training a conditional dens… ▽ More We propose a new model-agnostic search strategy for physics beyond the standard model (BSM) at the LHC, based on a novel application of neural density estimation to anomaly detection. Our approach, which we call Classifying Anomalies THrough Outer Density Estimation (CATHODE), assumes the BSM signal is localized in a signal region (defined e.g. using invariant mass). By training a conditional density estimator on a collection of additional features outside the signal region, interpolating it into the signal region, and sampling from it, we produce a collection of events that follow the background model. We can then train a classifier to distinguish the data from the events sampled from the background model, thereby approaching the optimal anomaly detector. Using the LHC Olympics R&D dataset, we demonstrate that CATHODE nearly saturates the best possible performance, and significantly outperforms other approaches that aim to enhance the bump hunt (CWoLa Hunting and ANODE). Finally, we demonstrate that CATHODE is very robust against correlations between the features and maintains nearly-optimal performance even in this more challenging setting. △ Less

Submitted 11 September, 2022; v1 submitted 1 September, 2021; originally announced September 2021.

Comments: 17 pages, 12 figures; v2: minor updates; v3 (published version): added study of background sculpting and minor fixes

Report number: EFI-20-5, FERMILAB-PUB-21-389-T

Journal ref: Phys. Rev. D 106, 055006 (2022)

arXiv:2107.02821 [pdf, other]

New Methods and Datasets for Group Anomaly Detection From Fundamental Physics

Authors: Gregor Kasieczka, Benjamin Nachman, David Shih

Abstract: The identification of anomalous overdensities in data - group or collective anomaly detection - is a rich problem with a large number of real world applications. However, it has received relatively little attention in the broader ML community, as compared to point anomalies or other types of single instance outliers. One reason for this is the lack of powerful benchmark datasets. In this paper, we… ▽ More The identification of anomalous overdensities in data - group or collective anomaly detection - is a rich problem with a large number of real world applications. However, it has received relatively little attention in the broader ML community, as compared to point anomalies or other types of single instance outliers. One reason for this is the lack of powerful benchmark datasets. In this paper, we first explain how, after the Nobel-prize winning discovery of the Higgs boson, unsupervised group anomaly detection has become a new frontier of fundamental physics (where the motivation is to find new particles and forces). Then we propose a realistic synthetic benchmark dataset (LHCO2020) for the development of group anomaly detection algorithms. Finally, we compare several existing statistically-sound techniques for unsupervised group anomaly detection, and demonstrate their performance on the LHCO2020 dataset. △ Less

Submitted 6 July, 2021; originally announced July 2021.

Comments: Accepted for ANDEA (Anomaly and Novelty Detection, Explanation and Accommodation) Workshop at KDD 2021

arXiv:2106.05285 [pdf, other]

doi 10.1103/PhysRevD.107.113003

CaloFlow: Fast and Accurate Generation of Calorimeter Showers with Normalizing Flows

Authors: Claudius Krause, David Shih

Abstract: We introduce CaloFlow, a fast detector simulation framework based on normalizing flows. For the first time, we demonstrate that normalizing flows can reproduce many-channel calorimeter showers with extremely high fidelity, providing a fresh alternative to computationally expensive GEANT4 simulations, as well as other state-of-the-art fast simulation frameworks based on GANs and VAEs. Besides the u… ▽ More We introduce CaloFlow, a fast detector simulation framework based on normalizing flows. For the first time, we demonstrate that normalizing flows can reproduce many-channel calorimeter showers with extremely high fidelity, providing a fresh alternative to computationally expensive GEANT4 simulations, as well as other state-of-the-art fast simulation frameworks based on GANs and VAEs. Besides the usual histograms of physical features and images of calorimeter showers, we introduce a new metric for judging the quality of generative modeling: the performance of a classifier trained to differentiate real from generated images. We show that GAN-generated images can be identified by the classifier with nearly 100% accuracy, while images generated from CaloFlow are better able to fool the classifier. More broadly, normalizing flows offer several advantages compared to other state-of-the-art approaches (GANs and VAEs), including: tractable likelihoods; stable and convergent training; and principled model selection. Normalizing flows also provide a bijective mapping between data and the latent space, which could have other applications beyond simulation, for example, to detector unfolding. △ Less

Submitted 5 May, 2023; v1 submitted 9 June, 2021; originally announced June 2021.

Comments: 33 pages, 19 figures, 5 tables; v2: improved handling of datasets, conclusions unchanged; v3: matches accepted version

arXiv:2104.12789 [pdf, other]

doi 10.1093/mnras/stab3372

Via Machinae: Searching for Stellar Streams using Unsupervised Machine Learning

Authors: David Shih, Matthew R. Buckley, Lina Necib, John Tamanas

Abstract: We develop a new machine learning algorithm, Via Machinae, to identify cold stellar streams in data from the Gaia telescope. Via Machinae is based on ANODE, a general method that uses conditional density estimation and sideband interpolation to detect local overdensities in the data in a model agnostic way. By applying ANODE to the positions, proper motions, and photometry of stars observed by Gai… ▽ More We develop a new machine learning algorithm, Via Machinae, to identify cold stellar streams in data from the Gaia telescope. Via Machinae is based on ANODE, a general method that uses conditional density estimation and sideband interpolation to detect local overdensities in the data in a model agnostic way. By applying ANODE to the positions, proper motions, and photometry of stars observed by Gaia, Via Machinae obtains a collection of those stars deemed most likely to belong to a stellar stream. We further apply an automated line-finding method based on the Hough transform to search for line-like features in patches of the sky. In this paper, we describe the Via Machinae algorithm in detail and demonstrate our approach on the prominent stream GD-1. Though some parts of the algorithm are tuned to increase sensitivity to cold streams, the Via Machinae technique itself does not rely on astrophysical assumptions, such as the potential of the Milky Way or stellar isochrones. This flexibility suggests that it may have further applications in identifying other anomalous structures within the Gaia dataset, for example debris flow and globular clusters. △ Less

Submitted 28 December, 2021; v1 submitted 26 April, 2021; originally announced April 2021.

Comments: 17 pages, 17 figures, v2: references added, minor corrections, v3: published version

arXiv:2104.02092 [pdf, other]

doi 10.1140/epjc/s10052-021-09389-x

Comparing Weak- and Unsupervised Methods for Resonant Anomaly Detection

Authors: Jack H. Collins, Pablo Martín-Ramiro, Benjamin Nachman, David Shih

Abstract: Anomaly detection techniques are growing in importance at the Large Hadron Collider (LHC), motivated by the increasing need to search for new physics in a model-agnostic way. In this work, we provide a detailed comparative study between a well-studied unsupervised method called the autoencoder (AE) and a weakly-supervised approach based on the Classification Without Labels (CWoLa) technique. We ex… ▽ More Anomaly detection techniques are growing in importance at the Large Hadron Collider (LHC), motivated by the increasing need to search for new physics in a model-agnostic way. In this work, we provide a detailed comparative study between a well-studied unsupervised method called the autoencoder (AE) and a weakly-supervised approach based on the Classification Without Labels (CWoLa) technique. We examine the ability of the two methods to identify a new physics signal at different cross sections in a fully hadronic resonance search. By construction, the AE classification performance is independent of the amount of injected signal. In contrast, the CWoLa performance improves with increasing signal abundance. When integrating these approaches with a complete background estimate, we find that the two methods have complementary sensitivity. In particular, CWoLa is effective at finding diverse and moderately rare signals while the AE can provide sensitivity to very rare signals, but only with certain topologies. We therefore demonstrate that both techniques are complementary and can be used together for anomaly detection at the LHC. △ Less

Submitted 5 April, 2021; originally announced April 2021.

Comments: 39 pages, 17 figures

arXiv:2101.08320 [pdf, other]

doi 10.1088/1361-6633/ac36b9

The LHC Olympics 2020: A Community Challenge for Anomaly Detection in High Energy Physics

Authors: Gregor Kasieczka, Benjamin Nachman, David Shih, Oz Amram, Anders Andreassen, Kees Benkendorfer, Blaz Bortolato, Gustaaf Brooijmans, Florencia Canelli, Jack H. Collins, Biwei Dai, Felipe F. De Freitas, Barry M. Dillon, Ioan-Mihail Dinu, Zhongtian Dong, Julien Donini, Javier Duarte, D. A. Faroughy, Julia Gonski, Philip Harris, Alan Kahn, Jernej F. Kamenik, Charanjit K. Khosa, Patrick Komiske, Luc Le Pottier , et al. (22 additional authors not shown)

Abstract: A new paradigm for data-driven, model-agnostic new physics searches at colliders is emerging, and aims to leverage recent breakthroughs in anomaly detection and machine learning. In order to develop and benchmark new anomaly detection methods within this framework, it is essential to have standard datasets. To this end, we have created the LHC Olympics 2020, a community challenge accompanied by a… ▽ More A new paradigm for data-driven, model-agnostic new physics searches at colliders is emerging, and aims to leverage recent breakthroughs in anomaly detection and machine learning. In order to develop and benchmark new anomaly detection methods within this framework, it is essential to have standard datasets. To this end, we have created the LHC Olympics 2020, a community challenge accompanied by a set of simulated collider events. Participants in these Olympics have developed their methods using an R&D dataset and then tested them on black boxes: datasets with an unknown anomaly (or not). This paper will review the LHC Olympics 2020 challenge, including an overview of the competition, a description of methods deployed in the competition, lessons learned from the experience, and implications for data analyses with future datasets as well as future colliders. △ Less

Submitted 20 January, 2021; originally announced January 2021.

Comments: 108 pages, 53 figures, 3 tables

arXiv:2009.03796 [pdf, other]

doi 10.1088/1748-0221/15/11/P11004

DCTRGAN: Improving the Precision of Generative Models with Reweighting

Authors: Sascha Diefenbacher, Engin Eren, Gregor Kasieczka, Anatolii Korol, Benjamin Nachman, David Shih

Abstract: Significant advances in deep learning have led to more widely used and precise neural network-based generative models such as Generative Adversarial Networks (GANs). We introduce a post-hoc correction to deep generative models to further improve their fidelity, based on the Deep neural networks using the Classification for Tuning and Reweighting (DCTR) protocol. The correction takes the form of a… ▽ More Significant advances in deep learning have led to more widely used and precise neural network-based generative models such as Generative Adversarial Networks (GANs). We introduce a post-hoc correction to deep generative models to further improve their fidelity, based on the Deep neural networks using the Classification for Tuning and Reweighting (DCTR) protocol. The correction takes the form of a reweighting function that can be applied to generated examples when making predictions from the simulation. We illustrate this approach using GANs trained on standard multimodal probability densities as well as calorimeter simulations from high energy physics. We show that the weighted GAN examples significantly improve the accuracy of the generated samples without a large loss in statistical power. This approach could be applied to any generative model and is a promising refinement method for high energy physics applications and beyond. △ Less

Submitted 3 September, 2020; originally announced September 2020.

Comments: 14 pages, 8 figures

arXiv:2007.14400 [pdf, other]

doi 10.1103/PhysRevD.103.035021

ABCDisCo: Automating the ABCD Method with Machine Learning

Authors: Gregor Kasieczka, Benjamin Nachman, Matthew D. Schwartz, David Shih

Abstract: The ABCD method is one of the most widely used data-driven background estimation techniques in high energy physics. Cuts on two statistically-independent classifiers separate signal and background into four regions, so that background in the signal region can be estimated simply using the other three control regions. Typically, the independent classifiers are chosen "by hand" to be intuitive and p… ▽ More The ABCD method is one of the most widely used data-driven background estimation techniques in high energy physics. Cuts on two statistically-independent classifiers separate signal and background into four regions, so that background in the signal region can be estimated simply using the other three control regions. Typically, the independent classifiers are chosen "by hand" to be intuitive and physically motivated variables. Here, we explore the possibility of automating the design of one or both of these classifiers using machine learning. We show how to use state-of-the-art decorrelation methods to construct powerful yet independent discriminators. Along the way, we uncover a previously unappreciated aspect of the ABCD method: its accuracy hinges on having low signal contamination in control regions not just overall, but relative to the signal fraction in the signal region. We demonstrate the method with three examples: a simple model consisting of three-dimensional Gaussians; boosted hadronic top jet tagging; and a recasted search for paired dijet resonances. In all cases, automating the ABCD method with machine learning significantly improves performance in terms of ABCD closure, background rejection and signal contamination. △ Less

Submitted 28 July, 2020; originally announced July 2020.

Comments: 37 pages, 12 figures

Journal ref: Phys. Rev. D 103, 035021 (2021)

arXiv:2006.16416 [pdf, other]

doi 10.1103/PhysRevD.102.095028

A Complete Framework for Tau Polarimetry in $B\to D^{(*)}τν$ Decays

Authors: Pouya Asadi, Anna Hallin, Jorge Martin Camalich, David Shih, Susanne Westhoff

Abstract: The meson decays $B\to Dτν$ and $B\to D^* τν$ are sensitive probes of the $b\to cτν$ transition. In this work we present a complete framework to obtain the maximum information on the physics of $B\to D^{(*)}τν$ with polarized $τ$ leptons and unpolarized $D^{(*)}$ mesons. Focusing on the hadronic decays $τ\to πν$ and $τ\toρν$, we show how to extract seven $τ$ asymmetries from a fully differential a… ▽ More The meson decays $B\to Dτν$ and $B\to D^* τν$ are sensitive probes of the $b\to cτν$ transition. In this work we present a complete framework to obtain the maximum information on the physics of $B\to D^{(*)}τν$ with polarized $τ$ leptons and unpolarized $D^{(*)}$ mesons. Focusing on the hadronic decays $τ\to πν$ and $τ\toρν$, we show how to extract seven $τ$ asymmetries from a fully differential analysis of the final-state kinematics. At Belle II with $50~\text{ab}^{-1}$ of data, these asymmetries could potentially be measured with percent level statistical uncertainty. This would open a new window into possible new physics contributions in $b\to cτν$ and would allow us to decipher its Lorentz and gauge structure. △ Less

Submitted 29 June, 2020; originally announced June 2020.

Comments: 20 pages + appendices, 5 figures

arXiv:2003.09517 [pdf, other]

Strange Jet Tagging

Authors: Yuichiro Nakai, David Shih, Scott Thomas

Abstract: Tagging jets of strongly interacting particles initiated by energetic strange quarks is one of the few largely unexplored Standard Model object classification problems remaining in high energy collider physics. In this paper we investigate the purest version of this classification problem in the form of distinguishing strange-quark jets from down-quark jets. Our strategy relies on the fact that a… ▽ More Tagging jets of strongly interacting particles initiated by energetic strange quarks is one of the few largely unexplored Standard Model object classification problems remaining in high energy collider physics. In this paper we investigate the purest version of this classification problem in the form of distinguishing strange-quark jets from down-quark jets. Our strategy relies on the fact that a strange-quark jet contains on average a higher ratio of neutral kaon energy to neutral pion energy than does a down-quark jet. Long-lived neutral kaons deposit energy mainly in the hadronic calorimeter of a high energy detector, while neutral pions decay promptly to photons that deposit energy mainly in the electromagnetic calorimeter. In addition, short-lived neutral kaons that decay in flight to charged pion pairs can be identified as a secondary vertex in the inner tracking system. Using these handles we study different approaches to distinguishing strange-quark from down-quark jets, including single variable cut-based methods, a boosted decision tree (BDT) with a small number of simple variables, and a deep learning convolutional neural network (CNN) architecture with jet images. We show that modest gains are possible from the CNN compared with the BDT or a single variable. Starting from jet samples with only strange-quark and down-quark jets, the CNN algorithm can improve the strange to down ratio by a factor of roughly 2 for strange tagging efficiencies below 0.2, and by a factor of 2.5 for strange tagging efficiencies near 0.02. △ Less

Submitted 20 March, 2020; originally announced March 2020.

Comments: 38 pages, 23 figures

arXiv:2001.05310 [pdf, other]

doi 10.1103/PhysRevLett.125.122001

DisCo Fever: Robust Networks Through Distance Correlation

Authors: Gregor Kasieczka, David Shih

Abstract: While deep learning has proven to be extremely successful at supervised classification tasks at the LHC and beyond, for practical applications, raw classification accuracy is often not the only consideration. One crucial issue is the stability of network predictions, either versus changes of individual features of the input data, or against systematic perturbations. We present a new method based o… ▽ More While deep learning has proven to be extremely successful at supervised classification tasks at the LHC and beyond, for practical applications, raw classification accuracy is often not the only consideration. One crucial issue is the stability of network predictions, either versus changes of individual features of the input data, or against systematic perturbations. We present a new method based on a novel application of "distance correlation" (DisCo), a measure quantifying non-linear correlations, that achieves equal performance to state-of-the-art adversarial decorrelation networks but is much simpler and more stable to train. To demonstrate the effectiveness of our method, we carefully recast a recent ATLAS study of decorrelation methods as applied to boosted, hadronic W-tagging. We also show the feasibility of DisCo regularization for more powerful convolutional neural networks, as well as for the problem of hadronic top tagging. △ Less

Submitted 30 September, 2020; v1 submitted 13 January, 2020; originally announced January 2020.

Comments: 9 pages, v2: essentially the journal version (refs added, typos fixed, minor improvements)

Journal ref: Phys. Rev. Lett. 125, 122001 (2020)

Showing 1–50 of 120 results for author: Shih, D