-
How Does Quantization Affect Multilingual LLMs?
Authors:
Kelly Marchisio,
Saurabh Dash,
Hongyu Chen,
Dennis Aumiller,
Ahmet Üstün,
Sara Hooker,
Sebastian Ruder
Abstract:
Quantization techniques are widely used to improve inference speed and deployment of large language models. While a wide body of work examines the impact of quantized LLMs on English tasks, none have examined the effect of quantization across languages. We conduct a thorough analysis of quantized multilingual LLMs, focusing on their performance across languages and at varying scales. We use automa…
▽ More
Quantization techniques are widely used to improve inference speed and deployment of large language models. While a wide body of work examines the impact of quantized LLMs on English tasks, none have examined the effect of quantization across languages. We conduct a thorough analysis of quantized multilingual LLMs, focusing on their performance across languages and at varying scales. We use automatic benchmarks, LLM-as-a-Judge methods, and human evaluation, finding that (1) harmful effects of quantization are apparent in human evaluation, and automatic metrics severely underestimate the detriment: a 1.7% average drop in Japanese across automatic tasks corresponds to a 16.0% drop reported by human evaluators on realistic prompts; (2) languages are disparately affected by quantization, with non-Latin script languages impacted worst; and (3) challenging tasks such as mathematical reasoning degrade fastest. As the ability to serve low-compute models is critical for wide global adoption of NLP technologies, our results urge consideration of multilingual performance as a key evaluation criterion for efficient models.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Scalable Artificial Intelligence for Science: Perspectives, Methods and Exemplars
Authors:
Wesley Brewer,
Aditya Kashi,
Sajal Dash,
Aristeidis Tsaris,
Junqi Yin,
Mallikarjun Shankar,
Feiyi Wang
Abstract:
In a post-ChatGPT world, this paper explores the potential of leveraging scalable artificial intelligence for scientific discovery. We propose that scaling up artificial intelligence on high-performance computing platforms is essential to address such complex problems. This perspective focuses on scientific use cases like cognitive simulations, large language models for scientific inquiry, medical…
▽ More
In a post-ChatGPT world, this paper explores the potential of leveraging scalable artificial intelligence for scientific discovery. We propose that scaling up artificial intelligence on high-performance computing platforms is essential to address such complex problems. This perspective focuses on scientific use cases like cognitive simulations, large language models for scientific inquiry, medical image analysis, and physics-informed approaches. The study outlines the methodologies needed to address such challenges at scale on supercomputers or the cloud and provides exemplars of such approaches applied to solve a variety of scientific problems.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Outliers and Calibration Sets have Diminishing Effect on Quantization of Modern LLMs
Authors:
Davide Paglieri,
Saurabh Dash,
Tim Rocktäschel,
Jack Parker-Holder
Abstract:
Post-Training Quantization (PTQ) enhances the efficiency of Large Language Models (LLMs) by enabling faster operation and compatibility with more accessible hardware through reduced memory usage, at the cost of small performance drops. We explore the role of calibration sets in PTQ, specifically their effect on hidden activations in various notable open-source LLMs. Calibration sets are crucial fo…
▽ More
Post-Training Quantization (PTQ) enhances the efficiency of Large Language Models (LLMs) by enabling faster operation and compatibility with more accessible hardware through reduced memory usage, at the cost of small performance drops. We explore the role of calibration sets in PTQ, specifically their effect on hidden activations in various notable open-source LLMs. Calibration sets are crucial for evaluating activation magnitudes and identifying outliers, which can distort the quantization range and negatively impact performance. Our analysis reveals a marked contrast in quantization effectiveness across models. The older OPT model, upon which much of the quantization literature is based, shows significant performance deterioration and high susceptibility to outliers with varying calibration sets. In contrast, newer models like Llama-2 7B, Llama-3 8B, Command-R 35B, and Mistral 7B demonstrate strong robustness, with Mistral 7B showing near-immunity to outliers and stable activations. These findings suggest a shift in PTQ strategies might be needed. As advancements in pre-training methods reduce the relevance of outliers, there is an emerging need to reassess the fundamentals of current quantization literature. The emphasis should pivot towards optimizing inference speed, rather than primarily focusing on outlier preservation, to align with the evolving characteristics of state-of-the-art LLMs.
△ Less
Submitted 5 June, 2024; v1 submitted 31 May, 2024;
originally announced May 2024.
-
Monte-Carlo Study Of Higher-Order Cumulants of Net-Particle Distributions in $p+p$ Collisions at $\sqrt{s}$ = 13 TeV
Authors:
Abdussamad M,
Rahul Verma,
Nirbhay Kumar Behera,
Sadhana Dash,
Basanta Kumar Nandi
Abstract:
Measurement of higher order cumulants of the distributions of conserved quantities, like net-charge, net-baryon and net-strangeness in heavy-ion collisions, is proposed as a sensitive tool to determine the freeze-out parameters and the nature of phase transitions at the LHC energies. Baseline measurements for heavy-ion collisions are essential to understand the experimental measurements. Recently,…
▽ More
Measurement of higher order cumulants of the distributions of conserved quantities, like net-charge, net-baryon and net-strangeness in heavy-ion collisions, is proposed as a sensitive tool to determine the freeze-out parameters and the nature of phase transitions at the LHC energies. Baseline measurements for heavy-ion collisions are essential to understand the experimental measurements. Recently, several experimental observations have shown some QGP-like scenarios in small systems (pp collisions). We report the first Monte-Carlo study of the measurements of cumulants and their ratios for net-charge, net-hadron, net-kaon, net-baryon, and net-proton distributions in pp collisions at $\sqrt{s}$=13 TeV using pQCD models like Pythia8 and Herwig. We also discuss the effect of different particle production mechanisms on the higher-order cumulants. This simulation study will serve as a baseline for future measurements at the LHC. Furthermore, it will shed more light on the measurement of cumulants and the connection between small systems and heavy-ion collisions at the LHC.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
Aya 23: Open Weight Releases to Further Multilingual Progress
Authors:
Viraat Aryabumi,
John Dang,
Dwarak Talupuru,
Saurabh Dash,
David Cairuz,
Hangyu Lin,
Bharat Venkitesh,
Madeline Smith,
Jon Ander Campos,
Yi Chern Tan,
Kelly Marchisio,
Max Bartolo,
Sebastian Ruder,
Acyr Locatelli,
Julia Kreutzer,
Nick Frosst,
Aidan Gomez,
Phil Blunsom,
Marzieh Fadaee,
Ahmet Üstün,
Sara Hooker
Abstract:
This technical report introduces Aya 23, a family of multilingual language models. Aya 23 builds on the recent release of the Aya model (Üstün et al., 2024), focusing on pairing a highly performant pre-trained model with the recently released Aya collection (Singh et al., 2024). The result is a powerful multilingual large language model serving 23 languages, expanding state-of-art language modelin…
▽ More
This technical report introduces Aya 23, a family of multilingual language models. Aya 23 builds on the recent release of the Aya model (Üstün et al., 2024), focusing on pairing a highly performant pre-trained model with the recently released Aya collection (Singh et al., 2024). The result is a powerful multilingual large language model serving 23 languages, expanding state-of-art language modeling capabilities to approximately half of the world's population. The Aya model covered 101 languages whereas Aya 23 is an experiment in depth vs breadth, exploring the impact of allocating more capacity to fewer languages that are included during pre-training. Aya 23 outperforms both previous massively multilingual models like Aya 101 for the languages it covers, as well as widely used models like Gemma, Mistral and Mixtral on an extensive range of discriminative and generative tasks. We release the open weights for both the 8B and 35B models as part of our continued commitment for expanding access to multilingual progress.
△ Less
Submitted 31 May, 2024; v1 submitted 23 May, 2024;
originally announced May 2024.
-
Study of transport properties of a hot and dense QCD matter using a novel approximation method
Authors:
Anowar Shaikh,
Shubhalaxmi Rath,
Sadhana Dash,
Binata Panda
Abstract:
We have studied the charge and heat transport properties of a hot and dense QCD matter using a novel approximation method within the quasiparticle model. Utilizing a novel collision integral for both the relaxation time approximation (RTA) and the Bhatnagar-Gross-Krook (BGK) models, we have solved the relativistic Boltzmann transport equation to estimate the electrical conductivity and the thermal…
▽ More
We have studied the charge and heat transport properties of a hot and dense QCD matter using a novel approximation method within the quasiparticle model. Utilizing a novel collision integral for both the relaxation time approximation (RTA) and the Bhatnagar-Gross-Krook (BGK) models, we have solved the relativistic Boltzmann transport equation to estimate the electrical conductivity and the thermal conductivity. We have investigated the temperature dependence of these transport coefficients. We have also provided a comparison between our findings and those of the standard RTA and BGK models. Further, we have explored the temperature dependence of the thermal diffusion constant and the Lorenz number using the novel approaches of the aforesaid models.
△ Less
Submitted 13 April, 2024;
originally announced April 2024.
-
Constraints on atmospheric water abundance and cloud deck pressure in the warm Neptune GJ 3470 b via CARMENES transmission spectroscopy
Authors:
Spandan Dash,
Matteo Brogi,
Siddharth Gandhi,
Marina Lafarga,
Annabella Meech,
Aaron Bello-Arufe,
Peter J. Wheatley
Abstract:
Observations of cooler atmospheres of super-Earths and Neptune sized objects often show flat transmission spectra. The most likely cause of this trend is the presence of aerosols (i.e. clouds and hazes) in the atmospheres of such objects. High-resolution spectroscopy provides an opportunity to test this hypothesis by targeting molecular species whose spectral line cores extend above the level of s…
▽ More
Observations of cooler atmospheres of super-Earths and Neptune sized objects often show flat transmission spectra. The most likely cause of this trend is the presence of aerosols (i.e. clouds and hazes) in the atmospheres of such objects. High-resolution spectroscopy provides an opportunity to test this hypothesis by targeting molecular species whose spectral line cores extend above the level of such opaque decks. In this work, we analyse high-resolution infrared observations of the warm Neptune GJ 3470 b taken over two transits using CARMENES (R $\sim$ 80,000) and look for signatures of H$_2$O (previously detected using HST WFC3+Spitzer observations) in these transits with a custom pipeline fully accounting for the effects of data cleaning on any potential exoplanet signal. We find that our data are potentially able to weakly detect ($\sim3σ$) an injected signal equivalent to the best-fit model from previous HST WFC3+Spitzer observations. However, we do not make a significant detection using the actual observations. Using a Bayesian framework to simultaneously constrain the H$_2$O Volume Mixing Ratio (VMR) and the cloud top pressure level, we select a family of models compatible with the non detection. These are either very high VMR, cloud-free models, solar-abundance models with a high cloud deck, or sub-solar abundance models with a moderate cloud deck. This is a broader range compared to published results from low-resolution spectroscopy, but is also compatible with them at a 1$σ$ level.
△ Less
Submitted 9 April, 2024;
originally announced April 2024.
-
Charged particle multiplicity fluctuation in $A-A$ collisions at RHIC and LHC energies using Angantyr model
Authors:
Pritindra Bhowmick,
Sadhana Dash,
Basanta Nandi,
Claude Pruneau
Abstract:
Event-by-event fluctuations of the charged particle multiplicity are studied for a wide range of centralities for Au$-$Au collisions at $\sqrt{s_{NN}}$ = 200 GeV,
Pb$-$Pb collisions at $\sqrt{s_{NN}}$ = 2.76 TeV and 5.02 TeV using the Pythia 8 Angantyr model. The centrality dependence of $ω_{ch}$ observable, which quantifies the fluctuations in terms of scaled variance is studied for different p…
▽ More
Event-by-event fluctuations of the charged particle multiplicity are studied for a wide range of centralities for Au$-$Au collisions at $\sqrt{s_{NN}}$ = 200 GeV,
Pb$-$Pb collisions at $\sqrt{s_{NN}}$ = 2.76 TeV and 5.02 TeV using the Pythia 8 Angantyr model. The centrality dependence of $ω_{ch}$ observable, which quantifies the fluctuations in terms of scaled variance is studied for different pseudorapidity ranges and has been compared with those obtained from a simple participant superposition model. The $ω_{ch}$ was found to be lower than the expectations from the participant model. The estimate would act like a baseline for current and future measurements of event-by-event fluctuations in the charged particle multiplicities in systems at LHC energies where no de-confined medium of quarks and gluons are formed.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
Analyzing the transport coefficients and observables of a rotating QGP medium in kinetic theory framework with a novel approach to the collision integral
Authors:
Shubhalaxmi Rath,
Sadhana Dash
Abstract:
In the present work, we have studied how the rotation of the QGP medium affects the transport coefficients and observables in heavy ion collisions. For the noncentral collisions, although most of the angular momentum gets carried away by the spectators, there still remains a finite angular momentum with a finite range of angular velocity, which thus incites rotation in the produced matter. As a re…
▽ More
In the present work, we have studied how the rotation of the QGP medium affects the transport coefficients and observables in heavy ion collisions. For the noncentral collisions, although most of the angular momentum gets carried away by the spectators, there still remains a finite angular momentum with a finite range of angular velocity, which thus incites rotation in the produced matter. As a result, various properties of the QGP medium are likely to be modulated by the rotation. We have calculated the transport coefficients and observables, such as electrical conductivity, thermal conductivity, Knudsen number, elliptic flow, specific heat at constant pressure, specific heat at constant volume, trace anomaly, thermal diffusion constant and isothermal compressibility using the kinetic theory to see the effect of rotation on them. In particular, we have used the novel relaxation time approximation for the collision integral in the relativistic Boltzmann transport equation to derive the transport coefficients and compared them with their values in the relaxation time approximation within the kinetic theory approach in conjunction with the finite angular velocity. We have found that the angular velocity plays an important role and enhances the flow of charge and heat in the medium. Further, as compared to the relaxation time approximation, the electrical and thermal conductivities have smaller values in the novel relaxation time approximation and these differences between the conductivities in the said approximations are more pronounced at high temperature than at low temperature. Furthermore, all the aforesaid observables are found to be sensitive to the rotation of the QGP medium.
△ Less
Submitted 23 March, 2024; v1 submitted 2 March, 2024;
originally announced March 2024.
-
Study of identified particle production as a function of transverse event activity classifier, $S_{T}$ in p$-$p collisions
Authors:
Rahul Verma,
Vishu Saini,
Basanta Nandi,
Sadhana Dash
Abstract:
A new observable, $S_{T}$, is introduced in terms of the sum of the transverse momentum of charged particles ($\sum_{i} p_{T_{i}}$ ) produced in proton proton (p$-$p) collisions at LHC energies to probe the underlying events (UE). The UE are defined as those aspects of proton-proton collisions that are not attributed to the primary hard scattering process, but rather to the accompanying interactio…
▽ More
A new observable, $S_{T}$, is introduced in terms of the sum of the transverse momentum of charged particles ($\sum_{i} p_{T_{i}}$ ) produced in proton proton (p$-$p) collisions at LHC energies to probe the underlying events (UE). The UE are defined as those aspects of proton-proton collisions that are not attributed to the primary hard scattering process, but rather to the accompanying interactions of the rest of the proton. The conventional approach of studying underlying events is usually carried out by defining topological regions with respect to the leading particle in an event. The transverse region is generally sensitive to UE and various classifiers have been used to discriminate the extent of UE activity regions. The production of identified particles like $π^{\pm}$, $K^{\pm}$, p , $K_{S}^{0}$, and $Λ^{0}$ are studied in different ranges of transverse activity classifier in p$-$p collisions at $\sqrt{s} = 13 $ TeV using pQCD inspired PYTHIA 8 event generator. A comparative analysis of the identified particle spectra, mean multiplicity and mean transverse momentum has been carried out with respect to $S_{T}$ and the performance of this new observable is gauged by comparing the results with previously defined $R_{T}$ observable.
△ Less
Submitted 2 March, 2024;
originally announced March 2024.
-
Efficiency of neural quantum states in light of the quantum geometric tensor
Authors:
Sidhartha Dash,
Filippo Vicentini,
Michel Ferrero,
Antoine Georges
Abstract:
Neural quantum state (NQS) ansätze have shown promise in variational Monte Carlo algorithms by their theoretical capability of representing any quantum state. However, the reason behind the practical improvement in their performance with an increase in the number of parameters is not fully understood. In this work, we systematically study the efficiency of restricted Boltzmann Machines (RBMs) to r…
▽ More
Neural quantum state (NQS) ansätze have shown promise in variational Monte Carlo algorithms by their theoretical capability of representing any quantum state. However, the reason behind the practical improvement in their performance with an increase in the number of parameters is not fully understood. In this work, we systematically study the efficiency of restricted Boltzmann Machines (RBMs) to represent the ground states in different phases of the spin-1 bilinear-biquadratic model, as the hidden layer density $α$ increases. We train our ansatz by minimizing two different loss functions: 1) energy, and 2) infidelity of the NQS ansatz w.r.t. that of the exact ground state. We observe that the accuracy of our ansatz saturates with $α$ in both cases. We demonstrate that this can be explained by looking at the spectrum of the quantum geometric tensor (QGT). We find that the rank of the QGT saturates beyond a certain $α$, and we emphasize that it corresponds to the \textit{dimension of the relevant manifold} for an optimized NQS. This provides a useful diagnostics for the practical representation power of an NQS ansatz.
△ Less
Submitted 3 March, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.
-
Room temperature nonlocal detection of charge-spin interconversion in a topological insulator
Authors:
Anamul Md. Hoque,
Lars Sjöström,
Dmitrii Khokhriakov,
Bing Zhao,
Saroj P. Dash
Abstract:
Topological insulators (TIs) are emerging materials for next-generation low-power nanoelectronic and spintronic device applications. TIs possess non-trivial spin-momentum locking features in the topological surface states in addition to the spin-Hall effect (SHE), and Rashba states due to high spin-orbit coupling (SOC) properties. These phenomena are vital for observing the charge-spin conversion…
▽ More
Topological insulators (TIs) are emerging materials for next-generation low-power nanoelectronic and spintronic device applications. TIs possess non-trivial spin-momentum locking features in the topological surface states in addition to the spin-Hall effect (SHE), and Rashba states due to high spin-orbit coupling (SOC) properties. These phenomena are vital for observing the charge-spin conversion (CSC) processes for spin-based memory, logic and quantum technologies. Although CSC has been observed in TIs by potentiometric measurements, reliable nonlocal detection has so far been limited to cryogenic temperatures up to T = 15 K. Here, we report nonlocal detection of CSC and its inverse effect in the TI compound Bi1.5Sb0.5Te1.7Se1.3 at room temperature using a van der Waals heterostructure with a graphene spin-valve device. The lateral nonlocal device design with graphene allows observation of both spin-switch and Hanle spin precession signals for generation, injection and detection of spin currents by the TI. Detailed bias- and gate-dependent measurements in different geometries prove the robustness of the CSC effects in the TI. These findings demonstrate the possibility of using topological materials to make all-electrical room-temperature spintronic devices.
△ Less
Submitted 25 January, 2024;
originally announced January 2024.
-
Optimizing Distributed Training on Frontier for Large Language Models
Authors:
Sajal Dash,
Isaac Lyngaas,
Junqi Yin,
Xiao Wang,
Romain Egele,
Guojing Cong,
Feiyi Wang,
Prasanna Balaprakash
Abstract:
Large language models (LLMs) have demonstrated remarkable success as foundational models, benefiting various downstream applications through fine-tuning. Recent studies on loss scaling have demonstrated the superior performance of larger LLMs compared to their smaller counterparts. Nevertheless, training LLMs with billions of parameters poses significant challenges and requires considerable comput…
▽ More
Large language models (LLMs) have demonstrated remarkable success as foundational models, benefiting various downstream applications through fine-tuning. Recent studies on loss scaling have demonstrated the superior performance of larger LLMs compared to their smaller counterparts. Nevertheless, training LLMs with billions of parameters poses significant challenges and requires considerable computational resources. For example, training a one trillion parameter GPT-style model on 20 trillion tokens requires a staggering 120 million exaflops of computation. This research explores efficient distributed training strategies to extract this computation from Frontier, the world's first exascale supercomputer dedicated to open science. We enable and investigate various model and data parallel training techniques, such as tensor parallelism, pipeline parallelism, and sharded data parallelism, to facilitate training a trillion-parameter model on Frontier. We empirically assess these techniques and their associated parameters to determine their impact on memory footprint, communication latency, and GPU's computational efficiency. We analyze the complex interplay among these techniques and find a strategy to combine them to achieve high throughput through hyperparameter tuning. We have identified efficient strategies for training large LLMs of varying sizes through empirical analysis and hyperparameter tuning. For 22 Billion, 175 Billion, and 1 Trillion parameters, we achieved GPU throughputs of $38.38\%$, $36.14\%$, and $31.96\%$, respectively. For the training of the 175 Billion parameter model and the 1 Trillion parameter model, we achieved $100\%$ weak scaling efficiency on 1024 and 3072 MI250X GPUs, respectively. We also achieved strong scaling efficiencies of $89\%$ and $87\%$ for these two models.
△ Less
Submitted 21 December, 2023; v1 submitted 19 December, 2023;
originally announced December 2023.
-
Signature of pressure-induced topological phase transition in ZrTe$_5$
Authors:
Zoltán Kovács-Krausz,
Dániel Nagy,
Albin Márffy,
Bogdan Karpiak,
Zoltán Tajkov,
László Oroszlány,
János Koltai,
Péter Nemes-Incze,
Saroj P. Dash,
Péter Makk,
Szabolcs Csonka,
Endre Tóvári
Abstract:
The layered van der Waals material ZrTe$_5$ is known as a candidate topological insulator (TI), however its topological phase and the relation with other properties such as an apparent Dirac semimetallic state is still a subject of debate. We employ a semiclassical multicarrier transport (MCT) model to analyze the magnetotransport of ZrTe$_5$ nanodevices at hydrostatic pressures up to 2 GPa. The t…
▽ More
The layered van der Waals material ZrTe$_5$ is known as a candidate topological insulator (TI), however its topological phase and the relation with other properties such as an apparent Dirac semimetallic state is still a subject of debate. We employ a semiclassical multicarrier transport (MCT) model to analyze the magnetotransport of ZrTe$_5$ nanodevices at hydrostatic pressures up to 2 GPa. The temperature dependence of the MCT results between 10 and 300 K is assessed in the context of thermal activation, and we obtain the positions of conduction and valence band edges in the vicinity of the chemical potential. We find evidence of the closing and subsequent re-opening of the band gap with increasing pressure, which is consistent with a phase transition from weak to strong TI. This matches expectations from ab initio band structure calculations, as well as previous observations that CVT-grown ZrTe$_5$ is in a weak TI phase in ambient conditions.
△ Less
Submitted 15 December, 2023;
originally announced December 2023.
-
Large Non-Volatile Frequency Tuning of Spin Hall Nano-Oscillators using Circular Memristive Nano-Gates
Authors:
Maha Khademi,
Akash Kumar,
Mona Rajabali,
Saroj P. Dash,
Johan Åkerman
Abstract:
Spin Hall nano oscillators (SHNOs) are promising candidates for neuromorphic computing due to their miniaturized dimensions, non-linearity, fast dynamics, and ability to synchronize in long chains and arrays. However, tuning the individual SHNOs in large chains/arrays, which is key to implementing synaptic control, has remained a challenge. Here, we demonstrate circular memristive nano-gates, both…
▽ More
Spin Hall nano oscillators (SHNOs) are promising candidates for neuromorphic computing due to their miniaturized dimensions, non-linearity, fast dynamics, and ability to synchronize in long chains and arrays. However, tuning the individual SHNOs in large chains/arrays, which is key to implementing synaptic control, has remained a challenge. Here, we demonstrate circular memristive nano-gates, both precisely aligned and shifted with respect to nano-constriction SHNOs of W/CoFeB/HfOx, with increased quality of the device tunability. Gating at the exact center of the nano-constriction region is found to cause irreversible degradation to the oxide layer, resulting in a permanent frequency shift of the auto-oscillating modes. As a remedy, gates shifted outside of the immediate nano-constriction region can tune the frequency dramatically (>200 MHz) without causing any permanent change to the constriction region. Circular memristive nano-gates can, therefore, be used in SHNO chains/arrays to manipulate the synchronization states precisely over large networks of oscillators.
△ Less
Submitted 18 January, 2024; v1 submitted 6 December, 2023;
originally announced December 2023.
-
Structural transitions in superconducting NbTiN thin films
Authors:
Siddhesh Sanjay Yeram,
Sonam Bhakat,
Subhashree S. Dash,
Avradeep Pal
Abstract:
Superconducting NbTiN thin films have garnered extensive interest due to their use in Superconducting Nanowire Single-Photon Detectors (SNSPDs) and other low-temperature applications for potential use in quantum computing and nanoelectronics. This study examines structural phase transitions observed in NbTiN thin films by analyzing the grazing angle x-ray diffraction patterns of a set of reactive…
▽ More
Superconducting NbTiN thin films have garnered extensive interest due to their use in Superconducting Nanowire Single-Photon Detectors (SNSPDs) and other low-temperature applications for potential use in quantum computing and nanoelectronics. This study examines structural phase transitions observed in NbTiN thin films by analyzing the grazing angle x-ray diffraction patterns of a set of reactive magnetron sputter deposited NbTiN thin films with varying nitrogen partial pressures in the reactive gas mixture. The superconducting transition temperature (T_C) of the NbTiN thin films showed a correlation with the crystal structure, with the highest T_C of 14.26 K obtained for the highly crystalline FCC phase.
△ Less
Submitted 23 November, 2023;
originally announced November 2023.
-
Ultra-low-current-density single-layer magnetic Weyl semimetal spin Hall nano-oscillators
Authors:
Lakhan Bainsla,
Yuya Sakuraba,
Avinash Kumar Chaurasiya,
Akash Kumar,
Keisuke Masuda,
Ahmad A. Awad,
Nilamani Behera,
Roman Khymyn,
Saroj Prasad Dash,
Johan Åkerman
Abstract:
Topological quantum materials can exhibit unconventional surface states and anomalous transport properties. Still, their applications in spintronic devices are restricted as they require the growth of high-quality thin films with bulk-like properties. Here, we study 10--30 nm thick epitaxial ferromagnetic Co$_{\rm 2}$MnGa films with high structural order and very high values of the anomalous Hall…
▽ More
Topological quantum materials can exhibit unconventional surface states and anomalous transport properties. Still, their applications in spintronic devices are restricted as they require the growth of high-quality thin films with bulk-like properties. Here, we study 10--30 nm thick epitaxial ferromagnetic Co$_{\rm 2}$MnGa films with high structural order and very high values of the anomalous Hall conductivity, $σ_{\rm xy}=1.35\times10^{5}$ $Ω^{-1} m^{-1}$ and the anomalous Hall angle, $θ_{\rm H}=15.8\%$, both comparable to bulk values. We observe a dramatic crystalline orientation dependence of the Gilbert damping constant of a factor of two and a giant intrinsic spin Hall conductivity, $\mathit{σ_{\rm SHC}}=(6.08\pm 0.02)\times 10^{5}$ ($\hbar/2e$) $Ω^{-1} m^{-1}$, an order of magnitude higher than literature values of multilayer Co$_{\rm 2}$MnGa stacks [1-3] and single-layer Ni, Co, Fe [4], and Ni$_{\rm 80}$Fe$_{\rm 20}$~[4,5]. As a consequence, spin-orbit-torque driven auto-oscillations of a 30 nm thick magnetic film are observed for the first time, at an ultralow threshold current density of $J_{th}=6.2\times10^{11}$ $Am^{-2}$. Theoretical calculations of the intrinsic spin Hall conductivity, originating from a strong Berry curvature, corroborate the results and yield values comparable to the experiment. Our results open up for the design of spintronic devices based on single layers of magnetic topological quantum materials.
△ Less
Submitted 19 April, 2024; v1 submitted 14 November, 2023;
originally announced November 2023.
-
Evidence of electron correlation induced kink in Dirac bands in a non-symmorphic Kondo lattice system, CeAgSb2
Authors:
Sawani Datta,
Khadiza Ali,
Rahul Verma,
Bahadur Singh,
Saroj P. Dash,
A. Thamizhavel,
Kalobaran Maiti
Abstract:
We study the behavior of Dirac fermions in the presence of electron correlation in a nonsymmorphic Kondo lattice system, CeAgSb2 employing high-resolution angle-resolved photoemission spectroscopy and first-principles calculations. Experiments reveal crossings of highly dispersive linear bands at the Brillouin zone boundary due to non-symmorphic symmetry. In addition, anisotropic Dirac cones are o…
▽ More
We study the behavior of Dirac fermions in the presence of electron correlation in a nonsymmorphic Kondo lattice system, CeAgSb2 employing high-resolution angle-resolved photoemission spectroscopy and first-principles calculations. Experiments reveal crossings of highly dispersive linear bands at the Brillouin zone boundary due to non-symmorphic symmetry. In addition, anisotropic Dirac cones are observed constituted by the squarenet Sb 5p states forming a diamond-shaped nodal line. The Dirac bands are linear in a wide energy range with a unusually high slope and exhibit distinct Dirac point in this highly spin-orbit coupled system. Interestingly, the linearity of the bands are preserved even after the hybridization of these states with the local Ce 4f states, which leads to a small reduction of slope via formation of a 'kink'. These results seed the emergence of an area of robust topological fermions even in presence of strong correlation.
△ Less
Submitted 9 November, 2023;
originally announced November 2023.
-
Ultra-Long Sequence Distributed Transformer
Authors:
Xiao Wang,
Isaac Lyngaas,
Aristeidis Tsaris,
Peng Chen,
Sajal Dash,
Mayanka Chandra Shekar,
Tao Luo,
Hong-Jun Yoon,
Mohamed Wahib,
John Gouley
Abstract:
Transformer models trained on long sequences often achieve higher accuracy than short sequences. Unfortunately, conventional transformers struggle with long sequence training due to the overwhelming computation and memory requirements. Existing methods for long sequence training offer limited speedup and memory reduction, and may compromise accuracy. This paper presents a novel and efficient distr…
▽ More
Transformer models trained on long sequences often achieve higher accuracy than short sequences. Unfortunately, conventional transformers struggle with long sequence training due to the overwhelming computation and memory requirements. Existing methods for long sequence training offer limited speedup and memory reduction, and may compromise accuracy. This paper presents a novel and efficient distributed training method, the Long Short-Sequence Transformer (LSS Transformer), for training transformer with long sequences. It distributes a long sequence into segments among GPUs, with each GPU computing a partial self-attention for its segment. Then, it uses a fused communication and a novel double gradient averaging technique to avoid the need to aggregate partial self-attention and minimize communication overhead. We evaluated the performance between LSS Transformer and the state-of-the-art Nvidia sequence parallelism on a Wikipedia enwik8 dataset. Results show that our proposed method lead to 5.6x faster and 10.2x more memory-efficient implementation compared to state-of-the-art sequence parallelism on 144 Nvidia V100 GPUs. Moreover, our algorithm scales to an extreme sequence length of 50,112 at 3,456 GPUs, achieving 161% super-linear parallel efficiency and a throughput of 32 petaflops.
△ Less
Submitted 8 November, 2023; v1 submitted 4 November, 2023;
originally announced November 2023.
-
Obtaining Explainable Classification Models using Distributionally Robust Optimization
Authors:
Sanjeeb Dash,
Soumyadip Ghosh,
Joao Goncalves,
Mark S. Squillante
Abstract:
Model explainability is crucial for human users to be able to interpret how a proposed classifier assigns labels to data based on its feature values. We study generalized linear models constructed using sets of feature value rules, which can capture nonlinear dependencies and interactions. An inherent trade-off exists between rule set sparsity and its prediction accuracy. It is computationally exp…
▽ More
Model explainability is crucial for human users to be able to interpret how a proposed classifier assigns labels to data based on its feature values. We study generalized linear models constructed using sets of feature value rules, which can capture nonlinear dependencies and interactions. An inherent trade-off exists between rule set sparsity and its prediction accuracy. It is computationally expensive to find the right choice of sparsity -- e.g., via cross-validation -- with existing methods. We propose a new formulation to learn an ensemble of rule sets that simultaneously addresses these competing factors. Good generalization is ensured while keeping computational costs low by utilizing distributionally robust optimization. The formulation utilizes column generation to efficiently search the space of rule sets and constructs a sparse ensemble of rule sets, in contrast with techniques like random forests or boosting and their variants. We present theoretical results that motivate and justify the use of our distributionally robust formulation. Extensive numerical experiments establish that our method improves over competing methods -- on a large set of publicly available binary classification problem instances -- with respect to one or more of the following metrics: generalization quality, computational cost, and explainability.
△ Less
Submitted 3 November, 2023;
originally announced November 2023.
-
Strong in-plane magnetic anisotropy (Co0.15Fe0.85)5GeTe2/graphene van der Waals heterostructure spin-valve at room temperature
Authors:
Roselle Ngaloy,
Bing Zhao,
Soheil Ershadrad,
Rahul Gupta,
Masoumeh Davoudiniya,
Lakhan Bainsla,
Lars Sjöström,
Anamul M. Hoque,
Alexei Kalaboukhov,
Peter Svedlindh,
Biplab Sanyal,
Saroj P. Dash
Abstract:
Van der Waals (vdW) magnets are promising owing to their tunable magnetic properties with doping or alloy composition, where the strength of magnetic interactions, their symmetry, and magnetic anisotropy can be tuned according to the desired application. However, most of the vdW magnet based spintronic devices are so far limited to cryogenic temperatures with magnetic anisotropies favouring out-of…
▽ More
Van der Waals (vdW) magnets are promising owing to their tunable magnetic properties with doping or alloy composition, where the strength of magnetic interactions, their symmetry, and magnetic anisotropy can be tuned according to the desired application. However, most of the vdW magnet based spintronic devices are so far limited to cryogenic temperatures with magnetic anisotropies favouring out-of-plane or canted orientation of the magnetization. Here, we report room-temperature lateral spin-valve devices with strong in-plane magnetic anisotropy of the vdW ferromagnet (Co0.15Fe0.85)5GeTe2 (CFGT) in heterostructures with graphene. Magnetization measurements reveal above room-temperature ferromagnetism in CFGT with a strong in-plane magnetic anisotropy. Density functional theory calculations show that the magnitude of the anisotropy depends on the Co concentration and is caused by the substitution of Co in the outermost Fe layer. Heterostructures consisting of CFGT nanolayers and graphene were used to experimentally realize basic building blocks for spin valve devices such as efficient spin injection and detection. The spin transport and Hanle spin precession measurements prove a strong in-plane and negative spin polarization at the interface with graphene, which is supported by the calculated spin-polarized density of states of CFGT. The in-plane magnetization of CFGT at room temperature proves its usefulness in graphene lateral spin-valve devices, thus opening further opportunities for spintronic technologies.
△ Less
Submitted 30 October, 2023;
originally announced October 2023.
-
Synergizing Airborne Non-Terrestrial Networks and Reconfigurable Intelligent Surfaces-Aided 6G IoT
Authors:
Muhammad Ali Jamshed,
Aryan Kaushik,
Mesut Toka,
Wonjae Shin,
Muhammad Zeeshan Shakir,
Soumya P. Dash,
Davide Dardari
Abstract:
On the one hand, Reconfigurable Intelligent Surfaces (RISs) emerge as a promising solution to meet the demand for higher data rates, improved coverage, and efficient spectrum utilization. On the other hand, Non-Terrestrial Networks (NTNs) offer unprecedented possibilities for global connectivity. Moreover, the NTN can also support the upsurge in the number of Internet of Things (IoT) devices by pr…
▽ More
On the one hand, Reconfigurable Intelligent Surfaces (RISs) emerge as a promising solution to meet the demand for higher data rates, improved coverage, and efficient spectrum utilization. On the other hand, Non-Terrestrial Networks (NTNs) offer unprecedented possibilities for global connectivity. Moreover, the NTN can also support the upsurge in the number of Internet of Things (IoT) devices by providing reliable and ubiquitous connectivity. Although NTNs have shown promising results, there are several challenges associated with their usage, such as signal propagation delays, interference, security, etc. In this article, we have discussed the possibilities of integrating RIS with an NTN platform to overcome the issues associated with NTN. Furthermore, through experimental validation, we have demonstrated that the RIS-assisted NTN can play a pivotal role in improving the performance of the entire communication system.
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
Skyrmions and magnetic bubbles in spin-orbit coupled metallic magnets
Authors:
Deepti Rana,
Soumyaranjan Dash,
Monika Bhakar,
Rajeshwari Roy Chowdhury,
Ravi Prakash Singh,
Sanjeev Kumar,
Goutam Sheet
Abstract:
Motivated by the observation of Skyrmion-like magnetic textures in 2D itinerant ferromagnets Fe$_n$GeTe$_2$ ($n \geq3$), we develop a microscopic model combining itinerant magnetism and spin-orbit coupling on a triangular lattice. The ground state of the model in the absence of magnetic field consists of filamentary magnetic domain walls revealing a striking similarity with our magnetic force micr…
▽ More
Motivated by the observation of Skyrmion-like magnetic textures in 2D itinerant ferromagnets Fe$_n$GeTe$_2$ ($n \geq3$), we develop a microscopic model combining itinerant magnetism and spin-orbit coupling on a triangular lattice. The ground state of the model in the absence of magnetic field consists of filamentary magnetic domain walls revealing a striking similarity with our magnetic force microscopy experiments on Fe$_3$GeTe$_2$. In the presence of magnetic field, these filaments were found to break into large size magnetic bubbles in our experiments. We identify uniaxial magnetic anisotropy as an important parameter in the model that interpolates between magnetic Skyrmions and ferromagnetic bubbles. Consequently, our work uncovers new topological magnetic textures that merge properties of Skyrmions and ferromagnetic bubbles.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
Large out-of-plane spin-orbit torque in topological Weyl semimetal candidate TaIrTe4
Authors:
Lakhan Bainsla,
Bing Zhao,
Anamul Md. Hoque,
Lars Sjöström,
Nilamani Behera,
Mahmoud Abdel-Hafiez,
Johan Åkerman,
Saroj P. Dash
Abstract:
Topological quantum materials, with novel spin textures and broken crystal symmetries are suitable candidates for spintronic memory technologies. Their unique electronic properties, such as protected surface states and exotic quasiparticles, can provide an out-of-plane spin polarized current needed for external field free magnetization switching of magnets with perpendicular magnetic anisotropy. C…
▽ More
Topological quantum materials, with novel spin textures and broken crystal symmetries are suitable candidates for spintronic memory technologies. Their unique electronic properties, such as protected surface states and exotic quasiparticles, can provide an out-of-plane spin polarized current needed for external field free magnetization switching of magnets with perpendicular magnetic anisotropy. Conventional spin-orbit torque materials, such as heavy metals and topological insulators, provide only an in-plane spin polarized current, and recently explored materials with lower crystal symmetries provide very low out-of-plane spin polarized current components, which is not suitable for energy-efficient spin-orbit torque (SOT) applications. Here, we demonstrate a large out-of-plane damping-like SOT at room temperature using a topological Weyl semimetal candidate TaIrTe4 with a lower crystal symmetry. We performed spin-orbit torque ferromagnetic resonance (STFMR) experiments in a TaIrTe4/Ni80Fe20 heterostructure and observed a large out-of-plane damping-like SOT efficiency. The out-of-plane spin Hall conductivity is estimated to be an order of magnitude higher than the reported values in other materials. These findings of high spin Hall conductivity and large out-of-plane SOT efficiency are suitable for the development of energy efficient and external field-free spintronic devices.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies
Authors:
Shuaiwen Leon Song,
Bonnie Kruft,
Minjia Zhang,
Conglong Li,
Shiyang Chen,
Chengming Zhang,
Masahiro Tanaka,
Xiaoxia Wu,
Jeff Rasley,
Ammar Ahmad Awan,
Connor Holmes,
Martin Cai,
Adam Ghanem,
Zhongzhu Zhou,
Yuxiong He,
Pete Luferenko,
Divya Kumar,
Jonathan Weyn,
Ruixiong Zhang,
Sylwester Klocek,
Volodymyr Vragov,
Mohammed AlQuraishi,
Gustaf Ahdritz,
Christina Floristean,
Cristina Negri
, et al. (67 additional authors not shown)
Abstract:
In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences. This could herald a new era of scientific exploration, bringing significant advancements across sectors from drug development to renewable energy. To answer this call, we present DeepSpeed4Science initiative (deepspeed4science.ai) which aims to build unique…
▽ More
In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences. This could herald a new era of scientific exploration, bringing significant advancements across sectors from drug development to renewable energy. To answer this call, we present DeepSpeed4Science initiative (deepspeed4science.ai) which aims to build unique capabilities through AI system technology innovations to help domain experts to unlock today's biggest science mysteries. By leveraging DeepSpeed's current technology pillars (training, inference and compression) as base technology enablers, DeepSpeed4Science will create a new set of AI system technologies tailored for accelerating scientific discoveries by addressing their unique complexity beyond the common technical approaches used for accelerating generic large language models (LLMs). In this paper, we showcase the early progress we made with DeepSpeed4Science in addressing two of the critical system challenges in structural biology research.
△ Less
Submitted 11 October, 2023; v1 submitted 6 October, 2023;
originally announced October 2023.
-
Coexistence of non-trivial van der Waals magnetic orders enable field-free spin-orbit torque switching at room temperature
Authors:
Bing Zhao,
Lakhan Bainsla,
Roselle Ngaloy,
Peter Svedlindh,
Saroj P. Dash
Abstract:
The discovery of van der Waals (vdW) materials exhibiting non-trivial and tunable magnetic interactions at room temperature can give rise to exotic magnetic states, which are not readily attainable with conventional materials. Such vdW magnets can provide a unique platform for studying new magnetic phenomena and realising magnetization dynamics for energy-efficient and non-volatile spintronic memo…
▽ More
The discovery of van der Waals (vdW) materials exhibiting non-trivial and tunable magnetic interactions at room temperature can give rise to exotic magnetic states, which are not readily attainable with conventional materials. Such vdW magnets can provide a unique platform for studying new magnetic phenomena and realising magnetization dynamics for energy-efficient and non-volatile spintronic memory and logic technologies. Recent developments in vdW magnets have revealed their potential to enable spin-orbit torque (SOT) induced magnetization dynamics. However, the deterministic and field-free SOT switching of vdW magnets at room temperature has been lacking, prohibiting their potential applications. Here, we demonstrate magnetic field-free and deterministic SOT switching of a vdW magnet (Co0.5Fe0.5)5GeTe2 (CFGT) at room temperature, capitalizing on its non-trivial intrinsic magnetic ordering. We discover a coexistence of ferromagnetic and antiferromagnetic orders in CFGT at room temperature, inducing an intrinsic exchange bias and canted perpendicular magnetism. The resulting canted perpendicular magnetization of CFGT introduces symmetry breaking, facilitating successful magnetic field-free magnetization switching in the CFGT/Pt heterostructure devices. Furthermore, the SOT-induced magnetization dynamics and their efficiency are evaluated using 2nd harmonic Hall measurements. This advancement opens new avenues for investigating tunable magnetic phenomena in vdW material heterostructures and realizing field-free SOT-based spintronic technologies.
△ Less
Submitted 25 August, 2023;
originally announced August 2023.
-
Evolving Scientific Discovery by Unifying Data and Background Knowledge with AI Hilbert
Authors:
Ryan Cory-Wright,
Cristina Cornelio,
Sanjeeb Dash,
Bachir El Khadir,
Lior Horesh
Abstract:
The discovery of scientific formulae that parsimoniously explain natural phenomena and align with existing background theory is a key goal in science. Historically, scientists have derived natural laws by manipulating equations based on existing knowledge, forming new equations, and verifying them experimentally. In recent years, data-driven scientific discovery has emerged as a viable competitor…
▽ More
The discovery of scientific formulae that parsimoniously explain natural phenomena and align with existing background theory is a key goal in science. Historically, scientists have derived natural laws by manipulating equations based on existing knowledge, forming new equations, and verifying them experimentally. In recent years, data-driven scientific discovery has emerged as a viable competitor in settings with large amounts of experimental data. Unfortunately, data-driven methods often fail to discover valid laws when data is noisy or scarce. Accordingly, recent works combine regression and reasoning to eliminate formulae inconsistent with background theory. However, the problem of searching over the space of formulae consistent with background theory to find one that best fits the data is not well-solved. We propose a solution to this problem when all axioms and scientific laws are expressible via polynomial equalities and inequalities and argue that our approach is widely applicable. We model notions of minimal complexity using binary variables and logical constraints, solve polynomial optimization problems via mixed-integer linear or semidefinite optimization, and prove the validity of our scientific discoveries in a principled manner using Positivstellensatz certificates. The optimization techniques leveraged in this paper allow our approach to run in polynomial time with fully correct background theory under an assumption that the complexity of our derivation is bounded), or non-deterministic polynomial (NP) time with partially correct background theory. We demonstrate that some famous scientific laws, including Kepler's Third Law of Planetary Motion, the Hagen-Poiseuille Equation, and the Radiated Gravitational Wave Power equation, can be derived in a principled manner from axioms and experimental data.
△ Less
Submitted 29 April, 2024; v1 submitted 18 August, 2023;
originally announced August 2023.
-
RIS-Assisted 6G Wireless Communications: A Novel Statistical Framework in the Presence of Direct Channel
Authors:
Soumya P. Dash,
Aryan Kaushik
Abstract:
A RIS-assisted wireless communication system in the presence of a direct communication path between the transceiver pair is considered in this paper. The transmitter-RIS and the RIS-receiver channels follow independent Nakagami-m distributions, and the direct channel between the transceiver pair follows a Rayleigh distribution. Considering this system model, the statistics of the composite channel…
▽ More
A RIS-assisted wireless communication system in the presence of a direct communication path between the transceiver pair is considered in this paper. The transmitter-RIS and the RIS-receiver channels follow independent Nakagami-m distributions, and the direct channel between the transceiver pair follows a Rayleigh distribution. Considering this system model, the statistics of the composite channel for the RIS-assisted communication system are derived in terms of obtaining novel expressions for the probability density functions for the magnitude and the phase of the communication channel. The correctness of the analytical framework is verified via Monte Carlo simulations, and the effects of the shape parameters of the channels and the number of reflecting elements in the RIS on the randomness of the composite channel are studied via numerical results.
△ Less
Submitted 11 August, 2023;
originally announced August 2023.
-
Nonextensive effects on the viscous properties of hot and magnetized QCD matter
Authors:
Shubhalaxmi Rath,
Sadhana Dash
Abstract:
We have studied the effect of the nonextensive Tsallis mechanism on the viscous properties of hot QCD matter in the presence of a strong magnetic field. The results are compared to the case of absence of magnetic field. The viscous coefficients, such as the shear viscosity ($η$) and the bulk viscosity ($ζ$) are determined in the similar environment by utilizing the nonextensive Tsallis mechanism w…
▽ More
We have studied the effect of the nonextensive Tsallis mechanism on the viscous properties of hot QCD matter in the presence of a strong magnetic field. The results are compared to the case of absence of magnetic field. The viscous coefficients, such as the shear viscosity ($η$) and the bulk viscosity ($ζ$) are determined in the similar environment by utilizing the nonextensive Tsallis mechanism within the relaxation time approximation of kinetic theory. We have observed that, when the nonextensive parameter $q$ is just above unity, both shear and bulk viscosities get increased as compared to their counterparts at $q=1$. This enhancement in viscosities is more evident in the additional presence of a strong magnetic field. Furthermore, some observables pertaining to the flow characteristic, fluid behavior and conformal symmetry of the medium are also explored.
△ Less
Submitted 6 February, 2024; v1 submitted 22 July, 2023;
originally announced July 2023.
-
RIS-Aided Index Modulation with Greedy Detection over Rician Fading Channels
Authors:
Aritra Basu,
Soumya P. Dash,
Aryan Kaushik,
Debasish Ghose,
Marco Di Renzo,
Yonina C. Eldar
Abstract:
Index modulation schemes for reconfigurable intelligent surfaces (RIS)-assisted systems are envisioned as promising technologies for fifth-generation-advanced and sixth-generation (6G) wireless communication systems to enhance various system capabilities such as coverage area and network capacity. In this paper, we consider a receive diversity RIS-assisted wireless communication system employing I…
▽ More
Index modulation schemes for reconfigurable intelligent surfaces (RIS)-assisted systems are envisioned as promising technologies for fifth-generation-advanced and sixth-generation (6G) wireless communication systems to enhance various system capabilities such as coverage area and network capacity. In this paper, we consider a receive diversity RIS-assisted wireless communication system employing IM schemes, namely, space-shift keying (SSK) for binary modulation and spatial modulation (SM) for M-ary modulation for data transmission. The RIS lies in close proximity to the transmitter, and the transmitted data is subjected to a fading environment with a prominent line-of-sight component modeled by a Rician distribution. A receiver structure based on a greedy detection rule is employed to select the receive diversity branch with the highest received signal energy for demodulation. The performance of the considered system is evaluated by obtaining a series-form expression for the probability of erroneous index detection (PED) of the considered target antenna using a characteristic function approach. In addition, closed-form and asymptotic expressions at high and low signal-to-noise ratios (SNRs) for the bit error rate (BER) for the SSK-based system, and the SM-based system employing M-ary phase-shift keying and M-ary quadrature amplitude modulation schemes, are derived. The dependencies of the system performance on the various parameters are corroborated via numerical results. The asymptotic expressions and results of PED and BER at high and low SNR values lead to the observation of a performance saturation and the presence of an SNR value as a point of inflection, which is attributed to the greedy detector's structure.
△ Less
Submitted 18 July, 2023;
originally announced July 2023.
-
Artificial Intelligence for the Electron Ion Collider (AI4EIC)
Authors:
C. Allaire,
R. Ammendola,
E. -C. Aschenauer,
M. Balandat,
M. Battaglieri,
J. Bernauer,
M. Bondì,
N. Branson,
T. Britton,
A. Butter,
I. Chahrour,
P. Chatagnon,
E. Cisbani,
E. W. Cline,
S. Dash,
C. Dean,
W. Deconinck,
A. Deshpande,
M. Diefenthaler,
R. Ent,
C. Fanelli,
M. Finger,
M. Finger, Jr.,
E. Fol,
S. Furletov
, et al. (70 additional authors not shown)
Abstract:
The Electron-Ion Collider (EIC), a state-of-the-art facility for studying the strong force, is expected to begin commissioning its first experiments in 2028. This is an opportune time for artificial intelligence (AI) to be included from the start at this facility and in all phases that lead up to the experiments. The second annual workshop organized by the AI4EIC working group, which recently took…
▽ More
The Electron-Ion Collider (EIC), a state-of-the-art facility for studying the strong force, is expected to begin commissioning its first experiments in 2028. This is an opportune time for artificial intelligence (AI) to be included from the start at this facility and in all phases that lead up to the experiments. The second annual workshop organized by the AI4EIC working group, which recently took place, centered on exploring all current and prospective application areas of AI for the EIC. This workshop is not only beneficial for the EIC, but also provides valuable insights for the newly established ePIC collaboration at EIC. This paper summarizes the different activities and R&D projects covered across the sessions of the workshop and provides an overview of the goals, approaches and strategies regarding AI/ML in the EIC community, as well as cutting-edge techniques currently studied in other experiments.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
Localization and interaction of interlayer excitons in MoSe$_2$/WSe$_2$ heterobilayers
Authors:
Hanlin Fang,
Qiaoling Lin,
Yi Zhang,
Joshua Thompson,
Sanshui Xiao,
Zhipei Sun,
Ermin Malic,
Saroj Dash,
Witlef Wieczorek
Abstract:
Transition metal dichalcogenide (TMD) heterobilayers provide a versatile platform to explore unique excitonic physics via properties of the constituent TMDs and external stimuli. Interlayer excitons (IXs) can form in TMD heterobilayers as delocalized or localized states. However, the localization of IX in different types of potential traps, the emergence of biexcitons in the high-excitation regime…
▽ More
Transition metal dichalcogenide (TMD) heterobilayers provide a versatile platform to explore unique excitonic physics via properties of the constituent TMDs and external stimuli. Interlayer excitons (IXs) can form in TMD heterobilayers as delocalized or localized states. However, the localization of IX in different types of potential traps, the emergence of biexcitons in the high-excitation regime, and the impact of potential traps on biexciton formation have remained elusive. In our work, we observe two types of potential traps in a MoSe$_2$/WSe$_2$ heterobilayer, which result in significantly different emission behavior of IXs at different temperatures. We identify the origin of these traps as localized defect states and the moir{é} potential of the TMD heterobilayer. Furthermore, with strong excitation intensity, a superlinear emission behavior indicates the emergence of interlayer biexcitons, whose formation peaks at a specific temperature. Our work elucidates the different excitation and temperature regimes required for the formation of both localized and delocalized IX and biexcitons, and, thus, contributes to a better understanding and application of the rich exciton physics in TMD heterostructures.
△ Less
Submitted 7 July, 2023;
originally announced July 2023.
-
Multiplicity and Transverse Spherocity dependence of $\langle p_{\rm T} \rangle$ fluctuations of charged particles in p$-$p collisions at $\sqrt{s}$ = 7 and 13 TeV
Authors:
Subhadeep Roy,
Tulika Tripathy,
Sadhana Dash
Abstract:
The multiplicity dependence of event-by-event fluctuations in mean transverse momentum, $\langle p_{\rm T} \rangle$, of charged particles has been studied in p$-$p collisions at $\sqrt{s}$ = 7 TeV and 13 TeV using the PYTHIA 8 event generator. The charged particles were selected in kinematic range of $0.15 < p_{\rm T}<2$ GeV$/c$ and $|η| < 0.8$. The dynamical fluctuations would indicate towards th…
▽ More
The multiplicity dependence of event-by-event fluctuations in mean transverse momentum, $\langle p_{\rm T} \rangle$, of charged particles has been studied in p$-$p collisions at $\sqrt{s}$ = 7 TeV and 13 TeV using the PYTHIA 8 event generator. The charged particles were selected in kinematic range of $0.15 < p_{\rm T}<2$ GeV$/c$ and $|η| < 0.8$. The dynamical fluctuations would indicate towards the correlated emission of particles. The measurements in A$-$A and p$-$p collisions has shown a decrease in the strength of $ \langle p_{\rm T} \rangle$ fluctuations with the average charged particle multiplicity. The effects of various microscopic processes like color reconnection and multi-partonic interactions has been studied. A minimal dependency on the collision energy is also observed. Furthermore, the fluctuation observables are investigated in the intervals of transverse spherocity in order to comprehend the relative contributions resulting from hard scattering and underlying events. The present study would act as a baseline for future measurements in A$-$A as well as p$-$p collisions at the LHC.
△ Less
Submitted 6 June, 2023;
originally announced June 2023.
-
Intriguing Properties of Quantization at Scale
Authors:
Arash Ahmadian,
Saurabh Dash,
Hongyu Chen,
Bharat Venkitesh,
Stephen Gou,
Phil Blunsom,
Ahmet Üstün,
Sara Hooker
Abstract:
Emergent properties have been widely adopted as a term to describe behavior not present in smaller models but observed in larger models. Recent work suggests that the trade-off incurred by quantization is also an emergent property, with sharp drops in performance in models over 6B parameters. In this work, we ask "are quantization cliffs in performance solely a factor of scale?" Against a backdrop…
▽ More
Emergent properties have been widely adopted as a term to describe behavior not present in smaller models but observed in larger models. Recent work suggests that the trade-off incurred by quantization is also an emergent property, with sharp drops in performance in models over 6B parameters. In this work, we ask "are quantization cliffs in performance solely a factor of scale?" Against a backdrop of increased research focus on why certain emergent properties surface at scale, this work provides a useful counter-example. We posit that it is possible to optimize for a quantization friendly training recipe that suppresses large activation magnitude outliers. Here, we find that outlier dimensions are not an inherent product of scale, but rather sensitive to the optimization conditions present during pre-training. This both opens up directions for more efficient quantization, and poses the question of whether other emergent properties are inherent or can be altered and conditioned by optimization and architecture design choices. We successfully quantize models ranging in size from 410M to 52B with minimal degradation in performance.
△ Less
Submitted 30 May, 2023;
originally announced May 2023.
-
On Counterfactual Data Augmentation Under Confounding
Authors:
Abbavaram Gowtham Reddy,
Saketh Bachu,
Saloni Dash,
Charchit Sharma,
Amit Sharma,
Vineeth N Balasubramanian
Abstract:
Counterfactual data augmentation has recently emerged as a method to mitigate confounding biases in the training data. These biases, such as spurious correlations, arise due to various observed and unobserved confounding variables in the data generation process. In this paper, we formally analyze how confounding biases impact downstream classifiers and present a causal viewpoint to the solutions b…
▽ More
Counterfactual data augmentation has recently emerged as a method to mitigate confounding biases in the training data. These biases, such as spurious correlations, arise due to various observed and unobserved confounding variables in the data generation process. In this paper, we formally analyze how confounding biases impact downstream classifiers and present a causal viewpoint to the solutions based on counterfactual data augmentation. We explore how removing confounding biases serves as a means to learn invariant features, ultimately aiding in generalization beyond the observed data distribution. Additionally, we present a straightforward yet powerful algorithm for generating counterfactual images, which effectively mitigates the influence of confounding effects on downstream classifiers. Through experiments on MNIST variants and the CelebA datasets, we demonstrate how our simple augmentation method helps existing state-of-the-art methods achieve good results.
△ Less
Submitted 21 November, 2023; v1 submitted 29 May, 2023;
originally announced May 2023.
-
Fairness in Image Search: A Study of Occupational Stereotyping in Image Retrieval and its Debiasing
Authors:
Swagatika Dash
Abstract:
Multi-modal search engines have experienced significant growth and widespread use in recent years, making them the second most common internet use. While search engine systems offer a range of services, the image search field has recently become a focal point in the information retrieval community, as the adage goes, "a picture is worth a thousand words". Although popular search engines like Googl…
▽ More
Multi-modal search engines have experienced significant growth and widespread use in recent years, making them the second most common internet use. While search engine systems offer a range of services, the image search field has recently become a focal point in the information retrieval community, as the adage goes, "a picture is worth a thousand words". Although popular search engines like Google excel at image search accuracy and agility, there is an ongoing debate over whether their search results can be biased in terms of gender, language, demographics, socio-cultural aspects, and stereotypes. This potential for bias can have a significant impact on individuals' perceptions and influence their perspectives.
In this paper, we present our study on bias and fairness in web search, with a focus on keyword-based image search. We first discuss several kinds of biases that exist in search systems and why it is important to mitigate them. We narrow down our study to assessing and mitigating occupational stereotypes in image search, which is a prevalent fairness issue in image retrieval. For the assessment of stereotypes, we take gender as an indicator. We explore various open-source and proprietary APIs for gender identification from images. With these, we examine the extent of gender bias in top-tanked image search results obtained for several occupational keywords. To mitigate the bias, we then propose a fairness-aware re-ranking algorithm that optimizes (a) relevance of the search result with the keyword and (b) fairness w.r.t genders identified. We experiment on 100 top-ranked images obtained for 10 occupational keywords and consider random re-ranking and re-ranking based on relevance as baselines. Our experimental results show that the fairness-aware re-ranking algorithm produces rankings with better fairness scores and competitive relevance scores than the baselines.
△ Less
Submitted 22 August, 2023; v1 submitted 5 May, 2023;
originally announced May 2023.
-
Thermally-driven Multilevel Non-volatile Memory with Monolayer MoS2 for Neuro-inspired Artificial Learning
Authors:
Sameer Kumar Mallik,
Roshan Padhan,
Mousam Charan Sahu,
Suman Roy,
Gopal K Pradhan,
Prasana Kumar Sahoo,
Saroj Prasad Dash,
Satyaprakash Sahoo
Abstract:
The demands of modern electronic components require advanced computing platforms for efficient information processing to realize in-memory operations with a high density of data storage capabilities towards developing alternatives to von Neumann architectures. Herein, we demonstrate the multifunctionality of monolayer MoS2 mem-transistors which can be used as a high-geared intrinsic transistor at…
▽ More
The demands of modern electronic components require advanced computing platforms for efficient information processing to realize in-memory operations with a high density of data storage capabilities towards developing alternatives to von Neumann architectures. Herein, we demonstrate the multifunctionality of monolayer MoS2 mem-transistors which can be used as a high-geared intrinsic transistor at room temperature; however, at a high temperature (>350 K), they exhibit synaptic multi-level memory operations. The temperature-dependent memory mechanism is governed by interfacial physics, which solely depends on the gate field modulated ion dynamics and charge transfer at the MoS2/dielectric interface. We have proposed a non-volatile memory application using a single FET device where thermal energy can be ventured to aid the memory functions with multi-level (3-bit) storage capabilities. Furthermore, our devices exhibit linear and symmetry in conductance weight updates when subjected to electrical potentiation and depression. This feature has enabled us to attain a high classification accuracy while training and testing the Modified National Institute of Standards and Technology datasets through artificial neural network simulation. This work paves the way for new avenues in 2D semiconductors toward reliable data processing and storage with high-packing density arrays for brain-inspired artificial learning.
△ Less
Submitted 3 May, 2023;
originally announced May 2023.
-
Impact of nonextensivity on the transport coefficients of a magnetized hot and dense QCD matter
Authors:
Shubhalaxmi Rath,
Sadhana Dash
Abstract:
We have studied the impact of the nonextensivity on the transport coefficients related to charge and heat in thermal QCD. For this purpose, the electrical ($σ_{\rm el}$), Hall ($σ_{\rm H}$), thermal ($κ$) and Hall-type thermal ($κ_{\rm H}$) conductivities are determined using the kinetic theory approach in association with the nonextensive Tsallis statistical mechanism. The effect of nonextensivit…
▽ More
We have studied the impact of the nonextensivity on the transport coefficients related to charge and heat in thermal QCD. For this purpose, the electrical ($σ_{\rm el}$), Hall ($σ_{\rm H}$), thermal ($κ$) and Hall-type thermal ($κ_{\rm H}$) conductivities are determined using the kinetic theory approach in association with the nonextensive Tsallis statistical mechanism. The effect of nonextensivity is encoded in the nonextensive Tsallis distribution function, where the deviation of the parameter $q$ from 1 signifies the degree of nonextensivity in the concerned system. The thermal and electrical conductivities are found to increase with the introduction of nonextensivity, which means that the deviation of the medium from thermal equilibrium enhances both charge and heat transports. With the magnetic field, the deviations of $σ_{\rm el}$, $σ_{\rm H}$, $κ$ and $κ_{\rm H}$ from their respective equilibrated values increase, whereas these deviations decrease with the chemical potential. We have also studied how the extent of the nonextensivity modulates the longevity of magnetic field. Present work is further extended to the study of some observables associated with the aforesaid transport phenomena, such as the Knudsen number and the elliptic flow within the nonextensive Tsallis framework.
△ Less
Submitted 25 September, 2023; v1 submitted 6 March, 2023;
originally announced March 2023.
-
Viscous QCD medium effects on the bottom quark transport coefficients
Authors:
Adiba Shaikh,
Sadhana Dash,
Basanta K. Nandi
Abstract:
The bottom quark transport coefficients, i.e., drag and diffusion coefficients, have been studied for the collisional and soft gluon radiative processes within the viscous QCD medium. The thermal medium effects are incorporated using the effective fugacity quasiparticle model (EQPM). Both the shear and bulk viscous effects at leading order are embedded through the near-equilibrium distribution fun…
▽ More
The bottom quark transport coefficients, i.e., drag and diffusion coefficients, have been studied for the collisional and soft gluon radiative processes within the viscous QCD medium. The thermal medium effects are incorporated using the effective fugacity quasiparticle model (EQPM). Both the shear and bulk viscous effects at leading order are embedded through the near-equilibrium distribution functions of the quark-gluon plasma (QGP) constituent quasiparticles. The transport coefficients' dependence on the bottom quark's initial momentum and QGP temperature has been investigated. The relative dominance of the radiative over the collisional process for the bottom quark seems to occur at a higher initial momentum compared to that of the charm quark. In contrast, the effect of the viscous corrections seems to be more for the charm quark.
△ Less
Submitted 4 February, 2023;
originally announced February 2023.
-
A ferromagnetic Eu-Pt surface compound grown below hexagonal boron nitride
Authors:
Alaa Mohammed Idris Bakhit,
Khadiza Ali,
Anna A. Makarova,
Igor Píš,
Federica Bondino,
Roberto Sant,
Saroj P. Dash,
Rodrigo Castrillo,
Yuri Hasegawa,
J. Enrique Ortega,
Laura Fernandez,
Frederik Schiller
Abstract:
One of the fundamental applications for monolayer-thick 2D materials is their use as protective layers of metal surfaces and in-situ intercalated reactive materials in ambient conditions. Here we investigate the structural, electronic, and magnetic properties, as well as the chemical stability in air of a very reactive metal, Europium, after intercalation between a hexagonal boron nitride (hBN) la…
▽ More
One of the fundamental applications for monolayer-thick 2D materials is their use as protective layers of metal surfaces and in-situ intercalated reactive materials in ambient conditions. Here we investigate the structural, electronic, and magnetic properties, as well as the chemical stability in air of a very reactive metal, Europium, after intercalation between a hexagonal boron nitride (hBN) layer and a Pt substrate. We demonstrate that Eu intercalation leads to a hBN-covered ferromagnetic EuPt$_2$ surface alloy with divalent Eu$^{2+}$ atoms at the interface. We expose the system to ambient conditions and find a partial conservation of the di-valent signal and hence the Eu-Pt interface. The use of a curved Pt substrate allows us to explore the changes in the Eu valence state and the ambient pressure protection at different substrate planes. The interfacial EuPt$_2$ surface alloy formation remains the same, but the resistance of the protecting hBN layer to ambient conditions is reduced, likely due to a rougher surface and a more discontinuous hBN coating.
△ Less
Submitted 21 June, 2023; v1 submitted 27 January, 2023;
originally announced January 2023.
-
Bottom-up growth of monolayer honeycomb SiC
Authors:
C. M. Polley,
H. Fedderwitz,
T. Balasubramanian,
A. A. Zakharov,
R. Yakimova,
O. Bäcke,
J. Ekman,
S. P. Dash,
S. Kubatkin,
S. Lara-Avila
Abstract:
The long theorized two-dimensional allotrope of SiC has remained elusive amid the exploration of graphenelike honeycomb structured monolayers. It is anticipated to possess a large direct band gap (2.5 eV), ambient stability, and chemical versatility. While $sp^{2}$ bonding between silicon and carbon is energetically favorable, only disordered nanoflakes have been reported to date. Here we demonstr…
▽ More
The long theorized two-dimensional allotrope of SiC has remained elusive amid the exploration of graphenelike honeycomb structured monolayers. It is anticipated to possess a large direct band gap (2.5 eV), ambient stability, and chemical versatility. While $sp^{2}$ bonding between silicon and carbon is energetically favorable, only disordered nanoflakes have been reported to date. Here we demonstrate large-area, bottom-up synthesis of monocrystalline, epitaxial monolayer honeycomb SiC atop ultrathin transition metal carbide films on SiC substrates. We find the 2D phase of SiC to be almost planar and stable at high temperatures, up to 1200°C in vacuum. Interactions between the 2D-SiC and the transition metal carbide surface result in a Dirac-like feature in the electronic band structure, which in the case of a TaC substrate is strongly spin-split. Our findings represent the first step towards routine and tailored synthesis of 2D-SiC monolayers, and this novel heteroepitaxial system may find diverse applications ranging from photovoltaics to topological superconductivity.
△ Less
Submitted 16 January, 2023;
originally announced January 2023.
-
DADAgger: Disagreement-Augmented Dataset Aggregation
Authors:
Akash Haridas,
Karim Hamadeh,
Samarendra Chandan Bindu Dash
Abstract:
DAgger is an imitation algorithm that aggregates its original datasets by querying the expert on all samples encountered during training. In order to reduce the number of samples queried, we propose a modification to DAgger, known as DADAgger, which only queries the expert for state-action pairs that are out of distribution (OOD). OOD states are identified by measuring the variance of the action p…
▽ More
DAgger is an imitation algorithm that aggregates its original datasets by querying the expert on all samples encountered during training. In order to reduce the number of samples queried, we propose a modification to DAgger, known as DADAgger, which only queries the expert for state-action pairs that are out of distribution (OOD). OOD states are identified by measuring the variance of the action predictions of an ensemble of models on each state, which we simulate using dropout. Testing on the Car Racing and Half Cheetah environments achieves comparable performance to DAgger but with reduced expert queries, and better performance than a random sampling baseline. We also show that our algorithm may be used to build efficient, well-balanced training datasets by running with no initial data and only querying the expert to resolve uncertainty.
△ Less
Submitted 3 January, 2023;
originally announced January 2023.
-
Affine Monads and Lazy Structures for Bayesian Programming
Authors:
Swaraj Dash,
Younesse Kaddar,
Hugo Paquet,
Sam Staton
Abstract:
We show that streams and lazy data structures are a natural idiom for programming with infinite-dimensional Bayesian methods such as Poisson processes, Gaussian processes, jump processes, Dirichlet processes, and Beta processes. The crucial semantic idea, inspired by developments in synthetic probability theory, is to work with two separate monads: an affine monad of probability, which supports la…
▽ More
We show that streams and lazy data structures are a natural idiom for programming with infinite-dimensional Bayesian methods such as Poisson processes, Gaussian processes, jump processes, Dirichlet processes, and Beta processes. The crucial semantic idea, inspired by developments in synthetic probability theory, is to work with two separate monads: an affine monad of probability, which supports laziness, and a commutative, non-affine monad of measures, which does not. (Affine means that $T(1)\cong 1$.) We show that the separation is important from a decidability perspective, and that the recent model of quasi-Borel spaces supports these two monads.
To perform Bayesian inference with these examples, we introduce new inference methods that are specially adapted to laziness; they are proven correct by reference to the Metropolis-Hastings-Green method. Our theoretical development is implemented as a Haskell library, LazyPPL.
△ Less
Submitted 14 December, 2022;
originally announced December 2022.
-
Bayesian Experimental Design for Symbolic Discovery
Authors:
Kenneth L. Clarkson,
Cristina Cornelio,
Sanjeeb Dash,
Joao Goncalves,
Lior Horesh,
Nimrod Megiddo
Abstract:
This study concerns the formulation and application of Bayesian optimal experimental design to symbolic discovery, which is the inference from observational data of predictive models taking general functional forms. We apply constrained first-order methods to optimize an appropriate selection criterion, using Hamiltonian Monte Carlo to sample from the prior. A step for computing the predictive dis…
▽ More
This study concerns the formulation and application of Bayesian optimal experimental design to symbolic discovery, which is the inference from observational data of predictive models taking general functional forms. We apply constrained first-order methods to optimize an appropriate selection criterion, using Hamiltonian Monte Carlo to sample from the prior. A step for computing the predictive distribution, involving convolution, is computed via either numerical integration, or via fast transform methods.
△ Less
Submitted 28 November, 2022;
originally announced November 2022.
-
Flow of charge and heat in thermal QCD within the weak magnetic field limit: A BGK model approach
Authors:
Anowar Shaikh,
Shubhalaxmi Rath,
Sadhana Dash,
Binata Panda
Abstract:
We have computed the charge and heat transport coefficients of hot QCD matter by solving the relativistic Boltzmann transport equation using the BGK model approximation with a modified collision integral in the weak magnetic field regime. This modified collision integral enhances both charge and heat transport phenomena which can be understood by the large values of the above-mentioned coefficient…
▽ More
We have computed the charge and heat transport coefficients of hot QCD matter by solving the relativistic Boltzmann transport equation using the BGK model approximation with a modified collision integral in the weak magnetic field regime. This modified collision integral enhances both charge and heat transport phenomena which can be understood by the large values of the above-mentioned coefficients in comparison to the relaxation collision integral. We have also presented a comparative study of coefficients like the electrical conductivity ($σ_{el}$), Hall conductivity ($σ_{H}$), thermal conductivity ($κ$) and Hall-type thermal conductivity($κ_{H}$) in weak and strong magnetic fields in the BGK model approximation. The effects of weak magnetic field and finite chemical potential on the transport coefficients have been explored using a quasiparticle model. Moreover, we have also studied the effects of weak magnetic field and finite chemical potential on Lorenz number, Knudsen number, specific heat, elliptic flow and Wiedemann-Franz law.
△ Less
Submitted 27 October, 2022;
originally announced October 2022.
-
Counterfactual Generation Under Confounding
Authors:
Abbavaram Gowtham Reddy,
Saloni Dash,
Amit Sharma,
Vineeth N Balasubramanian
Abstract:
A machine learning model, under the influence of observed or unobserved confounders in the training data, can learn spurious correlations and fail to generalize when deployed. For image classifiers, augmenting a training dataset using counterfactual examples has been empirically shown to break spurious correlations. However, the counterfactual generation task itself becomes more difficult as the l…
▽ More
A machine learning model, under the influence of observed or unobserved confounders in the training data, can learn spurious correlations and fail to generalize when deployed. For image classifiers, augmenting a training dataset using counterfactual examples has been empirically shown to break spurious correlations. However, the counterfactual generation task itself becomes more difficult as the level of confounding increases. Existing methods for counterfactual generation under confounding consider a fixed set of interventions (e.g., texture, rotation) and are not flexible enough to capture diverse data-generating processes. Given a causal generative process, we formally characterize the adverse effects of confounding on any downstream tasks and show that the correlation between generative factors (attributes) can be used to quantitatively measure confounding between generative factors. To minimize such correlation, we propose a counterfactual generation method that learns to modify the value of any attribute in an image and generate new images given a set of observed attributes, even when the dataset is highly confounded. These counterfactual images are then used to regularize the downstream classifier such that the learned representations are the same across various generative factors conditioned on the class label. Our method is computationally efficient, simple to implement, and works well for any number of generative factors and confounding variables. Our experimental results on both synthetic (MNIST variants) and real-world (CelebA) datasets show the usefulness of our approach.
△ Less
Submitted 10 December, 2022; v1 submitted 22 October, 2022;
originally announced October 2022.
-
ATHENA Detector Proposal -- A Totally Hermetic Electron Nucleus Apparatus proposed for IP6 at the Electron-Ion Collider
Authors:
ATHENA Collaboration,
J. Adam,
L. Adamczyk,
N. Agrawal,
C. Aidala,
W. Akers,
M. Alekseev,
M. M. Allen,
F. Ameli,
A. Angerami,
P. Antonioli,
N. J. Apadula,
A. Aprahamian,
W. Armstrong,
M. Arratia,
J. R. Arrington,
A. Asaturyan,
E. C. Aschenauer,
K. Augsten,
S. Aune,
K. Bailey,
C. Baldanza,
M. Bansal,
F. Barbosa,
L. Barion
, et al. (415 additional authors not shown)
Abstract:
ATHENA has been designed as a general purpose detector capable of delivering the full scientific scope of the Electron-Ion Collider. Careful technology choices provide fine tracking and momentum resolution, high performance electromagnetic and hadronic calorimetry, hadron identification over a wide kinematic range, and near-complete hermeticity. This article describes the detector design and its e…
▽ More
ATHENA has been designed as a general purpose detector capable of delivering the full scientific scope of the Electron-Ion Collider. Careful technology choices provide fine tracking and momentum resolution, high performance electromagnetic and hadronic calorimetry, hadron identification over a wide kinematic range, and near-complete hermeticity. This article describes the detector design and its expected performance in the most relevant physics channels. It includes an evaluation of detector technology choices, the technical challenges to realizing the detector and the R&D required to meet those challenges.
△ Less
Submitted 13 October, 2022;
originally announced October 2022.
-
Table Detection in the Wild: A Novel Diverse Table Detection Dataset and Method
Authors:
Mrinal Haloi,
Shashank Shekhar,
Nikhil Fande,
Siddhant Swaroop Dash,
Sanjay G
Abstract:
Recent deep learning approaches in table detection achieved outstanding performance and proved to be effective in identifying document layouts. Currently, available table detection benchmarks have many limitations, including the lack of samples diversity, simple table structure, the lack of training cases, and samples quality. In this paper, we introduce a diverse large-scale dataset for table det…
▽ More
Recent deep learning approaches in table detection achieved outstanding performance and proved to be effective in identifying document layouts. Currently, available table detection benchmarks have many limitations, including the lack of samples diversity, simple table structure, the lack of training cases, and samples quality. In this paper, we introduce a diverse large-scale dataset for table detection with more than seven thousand samples containing a wide variety of table structures collected from many diverse sources. In addition to that, we also present baseline results using a convolutional neural network-based method to detect table structure in documents. Experimental results show the superiority of applying convolutional deep learning methods over classical computer vision-based methods. The introduction of this diverse table detection dataset will enable the community to develop high throughput deep learning methods for understanding document layout and tabular data processing. Dataset is available at: 1. https://www.kaggle.com/datasets/mrinalim/stdw-dataset 2. https://huggingface.co/datasets/n3011/STDW
△ Less
Submitted 30 November, 2023; v1 submitted 31 August, 2022;
originally announced September 2022.
-
Study of multiplicity dependence of heavy flavor production in p-p collisions using rope hadronization mechanism
Authors:
Tulika Tripathy,
Bharati Naik,
Ranjit Nayak,
Nirbhay Behera,
Basanta K. Nandi,
Sadhana Dash
Abstract:
The multiplicity dependence of the production of the charm mesons in p$-$p collisions at $\sqrt{s} = 7$ TeV and 13 TeV as measured by ALICE experiment has been investigated using Pythia 8 event generator by studying the effect of various processes at partonic level such as the effect of different modes of color reconnections and rope hadronization. The relative yields (…
▽ More
The multiplicity dependence of the production of the charm mesons in p$-$p collisions at $\sqrt{s} = 7$ TeV and 13 TeV as measured by ALICE experiment has been investigated using Pythia 8 event generator by studying the effect of various processes at partonic level such as the effect of different modes of color reconnections and rope hadronization. The relative yields ($\rm Yield/\langle Yield \rangle$) of D-mesons and $J/ψ$ as a function of relative charged particle multiplicity for various transverse momentum ($p\textsubscript{T}$) ranges as measured by the ALICE experiment are in reasonable agreement with the estimations of Pythia 8 model within the framework of microscopic processes. The relative yields of B mesons for various $p\textsubscript{T}$ intervals ($1 < p_{T} < 20$ GeV/$c$) have also been predicted in p$-$p collisions at $\sqrt{s} = 7$ TeV and $\sqrt{s} = 13$ TeV.
△ Less
Submitted 19 September, 2022;
originally announced September 2022.
-
Revealing the band structure of ZrTe$_5$ using Multicarrier Transport
Authors:
Zoltán Kovács-Krausz,
Endre Tóvári,
Dániel Nagy,
Albin Márffy,
Bogdan Karpiak,
Zoltán Tajkov,
László Oroszlány,
János Koltai,
Péter Nemes-Incze,
Saroj Dash,
Péter Makk,
Szabolcs Csonka
Abstract:
The layered material ZrTe$_5$ appears to exhibit several exotic behaviors which resulted in significant interest recently, although the exact properties are still highly debated. Among these we find a Dirac/Weyl semimetallic behavior, nontrivial spin textures revealed by low temperature transport, and a potential weak or strong topological phase. The anomalous behavior of resistivity has been rece…
▽ More
The layered material ZrTe$_5$ appears to exhibit several exotic behaviors which resulted in significant interest recently, although the exact properties are still highly debated. Among these we find a Dirac/Weyl semimetallic behavior, nontrivial spin textures revealed by low temperature transport, and a potential weak or strong topological phase. The anomalous behavior of resistivity has been recently elucidated as originating from band shifting in the electronic structure. Our work examines magnetotransport behavior in ZrTe$_5$ samples in the context of multicarrier transport. The results, in conjunction with ab-initio band structure calculations, indicate that many of the transport features of ZrTe$_5$ across the majority of the temperature range can be adequately explained by the semiclassical multicarrier transport model originating from a complex Fermi surface.
△ Less
Submitted 19 January, 2023; v1 submitted 14 September, 2022;
originally announced September 2022.