subscribe to arXiv mailings

Flame spread over thin circular PMMA rods

Abstract: This article presents a series of opposed flow flame spread experiments, conducted using cast cylindrical PMMA (acrylic) rods, 80 mm long and of diameters 1 mm and 0.5 mm, in normal gravity and microgravity environments. The experiments are primarily conducted for molar oxygen levels of 21%, 23% and 40% at 1 atmosphere pressure and opposed flow speed ranging from 0 cm/s to 25 cm/s. Experiments are… ▽ More This article presents a series of opposed flow flame spread experiments, conducted using cast cylindrical PMMA (acrylic) rods, 80 mm long and of diameters 1 mm and 0.5 mm, in normal gravity and microgravity environments. The experiments are primarily conducted for molar oxygen levels of 21%, 23% and 40% at 1 atmosphere pressure and opposed flow speed ranging from 0 cm/s to 25 cm/s. Experiments are also conducted in normal gravity for oxygen levels 21% to 60% to study the effect of oxygen level. At near ambient oxygen levels, the flame shape in microgravity resembles a mushroom and there are fluctuations at the leading edge due to sporadic fuel jets emanating from bursting bubbles at the fuel surface. The flame spreads faster in microgravity, which is determined to be due to increased preheat length. Preheat length is measured for flame spreading over 1 mm diameter fuel rod under no flow condition using fine thermocouples and is found to be 0.98 cm in microgravity and 0.34 cm in normal gravity. It is found that for scaling analysis, including Stefan flow velocity in the definition of reference velocity results in a reasonable estimate of the preheat length of a flame spreading in quiescent microgravity environment. At high oxygen levels (> 35%) the flame becomes turbulent and spreads at nearly the same rate in normal gravity and microgravity environments and the flame spread rates are not affected by external flow speed. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.07612 [pdf, other]

Teaching Transformers Causal Reasoning through Axiomatic Training

Authors: Aniket Vashishtha, Abhinav Kumar, Abbavaram Gowtham Reddy, Vineeth N Balasubramanian, Amit Sharma

Abstract: For text-based AI systems to interact in the real world, causal reasoning is an essential skill. Since interventional data is costly to generate, we study to what extent an agent can learn causal reasoning from passive data. Specifically, we consider an axiomatic training setup where an agent learns from multiple demonstrations of a causal axiom (or rule), rather than incorporating the axiom as an… ▽ More For text-based AI systems to interact in the real world, causal reasoning is an essential skill. Since interventional data is costly to generate, we study to what extent an agent can learn causal reasoning from passive data. Specifically, we consider an axiomatic training setup where an agent learns from multiple demonstrations of a causal axiom (or rule), rather than incorporating the axiom as an inductive bias or inferring it from data values. A key question is whether the agent would learn to generalize from the axiom demonstrations to new scenarios. For example, if a transformer model is trained on demonstrations of the causal transitivity axiom over small graphs, would it generalize to applying the transitivity axiom over large graphs? Our results, based on a novel axiomatic training scheme, indicate that such generalization is possible. We consider the task of inferring whether a variable causes another variable, given a causal graph structure. We find that a 67 million parameter transformer model, when trained on linear causal chains (along with some noisy variations) can generalize well to new kinds of graphs, including longer causal chains, causal chains with reversed order, and graphs with branching; even when it is not explicitly trained for such settings. Our model performs at par (or even better) than many larger language models such as GPT-4, Gemini Pro, and Phi-3. Overall, our axiomatic training framework provides a new paradigm of learning causal reasoning from passive data that can be used to learn arbitrary axioms, as long as sufficient demonstrations can be generated. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.07480 [pdf, other]

The discovery of a nearby 421~s transient with CHIME/FRB/Pulsar

Authors: Fengqiu Adam Dong, Tracy Clarke, Alice P. Curtin, Ajay Kumar, Ingrid Stairs, Shami Chatterjee, Amanda M. Cook, Emmanuel Fonseca, B. M. Gaensler, Jason W. T. Hessels, Victoria M. Kaspi, Mattias Lazda, Kiyoshi W. Masui, James W. McKee, Bradley W. Meyers, Aaron B. Pearlman, Scott M. Ransom, Paul Scholz, Kaitlyn Shin, Kendrick M. Smith, Chia Min Tan

Abstract: Neutron stars and white dwarfs are both dense remnants of post-main-sequence stars. Pulsars, magnetars and strongly magnetised white dwarfs have all been seen to been observed to exhibit coherent, pulsed radio emission in relation to their rotational period. Recently, a new type of radio long period transient (LPT) has been discovered. The bright radio emission of LPTs resembles that of radio puls… ▽ More Neutron stars and white dwarfs are both dense remnants of post-main-sequence stars. Pulsars, magnetars and strongly magnetised white dwarfs have all been seen to been observed to exhibit coherent, pulsed radio emission in relation to their rotational period. Recently, a new type of radio long period transient (LPT) has been discovered. The bright radio emission of LPTs resembles that of radio pulsars and magnetars. However, they pulse on timescales (minutes) much longer than previously seen. While minute timescales are common rotation periods for white dwarfs, LPTs are much brighter than the known pulsating white dwarfs, and dipolar radiation from isolated (as opposed to binary) magnetic white dwarfs has yet to be observed. Here, we report the discovery of a new $\sim$421~s LPT, CHIME J0630+25, using the CHIME/FRB and CHIME/Pulsar instruments. We used standard pulsar timing techniques and obtained a phase-coherent timing solution which yielded limits on the inferred magnetic field and characteristic age. CHIME J0630+25 is remarkably nearby ($170 \pm 80$~pc), making it the closest LPT discovered to date. △ Less

Submitted 10 July, 2024; originally announced July 2024.

Comments: Submitted

arXiv:2407.06893 [pdf]

Measuring Sustainability Intention of ESG Fund Disclosure using Few-Shot Learning

Authors: Mayank Singh, Nazia Nafis, Abhijeet Kumar, Mridul Mishra

Abstract: Global sustainable fund universe encompasses open-end funds and exchange-traded funds (ETF) that, by prospectus or other regulatory filings, claim to focus on Environment, Social and Governance (ESG). Challengingly, the claims can only be confirmed by examining the textual disclosures to check if there is presence of intentionality and ESG focus on its investment strategy. Currently, there is no r… ▽ More Global sustainable fund universe encompasses open-end funds and exchange-traded funds (ETF) that, by prospectus or other regulatory filings, claim to focus on Environment, Social and Governance (ESG). Challengingly, the claims can only be confirmed by examining the textual disclosures to check if there is presence of intentionality and ESG focus on its investment strategy. Currently, there is no regulation to enforce sustainability in ESG products space. This paper proposes a unique method and system to classify and score the fund prospectuses in the sustainable universe regarding specificity and transparency of language. We aim to employ few-shot learners to identify specific, ambiguous, and generic sustainable investment-related language. Additionally, we construct a ratio metric to determine language score and rating to rank products and quantify sustainability claims for US sustainable universe. As a by-product, we publish manually annotated quality training dataset on Hugging Face (ESG-Prospectus-Clarity-Category under cc-by-nc-sa-4.0) of more than 1K ESG textual statements. The performance of the few-shot finetuning approach is compared with zero-shot models e.g., Llama-13B, GPT 3.5 Turbo etc. We found that prompting large language models are not accurate for domain specific tasks due to misalignment issues. The few-shot finetuning techniques outperform zero-shot models by large margins of more than absolute ~30% in precision, recall and F1 metrics on completely unseen ESG languages (test set). Overall, the paper attempts to establish a systematic and scalable approach to measure and rate sustainability intention quantitatively for sustainable funds using texts in prospectus. Regulatory bodies, investors, and advisors may utilize the findings of this research to reduce cognitive load in investigating or screening of ESG funds which accurately reflects the ESG intention. △ Less

Submitted 9 July, 2024; originally announced July 2024.

Comments: This paper was presented at 'AI applications in ESG Conference' at IIM Bangalore, India (Nov, 2023)

arXiv:2407.06868 [pdf, other]

Energy Efficient Fair STAR-RIS for Mobile Users

Authors: Ashok S. Kumar, Nancy Nayak, Sheetal Kalyani, Himal A. Suraweera

Abstract: In this work, we propose a method to improve the energy efficiency and fairness of simultaneously transmitting and reflecting reconfigurable intelligent surfaces (STAR-RIS) for mobile users, ensuring reduced power consumption while maintaining reliable communication. To achieve this, we introduce a new parameter known as the subsurface assignment variable, which determines the number of STAR-RIS e… ▽ More In this work, we propose a method to improve the energy efficiency and fairness of simultaneously transmitting and reflecting reconfigurable intelligent surfaces (STAR-RIS) for mobile users, ensuring reduced power consumption while maintaining reliable communication. To achieve this, we introduce a new parameter known as the subsurface assignment variable, which determines the number of STAR-RIS elements allocated to each user. We then formulate a novel optimization problem by concurrently optimizing the phase shifts of the STAR-RIS and subsurface assignment variable. We leverage the deep reinforcement learning (DRL) technique to address this optimization problem. The DRL model predicts the phase shifts of the STAR-RIS and efficiently allocates elements of STAR-RIS to the users. Additionally, we incorporate a penalty term in the DRL model to facilitate intelligent deactivation of STAR-RIS elements when not in use to enhance energy efficiency. Through extensive experiments, we show that the proposed method can achieve fairly high and nearly equal data rates for all users in both the transmission and reflection spaces in an energy-efficient manner. △ Less

Submitted 9 July, 2024; originally announced July 2024.

arXiv:2407.06110 [pdf, other]

FGA: Fourier-Guided Attention Network for Crowd Count Estimation

Authors: Yashwardhan Chaudhuri, Ankit Kumar, Arun Balaji Buduru, Adel Alshamrani

Abstract: Crowd counting is gaining societal relevance, particularly in domains of Urban Planning, Crowd Management, and Public Safety. This paper introduces Fourier-guided attention (FGA), a novel attention mechanism for crowd count estimation designed to address the inefficient full-scale global pattern capture in existing works on convolution-based attention networks. FGA efficiently captures multi-scale… ▽ More Crowd counting is gaining societal relevance, particularly in domains of Urban Planning, Crowd Management, and Public Safety. This paper introduces Fourier-guided attention (FGA), a novel attention mechanism for crowd count estimation designed to address the inefficient full-scale global pattern capture in existing works on convolution-based attention networks. FGA efficiently captures multi-scale information, including full-scale global patterns, by utilizing Fast-Fourier Transformations (FFT) along with spatial attention for global features and convolutions with channel-wise attention for semi-global and local features. The architecture of FGA involves a dual-path approach: (1) a path for processing full-scale global features through FFT, allowing for efficient extraction of information in the frequency domain, and (2) a path for processing remaining feature maps for semi-global and local features using traditional convolutions and channel-wise attention. This dual-path architecture enables FGA to seamlessly integrate frequency and spatial information, enhancing its ability to capture diverse crowd patterns. We apply FGA in the last layers of two popular crowd-counting works, CSRNet and CANNet, to evaluate the module's performance on benchmark datasets such as ShanghaiTech-A, ShanghaiTech-B, UCF-CC-50, and JHU++ crowd. The experiments demonstrate a notable improvement across all datasets based on Mean-Squared-Error (MSE) and Mean-Absolute-Error (MAE) metrics, showing comparable performance to recent state-of-the-art methods. Additionally, we illustrate the interpretability using qualitative analysis, leveraging Grad-CAM heatmaps, to show the effectiveness of FGA in capturing crowd patterns. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: Accepted to IJCNN'24

arXiv:2407.06093 [pdf, other]

Artificial Intuition: Efficient Classification of Scientific Abstracts

Authors: Harsh Sakhrani, Naseela Pervez, Anirudh Ravi Kumar, Fred Morstatter, Alexandra Graddy Reed, Andrea Belz

Abstract: It is desirable to coarsely classify short scientific texts, such as grant or publication abstracts, for strategic insight or research portfolio management. These texts efficiently transmit dense information to experts possessing a rich body of knowledge to aid interpretation. Yet this task is remarkably difficult to automate because of brevity and the absence of context. To address this gap, we h… ▽ More It is desirable to coarsely classify short scientific texts, such as grant or publication abstracts, for strategic insight or research portfolio management. These texts efficiently transmit dense information to experts possessing a rich body of knowledge to aid interpretation. Yet this task is remarkably difficult to automate because of brevity and the absence of context. To address this gap, we have developed a novel approach to generate and appropriately assign coarse domain-specific labels. We show that a Large Language Model (LLM) can provide metadata essential to the task, in a process akin to the augmentation of supplemental knowledge representing human intuition, and propose a workflow. As a pilot study, we use a corpus of award abstracts from the National Aeronautics and Space Administration (NASA). We develop new assessment tools in concert with established performance metrics. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.05263 [pdf, ps, other]

Impact of finite volume on kaon, antikaon, and $φ$ meson masses and decay width in asymmetric strange hadronic matter

Authors: Zeeshan Ahmad, Nisha Chahal, Arvind Kumar, Suneel Dutt

Abstract: In the present work, we investigate the impact of finite volume on the in-medium properties of kaons ($K^+$, $K^0$) and antikaons ($K^-$, $\bar{K^0}$), and $φ$ mesons in the isospin asymmetric strange hadronic medium at finite density and temperature. We use the chiral SU(3) hadronic mean-field model, which accounts for the interactions between baryons through the exchange of scalar ($σ, ζ, δ$) an… ▽ More In the present work, we investigate the impact of finite volume on the in-medium properties of kaons ($K^+$, $K^0$) and antikaons ($K^-$, $\bar{K^0}$), and $φ$ mesons in the isospin asymmetric strange hadronic medium at finite density and temperature. We use the chiral SU(3) hadronic mean-field model, which accounts for the interactions between baryons through the exchange of scalar ($σ, ζ, δ$) and vector ($ω$, $ρ$, $φ$) fields. To investigate the effects of finite volume, we apply the multiple reflection expansion (MRE) technique for calculations of the density of states. The non-strange scalar field $σ$ shows significant variation in an asymmetric medium, while the strange scalar field $ζ$ shows good dependency in the strange medium. We use the medium-modified masses of kaons and antikaons calculated using the chiral SU(3) model to obtain the masses and decay width of $φ$ mesons in finite volume hadronic medium. To obtain the masses and decay widths of $φ$ mesons, an effective Lagrangian approach with $φ$$K$$\bar{K}$ interactions at one-loop level is used in the present work. We obtain the effective masses and decay widths in the finite volume matter, for the spherical geometry of the medium with Neumann and Dirichlet boundary conditions as well as for the cubic geometry. The finite volume effects are found to be appreciable at high baryon densities. △ Less

Submitted 7 July, 2024; originally announced July 2024.

Comments: 40 pages and 13 figures

arXiv:2407.04784 [pdf, other]

Cavity QED in a High NA Resonator

Authors: Danial Shadmany, Aishwarya Kumar, Anna Soper, Lukas Palm, Chuan Yin, Henry Ando, Bowen Li, Lavanya Taneja, Matt Jaffe, David Schuster, Jon Simon

Abstract: From fundamental studies of light-matter interaction to applications in quantum networking and sensing, cavity quantum electrodynamics (QED) provides a platform-crossing toolbox to control interactions between atoms and photons. The coherence of such interactions is determined by the product of the single-pass atomic absorption and the number of photon round-trips. Reducing the cavity loss has ena… ▽ More From fundamental studies of light-matter interaction to applications in quantum networking and sensing, cavity quantum electrodynamics (QED) provides a platform-crossing toolbox to control interactions between atoms and photons. The coherence of such interactions is determined by the product of the single-pass atomic absorption and the number of photon round-trips. Reducing the cavity loss has enabled resonators supporting nearly 1-million optical roundtrips at the expense of severely limited optical material choices and increased alignment sensitivity. The single-pass absorption probability can be increased through the use of near-concentric, fiber or nanophotonic cavities, which reduce the mode waists at the expense of constrained optical access and exposure to surface fields. Here we present a new high numerical-aperture, lens-based resonator that pushes the single-atom-single-photon absorption probability per round trip close to its fundamental limit by reducing the mode size at the atom below a micron while keeping the atom mm-to-cm away from all optics. This resonator provides strong light-matter coupling in a cavity where the light circulates only ~ 10 times. We load a single 87Rb atom into such a cavity, observe strong coupling, demonstrate cavity-enhanced atom detection with imaging fidelity of 99.55(6) percent and survival probability of 99.89(4) percent in 130 microseconds, and leverage this new platform for a time-resolved exploration of cavity cooling. The resonator's loss-resilience paves the way to coupling of atoms to nonlinear and adaptive optical elements and provides a minimally invasive route to readout of defect centers. Introduction of intra-cavity imaging systems will enable the creation of cavity arrays compatible with Rydberg atom array computing technologies, vastly expanding the applicability of the cavity QED toolbox. △ Less

Submitted 5 July, 2024; originally announced July 2024.

arXiv:2407.04450 [pdf, other]

Massive Dirac-Pauli physics in lead-halide perovskites

Authors: Abhishek Shiva Kumar, Mikhail Maslov, Mikhail Lemeshko, Artem G. Volosniev, Zhanybek Alpichshev

Abstract: In standard quantum electrodynamics (QED), the so-called non-minimal (Pauli) coupling is suppressed for elementary particles and has no physical implications. Here, we show that the Pauli term naturally appears in a known family of Dirac materials -- the lead-halide perovskites, suggesting a novel playground for the study of analogue QED effects. We outline measurable manifestations of the Pauli t… ▽ More In standard quantum electrodynamics (QED), the so-called non-minimal (Pauli) coupling is suppressed for elementary particles and has no physical implications. Here, we show that the Pauli term naturally appears in a known family of Dirac materials -- the lead-halide perovskites, suggesting a novel playground for the study of analogue QED effects. We outline measurable manifestations of the Pauli term in the phenomena pertaining to (i) the Klein paradox and (ii) relativistic corrections to bound states. In particular, we demonstrate that the binding energy of an electron in the vicinity of a positively charged defect is noticeably decreased due to the polarizability of lead ions and the appearance of a Darwin-like term. Our study adds to understanding of quantum phenomena in lead-halide perovskites, and paves the way for tabletop simulations of analogue Dirac-Pauli equations. △ Less

Submitted 5 July, 2024; originally announced July 2024.

arXiv:2407.04268 [pdf, other]

NeuFair: Neural Network Fairness Repair with Dropout

Authors: Vishnu Asutosh Dasu, Ashish Kumar, Saeid Tizpaz-Niari, Gang Tan

Abstract: This paper investigates neuron dropout as a post-processing bias mitigation for deep neural networks (DNNs). Neural-driven software solutions are increasingly applied in socially critical domains with significant fairness implications. While neural networks are exceptionally good at finding statistical patterns from data, they may encode and amplify existing biases from the historical data. Existi… ▽ More This paper investigates neuron dropout as a post-processing bias mitigation for deep neural networks (DNNs). Neural-driven software solutions are increasingly applied in socially critical domains with significant fairness implications. While neural networks are exceptionally good at finding statistical patterns from data, they may encode and amplify existing biases from the historical data. Existing bias mitigation algorithms often require modifying the input dataset or the learning algorithms. We posit that the prevalent dropout methods that prevent over-fitting during training by randomly dropping neurons may be an effective and less intrusive approach to improve the fairness of pre-trained DNNs. However, finding the ideal set of neurons to drop is a combinatorial problem. We propose NeuFair, a family of post-processing randomized algorithms that mitigate unfairness in pre-trained DNNs via dropouts during inference after training. Our randomized search is guided by an objective to minimize discrimination while maintaining the model's utility. We show that our design of randomized algorithms is effective and efficient in improving fairness (up to 69%) with minimal or no model performance degradation. We provide intuitive explanations of these phenomena and carefully examine the influence of various hyperparameters of search algorithms on the results. Finally, we empirically and conceptually compare NeuFair to different state-of-the-art bias mitigators. △ Less

Submitted 12 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

Comments: Paper accepted at ACM ISSTA 2024

arXiv:2407.04039 [pdf, ps, other]

doi 10.2172/2372903

Flexible Stellarator Physics Facility

Authors: F. I. Parra, S. -G. Baek, M. Churchill, D. R. Demers, B. Dudson, N. M. Ferraro, B. Geiger, S. Gerhardt, K. C. Hammond, S. Hudson, R. Jorge, E. Kolemen, D. M. Kriete, S. T. A. Kumar, M. Landreman, C. Lowe, D. A. Maurer, F. Nespoli, N. Pablant, M. J. Pueschel, A. Punjabi, J. A. Schwartz, C. P. S. Swanson, A. M. Wright

Abstract: We propose to build a Flexible Stellarator Physics Facility to explore promising regions of the vast parameter space of disruption-free stellarator solutions for Fusion Pilot Plants (FPPs). We propose to build a Flexible Stellarator Physics Facility to explore promising regions of the vast parameter space of disruption-free stellarator solutions for Fusion Pilot Plants (FPPs). △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: White paper submitted to FESAC subcommittee on Facilities, 8 pages

arXiv:2407.03941 [pdf, other]

Narrow Transformer: Starcoder-Based Java-LM For Desktop

Authors: Kamalkumar Rathinasamy, Balaji A J, Ankush Kumar, Gagan Gayari, Harshini K, Rajab Ali Mondal, Sreenivasa Raghavan K S, Swayam Singh

Abstract: This paper presents NT-Java-1.1B, an open-source specialized code language model built on StarCoderBase-1.1B, designed for coding tasks in Java programming. NT-Java-1.1B achieves state-of-the-art performance, surpassing its base model and majority of other models of similar size on MultiPL-E Java code benchmark. While there have been studies on extending large, generic pre-trained models to improv… ▽ More This paper presents NT-Java-1.1B, an open-source specialized code language model built on StarCoderBase-1.1B, designed for coding tasks in Java programming. NT-Java-1.1B achieves state-of-the-art performance, surpassing its base model and majority of other models of similar size on MultiPL-E Java code benchmark. While there have been studies on extending large, generic pre-trained models to improve proficiency in specific programming languages like Python, similar investigations on small code models for other programming languages are lacking. Large code models require specialized hardware like GPUs for inference, highlighting the need for research into building small code models that can be deployed on developer desktops. This paper addresses this research gap by focusing on the development of a small Java code model, NT-Java-1.1B, and its quantized versions, which performs comparably to open models around 1.1B on MultiPL-E Java code benchmarks, making them ideal for desktop deployment. This paper establishes the foundation for specialized models across languages and sizes for a family of NT Models. △ Less

Submitted 4 July, 2024; originally announced July 2024.

ACM Class: I.2.7

arXiv:2407.03648 [pdf, other]

High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching

Authors: Gael Le Lan, Bowen Shi, Zhaoheng Ni, Sidd Srinivasan, Anurag Kumar, Brian Ellis, David Kant, Varun Nagaraja, Ernie Chang, Wei-Ning Hsu, Yangyang Shi, Vikas Chandra

Abstract: We introduce a simple and efficient text-controllable high-fidelity music generation and editing model. It operates on sequences of continuous latent representations from a low frame rate 48 kHz stereo variational auto encoder codec that eliminates the information loss drawback of discrete representations. Based on a diffusion transformer architecture trained on a flow-matching objective the model… ▽ More We introduce a simple and efficient text-controllable high-fidelity music generation and editing model. It operates on sequences of continuous latent representations from a low frame rate 48 kHz stereo variational auto encoder codec that eliminates the information loss drawback of discrete representations. Based on a diffusion transformer architecture trained on a flow-matching objective the model can generate and edit diverse high quality stereo samples of variable duration, with simple text descriptions. We also explore a new regularized latent inversion method for zero-shot test-time text-guided editing and demonstrate its superior performance over naive denoising diffusion implicit model (DDIM) inversion for variety of music editing prompts. Evaluations are conducted on both objective and subjective metrics and demonstrate that the proposed model is not only competitive to the evaluated baselines on a standard text-to-music benchmark - quality and efficiency-wise - but also outperforms previous state of the art for music editing when combined with our proposed latent inversion. Samples are available at https://melodyflow.github.io. △ Less

Submitted 4 July, 2024; originally announced July 2024.

arXiv:2407.02413 [pdf]

First-principles investigation of multifaceted properties; lattice dynamic, structural stability, mechanical, electronic, magnetic and thermodynamic response of Alkali metals-based semi Heusler alloys

Authors: Diwaker, Shyam L. Gupta, Anupam, Sumit Kumar, Aadil Fayaz, Ashwani Kumar

Abstract: Taking into considerations the wide compositional stretch of Heusler alloys, the first principles density functional theory based calculations are excellently suitable for estimating the multifaceted properties of alkali metal based LiVSb and NaVSb Heusler alloys. We calculated ground state stability by optimizing the energy in alpha, beta and gamma phase configurations. The materials are dynamica… ▽ More Taking into considerations the wide compositional stretch of Heusler alloys, the first principles density functional theory based calculations are excellently suitable for estimating the multifaceted properties of alkali metal based LiVSb and NaVSb Heusler alloys. We calculated ground state stability by optimizing the energy in alpha, beta and gamma phase configurations. The materials are dynamically stable in spin polarised phase type alpha. To explore the electronic structure, we successfully employed the generalized gradient approximation potential. The electronic band structures indicate a half-metallic nature featuring a wide indirect band gap of 1.40eV and 1.45eV. We computed the second-order elastic parameters at different pressure levels. The Pugh ratio less than 0.25 assessed that both alloys are brittle in nature and mechanically stable. The obtained magnetic moment is consistent with the Slater-Pauling rule. By executing the Quasi-Harmonic Debye model and Boltzmann theory we assessed the various thermodynamic parameters and transport coefficients of both alloys at different temperatures and pressures. All positive frequencies in lattice dynamic study confirmed their stability. Our findings highlight the potential of these alloys in modern semiconductor technology, and thermoelectric applications. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2407.01351 [pdf, other]

Probing the connection between IceCube neutrinos and MOJAVE AGN

Authors: R. Abbasi, M. Ackermann, J. Adams, S. K. Agarwalla, J. A. Aguilar, M. Ahlers, J. M. Alameddine, N. M. Amin, K. Andeen, C. Argüelles, Y. Ashida, S. Athanasiadou, L. Ausborm, S. N. Axani, X. Bai, A. Balagopal V., M. Baricevic, S. W. Barwick, S. Bash, V. Basu, R. Bay, J. J. Beatty, J. Becker Tjus, J. Beise, C. Bellenghi , et al. (399 additional authors not shown)

Abstract: Active Galactic Nuclei (AGN) are prime candidate sources of the high-energy, astrophysical neutrinos detected by IceCube. This is demonstrated by the real-time multi-messenger detection of the blazar TXS 0506+056 and the recent evidence of neutrino emission from NGC 1068 from a separate time-averaged study. However, the production mechanism of the astrophysical neutrinos in AGN is not well establi… ▽ More Active Galactic Nuclei (AGN) are prime candidate sources of the high-energy, astrophysical neutrinos detected by IceCube. This is demonstrated by the real-time multi-messenger detection of the blazar TXS 0506+056 and the recent evidence of neutrino emission from NGC 1068 from a separate time-averaged study. However, the production mechanism of the astrophysical neutrinos in AGN is not well established which can be resolved via correlation studies with photon observations. For neutrinos produced due to photohadronic interactions in AGN, in addition to a correlation of neutrinos with high-energy photons, there would also be a correlation of neutrinos with photons emitted at radio wavelengths. In this work, we perform an in-depth stacking study of the correlation between 15 GHz radio observations of AGN reported in the MOJAVE XV catalog, and ten years of neutrino data from IceCube. We also use a time-dependent approach which improves the statistical power of the stacking analysis. No significant correlation was found for both analyses and upper limits are reported. When compared to the IceCube diffuse flux, at 100 TeV and for a spectral index of 2.5, the upper limits derived are $\sim3\%$ and $\sim9\%$ for the time-averaged and time-dependent case, respectively. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: 14 Pages 7 Figures

arXiv:2407.01314 [pdf, other]

Search for a light sterile neutrino with 7.5 years of IceCube DeepCore data

Authors: R. Abbasi, M. Ackermann, J. Adams, S. K. Agarwalla, J. A. Aguilar, M. Ahlers, J. M. Alameddine, N. M. Amin, K. Andeen, C. Argüelles, Y. Ashida, S. Athanasiadou, L. Ausborm, S. N. Axani, X. Bai, A. Balagopal V., M. Baricevic, S. W. Barwick, S. Bash, V. Basu, R. Bay, J. J. Beatty, J. Becker Tjus, J. Beise, C. Bellenghi , et al. (399 additional authors not shown)

Abstract: We present a search for an eV-scale sterile neutrino using 7.5 years of data from the IceCube DeepCore detector. The analysis uses a sample of 21,914 events with energies between 5 and 150 GeV to search for sterile neutrinos through atmospheric muon neutrino disappearance. Improvements in event selection and treatment of systematic uncertainties provide greater statistical power compared to previo… ▽ More We present a search for an eV-scale sterile neutrino using 7.5 years of data from the IceCube DeepCore detector. The analysis uses a sample of 21,914 events with energies between 5 and 150 GeV to search for sterile neutrinos through atmospheric muon neutrino disappearance. Improvements in event selection and treatment of systematic uncertainties provide greater statistical power compared to previous DeepCore sterile neutrino searches. Our results are compatible with the absence of mixing between active and sterile neutrino states, and we place constraints on the mixing matrix elements $|U_{μ4}|^2 < 0.0534$ and $|U_{τ4}|^2 < 0.0574$ at 90% CL under the assumption that $Δm^2_{41}\geq 1\;\mathrm{eV^2}$. These null results add to the growing tension between anomalous appearance results and constraints from disappearance searches in the 3+1 sterile neutrino landscape. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: 11 pages, 5 figures. To be submitted to Physical Review D

arXiv:2407.01306 [pdf, other]

Unveiling the Unseen: Exploring Whitebox Membership Inference through the Lens of Explainability

Authors: Chenxi Li, Abhinav Kumar, Zhen Guo, Jie Hou, Reza Tourani

Abstract: The increasing prominence of deep learning applications and reliance on personalized data underscore the urgent need to address privacy vulnerabilities, particularly Membership Inference Attacks (MIAs). Despite numerous MIA studies, significant knowledge gaps persist, particularly regarding the impact of hidden features (in isolation) on attack efficacy and insufficient justification for the root… ▽ More The increasing prominence of deep learning applications and reliance on personalized data underscore the urgent need to address privacy vulnerabilities, particularly Membership Inference Attacks (MIAs). Despite numerous MIA studies, significant knowledge gaps persist, particularly regarding the impact of hidden features (in isolation) on attack efficacy and insufficient justification for the root causes of attacks based on raw data features. In this paper, we aim to address these knowledge gaps by first exploring statistical approaches to identify the most informative neurons and quantifying the significance of the hidden activations from the selected neurons on attack accuracy, in isolation and combination. Additionally, we propose an attack-driven explainable framework by integrating the target and attack models to identify the most influential features of raw data that lead to successful membership inference attacks. Our proposed MIA shows an improvement of up to 26% on state-of-the-art MIA. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: 20 pages, 10 figures, 4 tables

arXiv:2407.00866 [pdf, other]

Silver Linings in the Shadows: Harnessing Membership Inference for Machine Unlearning

Authors: Nexhi Sula, Abhinav Kumar, Jie Hou, Han Wang, Reza Tourani

Abstract: With the continued advancement and widespread adoption of machine learning (ML) models across various domains, ensuring user privacy and data security has become a paramount concern. In compliance with data privacy regulations, such as GDPR, a secure machine learning framework should not only grant users the right to request the removal of their contributed data used for model training but also fa… ▽ More With the continued advancement and widespread adoption of machine learning (ML) models across various domains, ensuring user privacy and data security has become a paramount concern. In compliance with data privacy regulations, such as GDPR, a secure machine learning framework should not only grant users the right to request the removal of their contributed data used for model training but also facilitates the elimination of sensitive data fingerprints within machine learning models to mitigate potential attack - a process referred to as machine unlearning. In this study, we present a novel unlearning mechanism designed to effectively remove the impact of specific data samples from a neural network while considering the performance of the unlearned model on the primary task. In achieving this goal, we crafted a novel loss function tailored to eliminate privacy-sensitive information from weights and activation values of the target model by combining target classification loss and membership inference loss. Our adaptable framework can easily incorporate various privacy leakage approximation mechanisms to guide the unlearning process. We provide empirical evidence of the effectiveness of our unlearning approach with a theoretical upper-bound analysis through a membership inference mechanism as a proof of concept. Our results showcase the superior performance of our approach in terms of unlearning efficacy and latency as well as the fidelity of the primary task, across four datasets and four deep learning architectures. △ Less

Submitted 5 July, 2024; v1 submitted 30 June, 2024; originally announced July 2024.

Comments: 17 pages, 14 figures, 6 tables

arXiv:2407.00774 [pdf, other]

Advantages of quantum support vector machine in cross-domain classification of quantum states

Authors: Diksha Sharma, Vivek Balasaheb Sabale, Parvinder Singh, Atul Kumar

Abstract: In this study, we use cross-domain classification using quantum machine learning for quantum advantages to address the entanglement versus separability paradigm. We further demonstrate the efficient classification of Bell diagonal states into zero and non-zero discord classes. The inherited structure of quantum states and its relation with a particular class of quantum states are exploited to intu… ▽ More In this study, we use cross-domain classification using quantum machine learning for quantum advantages to address the entanglement versus separability paradigm. We further demonstrate the efficient classification of Bell diagonal states into zero and non-zero discord classes. The inherited structure of quantum states and its relation with a particular class of quantum states are exploited to intuitively approach the classification of different domain testing states, referred here as crossdomain classification. In addition, we extend our analysis to evaluate the robustness of our model for the analyzed problem using random unitary transformations. Using numerical analysis, our results clearly demonstrate the potential of QSVM for classifying quantum states across the multidimensional Hilbert space. △ Less

Submitted 30 June, 2024; originally announced July 2024.

arXiv:2407.00597 [pdf]

Myriad of Terahertz Magnons with All-Optical Magnetoelectric Functionality for Efficient Spin-Wave Computing in Honeycomb Magnet Co4Ta2O9

Authors: Brijesh Singh Mehra, Sanjeev Kumar, Gaurav Dubey, Ayyappan Shyam, Ankit Kumar, K Anirudh, Kiran Singh, Dhanvir Singh Rana

Abstract: Terahertz (THz) magnonics represent the notion of mathematical algebraic operations of magnons such as addition and subtraction in THz regime which is an emergent dissipationless ultrafast alternative to existing data processing technologies. Spin waves on antiferromagnets with a twist in spin order host such magnons in THz regime, which possess advantage of higher processing speeds, additional po… ▽ More Terahertz (THz) magnonics represent the notion of mathematical algebraic operations of magnons such as addition and subtraction in THz regime which is an emergent dissipationless ultrafast alternative to existing data processing technologies. Spin waves on antiferromagnets with a twist in spin order host such magnons in THz regime, which possess advantage of higher processing speeds, additional polarization degree of freedom and longer propagation lengths compared to that of gigahertz magnons in ferromagnets. While interaction among THz magnons is the crux of algebra operations, it requires magnetic orders with closely spaced magnon modes for easier experimental realization of their interactions. Herein, rich wealth of magnons spanning a narrow energy range of 0.4 to 10 meV is unraveled in Co4Ta2O9 using magneto-THz spectroscopy. Rare multitude of ten excitation modes, either of magnons or hybrid magnon-phonon modes is presented. Among other attributes, spin lattice interaction suggests a correlation among spin and local lattice distortion, magnetostriction, and magnetic exchange interaction signifying a THz magnetoelectric effect. This unification of structural, magnetic and dielectric facets, and their magnetic field control in a narrow spectrum unwinds the mechanism underneath the system's complexity while the manifestation of multitude of spin excitation modes is a potential source to design multiple channels in spin-wave computing based devices. △ Less

Submitted 30 June, 2024; originally announced July 2024.

arXiv:2407.00537 [pdf, other]

Accelerating Longitudinal MRI using Prior Informed Latent Diffusion

Authors: Yonatan Urman, Zachary Shah, Ashwin Kumar, Bruno P. Soares, Kawin Setsompop

Abstract: MRI is a widely used ionization-free soft-tissue imaging modality, often employed repeatedly over a patient's lifetime. However, prolonged scanning durations, among other issues, can limit availability and accessibility. In this work, we aim to substantially reduce scan times by leveraging prior scans of the same patient. These prior scans typically contain considerable shared information with the… ▽ More MRI is a widely used ionization-free soft-tissue imaging modality, often employed repeatedly over a patient's lifetime. However, prolonged scanning durations, among other issues, can limit availability and accessibility. In this work, we aim to substantially reduce scan times by leveraging prior scans of the same patient. These prior scans typically contain considerable shared information with the current scan, thereby enabling higher acceleration rates when appropriately utilized. We propose a prior informed reconstruction method with a trained diffusion model in conjunction with data-consistency steps. Our method can be trained with unlabeled image data, eliminating the need for a dataset of either k-space measurements or paired longitudinal scans as is required of other learning-based methods. We demonstrate superiority of our method over previously suggested approaches in effectively utilizing prior information without over-biasing prior consistency, which we validate on both an open-source dataset of healthy patients as well as several longitudinal cases of clinical interest. △ Less

Submitted 29 June, 2024; originally announced July 2024.

arXiv:2407.00071 [pdf, other]

Combinatorial Reasoning: Selecting Reasons in Generative AI Pipelines via Combinatorial Optimization

Authors: Mert Esencan, Tarun Advaith Kumar, Ata Akbari Asanjan, P. Aaron Lott, Masoud Mohseni, Can Unlu, Davide Venturelli, Alan Ho

Abstract: Recent Large Language Models (LLMs) have demonstrated impressive capabilities at tasks that require human intelligence and are a significant step towards human-like artificial intelligence (AI). Yet the performance of LLMs at reasoning tasks have been subpar and the reasoning capability of LLMs is a matter of significant debate. While it has been shown that the choice of the prompting technique to… ▽ More Recent Large Language Models (LLMs) have demonstrated impressive capabilities at tasks that require human intelligence and are a significant step towards human-like artificial intelligence (AI). Yet the performance of LLMs at reasoning tasks have been subpar and the reasoning capability of LLMs is a matter of significant debate. While it has been shown that the choice of the prompting technique to the LLM can alter its performance on a multitude of tasks, including reasoning, the best performing techniques require human-made prompts with the knowledge of the tasks at hand. We introduce a framework for what we call Combinatorial Reasoning (CR), a fully-automated prompting method, where reasons are sampled from an LLM pipeline and mapped into a Quadratic Unconstrained Binary Optimization (QUBO) problem. The framework investigates whether QUBO solutions can be profitably used to select a useful subset of the reasons to construct a Chain-of-Thought style prompt. We explore the acceleration of CR with specialized solvers. We also investigate the performance of simpler zero-shot strategies such as linear majority rule or random selection of reasons. Our preliminary study indicates that coupling a combinatorial solver to generative AI pipelines is an interesting avenue for AI reasoning and elucidates design principles for future CR methods. △ Less

Submitted 19 June, 2024; originally announced July 2024.

Comments: 13 pages, 3 figures

arXiv:2406.19421 [pdf, other]

The Belle II Detector Upgrades Framework Conceptual Design Report

Authors: H. Aihara, A. Aloisio, D. P. Auguste, M. Aversano, M. Babeluk, S. Bahinipati, Sw. Banerjee, M. Barbero, J. Baudot, A. Beaubien, F. Becherer, T. Bergauer, F. U. Bernlochner., V. Bertacchi, G. Bertolone, C. Bespin, M. Bessner, S. Bettarini, A. J. Bevan, B. Bhuyan, M. Bona, J. F. Bonis, J. Borah, F. Bosi, R. Boudagga , et al. (186 additional authors not shown)

Abstract: We describe the planned near-term and potential longer-term upgrades of the Belle II detector at the SuperKEKB electron-positron collider operating at the KEK laboratory in Tsukuba, Japan. These upgrades will allow increasingly sensitive searches for possible new physics beyond the Standard Model in flavor, tau, electroweak and dark sector physics that are both complementary to and competitive wit… ▽ More We describe the planned near-term and potential longer-term upgrades of the Belle II detector at the SuperKEKB electron-positron collider operating at the KEK laboratory in Tsukuba, Japan. These upgrades will allow increasingly sensitive searches for possible new physics beyond the Standard Model in flavor, tau, electroweak and dark sector physics that are both complementary to and competitive with the LHC and other experiments. △ Less

Submitted 4 July, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

Comments: Editor: F. Forti 170 pages

Report number: KEK-REPORT-2024-1, BELLE2-REPORT-2024-042

arXiv:2406.18290 [pdf, ps, other]

The first Steklov eigenvalue on manifolds with nonnegative Ricci curvature and convex boundary

Authors: Jonah A. J. Duncan, Aditya Kumar

Abstract: We establish a new lower bound for the first non-zero Steklov eigenvalue of a compact Riemannian manifold with non-negative Ricci curvature and (strictly) convex boundary. Related results are also obtained under weaker geometric hypotheses. We establish a new lower bound for the first non-zero Steklov eigenvalue of a compact Riemannian manifold with non-negative Ricci curvature and (strictly) convex boundary. Related results are also obtained under weaker geometric hypotheses. △ Less

Submitted 26 June, 2024; originally announced June 2024.

arXiv:2406.17304 [pdf, other]

Leveraging LLMs for Dialogue Quality Measurement

Authors: Jinghan Jia, Abi Komma, Timothy Leffel, Xujun Peng, Ajay Nagesh, Tamer Soliman, Aram Galstyan, Anoop Kumar

Abstract: In task-oriented conversational AI evaluation, unsupervised methods poorly correlate with human judgments, and supervised approaches lack generalization. Recent advances in large language models (LLMs) show robust zeroshot and few-shot capabilities across NLP tasks. This paper explores using LLMs for automated dialogue quality evaluation, experimenting with various configurations on public and pro… ▽ More In task-oriented conversational AI evaluation, unsupervised methods poorly correlate with human judgments, and supervised approaches lack generalization. Recent advances in large language models (LLMs) show robust zeroshot and few-shot capabilities across NLP tasks. This paper explores using LLMs for automated dialogue quality evaluation, experimenting with various configurations on public and proprietary datasets. Manipulating factors such as model size, in-context examples, and selection techniques, we examine "chain-of-thought" (CoT) reasoning and label extraction procedures. Our results show that (1) larger models yield more accurate dialogue labels; (2) algorithmic selection of in-context examples outperforms random selection; (3) CoT reasoning where an LLM is asked to provide justifications before outputting final labels improves performance; and (4) fine-tuned LLMs outperform out-of-the-box ones. Our results indicate that LLMs that are suitably fine-tuned and have sufficient reasoning capabilities can be leveraged for automated dialogue evaluation. △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.16075 [pdf, other]

Odd Dipole Screening in Radial Inflation

Authors: Yang Fu, H. George E. Hentschel, Pawandeep Kaur, Avanish Kumar, Itamar Procaccia

Abstract: The inflation of an inner radial (or spherical) cavity in an amorphous solids confined in a disk (or a sphere), served as a fruitful case model for studying the effects of plastic deformations on the mechanical response. It was shown that when the field associated with Eshelby quadrupolar charges is non-uniform, the displacement field is riddled with dipole charges that screen elasticity, reminisc… ▽ More The inflation of an inner radial (or spherical) cavity in an amorphous solids confined in a disk (or a sphere), served as a fruitful case model for studying the effects of plastic deformations on the mechanical response. It was shown that when the field associated with Eshelby quadrupolar charges is non-uniform, the displacement field is riddled with dipole charges that screen elasticity, reminiscent of Debye monopoles screening in electrostatics. In this paper we look deeper into the screening phenomenon, taking into account the consequences of irreversibility that are associated with the breaking of Chiral symmetry. We consider the equations for the displacement field with the presence of "Odd Dipole Screening", solve them analytically and compare with numerical simulations. Suggestions how to test the theory in experiments are provided. △ Less

Submitted 27 June, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

arXiv:2406.16008 [pdf, other]

Found in the Middle: Calibrating Positional Attention Bias Improves Long Context Utilization

Authors: Cheng-Yu Hsieh, Yung-Sung Chuang, Chun-Liang Li, Zifeng Wang, Long T. Le, Abhishek Kumar, James Glass, Alexander Ratner, Chen-Yu Lee, Ranjay Krishna, Tomas Pfister

Abstract: Large language models (LLMs), even when specifically trained to process long input contexts, struggle to capture relevant information located in the middle of their input. This phenomenon has been known as the lost-in-the-middle problem. In this work, we make three contributions. First, we set out to understand the factors that cause this phenomenon. In doing so, we establish a connection between… ▽ More Large language models (LLMs), even when specifically trained to process long input contexts, struggle to capture relevant information located in the middle of their input. This phenomenon has been known as the lost-in-the-middle problem. In this work, we make three contributions. First, we set out to understand the factors that cause this phenomenon. In doing so, we establish a connection between lost-in-the-middle to LLMs' intrinsic attention bias: LLMs exhibit a U-shaped attention bias where the tokens at the beginning and at the end of its input receive higher attention, regardless of their relevance. Second, we mitigate this positional bias through a calibration mechanism, found-in-the-middle, that allows the model to attend to contexts faithfully according to their relevance, even though when they are in the middle. Third, we show found-in-the-middle not only achieves better performance in locating relevant information within a long context, but also eventually leads to improved retrieval-augmented generation (RAG) performance across various tasks, outperforming existing methods by up to 15 percentage points. These findings open up future directions in understanding LLM attention bias and its potential consequences. △ Less

Submitted 3 July, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

Comments: ACL Findings 2024

arXiv:2406.15649 [pdf, other]

Efficient Human Pose Estimation: Leveraging Advanced Techniques with MediaPipe

Authors: Sandeep Singh Sengar, Abhishek Kumar, Owen Singh

Abstract: This study presents significant enhancements in human pose estimation using the MediaPipe framework. The research focuses on improving accuracy, computational efficiency, and real-time processing capabilities by comprehensively optimising the underlying algorithms. Novel modifications are introduced that substantially enhance pose estimation accuracy across challenging scenarios, such as dynamic m… ▽ More This study presents significant enhancements in human pose estimation using the MediaPipe framework. The research focuses on improving accuracy, computational efficiency, and real-time processing capabilities by comprehensively optimising the underlying algorithms. Novel modifications are introduced that substantially enhance pose estimation accuracy across challenging scenarios, such as dynamic movements and partial occlusions. The improved framework is benchmarked against traditional models, demonstrating considerable precision and computational speed gains. The advancements have wide-ranging applications in augmented reality, sports analytics, and healthcare, enabling more immersive experiences, refined performance analysis, and advanced patient monitoring. The study also explores the integration of these enhancements within mobile and embedded systems, addressing the need for computational efficiency and broader accessibility. The implications of this research set a new benchmark for real-time human pose estimation technologies and pave the way for future innovations in the field. The implementation code for the paper is available at https://github.com/avhixd/Human_pose_estimation. △ Less

Submitted 21 June, 2024; originally announced June 2024.

arXiv:2406.15646 [pdf, other]

VigilEye -- Artificial Intelligence-based Real-time Driver Drowsiness Detection

Authors: Sandeep Singh Sengar, Aswin Kumar, Owen Singh

Abstract: This study presents a novel driver drowsiness detection system that combines deep learning techniques with the OpenCV framework. The system utilises facial landmarks extracted from the driver's face as input to Convolutional Neural Networks trained to recognise drowsiness patterns. The integration of OpenCV enables real-time video processing, making the system suitable for practical implementation… ▽ More This study presents a novel driver drowsiness detection system that combines deep learning techniques with the OpenCV framework. The system utilises facial landmarks extracted from the driver's face as input to Convolutional Neural Networks trained to recognise drowsiness patterns. The integration of OpenCV enables real-time video processing, making the system suitable for practical implementation. Extensive experiments on a diverse dataset demonstrate high accuracy, sensitivity, and specificity in detecting drowsiness. The proposed system has the potential to enhance road safety by providing timely alerts to prevent accidents caused by driver fatigue. This research contributes to advancing real-time driver monitoring systems and has implications for automotive safety and intelligent transportation systems. The successful application of deep learning techniques in this context opens up new avenues for future research in driver monitoring and vehicle safety. The implementation code for the paper is available at https://github.com/LUFFY7001/Driver-s-Drowsiness-Detection. △ Less

Submitted 21 June, 2024; originally announced June 2024.

arXiv:2406.15565 [pdf, other]

Unseen Object Reasoning with Shared Appearance Cues

Authors: Paridhi Singh, Arun Kumar

Abstract: This paper introduces an innovative approach to open world recognition (OWR), where we leverage knowledge acquired from known objects to address the recognition of previously unseen objects. The traditional method of object modeling relies on supervised learning with strict closed-set assumptions, presupposing that objects encountered during inference are already known at the training phase. Howev… ▽ More This paper introduces an innovative approach to open world recognition (OWR), where we leverage knowledge acquired from known objects to address the recognition of previously unseen objects. The traditional method of object modeling relies on supervised learning with strict closed-set assumptions, presupposing that objects encountered during inference are already known at the training phase. However, this assumption proves inadequate for real-world scenarios due to the impracticality of accounting for the immense diversity of objects. Our hypothesis posits that object appearances can be represented as collections of "shareable" mid-level features, arranged in constellations to form object instances. By adopting this framework, we can efficiently dissect and represent both known and unknown objects in terms of their appearance cues. Our paper introduces a straightforward yet elegant method for modeling novel or unseen objects, utilizing established appearance cues and accounting for inherent uncertainties. This representation not only enables the detection of out-of-distribution objects or novel categories among unseen objects but also facilitates a deeper level of reasoning, empowering the identification of the superclass to which an unknown instance belongs. This novel approach holds promise for advancing open world recognition in diverse applications. △ Less

Submitted 21 June, 2024; originally announced June 2024.

arXiv:2406.14532 [pdf, other]

RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold

Authors: Amrith Setlur, Saurabh Garg, Xinyang Geng, Naman Garg, Virginia Smith, Aviral Kumar

Abstract: Training on model-generated synthetic data is a promising approach for finetuning LLMs, but it remains unclear when it helps or hurts. In this paper, we investigate this question for math reasoning via an empirical study, followed by building a conceptual understanding of our observations. First, we find that while the typical approach of finetuning a model on synthetic correct or positive problem… ▽ More Training on model-generated synthetic data is a promising approach for finetuning LLMs, but it remains unclear when it helps or hurts. In this paper, we investigate this question for math reasoning via an empirical study, followed by building a conceptual understanding of our observations. First, we find that while the typical approach of finetuning a model on synthetic correct or positive problem-solution pairs generated by capable models offers modest performance gains, sampling more correct solutions from the finetuned learner itself followed by subsequent fine-tuning on this self-generated data $\textbf{doubles}$ the efficiency of the same synthetic problems. At the same time, training on model-generated positives can amplify various spurious correlations, resulting in flat or even inverse scaling trends as the amount of data increases. Surprisingly, we find that several of these issues can be addressed if we also utilize negative responses, i.e., model-generated responses that are deemed incorrect by a final answer verifier. Crucially, these negatives must be constructed such that the training can appropriately recover the utility or advantage of each intermediate step in the negative response. With this per-step scheme, we are able to attain consistent gains over only positive data, attaining performance similar to amplifying the amount of synthetic data by $\mathbf{8 \times}$. We show that training on per-step negatives can help to unlearn spurious correlations in the positive data, and is equivalent to advantage-weighted reinforcement learning (RL), implying that it inherits robustness benefits of RL over imitating positive data alone. △ Less

Submitted 20 June, 2024; originally announced June 2024.

arXiv:2406.13658 [pdf, ps, other]

Generalized Hamming weights and symbolic powers of Stanley-Reisner ideals of matroids

Authors: Michael DiPasquale, Louiza Fouli, Arvind Kumar, Ştefan O. Tohǎneanu

Abstract: It is well-known that the first generalized Hamming weight of a code, more commonly called \textit{the minimum distance} of the code, corresponds to the initial degree of the Stanley-Reisner ideal of the matroid of the dual code. Our starting point in this paper is a generalization of this fact -- namely, the $r$-th generalized Hamming weight of a code is the smallest degree of a squarefree monomi… ▽ More It is well-known that the first generalized Hamming weight of a code, more commonly called \textit{the minimum distance} of the code, corresponds to the initial degree of the Stanley-Reisner ideal of the matroid of the dual code. Our starting point in this paper is a generalization of this fact -- namely, the $r$-th generalized Hamming weight of a code is the smallest degree of a squarefree monomial in the $r$-th symbolic power of the Stanley-Reisner ideal of the matroid of the dual code (in the appropriate range for $r$). It turns out that the squarefree monomials in successive symbolic powers of the Stanley-Reisner ideal of a matroid suffice to describe all symbolic powers of the Stanley-Reisner ideal. This implies that generalized Hamming weights -- which can be defined in a natural way for matroids -- are fundamentally tied to the structure of symbolic powers of Stanley-Reisner ideals of matroids. We illustrate this by studying initial degree statistics of symbolic powers of the Stanley-Reisner ideal of a matroid in terms of generalized Hamming weights and working out many examples that are meaningful from a coding-theoretic perspective. Our results also apply to projective varieties known as matroid configurations introduced by Geramita-Harbourne-Migliore-Nagel. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 37 pages. Comments welcome!

MSC Class: 94B05; 05B35; 05E40; 13F55; 51E10

arXiv:2406.13236 [pdf, other]

Data Contamination Can Cross Language Barriers

Authors: Feng Yao, Yufan Zhuang, Zihao Sun, Sunan Xu, Animesh Kumar, Jingbo Shang

Abstract: The opacity in developing large language models (LLMs) is raising growing concerns about the potential contamination of public benchmarks in the pre-training data. Existing contamination detection methods are typically based on the text overlap between training and evaluation data, which can be too superficial to reflect deeper forms of contamination. In this paper, we first present a cross-lingua… ▽ More The opacity in developing large language models (LLMs) is raising growing concerns about the potential contamination of public benchmarks in the pre-training data. Existing contamination detection methods are typically based on the text overlap between training and evaluation data, which can be too superficial to reflect deeper forms of contamination. In this paper, we first present a cross-lingual form of contamination that inflates LLMs' performance while evading current detection methods, deliberately injected by overfitting LLMs on the translated versions of benchmark test sets. Then, we propose generalization-based approaches to unmask such deeply concealed contamination. Specifically, we examine the LLM's performance change after modifying the original benchmark by replacing the false answer choices with correct ones from other questions. Contaminated models can hardly generalize to such easier situations, where the false choices can be \emph{not even wrong}, as all choices are correct in their memorization. Experimental results demonstrate that cross-lingual contamination can easily fool existing detection methods, but not ours. In addition, we discuss the potential utilization of cross-lingual contamination in interpreting LLMs' working mechanisms and in post-training LLMs for enhanced multilingual capabilities. The code and dataset we use can be obtained from \url{https://github.com/ShangDataLab/Deep-Contam}. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 12 pages, 5 figures

arXiv:2406.12804 [pdf, other]

Varying activity and the bursts properties of FRB 20240114A probed with GMRT down to 300 MHz

Authors: Ajay Kumar, Yogesh Maan, Yash Bhusare

Abstract: Repeating fast radio bursts can exhibit a wide range of burst repetition rates, from none to hundreds of bursts per hour. Here, we report the detection and characteristics of 57 bursts from the recently discovered FRB 20240114A, observed with GMRT in the frequency ranges 300-500 MHz and 550-750 MHz. Majority of the bursts show narrow emission-bandwidth with $Δν/ν\sim$ around 10 %. All of the burst… ▽ More Repeating fast radio bursts can exhibit a wide range of burst repetition rates, from none to hundreds of bursts per hour. Here, we report the detection and characteristics of 57 bursts from the recently discovered FRB 20240114A, observed with GMRT in the frequency ranges 300-500 MHz and 550-750 MHz. Majority of the bursts show narrow emission-bandwidth with $Δν/ν\sim$ around 10 %. All of the bursts we detect are faint ($<$10 Jy ms), and thus probe the lower end of the energy distribution. We determine the rate function for FRB 20240114A at 400 MHz, and downward drift rates at 400 and 650 MHz, and discuss our measurements in the context of the repeating FRB population. We observe sudden variations in the burst activity of FRB 20240114A over time. Our data as well as the other publicly available information on other observations of FRB 20240114A so far, there is an indication that FRB 20240114A potentially exhibit a chromaticity in its burst activity. While the burst properties of FRB 20240114A are similar to ther repeating FRBs, the frequency-dependent activity, if established, could provide crucial clues to the origin of repeating FRBs. We also place the most stringent 5$σ$ upper limits of 600 $μ$Jy and 89 $μ$Jy on any persistent radio source (PRS) associated with FRB 20240114A at 400 MHz and 650 MHz, respectively, and compare these with the luminosity of the known PRSs associated with FRB121102A and FRB190520B. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 13 Pages, 5 Figures, Submitted to ApJ

arXiv:2406.12644 [pdf, other]

Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models

Authors: Devichand Budagam, Sankalp KJ, Ashutosh Kumar, Vinija Jain, Aman Chadha

Abstract: Assessing the effectiveness of large language models (LLMs) in addressing diverse tasks is essential for comprehending their strengths and weaknesses. Conventional evaluation techniques typically apply a single prompting strategy uniformly across datasets, not considering the varying degrees of task complexity. We introduce the Hierarchical Prompting Taxonomy (HPT), a taxonomy that employs a Hiera… ▽ More Assessing the effectiveness of large language models (LLMs) in addressing diverse tasks is essential for comprehending their strengths and weaknesses. Conventional evaluation techniques typically apply a single prompting strategy uniformly across datasets, not considering the varying degrees of task complexity. We introduce the Hierarchical Prompting Taxonomy (HPT), a taxonomy that employs a Hierarchical Prompt Framework (HPF) composed of five unique prompting strategies, arranged from the simplest to the most complex, to assess LLMs more precisely and to offer a clearer perspective. This taxonomy assigns a score, called the Hierarchical Prompting Score (HP-Score), to datasets as well as LLMs based on the rules of the taxonomy, providing a nuanced understanding of their ability to solve diverse tasks and offering a universal measure of task complexity. Additionally, we introduce the Adaptive Hierarchical Prompt framework, which automates the selection of appropriate prompting strategies for each task. This study compares manual and adaptive hierarchical prompt frameworks using four instruction-tuned LLMs, namely Llama 3 8B, Phi 3 3.8B, Mistral 7B, and Gemma 7B, across four datasets: BoolQ, CommonSenseQA (CSQA), IWSLT-2017 en-fr (IWSLT), and SamSum. Experiments demonstrate the effectiveness of HPT, providing a reliable way to compare different tasks and LLM capabilities. This paper leads to the development of a universal evaluation metric that can be used to evaluate both the complexity of the datasets and the capabilities of LLMs. The implementation of both manual HPF and adaptive HPF is publicly available. △ Less

Submitted 27 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

arXiv:2406.11925 [pdf, other]

DocCGen: Document-based Controlled Code Generation

Authors: Sameer Pimparkhede, Mehant Kammakomati, Srikanth Tamilselvam, Prince Kumar, Ashok Pon Kumar, Pushpak Bhattacharyya

Abstract: Recent developments show that Large Language Models (LLMs) produce state-of-the-art performance on natural language (NL) to code generation for resource-rich general-purpose languages like C++, Java, and Python. However, their practical usage for structured domain-specific languages (DSLs) such as YAML, JSON is limited due to domain-specific schema, grammar, and customizations generally unseen by… ▽ More Recent developments show that Large Language Models (LLMs) produce state-of-the-art performance on natural language (NL) to code generation for resource-rich general-purpose languages like C++, Java, and Python. However, their practical usage for structured domain-specific languages (DSLs) such as YAML, JSON is limited due to domain-specific schema, grammar, and customizations generally unseen by LLMs during pre-training. Efforts have been made to mitigate this challenge via in-context learning through relevant examples or by fine-tuning. However, it suffers from problems, such as limited DSL samples and prompt sensitivity but enterprises maintain good documentation of the DSLs. Therefore, we propose DocCGen, a framework that can leverage such rich knowledge by breaking the NL-to-Code generation task for structured code languages into a two-step process. First, it detects the correct libraries using the library documentation that best matches the NL query. Then, it utilizes schema rules extracted from the documentation of these libraries to constrain the decoding. We evaluate our framework for two complex structured languages, Ansible YAML and Bash command, consisting of two settings: Out-of-domain (OOD) and In-domain (ID). Our extensive experiments show that DocCGen consistently improves different-sized language models across all six evaluation metrics, reducing syntactic and semantic errors in structured code. We plan to open-source the datasets and code to motivate research in constrained code generation. △ Less

Submitted 3 July, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.11896 [pdf, other]

DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning

Authors: Hao Bai, Yifei Zhou, Mert Cemri, Jiayi Pan, Alane Suhr, Sergey Levine, Aviral Kumar

Abstract: Training corpuses for vision language models (VLMs) typically lack sufficient amounts of decision-centric data. This renders off-the-shelf VLMs sub-optimal for decision-making tasks such as in-the-wild device control through graphical user interfaces (GUIs). While training with static demonstrations has shown some promise, we show that such methods fall short for controlling real GUIs due to their… ▽ More Training corpuses for vision language models (VLMs) typically lack sufficient amounts of decision-centric data. This renders off-the-shelf VLMs sub-optimal for decision-making tasks such as in-the-wild device control through graphical user interfaces (GUIs). While training with static demonstrations has shown some promise, we show that such methods fall short for controlling real GUIs due to their failure to deal with real-world stochasticity and non-stationarity not captured in static observational data. This paper introduces a novel autonomous RL approach, called DigiRL, for training in-the-wild device control agents through fine-tuning a pre-trained VLM in two stages: offline RL to initialize the model, followed by offline-to-online RL. To do this, we build a scalable and parallelizable Android learning environment equipped with a VLM-based evaluator and develop a simple yet effective RL approach for learning in this domain. Our approach runs advantage-weighted RL with advantage estimators enhanced to account for stochasticity along with an automatic curriculum for deriving maximal learning signal. We demonstrate the effectiveness of DigiRL using the Android-in-the-Wild (AitW) dataset, where our 1.3B VLM trained with RL achieves a 49.5% absolute improvement -- from 17.7 to 67.2% success rate -- over supervised fine-tuning with static human demonstration data. These results significantly surpass not only the prior best agents, including AppAgent with GPT-4V (8.3% success rate) and the 17B CogAgent trained with AitW data (38.5%), but also the prior best autonomous RL approach based on filtered behavior cloning (57.8%), thereby establishing a new state-of-the-art for digital agents for in-the-wild device control. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: 11 pages of main text, 28 pages in total

arXiv:2406.11619 [pdf, other]

AV-CrossNet: an Audiovisual Complex Spectral Mapping Network for Speech Separation By Leveraging Narrow- and Cross-Band Modeling

Authors: Vahid Ahmadi Kalkhorani, Cheng Yu, Anurag Kumar, Ke Tan, Buye Xu, DeLiang Wang

Abstract: Adding visual cues to audio-based speech separation can improve separation performance. This paper introduces AV-CrossNet, an audiovisual (AV) system for speech enhancement, target speaker extraction, and multi-talker speaker separation. AV-CrossNet is extended from the CrossNet architecture, which is a recently proposed network that performs complex spectral mapping for speech separation by lever… ▽ More Adding visual cues to audio-based speech separation can improve separation performance. This paper introduces AV-CrossNet, an audiovisual (AV) system for speech enhancement, target speaker extraction, and multi-talker speaker separation. AV-CrossNet is extended from the CrossNet architecture, which is a recently proposed network that performs complex spectral mapping for speech separation by leveraging global attention and positional encoding. To effectively utilize visual cues, the proposed system incorporates pre-extracted visual embeddings and employs a visual encoder comprising temporal convolutional layers. Audio and visual features are fused in an early fusion layer before feeding to AV-CrossNet blocks. We evaluate AV-CrossNet on multiple datasets, including LRS, VoxCeleb, and COG-MHEAR challenge. Evaluation results demonstrate that AV-CrossNet advances the state-of-the-art performance in all audiovisual tasks, even on untrained and mismatched datasets. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 10 pages, 4 Figures, and 4 Tables

arXiv:2406.10935 [pdf, other]

Pick-or-Mix: Dynamic Channel Sampling for ConvNets

Authors: Ashish Kumar, Daneul Kim, Jaesik Park, Laxmidhar Behera

Abstract: Channel pruning approaches for convolutional neural networks (ConvNets) deactivate the channels, statically or dynamically, and require special implementation. In addition, channel squeezing in representative ConvNets is carried out via 1x1 convolutions which dominates a large portion of computations and network parameters. Given these challenges, we propose an effective multi-purpose module for d… ▽ More Channel pruning approaches for convolutional neural networks (ConvNets) deactivate the channels, statically or dynamically, and require special implementation. In addition, channel squeezing in representative ConvNets is carried out via 1x1 convolutions which dominates a large portion of computations and network parameters. Given these challenges, we propose an effective multi-purpose module for dynamic channel sampling, namely Pick-or-Mix (PiX), which does not require special implementation. PiX divides a set of channels into subsets and then picks from them, where the picking decision is dynamically made per each pixel based on the input activations. We plug PiX into prominent ConvNet architectures and verify its multi-purpose utilities. After replacing 1x1 channel squeezing layers in ResNet with PiX, the network becomes 25% faster without losing accuracy. We show that PiX allows ConvNets to learn better data representation than widely adopted approaches to enhance networks' representation power (e.g., SE, CBAM, AFF, SKNet, and DWP). We also show that PiX achieves state-of-the-art performance on network downscaling and dynamic channel pruning applications. △ Less

Submitted 16 June, 2024; originally announced June 2024.

Comments: Published in Computer Vision and Pattern Recognition (CVPR 2024)

Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

arXiv:2406.10764 [pdf, other]

GNOME: Generating Negotiations through Open-Domain Mapping of Exchanges

Authors: Darshan Deshpande, Shambhavi Sinha, Anirudh Ravi Kumar, Debaditya Pal, Jonathan May

Abstract: Language Models have previously shown strong negotiation capabilities in closed domains where the negotiation strategy prediction scope is constrained to a specific setup. In this paper, we first show that these models are not generalizable beyond their original training domain despite their wide-scale pretraining. Following this, we propose an automated framework called GNOME, which processes exi… ▽ More Language Models have previously shown strong negotiation capabilities in closed domains where the negotiation strategy prediction scope is constrained to a specific setup. In this paper, we first show that these models are not generalizable beyond their original training domain despite their wide-scale pretraining. Following this, we propose an automated framework called GNOME, which processes existing human-annotated, closed-domain datasets using Large Language Models and produces synthetic open-domain dialogues for negotiation. GNOME improves the generalizability of negotiation systems while reducing the expensive and subjective task of manual data curation. Through our experimental setup, we create a benchmark comparing encoder and decoder models trained on existing datasets against datasets created through GNOME. Our results show that models trained on our dataset not only perform better than previous state of the art models on domain specific strategy prediction, but also generalize better to previously unseen domains. △ Less

Submitted 15 June, 2024; originally announced June 2024.

arXiv:2406.10024 [pdf, ps, other]

On Fridman invariant, injectivity radius function and squeezing function

Authors: Akhil Kumar, Sanjay Kumar Pant

Abstract: We give a class of domains for which Fridman invariant and injectivity radius function coincide with respect to Carathéodory metric. We give explicit expressions of the squeezing functions for these domains and investigate some of their properties. We give a class of domains for which Fridman invariant and injectivity radius function coincide with respect to Carathéodory metric. We give explicit expressions of the squeezing functions for these domains and investigate some of their properties. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: 10 pages, work in progress, comments are welcome

MSC Class: 32F45; 32H02

arXiv:2406.09749 [pdf, other]

Substrate$-$bias driven Sputter deposited $β-$phase dominated Tungsten film for Spintronic applications

Authors: Abhay Singh Rajawat, Naim Ahmad, Risvana Nasril, Tasneem Sheikh, Mohammad Muhiuddin, A kumar, Mohammad R Rahman, Waseem Akhtar

Abstract: $β$-Tungsten ($β$-W), a A15 cubic phase of Tungsten exhibits giant spin hall angle as compared to its bcc-phase $α$-Tungsten ($α$-W), making high quality $β$-W film desirable for spin-based application. We report on the substrate bias driven on-demand growth of $β$-W film on SiO$_2$ coated silicon (SiO$_2… ▽ More $β$-Tungsten ($β$-W), a A15 cubic phase of Tungsten exhibits giant spin hall angle as compared to its bcc-phase $α$-Tungsten ($α$-W), making high quality $β$-W film desirable for spin-based application. We report on the substrate bias driven on-demand growth of $β$-W film on SiO$_2$ coated silicon (SiO$_2$/Si) using DC sputtering. GIXRD plots and SEM images are used to show a systematic change on the structure and grain size of the deposited films with the application of substrate bias. It is observed that zero bias film are amorphous in nature and changes phase from $α$ to $β$ or mixed ($α$ + $β$) depending upon the sign and magnitude of the substrate bias. We performed One-Dimensional Power spectrum density of the AFM images which revealed that the pure $β$-W film grown at a positive bias of +50V has the minimum roughness as compared to films grown at different substrate bias. We further confirm the metallic surface homogeneity using the room temperature STM. Our results shows that the substrate bias which controls the energy of the deposited atom, is a crucial parameter for an on demand growth of $β$-W, an important material for spintronic applications. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: 5 pages, 4 figures

arXiv:2406.09329 [pdf, other]

Is Value Learning Really the Main Bottleneck in Offline RL?

Authors: Seohong Park, Kevin Frans, Sergey Levine, Aviral Kumar

Abstract: While imitation learning requires access to high-quality data, offline reinforcement learning (RL) should, in principle, perform similarly or better with substantially lower data quality by using a value function. However, current results indicate that offline RL often performs worse than imitation learning, and it is often unclear what holds back the performance of offline RL. Motivated by this o… ▽ More While imitation learning requires access to high-quality data, offline reinforcement learning (RL) should, in principle, perform similarly or better with substantially lower data quality by using a value function. However, current results indicate that offline RL often performs worse than imitation learning, and it is often unclear what holds back the performance of offline RL. Motivated by this observation, we aim to understand the bottlenecks in current offline RL algorithms. While poor performance of offline RL is typically attributed to an imperfect value function, we ask: is the main bottleneck of offline RL indeed in learning the value function, or something else? To answer this question, we perform a systematic empirical study of (1) value learning, (2) policy extraction, and (3) policy generalization in offline RL problems, analyzing how these components affect performance. We make two surprising observations. First, we find that the choice of a policy extraction algorithm significantly affects the performance and scalability of offline RL, often more so than the value learning objective. For instance, we show that common value-weighted behavioral cloning objectives (e.g., AWR) do not fully leverage the learned value function, and switching to behavior-constrained policy gradient objectives (e.g., DDPG+BC) often leads to substantial improvements in performance and scalability. Second, we find that a big barrier to improving offline RL performance is often imperfect policy generalization on test-time states out of the support of the training data, rather than policy learning on in-distribution states. We then show that the use of suboptimal but high-coverage data or test-time policy training techniques can address this generalization issue in practice. Specifically, we propose two simple test-time policy improvement methods and show that these methods lead to better performance. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2406.09230 [pdf, other]

Correlations and Signaling in the Schrödinger-Newton Model

Authors: Jacek Aleksander Gruca, Ankit Kumar, Ray Ganardi, Paramasivan Arumugam, Karolina Kropielnicka, Tomasz Paterek

Abstract: The Schrödinger-Newton model is a semi-classical theory in which, in addition to mutual attraction, massive quantum particles interact with their own gravitational fields. While there are many studies on the phenomenology of single particles, correlation dynamics in multipartite systems is largely unexplored. Here, we show that the Schrödinger-Newton interactions preserve the product form of initi… ▽ More The Schrödinger-Newton model is a semi-classical theory in which, in addition to mutual attraction, massive quantum particles interact with their own gravitational fields. While there are many studies on the phenomenology of single particles, correlation dynamics in multipartite systems is largely unexplored. Here, we show that the Schrödinger-Newton interactions preserve the product form of initial states, yet on average it agrees with classical mechanics of continuous mass distributions. This leads to a simple test of the model, based on verifying bipartite gravitational evolution towards non-product states. We show using standard quantum mechanics that, with currently accessible single-particle parameters, two masses released from harmonic traps get correlated well before any observable entanglement is accumulated. Therefore, the Schrödinger-Newton model can be tested with setups aimed at observation of gravitational entanglement with significantly relaxed requirements on coherence time. We also present a mixed-state extension of the model that avoids superluminal signaling. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2406.08645 [pdf, other]

ODIN: Identifying Protoclusters and Cosmic Filaments Traced by Ly$α$-emitting Galaxies

Authors: Vandana Ramakrishnan, Kyoung-Soo Lee, Maria Celeste Artale, Eric Gawiser. Yujin Yang, Changbom Park, Robin Ciardullo, Lucia Guaita, Sang Hyeok Im, Seongjae Kim, Ankit Kumar, Jaehyun Lee, Seong-Kook Lee, Byeongha Moon, Nelson Padilla, Alexandra Pope, Roxana Popescu, Hyunmi Song, Paulina Troncoso, Francisco Valdes, Ann Zabludoff

Abstract: To understand the formation and evolution of massive cosmic structures, studying them at high redshift, in the epoch when they formed the majority of their mass is essential. The One-hundred-deg$^2$ DECam Imaging in Narrowbands (ODIN) survey is undertaking the widest-area narrowband program to date, to use Ly$α$-emitting galaxies (LAEs) to trace the large-scale structure (LSS) of the Universe at t… ▽ More To understand the formation and evolution of massive cosmic structures, studying them at high redshift, in the epoch when they formed the majority of their mass is essential. The One-hundred-deg$^2$ DECam Imaging in Narrowbands (ODIN) survey is undertaking the widest-area narrowband program to date, to use Ly$α$-emitting galaxies (LAEs) to trace the large-scale structure (LSS) of the Universe at three cosmic epochs. In this work, we present results at $z$ = 3.1 based on early ODIN data in the COSMOS field. We identify and characterize protoclusters and cosmic filaments using multiple methods and discuss their strengths and weaknesses. We then compare our observations against the IllustrisTNG suite of cosmological hydrodynamical simulations. The two are in excellent agreement, with a similar number and angular size of structures identified above a specified density threshold. We are able to recover the simulated protoclusters with $\log$(M$_{z=0}$/$M_\odot$) $\gtrsim$ 14.4 in $\sim$ 60\% of the cases. With these objects we show that the descendant masses of the protoclusters in our sample can be estimated purely based on our 2D measurements, finding a median $z$ = 0 mass of $\sim10^{14.5}$M$_\odot$. The lack of information on the radial extent of each protocluster introduces a $\sim$0.4~dex uncertainty in its descendant mass. Finally, we show that the recovery of the cosmic web in the vicinity of protoclusters is both efficient and accurate. The similarity of our observations and the simulations imply that our structure selection is likewise robust and efficient, demonstrating that LAEs are reliable tracers of the LSS. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 26 pages, 18 figures; submitted to ApJ

arXiv:2406.08067 [pdf, other]

Synchronous and Asynchronous Updates of Active Ising Spins in One Dimension

Authors: Anish Kumar, Sudipta Pattanayak, R. K. Singh, Shradha Mishra

Abstract: How do update rules affect the dynamical and steady state properties of a flock? In this study, we have explored the active Ising spins (s = +-1) in one dimension, where spin updates its orientation according to the Metropolis algorithm (based on the neighbors) via two different update rules. (i) Parallel, and (ii) Random-sequential. We explore the effect of Parallel and Random-sequential updates… ▽ More How do update rules affect the dynamical and steady state properties of a flock? In this study, we have explored the active Ising spins (s = +-1) in one dimension, where spin updates its orientation according to the Metropolis algorithm (based on the neighbors) via two different update rules. (i) Parallel, and (ii) Random-sequential. We explore the effect of Parallel and Random-sequential updates on the dynamical properties of flocks in one dimension. Due to the inherent asynchronous nature of the Random-sequential update, the directional switching of the flock is increased compared to the Parallel one. The nature of phase transition is affected by the difference in the updating mechanism: discontinuous for Parallel and continuous for Random-sequential updates. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 7 pages, 6 figures. arXiv admin note: text overlap with arXiv:1704.04041

arXiv:2406.07601 [pdf, other]

IceCube Search for Neutrino Emission from X-ray Bright Seyfert Galaxies

Authors: R. Abbasi, M. Ackermann, J. Adams, S. K. Agarwalla, J. A. Aguilar, M. Ahlers, J. M. Alameddine, N. M. Amin, K. Andeen, C. Argüelles, Y. Ashida, S. Athanasiadou, L. Ausborm, S. N. Axani, X. Bai, A. Balagopal V., M. Baricevic, S. W. Barwick, S. Bash, V. Basu, R. Bay, J. J. Beatty, J. Becker Tjus, J. Beise, C. Bellenghi , et al. (400 additional authors not shown)

Abstract: The recent IceCube detection of TeV neutrino emission from the nearby active galaxy NGC 1068 suggests that active galactic nuclei (AGN) could make a sizable contribution to the diffuse flux of astrophysical neutrinos. The absence of TeV $γ$-rays from NGC 1068 indicates neutrino production in the vicinity of the supermassive black hole, where the high radiation density leads to $γ$-ray attenuation.… ▽ More The recent IceCube detection of TeV neutrino emission from the nearby active galaxy NGC 1068 suggests that active galactic nuclei (AGN) could make a sizable contribution to the diffuse flux of astrophysical neutrinos. The absence of TeV $γ$-rays from NGC 1068 indicates neutrino production in the vicinity of the supermassive black hole, where the high radiation density leads to $γ$-ray attenuation. Therefore, any potential neutrino emission from similar sources is not expected to correlate with high-energy $γ$-rays. Disk-corona models predict neutrino emission from Seyfert galaxies to correlate with keV X-rays, as they are tracers of coronal activity. Using through-going track events from the Northern Sky recorded by IceCube between 2011 and 2021, we report results from a search for individual and aggregated neutrino signals from 27 additional Seyfert galaxies that are contained in the BAT AGN Spectroscopic Survey (BASS). Besides the generic single power-law, we evaluate the spectra predicted by the disk-corona model. Assuming all sources to be intrinsically similar to NGC 1068, our findings constrain the collective neutrino emission from X-ray bright Seyfert galaxies in the Northern Hemisphere, but, at the same time, show excesses of neutrinos that could be associated with the objects NGC 4151 and CGCG 420-015. These excesses result in a 2.7$σ$ significance with respect to background expectations. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 17 pages, 9 figures

arXiv:2406.07470 [pdf, other]

Exploring non-radial oscillation modes in dark matter admixed neutron stars

Authors: Pratik Thakur, Anil Kumar, Vivek Baruah Thapa, Vishal Parmar, Monika Sinha

Abstract: Because of their extreme densities and consequently, gravitational potential, compact objects such as neutron stars can prove to be excellent captors of dark matter particles. Considering purely gravitational interactions between dark and hadronic matter, we construct dark matter admixed stars composed of two-fluid matter subject to current astrophysical constraints of maximum mass and tidal defor… ▽ More Because of their extreme densities and consequently, gravitational potential, compact objects such as neutron stars can prove to be excellent captors of dark matter particles. Considering purely gravitational interactions between dark and hadronic matter, we construct dark matter admixed stars composed of two-fluid matter subject to current astrophysical constraints of maximum mass and tidal deformability. We choose a wide range of parameters to construct the dark matter equation of state, and the DDME2 parameterization for the hadronic equation of state. We then examine the effect of dark matter on the stellar structure, tidal deformability and non-radial modes considering the relativistic Cowling approximation. We find the effect on $p$-modes is substantial, with frequencies decreasing up to the typical $f-$mode frequency range for most stars with a dark matter halo. The effects on the $f-$mode frequency are less extreme. Finally, we find the most probable and $1σ$ values of the dark matter parameters used in this study. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 23 pages and 5 figures

arXiv:2406.06684 [pdf, other]

Search for neutrino emission from hard X-ray AGN with IceCube

Authors: R. Abbasi, M. Ackermann, J. Adams, S. K. Agarwalla, J. A. Aguilar, M. Ahlers, J. M. Alameddine, N. M. Amin, K. Andeen, C. Argüelles, Y. Ashida, S. Athanasiadou, L. Ausborm, S. N. Axani, X. Bai, A. Balagopal V., M. Baricevic, S. W. Barwick, S. Bash, V. Basu, R. Bay, J. J. Beatty, J. Becker Tjus, J. Beise, C. Bellenghi , et al. (401 additional authors not shown)

Abstract: Active Galactic Nuclei (AGN) are promising candidate sources of high-energy astrophysical neutrinos since they provide environments rich in matter and photon targets where cosmic ray interactions may lead to the production of gamma rays and neutrinos. We searched for high-energy neutrino emission from AGN using the $\textit{Swift}$-BAT Spectroscopic Survey (BASS) catalog of hard X-ray sources and… ▽ More Active Galactic Nuclei (AGN) are promising candidate sources of high-energy astrophysical neutrinos since they provide environments rich in matter and photon targets where cosmic ray interactions may lead to the production of gamma rays and neutrinos. We searched for high-energy neutrino emission from AGN using the $\textit{Swift}$-BAT Spectroscopic Survey (BASS) catalog of hard X-ray sources and 12 years of IceCube muon track data. First, upon performing a stacked search, no significant emission was found. Second, we searched for neutrinos from a list of 43 candidate sources and found an excess from the direction of two sources, Seyfert galaxies NGC 1068 and NGC 4151. We observed NGC 1068 at flux $φ_{ν_μ+\barν_μ}$ = $4.02_{-1.52}^{+1.58} \times 10^{-11}$ TeV$^{-1}$ cm$^{-2}$ s$^{-1}$ normalized at 1 TeV, with power-law spectral index, $γ$ = 3.10$^{+0.26}_{-0.22}$, consistent with previous IceCube results. The observation of a neutrino excess from the direction of NGC 4151 is at a post-trial significance of 2.9$σ$. If interpreted as an astrophysical signal, the excess observed from NGC 4151 corresponds to a flux $φ_{ν_μ+\barν_μ}$ = $1.51_{-0.81}^{+0.99} \times 10^{-11}$ TeV$^{-1}$ cm$^{-2}$ s$^{-1}$ normalized at 1 TeV and $γ$ = 2.83$^{+0.35}_{-0.28}$. △ Less

Submitted 12 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

Showing 1–50 of 2,910 results for author: Kumar, A