subscribe to arXiv mailings

A Comprehensive Convolutional Neural Network Architecture Design using Magnetic Skyrmion and Domain Wall

Authors: Saumya Gupta, Venkatesh Vadde, Bhaskaran Muralidharan, Abhishek Sharma

Abstract: Spintronic-based neuromorphic hardware enables high-density and rapid data processing at nanoscale lengths. leveraged by the topologically protected spin configurations and low current densities to manipulate magnetic structures such as skyrmion and domain wall. The paper presents a compact, energy-efficient multi-bit skyrmionic synapse and domain wall-based ReLU with max-pooling functionalities f… ▽ More Spintronic-based neuromorphic hardware enables high-density and rapid data processing at nanoscale lengths. leveraged by the topologically protected spin configurations and low current densities to manipulate magnetic structures such as skyrmion and domain wall. The paper presents a compact, energy-efficient multi-bit skyrmionic synapse and domain wall-based ReLU with max-pooling functionalities for hardware neural network applications. A 4-bit,5-bit, and 6-bit skyrmionic synapse is proposed, featuring a circular bilayer vortex-based geometry. The 4-bit skyrmionic synapse consumes an ultra-low energy of 0.8724 fJ per weight update. The proposed skyrmionic synapse comprises an ultra-thin ferromagnetic layer with a strong Dzyaloshinskii-Moriya interaction and a polarizer layer with a vortex-like spin configuration. The interaction between perpendicular current flow and the labyrinth maze-like uniaxial anisotropy profiles induce skyrmionic gyration, resulting in long-term potentiation (LTP) and long-term depression (LTD) that modifies the synaptic weights. We develop a phenomenology of the synaptic device, implementing 16-state (4-bit), 32-state (5-bit), and 64-state (6-bit) skyrmionic synapses, analyzing them quantitatively using micromagnetics simulations. Furthermore, we design a CMOS hybrid domain wall-based ReLU-max pooled circuit. The activation function works on the variation of the domain wall position implying variation in the device resistance on encountering uniaxial anisotropy variation along the track. To demonstrate the practical application of our 4-bit (16-state) skyrmionic synapse with domain wall-based ReLU-Max Pooling circuit we integrate it into an inference-based convolutional neural network (CNN) for pattern recognition, achieving a comparable accuracy of 98.07% to software-based ideal training. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 15 pages, 10 figures

arXiv:2407.06478 [pdf]

Automated and Continuous Chronotyping from a Calendar using Machine Learning

Authors: Pratiik Kaushik, Koorosh Askari, Saksham Gupta, Rahul Mohan, Kris Skrinak, Royan Kamyar, Benjamin Smarr

Abstract: Objectives: Chronotypes -- comparisons of individuals' circadian phase relative to others -- can contextualize mental health risk assessments, and support detection of social jet lag, which can hamper mental health and cognition. Existing ways of determining chronotypes, such as Dim Light Melatonin Onset (DLMO) or the Morningness-Eveningness Questionnaire (MEQ), are limited by being discrete in ti… ▽ More Objectives: Chronotypes -- comparisons of individuals' circadian phase relative to others -- can contextualize mental health risk assessments, and support detection of social jet lag, which can hamper mental health and cognition. Existing ways of determining chronotypes, such as Dim Light Melatonin Onset (DLMO) or the Morningness-Eveningness Questionnaire (MEQ), are limited by being discrete in time and time-intensive to update, rarely capturing real-world variability over time. Chronotyping users based on living schedules, as in daily planner apps, might augment existing methods by assessing chronotype and social jet lag continuously and at scale. Developing this functionality would require a novel tool to translate between digital schedules and chronotypes. Here we use a supervised binary classifier to assess the feasibility of this approach. Methods: In this study, 1,460 registered users from the Owaves app opted in to filled out the MEQ survey. Of those, 142 met the eligibility criteria for data analysis. We used multimodal app data to assess the classification of individuals identified as morning and evening types from MEQ data, basing the classifier on app time series data. This includes daily timing for 8 main lifestyle activity categories (exercise, sleep, social interactions, meal times, relaxation, work, play, and miscellaneous) as defined in the app. Results: The novel chronotyping tool was able to predict the morningness and eveningness of its users with an ROC AUC of 0.70. Conclusion: Our findings support the feasibility of chronotype classification from multimodal, real-world app data. We highlight challenges to applying binary labels to complex, multimodal behaviors. Our findings suggest a potential for real-time monitoring to support future, prospective mental health research. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: 15 pages, 4 figures, unsubmitted for peer review at date of posting

arXiv:2407.05887 [pdf, other]

Generation and De-Identification of Indian Clinical Discharge Summaries using LLMs

Authors: Sanjeet Singh, Shreya Gupta, Niralee Gupta, Naimish Sharma, Lokesh Srivastava, Vibhu Agarwal, Ashutosh Modi

Abstract: The consequences of a healthcare data breach can be devastating for the patients, providers, and payers. The average financial impact of a data breach in recent months has been estimated to be close to USD 10 million. This is especially significant for healthcare organizations in India that are managing rapid digitization while still establishing data governance procedures that align with the lett… ▽ More The consequences of a healthcare data breach can be devastating for the patients, providers, and payers. The average financial impact of a data breach in recent months has been estimated to be close to USD 10 million. This is especially significant for healthcare organizations in India that are managing rapid digitization while still establishing data governance procedures that align with the letter and spirit of the law. Computer-based systems for de-identification of personal information are vulnerable to data drift, often rendering them ineffective in cross-institution settings. Therefore, a rigorous assessment of existing de-identification against local health datasets is imperative to support the safe adoption of digital health initiatives in India. Using a small set of de-identified patient discharge summaries provided by an Indian healthcare institution, in this paper, we report the nominal performance of de-identification algorithms (based on language models) trained on publicly available non-Indian datasets, pointing towards a lack of cross-institutional generalization. Similarly, experimentation with off-the-shelf de-identification systems reveals potential risks associated with the approach. To overcome data scarcity, we explore generating synthetic clinical reports (using publicly available and Indian summaries) by performing in-context learning over Large Language Models (LLMs). Our experiments demonstrate the use of generated reports as an effective strategy for creating high-performing de-identification systems with good generalization capabilities. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: Accepted at BioNLP Workshop at ACL 2024; 21 pages (9 pages main content)

arXiv:2407.05607 [pdf, other]

Weakly Supervised Test-Time Domain Adaptation for Object Detection

Authors: Anh-Dzung Doan, Bach Long Nguyen, Terry Lim, Madhuka Jayawardhana, Surabhi Gupta, Christophe Guettier, Ian Reid, Markus Wagner, Tat-Jun Chin

Abstract: Prior to deployment, an object detector is trained on a dataset compiled from a previous data collection campaign. However, the environment in which the object detector is deployed will invariably evolve, particularly in outdoor settings where changes in lighting, weather and seasons will significantly affect the appearance of the scene and target objects. It is almost impossible for all potential… ▽ More Prior to deployment, an object detector is trained on a dataset compiled from a previous data collection campaign. However, the environment in which the object detector is deployed will invariably evolve, particularly in outdoor settings where changes in lighting, weather and seasons will significantly affect the appearance of the scene and target objects. It is almost impossible for all potential scenarios that the object detector may come across to be present in a finite training dataset. This necessitates continuous updates to the object detector to maintain satisfactory performance. Test-time domain adaptation techniques enable machine learning models to self-adapt based on the distributions of the testing data. However, existing methods mainly focus on fully automated adaptation, which makes sense for applications such as self-driving cars. Despite the prevalence of fully automated approaches, in some applications such as surveillance, there is usually a human operator overseeing the system's operation. We propose to involve the operator in test-time domain adaptation to raise the performance of object detection beyond what is achievable by fully automated adaptation. To reduce manual effort, the proposed method only requires the operator to provide weak labels, which are then used to guide the adaptation process. Furthermore, the proposed method can be performed in a streaming setting, where each online sample is observed only once. We show that the proposed method outperforms existing works, demonstrating a great benefit of human-in-the-loop test-time domain adaptation. Our code is publicly available at https://github.com/dzungdoan6/WSTTA △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.05467 [pdf, other]

The infrastructure powering IBM's Gen AI model development

Authors: Talia Gershon, Seetharami Seelam, Brian Belgodere, Milton Bonilla, Lan Hoang, Danny Barnett, I-Hsin Chung, Apoorve Mohan, Ming-Hung Chen, Lixiang Luo, Robert Walkup, Constantinos Evangelinos, Shweta Salaria, Marc Dombrowa, Yoonho Park, Apo Kayi, Liran Schour, Alim Alim, Ali Sydney, Pavlos Maniotis, Laurent Schares, Bernard Metzler, Bengi Karacali-Akyamac, Sophia Wen, Tatsuhiro Chiba , et al. (121 additional authors not shown)

Abstract: AI Infrastructure plays a key role in the speed and cost-competitiveness of developing and deploying advanced AI models. The current demand for powerful AI infrastructure for model training is driven by the emergence of generative AI and foundational models, where on occasion thousands of GPUs must cooperate on a single training job for the model to be trained in a reasonable time. Delivering effi… ▽ More AI Infrastructure plays a key role in the speed and cost-competitiveness of developing and deploying advanced AI models. The current demand for powerful AI infrastructure for model training is driven by the emergence of generative AI and foundational models, where on occasion thousands of GPUs must cooperate on a single training job for the model to be trained in a reasonable time. Delivering efficient and high-performing AI training requires an end-to-end solution that combines hardware, software and holistic telemetry to cater for multiple types of AI workloads. In this report, we describe IBM's hybrid cloud infrastructure that powers our generative AI model development. This infrastructure includes (1) Vela: an AI-optimized supercomputing capability directly integrated into the IBM Cloud, delivering scalable, dynamic, multi-tenant and geographically distributed infrastructure for large-scale model training and other AI workflow steps and (2) Blue Vela: a large-scale, purpose-built, on-premises hosting environment that is optimized to support our largest and most ambitious AI model training tasks. Vela provides IBM with the dual benefit of high performance for internal use along with the flexibility to adapt to an evolving commercial landscape. Blue Vela provides us with the benefits of rapid development of our largest and most ambitious models, as well as future-proofing against the evolving model landscape in the industry. Taken together, they provide IBM with the ability to rapidly innovate in the development of both AI models and commercial offerings. △ Less

Submitted 7 July, 2024; originally announced July 2024.

Comments: Corresponding Authors: Talia Gershon, Seetharami Seelam,Brian Belgodere, Milton Bonilla

arXiv:2407.04302 [pdf, other]

Fair Federated Data Clustering through Personalization: Bridging the Gap between Diverse Data Distributions

Authors: Shivam Gupta, Tarushi, Tsering Wangzes, Shweta Jain

Abstract: The rapid growth of data from edge devices has catalyzed the performance of machine learning algorithms. However, the data generated resides at client devices thus there are majorly two challenge faced by traditional machine learning paradigms - centralization of data for training and secondly for most the generated data the class labels are missing and there is very poor incentives to clients to… ▽ More The rapid growth of data from edge devices has catalyzed the performance of machine learning algorithms. However, the data generated resides at client devices thus there are majorly two challenge faced by traditional machine learning paradigms - centralization of data for training and secondly for most the generated data the class labels are missing and there is very poor incentives to clients to manually label their data owing to high cost and lack of expertise. To overcome these issues, there have been initial attempts to handle unlabelled data in a privacy preserving distributed manner using unsupervised federated data clustering. The goal is partition the data available on clients into $k$ partitions (called clusters) without actual exchange of data. Most of the existing algorithms are highly dependent on data distribution patterns across clients or are computationally expensive. Furthermore, due to presence of skewed nature of data across clients in most of practical scenarios existing models might result in clients suffering high clustering cost making them reluctant to participate in federated process. To this, we are first to introduce the idea of personalization in federated clustering. The goal is achieve balance between achieving lower clustering cost and at same time achieving uniform cost across clients. We propose p-FClus that addresses these goal in a single round of communication between server and clients. We validate the efficacy of p-FClus against variety of federated datasets showcasing it's data independence nature, applicability to any finite $\ell$-norm, while simultaneously achieving lower cost and variance. △ Less

Submitted 12 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

arXiv:2407.03172 [pdf, other]

IMC 2024 Methods & Solutions Review

Authors: Shyam Gupta, Dhanisha Sharma, Songling Huang

Abstract: For the past three years, Kaggle has been hosting the Image Matching Challenge, which focuses on solving a 3D image reconstruction problem using a collection of 2D images. Each year, this competition fosters the development of innovative and effective methodologies by its participants. In this paper, we introduce an advanced ensemble technique that we developed, achieving a score of 0.153449 on th… ▽ More For the past three years, Kaggle has been hosting the Image Matching Challenge, which focuses on solving a 3D image reconstruction problem using a collection of 2D images. Each year, this competition fosters the development of innovative and effective methodologies by its participants. In this paper, we introduce an advanced ensemble technique that we developed, achieving a score of 0.153449 on the private leaderboard and securing the 160th position out of over 1,000 participants. Additionally, we conduct a comprehensive review of existing methods and techniques employed by top-performing teams in the competition. Our solution, alongside the insights gathered from other leading approaches, contributes to the ongoing advancement in the field of 3D image reconstruction. This research provides valuable knowledge for future participants and researchers aiming to excel in similar image matching and reconstruction challenges. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 8 Pages, 9 figures

arXiv:2407.02841 [pdf, other]

Investigation of the Gamma-Ray Bursts prompt emission under the relativistically expanding fireball scenario

Authors: Soumya Gupta, Sunder Sahayanathan

Abstract: The spectral properties of a composite thermal emission arising from a relativistic expanding fireball can be remarkably different from the Planck function. We perform a detailed study of such a system to explore the features of the prompt emission spectra from the gamma-ray bursts (GRBs). Particularly, we address the effect of optical opacity and its dependence on the density profile between the… ▽ More The spectral properties of a composite thermal emission arising from a relativistic expanding fireball can be remarkably different from the Planck function. We perform a detailed study of such a system to explore the features of the prompt emission spectra from the gamma-ray bursts (GRBs). Particularly, we address the effect of optical opacity and its dependence on the density profile between the expanding gas and the observer. This results in a nontrivial shape of the photospheric radius which in combination with the constraints derived from the equal-arrival-time can result in a mild broader spectrum compared to the Planck function. Further, we show the time-integrated spectrum from the expanding fireball deviates significantly from the instantaneous emission and is capable of explaining the observed broad spectral width of the GRBs. We also show, that the demand of the spectral width of the order of unity, obtained through statistical analysis, is consistent with the scenario where the dynamics of the expanding fireball are governed predominantly by the energy content of the matter. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: Accepted for publication in ApJL

arXiv:2407.02413 [pdf]

First-principles investigation of multifaceted properties; lattice dynamic, structural stability, mechanical, electronic, magnetic and thermodynamic response of Alkali metals-based semi Heusler alloys

Authors: Diwaker, Shyam L. Gupta, Anupam, Sumit Kumar, Aadil Fayaz, Ashwani Kumar

Abstract: Taking into considerations the wide compositional stretch of Heusler alloys, the first principles density functional theory based calculations are excellently suitable for estimating the multifaceted properties of alkali metal based LiVSb and NaVSb Heusler alloys. We calculated ground state stability by optimizing the energy in alpha, beta and gamma phase configurations. The materials are dynamica… ▽ More Taking into considerations the wide compositional stretch of Heusler alloys, the first principles density functional theory based calculations are excellently suitable for estimating the multifaceted properties of alkali metal based LiVSb and NaVSb Heusler alloys. We calculated ground state stability by optimizing the energy in alpha, beta and gamma phase configurations. The materials are dynamically stable in spin polarised phase type alpha. To explore the electronic structure, we successfully employed the generalized gradient approximation potential. The electronic band structures indicate a half-metallic nature featuring a wide indirect band gap of 1.40eV and 1.45eV. We computed the second-order elastic parameters at different pressure levels. The Pugh ratio less than 0.25 assessed that both alloys are brittle in nature and mechanically stable. The obtained magnetic moment is consistent with the Slater-Pauling rule. By executing the Quasi-Harmonic Debye model and Boltzmann theory we assessed the various thermodynamic parameters and transport coefficients of both alloys at different temperatures and pressures. All positive frequencies in lattice dynamic study confirmed their stability. Our findings highlight the potential of these alloys in modern semiconductor technology, and thermoelectric applications. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2407.00855 [pdf, other]

K-mouflage at high k: extending the reach of $\texttt{Hi-COLA}$

Authors: Ashim Sen Gupta, Bartolomeo Fiorini, Tessa Baker

Abstract: The $\texttt{Hi-COLA}$ code is an efficient dark matter simulation suite that flexibly handles the Horndeski family of modified gravity models. In this work we extend the scope of $\texttt{Hi-COLA}$ to accommodate Horndeski theories with K-mouflage screening, allowing for the computation of matter power spectra in the non-linear regime in these models. We explore the boost of the dark matter power… ▽ More The $\texttt{Hi-COLA}$ code is an efficient dark matter simulation suite that flexibly handles the Horndeski family of modified gravity models. In this work we extend the scope of $\texttt{Hi-COLA}$ to accommodate Horndeski theories with K-mouflage screening, allowing for the computation of matter power spectra in the non-linear regime in these models. We explore the boost of the dark matter power spectrum relative to GR-$Λ$CDM in K-mouflage gravity, and also discuss how large-scale structure computations change between the Einstein and Jordan frames. A dissection of the relative contributions of the modified background, linear growth, fifth force, and the conformal factor (a new inclusion to $\texttt{Hi-COLA}$) to the boost factor is presented. The ability of $\texttt{Hi-COLA}$ to run with general Horndeski models and multiple screening mechanisms makes it an ideal tool for testing gravity with upcoming galaxy survey data. △ Less

Submitted 30 June, 2024; originally announced July 2024.

Comments: 31 pages, 10 figures, 1 table. Comments welcome. The Hi-COLA code can be found at https://github.com/Hi-COLACode/Hi-COLA

arXiv:2407.00854 [pdf, ps, other]

Effects of Internal Resonance and Damping on Koopman Modes

Authors: Rahul Das, Anil K. Bajaj, Sayan Gupta

Abstract: This study investigates the nonlinear normal modes (NNMs) of a system comprising of two coupled Duffing oscillators, with one oscillator being grounded and with the coupling being both linear and nonlinear. The study utilizes the eigenfunctions of the Koopman operator and validates their connection with the Shaw-Piere invariant manifold framework for NNMs. Furthermore, the study delves into the im… ▽ More This study investigates the nonlinear normal modes (NNMs) of a system comprising of two coupled Duffing oscillators, with one oscillator being grounded and with the coupling being both linear and nonlinear. The study utilizes the eigenfunctions of the Koopman operator and validates their connection with the Shaw-Piere invariant manifold framework for NNMs. Furthermore, the study delves into the impact of internal resonance and dissipation on the accuracy of this framework by defining a continuous quantitative measure for internal resonance. The applicability and robustness of the framework for the systems which are very similar qualitatively to that of an ENO, are also observed and discussed about the limitations of the approximation technique. △ Less

Submitted 30 June, 2024; originally announced July 2024.

arXiv:2407.00167 [pdf, other]

Can GPT-4 Help Detect Quit Vaping Intentions? An Exploration of Automatic Data Annotation Approach

Authors: Sai Krishna Revanth Vuruma, Dezhi Wu, Saborny Sen Gupta, Lucas Aust, Valerie Lookingbill, Wyatt Bellamy, Yang Ren, Erin Kasson, Li-Shiun Chen, Patricia Cavazos-Rehg, Dian Hu, Ming Huang

Abstract: In recent years, the United States has witnessed a significant surge in the popularity of vaping or e-cigarette use, leading to a notable rise in cases of e-cigarette and vaping use-associated lung injury (EVALI) that caused hospitalizations and fatalities during the EVALI outbreak in 2019, highlighting the urgency to comprehend vaping behaviors and develop effective strategies for cessation. Due… ▽ More In recent years, the United States has witnessed a significant surge in the popularity of vaping or e-cigarette use, leading to a notable rise in cases of e-cigarette and vaping use-associated lung injury (EVALI) that caused hospitalizations and fatalities during the EVALI outbreak in 2019, highlighting the urgency to comprehend vaping behaviors and develop effective strategies for cessation. Due to the ubiquity of social media platforms, over 4.7 billion users worldwide use them for connectivity, communications, news, and entertainment with a significant portion of the discourse related to health, thereby establishing social media data as an invaluable organic data resource for public health research. In this study, we extracted a sample dataset from one vaping sub-community on Reddit to analyze users' quit-vaping intentions. Leveraging OpenAI's latest large language model GPT-4 for sentence-level quit vaping intention detection, this study compares the outcomes of this model against layman and clinical expert annotations. Using different prompting strategies such as zero-shot, one-shot, few-shot and chain-of-thought prompting, we developed 8 prompts with varying levels of detail to explain the task to GPT-4 and also evaluated the performance of the strategies against each other. These preliminary findings emphasize the potential of GPT-4 in social media data analysis, especially in identifying users' subtle intentions that may elude human detection. △ Less

Submitted 28 June, 2024; originally announced July 2024.

Comments: Accepted for the AI Applications in Public Health and Social Services workshop at the 22nd International Conference on Artificial Intelligence in Medicine (AIME 2024)

arXiv:2406.19299 [pdf, other]

PNeRV: A Polynomial Neural Representation for Videos

Authors: Sonam Gupta, Snehal Singh Tomar, Grigorios G Chrysos, Sukhendu Das, A. N. Rajagopalan

Abstract: Extracting Implicit Neural Representations (INRs) on video data poses unique challenges due to the additional temporal dimension. In the context of videos, INRs have predominantly relied on a frame-only parameterization, which sacrifices the spatiotemporal continuity observed in pixel-level (spatial) representations. To mitigate this, we introduce Polynomial Neural Representation for Videos (PNeRV… ▽ More Extracting Implicit Neural Representations (INRs) on video data poses unique challenges due to the additional temporal dimension. In the context of videos, INRs have predominantly relied on a frame-only parameterization, which sacrifices the spatiotemporal continuity observed in pixel-level (spatial) representations. To mitigate this, we introduce Polynomial Neural Representation for Videos (PNeRV), a parameter-wise efficient, patch-wise INR for videos that preserves spatiotemporal continuity. PNeRV leverages the modeling capabilities of Polynomial Neural Networks to perform the modulation of a continuous spatial (patch) signal with a continuous time (frame) signal. We further propose a custom Hierarchical Patch-wise Spatial Sampling Scheme that ensures spatial continuity while retaining parameter efficiency. We also employ a carefully designed Positional Embedding methodology to further enhance PNeRV's performance. Our extensive experimentation demonstrates that PNeRV outperforms the baselines in conventional Implicit Neural Representation tasks like compression along with downstream applications that require spatiotemporal continuity in the underlying representation. PNeRV not only addresses the challenges posed by video data in the realm of INRs but also opens new avenues for advanced video processing and analysis. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: 25 pages, 17 figures, published at TMLR, Feb 2024

arXiv:2406.19102 [pdf, other]

Statements: Universal Information Extraction from Tables with Large Language Models for ESG KPIs

Authors: Lokesh Mishra, Sohayl Dhibi, Yusik Kim, Cesar Berrospi Ramis, Shubham Gupta, Michele Dolfi, Peter Staar

Abstract: Environment, Social, and Governance (ESG) KPIs assess an organization's performance on issues such as climate change, greenhouse gas emissions, water consumption, waste management, human rights, diversity, and policies. ESG reports convey this valuable quantitative information through tables. Unfortunately, extracting this information is difficult due to high variability in the table structure as… ▽ More Environment, Social, and Governance (ESG) KPIs assess an organization's performance on issues such as climate change, greenhouse gas emissions, water consumption, waste management, human rights, diversity, and policies. ESG reports convey this valuable quantitative information through tables. Unfortunately, extracting this information is difficult due to high variability in the table structure as well as content. We propose Statements, a novel domain agnostic data structure for extracting quantitative facts and related information. We propose translating tables to statements as a new supervised deep-learning universal information extraction task. We introduce SemTabNet - a dataset of over 100K annotated tables. Investigating a family of T5-based Statement Extraction Models, our best model generates statements which are 82% similar to the ground-truth (compared to baseline of 21%). We demonstrate the advantages of statements by applying our model to over 2700 tables from ESG reports. The homogeneous nature of statements permits exploratory data analysis on expansive information found in large collections of ESG reports. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: Accepted at the NLP4Climate workshop in the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024)

arXiv:2406.17713 [pdf, other]

Multi-objective Binary Differential Approach with Parameter Tuning for Discovering Business Process Models: MoD-ProM

Authors: Sonia Deshmukh, Shikha Gupta, Naveen Kumar

Abstract: Process discovery approaches analyze the business data to automatically uncover structured information, known as a process model. The quality of a process model is measured using quality dimensions -- completeness (replay fitness), preciseness, simplicity, and generalization. Traditional process discovery algorithms usually output a single process model. A single model may not accurately capture t… ▽ More Process discovery approaches analyze the business data to automatically uncover structured information, known as a process model. The quality of a process model is measured using quality dimensions -- completeness (replay fitness), preciseness, simplicity, and generalization. Traditional process discovery algorithms usually output a single process model. A single model may not accurately capture the observed behavior and overfit the training data. We have formed the process discovery problem in a multi-objective framework that yields several candidate solutions for the end user who can pick a suitable model based on the local environmental constraints (possibly varying). We consider the Binary Differential Evolution approach in a multi-objective framework for the task of process discovery. The proposed method employs dichotomous crossover/mutation operators. The parameters are tuned using Grey relational analysis combined with the Taguchi approach. {We have compared the proposed approach with the well-known single-objective algorithms and state-of-the-art multi-objective evolutionary algorithm -- Non-dominated Sorting Genetic Algorithm (NSGA-II).} Additional comparison via computing a weighted average of the quality dimensions is also undertaken. Results show that the proposed algorithm is computationally efficient and produces diversified candidate solutions that score high on the fitness functions. It is shown that the process models generated by the proposed approach are superior to or at least as good as those generated by the state-of-the-art algorithms. △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.17440 [pdf, ps, other]

Effect of clustering on Turing instability in complex networks

Authors: Samana Pranesh, Devanand Jaiswal, Sayan Gupta

Abstract: Turing instability in complex networks have been shown in the literature to be dominated by the distribution of the nodal degrees. The conditions for Turing instability have been derived with an explicit dependence on the eigenvalues of the Laplacian, which in turn depends on the network topology. This study reveals that apart from average degree of the network, another global network measure - th… ▽ More Turing instability in complex networks have been shown in the literature to be dominated by the distribution of the nodal degrees. The conditions for Turing instability have been derived with an explicit dependence on the eigenvalues of the Laplacian, which in turn depends on the network topology. This study reveals that apart from average degree of the network, another global network measure - the nodal clustering - also plays a crucial role. Analytical and numerical results are presented to show the importance of clustering for several network topologies ranging from the $\mathbb{S}^1$ / $\mathbb{H}^2$ hyperbolic geometric networks that enable modelling the naturally occurring clustering in real world networks, as well as the random and scale free networks, which are obtained as limiting cases of the $\mathbb{S}^1$ / $\mathbb{H}^2$ model. Analysis of eigenvector localization properties in these networks are shown to reveal distinct signatures that enable identifying the so called Turing patterns even in complex networks. △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.15993 [pdf, other]

Does the Amati Correlation Exhibit Redshift-Driven Heterogeneity in Long GRBs?

Authors: Darshan Singh, Meghendra Singh, Dinkar Verma, Kanhaiya Lal Pandey, Shashikant Gupta

Abstract: Long gamma-ray bursts (GRBs) offer significant insights into cosmology due to their high energy emissions and the potential to probe the early universe. The Amati relation, which links the intrinsic peak energy to the isotropic energy, is crucial for understanding their cosmological applications. This study investigates the redshift-driven heterogeneity of the Amati correlation in long GRBs. Analy… ▽ More Long gamma-ray bursts (GRBs) offer significant insights into cosmology due to their high energy emissions and the potential to probe the early universe. The Amati relation, which links the intrinsic peak energy to the isotropic energy, is crucial for understanding their cosmological applications. This study investigates the redshift-driven heterogeneity of the Amati correlation in long GRBs. Analyzing 221 long GRBs with redshifts from 0.034 to 8.2 we divided the dataset based on redshift thresholds of 1.5 and 2. Using Bayesian marginalization and Reichart's likelihood approach, we found significant differences in the Amati parameters between low and high redshift subgroups. These variations, differing by approximately $2σ$ at $z = 1.5$ and more than $1σ$ at $z = 2$, suggest an evolution in the GRB population with redshift, possibly reflecting changes in host galaxy properties. However, selection effects and instrumental biases may also contribute. Our results challenge the assumption of the Amati relation's universality and underscore the need for larger datasets and more precise measurements from upcoming missions like THESEUS and eXTP to refine our understanding of GRB physics. △ Less

Submitted 22 June, 2024; originally announced June 2024.

Comments: 14 pages, 8 figures

arXiv:2406.15958 [pdf, other]

Bone Fracture Classification using Transfer Learning

Authors: Shyam Gupta, Dhanisha Sharma

Abstract: The manual examination of X-ray images for fractures is a time-consuming process that is prone to human error. In this work, we introduce a robust yet simple training loop for the classification of fractures, which significantly outperforms existing methods. Our method achieves superior performance in less than ten epochs and utilizes the latest dataset to deliver the best-performing model for thi… ▽ More The manual examination of X-ray images for fractures is a time-consuming process that is prone to human error. In this work, we introduce a robust yet simple training loop for the classification of fractures, which significantly outperforms existing methods. Our method achieves superior performance in less than ten epochs and utilizes the latest dataset to deliver the best-performing model for this task. We emphasize the importance of training deep learning models responsibly and efficiently, as well as the critical role of selecting high-quality datasets. △ Less

Submitted 22 June, 2024; originally announced June 2024.

Comments: code is publicly available at - https://github.com/shyamgupta196/Bone-Fracture-Classification

arXiv:2406.14706 [pdf]

SWANN: Shuffling Weights in Crossbar Arrays for Enhanced DNN Accuracy in Deeply Scaled Technologies

Authors: Jeffry Victor, Dong Eun Kim, Chunguang Wang, Kaushik Roy, Sumeet Gupta

Abstract: Deep neural network (DNN) accelerators employing crossbar arrays capable of in-memory computing (IMC) are highly promising for neural computing platforms. However, in deeply scaled technologies, interconnect resistance severely impairs IMC robustness, leading to a drop in the system accuracy. To address this problem, we propose SWANN - a technique based on shuffling weights in crossbar arrays whic… ▽ More Deep neural network (DNN) accelerators employing crossbar arrays capable of in-memory computing (IMC) are highly promising for neural computing platforms. However, in deeply scaled technologies, interconnect resistance severely impairs IMC robustness, leading to a drop in the system accuracy. To address this problem, we propose SWANN - a technique based on shuffling weights in crossbar arrays which alleviates the detrimental effect of wire resistance on IMC. For 8T-SRAM-based 128x128 crossbar arrays in 7nm technology, SWANN enhances the accuracy from 47.78% to 83.5% for ResNet-20/CIFAR-10. We also show that SWANN can be used synergistically with Partial-Word-LineActivation, further boosting the accuracy. Moreover, we evaluate the implications of SWANN for compact ferroelectric-transistorbased crossbar arrays. SWANN incurs minimal hardware overhead, with less than a 1% increase in energy consumption. Additionally, the latency and area overheads of SWANN are ~1% and ~16%, respectively when 1 ADC is utilized per crossbar array. △ Less

Submitted 20 June, 2024; originally announced June 2024.

arXiv:2406.14398 [pdf, other]

ATAC-Net: Zoomed view works better for Anomaly Detection

Authors: Shaurya Gupta, Neil Gautam, Anurag Malyala

Abstract: The application of deep learning in visual anomaly detection has gained widespread popularity due to its potential use in quality control and manufacturing. Current standard methods are Unsupervised, where a clean dataset is utilised to detect deviations and flag anomalies during testing. However, incorporating a few samples when the type of anomalies is known beforehand can significantly enhance… ▽ More The application of deep learning in visual anomaly detection has gained widespread popularity due to its potential use in quality control and manufacturing. Current standard methods are Unsupervised, where a clean dataset is utilised to detect deviations and flag anomalies during testing. However, incorporating a few samples when the type of anomalies is known beforehand can significantly enhance performance. Thus, we propose ATAC-Net, a framework that trains to detect anomalies from a minimal set of known prior anomalies. Furthermore, we introduce attention-guided cropping, which provides a closer view of suspect regions during the training phase. Our framework is a reliable and easy-to-understand system for detecting anomalies, and we substantiate its superiority to some of the current state-of-the-art techniques in a comparable setting. △ Less

Submitted 20 June, 2024; originally announced June 2024.

arXiv:2406.14330 [pdf, other]

Promise of Graph Sparsification and Decomposition for Noise Reduction in QAOA: Analysis for Trapped-Ion Compilations

Authors: Jai Moondra, Philip C. Lotshaw, Greg Mohler, Swati Gupta

Abstract: We develop new approximate compilation schemes that significantly reduce the expense of compiling the Quantum Approximate Optimization Algorithm (QAOA) for solving the Max-Cut problem. Our main focus is on compilation with trapped-ion simulators using Pauli-$X$ operations and all-to-all Ising Hamiltonian $H_\text{Ising}$ evolution generated by Molmer-Sorensen or optical dipole force interactions,… ▽ More We develop new approximate compilation schemes that significantly reduce the expense of compiling the Quantum Approximate Optimization Algorithm (QAOA) for solving the Max-Cut problem. Our main focus is on compilation with trapped-ion simulators using Pauli-$X$ operations and all-to-all Ising Hamiltonian $H_\text{Ising}$ evolution generated by Molmer-Sorensen or optical dipole force interactions, though some of our results also apply to standard gate-based compilations. Our results are based on principles of graph sparsification and decomposition; the former reduces the number of edges in a graph while maintaining its cut structure, while the latter breaks a weighted graph into a small number of unweighted graphs. Though these techniques have been used as heuristics in various hybrid quantum algorithms, there have been no guarantees on their performance, to the best of our knowledge. This work provides the first provable guarantees using sparsification and decomposition to improve quantum noise resilience and reduce quantum circuit complexity. For quantum hardware that uses edge-by-edge QAOA compilations, sparsification leads to a direct reduction in circuit complexity. For trapped-ion quantum simulators implementing all-to-all $H_\text{Ising}$ pulses, we show that for a $(1-ε)$ factor loss in the Max-Cut approximation ($ε>0)$, our compilations improve the (worst-case) number of $H_\text{Ising}$ pulses from $O(n^2)$ to $O(n\log(n/ε))$ and the (worst-case) number of Pauli-$X$ bit flips from $O(n^2)$ to $O\left(\frac{n\log(n/ε)}{ε^2}\right)$ for $n$-node graphs. We demonstrate significant reductions in noise are obtained in our new compilation approaches using theory and numerical calculations for trapped-ion hardware. We anticipate these approximate compilation techniques will be useful tools in a variety of future quantum computing experiments. △ Less

Submitted 20 June, 2024; originally announced June 2024.

MSC Class: 81P68

arXiv:2406.13755 [pdf, other]

A detailed time-resolved and energy-resolved spectro-polarimetric study of bright GRBs detected by AstroSat CZTI in its first year of operation

Authors: Rahul Gupta, S. B. Pandey, S. Gupta, T. Chattopadhayay, D. Bhattacharya, V. Bhalerao, A. J. Castro-Tirado, A. Valeev, A. K. Ror, V. Sharma, J. Racusin, A. Aryan, S. Iyyani, S. Vadawale

Abstract: The radiation mechanism underlying the prompt emission remains unresolved and can be resolved using a systematic and uniform time-resolved spectro-polarimetric study. In this paper, we investigated the spectral, temporal, and polarimetric characteristics of five bright GRBs using archival data from AstroSat CZTI, Swift BAT, and Fermi GBM. These bright GRBs were detected by CZTI in its first year o… ▽ More The radiation mechanism underlying the prompt emission remains unresolved and can be resolved using a systematic and uniform time-resolved spectro-polarimetric study. In this paper, we investigated the spectral, temporal, and polarimetric characteristics of five bright GRBs using archival data from AstroSat CZTI, Swift BAT, and Fermi GBM. These bright GRBs were detected by CZTI in its first year of operation, and their average polarization characteristics have been published in Chattopadhyay et al. (2022). In the present work, we examined the time-resolved (in 100-600 keV) and energy-resolved polarization measurements of these GRBs with an improved polarimetric technique such as increasing the effective area and bandwidth (by using data from low-gain pixels), using an improved event selection logic to reduce noise in the double events and extend the spectral bandwidth. In addition, we also separately carried out detailed time-resolved spectral analyses of these GRBs using empirical and physical synchrotron models. By these improved time-resolved and energy-resolved spectral and polarimetric studies (not fully coupled spectro-polarimetric fitting), we could pin down the elusive prompt emission mechanism of these GRBs. Our spectro-polarimetric analysis reveals that GRB 160623A, GRB 160703A, and GRB 160821A have Poynting flux-dominated jets. On the other hand, GRB 160325A and GRB 160802A have baryonic-dominated jets with mild magnetization. Furthermore, we observe a rapid change in polarization angle by $\sim$ 90 degrees within the main pulse of very bright GRB 160821A, consistent with our previous results. Our study suggests that the jet composition of GRBs may exhibit a wide range of magnetization, which can be revealed by utilizing spectro-polarimetric investigations of the bright GRBs. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 36 pages, 11 figures, Accepted for publication in ApJ

arXiv:2406.13667 [pdf, other]

Matter Power Spectra in Modified Gravity: A Comparative Study of Approximations and $N$-Body Simulations

Authors: Benjamin Bose, Ashim Sen Gupta, Bartolomeo Fiorini, Guilherme Brando, Farbod Hassani, Tessa Baker, Lucas Lombriser, Baojiu Li, Cheng-Zong Ruan, Cesar Hernandez-Aguayo, Luis Atayde, Noemi Frusciante

Abstract: Testing gravity and the concordance model of cosmology, $Λ$CDM, at large scales is a key goal of this decade's largest galaxy surveys. Here we present a comparative study of dark matter power spectrum predictions from different numerical codes in the context of three popular theories of gravity that induce scale-independent modifications to the linear growth of structure: nDGP, Cubic Galileon and… ▽ More Testing gravity and the concordance model of cosmology, $Λ$CDM, at large scales is a key goal of this decade's largest galaxy surveys. Here we present a comparative study of dark matter power spectrum predictions from different numerical codes in the context of three popular theories of gravity that induce scale-independent modifications to the linear growth of structure: nDGP, Cubic Galileon and K-mouflage. In particular, we compare the predictions from full $N$-body simulations, two $N$-body codes with approximate time integration schemes, a parametrised modified $N$-body implementation and the analytic halo model reaction approach. We find the modification to the $Λ$CDM spectrum is in $2\%$ agreement for $z\leq1$ and $k\leq 1~h/{\rm Mpc}$ over all gravitational models and codes, in accordance with many previous studies, indicating these modelling approaches are robust enough to be used in forthcoming survey analyses under appropriate scale cuts. We further make public the new code implementations presented, specifically the halo model reaction K-mouflage implementation and the relativistic Cubic Galileon implementation. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 20 pages, 4 figures, 4 tables

arXiv:2406.11784 [pdf, other]

MDCR: A Dataset for Multi-Document Conditional Reasoning

Authors: Peter Baile Chen, Yi Zhang, Chunwei Liu, Sejal Gupta, Yoon Kim, Michael Cafarella

Abstract: The same real-life questions posed to different individuals may lead to different answers based on their unique situations. For instance, whether a student is eligible for a scholarship depends on eligibility conditions, such as major or degree required. ConditionalQA was proposed to evaluate models' capability of reading a document and answering eligibility questions, considering unmentioned cond… ▽ More The same real-life questions posed to different individuals may lead to different answers based on their unique situations. For instance, whether a student is eligible for a scholarship depends on eligibility conditions, such as major or degree required. ConditionalQA was proposed to evaluate models' capability of reading a document and answering eligibility questions, considering unmentioned conditions. However, it is limited to questions on single documents, neglecting harder cases that may require cross-document reasoning and optimization, for example, "What is the maximum number of scholarships attainable?" Such questions over multiple documents are not only more challenging due to more context having to understand, but also because the model has to (1) explore all possible combinations of unmentioned conditions and (2) understand the relationship between conditions across documents, to reason about the optimal outcome. To evaluate models' capability of answering such questions, we propose a new dataset MDCR, which can reflect real-world challenges and serve as a new test bed for complex conditional reasoning that requires optimization. We evaluate this dataset using the most recent LLMs and demonstrate their limitations in solving this task. We believe this dataset will facilitate future research in answering optimization questions with unknown conditions. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.10528 [pdf, other]

Memory Faults in Activation-sparse Quantized Deep Neural Networks: Analysis and Mitigation using Sharpness-aware Training

Authors: Akul Malhotra, Sumeet Kumar Gupta

Abstract: Improving the hardware efficiency of deep neural network (DNN) accelerators with techniques such as quantization and sparsity enhancement have shown an immense promise. However, their inference accuracy in non-ideal real-world settings (such as in the presence of hardware faults) is yet to be systematically analyzed. In this work, we investigate the impact of memory faults on activation-sparse qua… ▽ More Improving the hardware efficiency of deep neural network (DNN) accelerators with techniques such as quantization and sparsity enhancement have shown an immense promise. However, their inference accuracy in non-ideal real-world settings (such as in the presence of hardware faults) is yet to be systematically analyzed. In this work, we investigate the impact of memory faults on activation-sparse quantized DNNs (AS QDNNs). We show that a high level of activation sparsity comes at the cost of larger vulnerability to faults, with AS QDNNs exhibiting up to 11.13% lower accuracy than the standard QDNNs. We establish that the degraded accuracy correlates with a sharper minima in the loss landscape for AS QDNNs, which makes them more sensitive to perturbations in the weight values due to faults. Based on this observation, we employ sharpness-aware quantization (SAQ) training to mitigate the impact of memory faults. The AS and standard QDNNs trained with SAQ have up to 19.50% and 15.82% higher inference accuracy, respectively compared to their conventionally trained equivalents. Moreover, we show that SAQ-trained AS QDNNs show higher accuracy in faulty settings than standard QDNNs trained conventionally. Thus, sharpness-aware training can be instrumental in achieving sparsity-related latency benefits without compromising on fault tolerance. △ Less

Submitted 15 June, 2024; originally announced June 2024.

Comments: arXiv admin note: substantial text overlap with arXiv:2301.00675

arXiv:2406.10422 [pdf, other]

Phoneme Discretized Saliency Maps for Explainable Detection of AI-Generated Voice

Authors: Shubham Gupta, Mirco Ravanelli, Pascal Germain, Cem Subakan

Abstract: In this paper, we propose Phoneme Discretized Saliency Maps (PDSM), a discretization algorithm for saliency maps that takes advantage of phoneme boundaries for explainable detection of AI-generated voice. We experimentally show with two different Text-to-Speech systems (i.e., Tacotron2 and Fastspeech2) that the proposed algorithm produces saliency maps that result in more faithful explanations com… ▽ More In this paper, we propose Phoneme Discretized Saliency Maps (PDSM), a discretization algorithm for saliency maps that takes advantage of phoneme boundaries for explainable detection of AI-generated voice. We experimentally show with two different Text-to-Speech systems (i.e., Tacotron2 and Fastspeech2) that the proposed algorithm produces saliency maps that result in more faithful explanations compared to standard posthoc explanation methods. Moreover, by associating the saliency maps to the phoneme representations, this methodology generates explanations that tend to be more understandable than standard saliency maps on magnitude spectrograms. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: Accepted to Interspeech 2024

arXiv:2406.10090 [pdf, other]

Over-parameterization and Adversarial Robustness in Neural Networks: An Overview and Empirical Analysis

Authors: Zhang Chen, Luca Demetrio, Srishti Gupta, Xiaoyi Feng, Zhaoqiang Xia, Antonio Emanuele Cinà, Maura Pintor, Luca Oneto, Ambra Demontis, Battista Biggio, Fabio Roli

Abstract: Thanks to their extensive capacity, over-parameterized neural networks exhibit superior predictive capabilities and generalization. However, having a large parameter space is considered one of the main suspects of the neural networks' vulnerability to adversarial example -- input samples crafted ad-hoc to induce a desired misclassification. Relevant literature has claimed contradictory remarks in… ▽ More Thanks to their extensive capacity, over-parameterized neural networks exhibit superior predictive capabilities and generalization. However, having a large parameter space is considered one of the main suspects of the neural networks' vulnerability to adversarial example -- input samples crafted ad-hoc to induce a desired misclassification. Relevant literature has claimed contradictory remarks in support of and against the robustness of over-parameterized networks. These contradictory findings might be due to the failure of the attack employed to evaluate the networks' robustness. Previous research has demonstrated that depending on the considered model, the algorithm employed to generate adversarial examples may not function properly, leading to overestimating the model's robustness. In this work, we empirically study the robustness of over-parameterized networks against adversarial examples. However, unlike the previous works, we also evaluate the considered attack's reliability to support the results' veracity. Our results show that over-parameterized networks are robust against adversarial attacks as opposed to their under-parameterized counterparts. △ Less

Submitted 14 June, 2024; originally announced June 2024.

MSC Class: 68T10 ACM Class: I.5

arXiv:2406.09208 [pdf, other]

Python-based DSL for generating Verilog model of Synchronous Digital Circuits

Authors: Mandar Datar, Dhruva S. Hegde, Vendra Durga Prasad, Manish Prajapati, Neralla Manikanta, Devansh Gupta, Janampalli Pavanija, Pratyush Pare, Akash, Shivam Gupta, Sachin B. Patkar

Abstract: We have designed a Python-based Domain Specific Language (DSL) for modeling synchronous digital circuits. In this DSL, hardware is modeled as a collection of transactions -- running in series, parallel, and loops. When the model is executed by a Python interpreter, synthesizable and behavioural Verilog is generated as output, which can be integrated with other RTL designs or directly used for FPGA… ▽ More We have designed a Python-based Domain Specific Language (DSL) for modeling synchronous digital circuits. In this DSL, hardware is modeled as a collection of transactions -- running in series, parallel, and loops. When the model is executed by a Python interpreter, synthesizable and behavioural Verilog is generated as output, which can be integrated with other RTL designs or directly used for FPGA and ASIC flows. In this paper, we describe - 1) the language (DSL), which allows users to express computation in series/parallel/loop constructs, with explicit cycle boundaries, 2) the internals of a simple Python implementation to produce synthesizable Verilog, and 3) several design examples and case studies for applications in post-quantum cryptography, stereo-vision, digital signal processing and optimization techniques. In the end, we list ideas to extend this framework. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: 9 pages, 13 figures

arXiv:2406.07797 [pdf, other]

Real-time Deformation Correction in Additively Printed Flexible Antenna Arrays

Authors: Sreeni Poolakkal, Abdullah Islam, Shrestha Bansal, Arpit Rao, Ted Dabrowski, Kalsi Kwan, Amit Mishra, Quiyan Xu, Erfan Ghaderi, Pradeep Lall, Sudip Shekhar, Julio Navarro, Shenqiang Ren, John Williams, Subhanshu Gupta

Abstract: Conformal phased arrays provide multiple degrees of freedom to the scan angle, which is typically limited by antenna aperture in rigid arrays. Silicon-based RF signal processing offers reliable, reconfigurable, multi-functional, and compact control for conformal phased arrays that can be used for on-the-move communication. While the lightweight, compactness, and shape-changing properties of the co… ▽ More Conformal phased arrays provide multiple degrees of freedom to the scan angle, which is typically limited by antenna aperture in rigid arrays. Silicon-based RF signal processing offers reliable, reconfigurable, multi-functional, and compact control for conformal phased arrays that can be used for on-the-move communication. While the lightweight, compactness, and shape-changing properties of the conformal phased arrays are attractive, these features result in dynamic deformation of the array during motion leading to significant dynamic beam pointing errors. We propose a silicon-based, compact, reconfigurable solution to self-correct these dynamic deformation-induced beam pointing errors. Furthermore, additive printing is leveraged to enhance the flexibility of the conformal phased arrays, as the printed conductive ink is more flexible than bulk copper and can be easily deposited on flexible sheets using different printing tools, providing an environmentally-friendly solution for large-scale production. The inks such as conventional silver inks are expensive and copper-based printable inks suffer from spontaneous metal oxidation that alters trace impedance and degrades beamforming performance. This work uses a low-cost molecular copper decomposition ink with reliable RF properties at different temperature and strain to print the proposed intelligent conformal phased array operating at 2.1 GHz. Proof-of-concept prototype $2\times2$ array self-corrects the deformation induces beampointing error with an error $<1.25^\circ$. The silicon based array processing part occupying only 2.58 mm$^2$ area and 83 mW power per tile. △ Less

Submitted 21 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

arXiv:2406.06656 [pdf]

Spin-polarized DFT calculations for physical properties of novel KVSb half-Heusler compound for spintronic and thermodynamic applicability

Authors: Ashwani Kumar, Anupam, Shyam L. Gupta, Sumit Kumar, Vipan Kumar, Diwaker

Abstract: In the reported study we have investigated the robust phase stability, elasto-mechanical, thermophysical and magnetic properties of KVSb half Heusler compound by implementing density functional theory models in Wien2k simulation package. The dynamic phase stability is computed in phase type I, II & III phase configurations by optimising their energy. It is observed that given compound is more stab… ▽ More In the reported study we have investigated the robust phase stability, elasto-mechanical, thermophysical and magnetic properties of KVSb half Heusler compound by implementing density functional theory models in Wien2k simulation package. The dynamic phase stability is computed in phase type I, II & III phase configurations by optimising their energy. It is observed that given compound is more stable in spin-polarised state of phase type I. To explore the electronic band structure, we apply the generalised gradient approximation. The electronic band profile of the Heusler alloy display a half-metallic nature. Moreover, the calculated second-order elastic parameters divulge the ductile nature. To understand the thermodynamical and thermoelectric stability of the alloy at various temperature and pressures ranges we have utilised the Quasi-Harmonic Debye model. The computed value of magnetic moment found in good agreement with Slater-Pauling rule. Our findings confirms that the predicted half Heusler alloy can be used in various spintronics and thermoelectric applications. △ Less

Submitted 10 June, 2024; originally announced June 2024.

arXiv:2406.06639 [pdf]

Investigations of the Effects of Pressure on the Structural and Electronic Properties of Co$_2$VZ (Z = Al, Be) Full Heusler Alloy: A Comparative Study Using DFT

Authors: Sumit Kumar, Diwaker, Karan Singh Vinayak, Shyam L. Gupta

Abstract: This study focuses on the investigations and comparative study of the electronic structure of Co$_2$VZ (Z=Al, Be) Heusler alloys under varying high pressure conditions. The pressure range explored spans from 0.0 GPa to 30.0GPa, with increments of 0.5GPa. The WIEN2K simulation program is used to investigate the effect of pressure on the structural, magnetic, and electronic properties of Co$_2$VZ He… ▽ More This study focuses on the investigations and comparative study of the electronic structure of Co$_2$VZ (Z=Al, Be) Heusler alloys under varying high pressure conditions. The pressure range explored spans from 0.0 GPa to 30.0GPa, with increments of 0.5GPa. The WIEN2K simulation program is used to investigate the effect of pressure on the structural, magnetic, and electronic properties of Co$_2$VZ Heusler alloys. The WIEN2K simulation code with WC-GGA and mBJ exchange correlation potentials are used to investigate various features. The results of the WC-GGA exchange correlation potentials are then compared to earlier experimental and theoretical findings employed different exchange correlation potentials. The stability observed in the P-V plot indicates the absence of any structural phase transition from a cubic symmetry structure to another structural phase. The varying slopes observed in the band gap response to increasing pressure in different pressure ranges for studied alloys can be attributed to the predominance of either permittivity or quantum confinement effects. △ Less

Submitted 9 June, 2024; originally announced June 2024.

arXiv:2406.06608 [pdf, other]

The Prompt Report: A Systematic Survey of Prompting Techniques

Authors: Sander Schulhoff, Michael Ilie, Nishant Balepur, Konstantine Kahadze, Amanda Liu, Chenglei Si, Yinheng Li, Aayush Gupta, HyoJung Han, Sevien Schulhoff, Pranav Sandeep Dulepet, Saurav Vidyadhara, Dayeon Ki, Sweta Agrawal, Chau Pham, Gerson Kroiz, Feileen Li, Hudson Tao, Ashay Srivastava, Hevander Da Costa, Saloni Gupta, Megan L. Rogers, Inna Goncearenco, Giuseppe Sarli, Igor Galynker , et al. (6 additional authors not shown)

Abstract: Generative Artificial Intelligence (GenAI) systems are being increasingly deployed across all parts of industry and research settings. Developers and end users interact with these systems through the use of prompting or prompt engineering. While prompting is a widespread and highly researched concept, there exists conflicting terminology and a poor ontological understanding of what constitutes a p… ▽ More Generative Artificial Intelligence (GenAI) systems are being increasingly deployed across all parts of industry and research settings. Developers and end users interact with these systems through the use of prompting or prompt engineering. While prompting is a widespread and highly researched concept, there exists conflicting terminology and a poor ontological understanding of what constitutes a prompt due to the area's nascency. This paper establishes a structured understanding of prompts, by assembling a taxonomy of prompting techniques and analyzing their use. We present a comprehensive vocabulary of 33 vocabulary terms, a taxonomy of 58 text-only prompting techniques, and 40 techniques for other modalities. We further present a meta-analysis of the entire literature on natural language prefix-prompting. △ Less

Submitted 16 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

arXiv:2406.05538 [pdf]

First-principle screening of structural, electronic and hydrogen storage properties of Vanadium based hydride perovskites XVH$_3$ (X = Li, K)

Authors: Anupam, Shyam Lal Gupta, Vipan Kumar, Sumit Kumar, Sanjay Panwar, Diwaker

Abstract: V-based XVH$_3$ (X = Li,K) hydrides perovskites are investigated for their hydrogen storage capacity using the WIEN2K code. To verify the stability of these hydrides, first-principles investigations are employed to examine their structural, electronic and hydrogen storage properties. According to structural studies these compositions hydrides are stable and part of the cubic space group (221 Pm-3m… ▽ More V-based XVH$_3$ (X = Li,K) hydrides perovskites are investigated for their hydrogen storage capacity using the WIEN2K code. To verify the stability of these hydrides, first-principles investigations are employed to examine their structural, electronic and hydrogen storage properties. According to structural studies these compositions hydrides are stable and part of the cubic space group (221 Pm-3m). We have examined many aspects of these compositions throughout, using the PBE-GGA exchange correlation potential. We obtained the energy versus volume curve and found the stable phase and structural parameter of these hydrides using equation of state given by Birch-Murnaghan's. These hydrides thermodynamic stability is expressed in terms of their gravimetric hydrogen storage capacity.The goal of this study is to compute the standard enthalpy of formation and thermal desorption to ascertain the stability of these hydrides. Based on band structure and density of state plots it is found that these compositions are metallic in nature. The study presents a preliminary theoretical approach for hydrogen storage applications of thermoelectric compositions, revealing their strong thermoelectric responses and potential for green energy sources. △ Less

Submitted 8 June, 2024; originally announced June 2024.

arXiv:2406.05537 [pdf]

Effects of metals (X = Pd, Ag, Cd ) on structural, electronic, mechanical, thermoelectric and hydrogen storage properties of LiXH$_3$ perovskites

Authors: Anupam, Shyam Lal Gupta, Sumit Kumar, Samjeet Singh Thakur, Sanjay Panwar, Diwaker

Abstract: Using the WIEN2K code, the hydrogen storage capabilities of lithium compositions like LiXH$_3$ (X = Pd, Ag, Cd) hydrides are examined. Structural, electrical, mechanical, thermoelectric, and hydrogen storage properties of these hydrides are analyzed using first-principles simulations to verify their stability. Structural analysis of these compositions reveals that the hydrides are stable and belon… ▽ More Using the WIEN2K code, the hydrogen storage capabilities of lithium compositions like LiXH$_3$ (X = Pd, Ag, Cd) hydrides are examined. Structural, electrical, mechanical, thermoelectric, and hydrogen storage properties of these hydrides are analyzed using first-principles simulations to verify their stability. Structural analysis of these compositions reveals that the hydrides are stable and belong to the cubic space group number (221 Pm-3m). The thermodynamic stability of these hydrides are given in terms of gravimetric hydrogen storage capacities. The purpose of the study is to calculate heating of formation and breakdown temperature to determine stability of these hydrides. The metallic nature of all compositions are confirmed by band plots and density of states. The elastic properties such as elastic constant, Pugh's ratio, bulk modulus, Poisson's ratio and anisotropy factor are calculated to check the applicability of these compositions for applications involving hydrogen storage. The present paper represents the initial theoretical approach toward the future exploration of these materials for hydrogen storage applications. △ Less

Submitted 8 June, 2024; originally announced June 2024.

arXiv:2406.05530 [pdf]

Effects of metals (X = Zn, Co) on structure, electronic bands and gravimetric capacity of KXH$_3$ hydrides

Authors: Anupam, Shyam Lal Gupta, Sumit Kumar, Samjeet Singh Thakur, Sanjay Panwar, Diwaker

Abstract: Using the WIEN2K code, the hydrogen storage capabilities of lithium-based KXH$_3$ (X = Zn, Co) hydrides perovskites are examined. To verify the stability of these hydrides, first-principles simulations are employed to examine their structural, electronic, and hydrogen storage capabilities. These compositions' structural investigation shows that the hydrides are stable and part of the cubic space g… ▽ More Using the WIEN2K code, the hydrogen storage capabilities of lithium-based KXH$_3$ (X = Zn, Co) hydrides perovskites are examined. To verify the stability of these hydrides, first-principles simulations are employed to examine their structural, electronic, and hydrogen storage capabilities. These compositions' structural investigation shows that the hydrides are stable and part of the cubic space group (221 Pm-3m). We have examined several aspects of these composition's features throughout, using the Perdew-Burke-Ernzerhof generalized gradient approximation. The study identifies stable phases and structural parameters of hydrides using B-E equations, assessing thermodynamic stability in terms of hydrogen storage capacities. The metallic nature of these hydrides is confirmed through band structure and density calculations using WIEN2K. △ Less

Submitted 8 June, 2024; originally announced June 2024.

arXiv:2406.05527 [pdf]

Ab-initio investigations of novel potential all-d metal Heusler alloys Co$_2$MnNb

Authors: Sumit Kumar, Diwaker, Vivek Kumar, Karan S. Vinayak, Shyam Lal Gupta

Abstract: In this study, we employ the Wien2k code to conduct ab-initio study of a novel potential all-d-metal Heusler alloy Co$_2$MnNb. The analysis utilizes the comparison of local spin density approximations (LDA) with Perdew-Burke-Ernzerh parameterized Generalized Gradient Approximation (PBE-GGA) for structural optimization while modified Becke-Jones potential (mBJ) exchange-correlation potentials to ex… ▽ More In this study, we employ the Wien2k code to conduct ab-initio study of a novel potential all-d-metal Heusler alloy Co$_2$MnNb. The analysis utilizes the comparison of local spin density approximations (LDA) with Perdew-Burke-Ernzerh parameterized Generalized Gradient Approximation (PBE-GGA) for structural optimization while modified Becke-Jones potential (mBJ) exchange-correlation potentials to examine various characteristic properties of the alloy under study. Employing Birch-Murnaghan equation of state, we construct the energy-versus-volume curve, facilitating the determination of stable phases and structural parameters of the investigated alloys. Structural optimization in both non-magnetic (NM) and spin-polarized (FM) states reveals the stability of the alloy in the FM state. The compound exhibits metallic behavior in bulk, with notable anisotropic semiconducting behavior for down spin while pure metallic behavior for up spin electrons. Partial density of states of each element of the composition is also analysed to compare their respective contribution towards the observed band structure. The anisotropic behavior of Co$_2$MnNb for a specific spin state could be of importance in future spintronic and other thin films device applications. △ Less

Submitted 8 June, 2024; originally announced June 2024.

arXiv:2406.04128 [pdf]

Realization of higher coordinated Er in high-pressure cotunnite phase of Er$_2$Ti$_2$O$_7$

Authors: M. Modak, Rahul Kaiwart, Santosh K. Gupta, A. Dwivedi, K. K. Pandey, A. K. Poswal, H. K. Poswal

Abstract: In this article we report the structural stability of Er$_2$Ti$_2$O$_7$ cubic pyrochlore with pressure using x-ray diffraction, Raman spectroscopy, photoluminescence, x-ray absorption and ab-initio calculations. Our studies establish a phase transformation in Er$_2$Ti$_2$O$_7$ from ambient cubic phase to high-pressure orthorhombic (cotunnite) phase, initiated at ~40 GPa. The transformation is slug… ▽ More In this article we report the structural stability of Er$_2$Ti$_2$O$_7$ cubic pyrochlore with pressure using x-ray diffraction, Raman spectroscopy, photoluminescence, x-ray absorption and ab-initio calculations. Our studies establish a phase transformation in Er$_2$Ti$_2$O$_7$ from ambient cubic phase to high-pressure orthorhombic (cotunnite) phase, initiated at ~40 GPa. The transformation is sluggish and it does not complete even at the highest measured pressure in our study i.e. ~60.0 GPa. This is further supported by the first principle calculations which reveal that cotunnite phase is energetically more stable than the ambient phase above ~53 GPa. After complete release of pressure, the high-pressure cotunnite phase is retained while the fraction of untransformed pyrochlore phase becomes amorphous. Furthermore, the EXAFS data of the recovered sample at L3 edge of Er3+ ion show an increase in the coordination number of cations from eight at ambient to nine in the high-pressure phase. The mechanism of structural transformation is explained in terms of accumulation of cation antisite defects and subsequent disordering of cations and anions in their respective sublattice. The amorphization of the pyrochlore phase upon release is interpreted as the inability of accommodating the point defects at ambient conditions, which are formed in the pyrochlore lattice under compression. △ Less

Submitted 6 June, 2024; originally announced June 2024.

arXiv:2406.03952 [pdf, other]

Coexistence of Topological Dirac and Dirac Nodal line semimetal in SrCaP belonging to Nodal line semimetal family SrCaX(X= Bi, Sb, As, P)

Authors: Shivendra Kumar Gupta, Ashish Kore, Saurabh Kumar Sen, Poorva Singh

Abstract: Nodal line semimetals represent precursor states for various topological phases, exhibiting intrinsic topological characteristics and intriguing properties. These materials host rare and distinctive topological features, which can give rise to exotic phenomena, thereby garnering significant attention in both fundamental research and technological applications. In this study, we conduct ab-initio c… ▽ More Nodal line semimetals represent precursor states for various topological phases, exhibiting intrinsic topological characteristics and intriguing properties. These materials host rare and distinctive topological features, which can give rise to exotic phenomena, thereby garnering significant attention in both fundamental research and technological applications. In this study, we conduct ab-initio calculations to explore the properties of SrCaX (X = Bi, Sb, As, P), identifying these as multiple Dirac nodal line semimetals protected by Z2 quantized Berry phases and manifesting multiple drum-head-like surface states. The nodal lines in these compounds are situated at the M point when kz = 0 and at the A point when kz = π. Notably, SrCaX family exhibits a unique characteristic wherein they host both type II Dirac point and topological nodal line semimetal within a single crystal structure, hence providing an excellent platform for studying the interplay between different topological properties. Additionally, in SrCaP topological Dirac semimetal, Type II Dirac point and topological nodal line semimetal features coexist in a single crystal. These special features in this series of materials make them ideal candidates for further investigation by experimental means. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: 7+7=14 pages, 6+10=16 figures, 1 table

arXiv:2406.03661 [pdf, other]

Randomness in atomic disorder and consequent squandering of spin-polarization in a ferromagnetically fragile quaternary Heusler alloy FeRuCrSi

Authors: Shuvankar Gupta, Sudip Chakraborty, Vidha Bhasin, Celine Barreteau, Jean-Claude Crivello, Jean-Marc Greneche, S. N. Jha, D. Bhattacharyya, Eric Alleno, Chandan Mazumdar

Abstract: Ru$_{2-x}$Fe$_x$CrSi ( 0 $<$ x $<$1) system is theoretically predicted to be one of the very few known examples of robust half-metallic ferromagnet with 100\% spin polarization. Since Cr is considered to be the main contributor to magnetism, the Fe/Ru substitution is not expected to disturb its magnetic properties any significantly, and hence all Fe-containing members of the series are predicted t… ▽ More Ru$_{2-x}$Fe$_x$CrSi ( 0 $<$ x $<$1) system is theoretically predicted to be one of the very few known examples of robust half-metallic ferromagnet with 100\% spin polarization. Since Cr is considered to be the main contributor to magnetism, the Fe/Ru substitution is not expected to disturb its magnetic properties any significantly, and hence all Fe-containing members of the series are predicted to follow Slater-Pauling rule with a saturation magnetic moment of 2 ${μ_B}$/f.u. However, contrarily to the theoretical expectations, some experiments rather show a linear variation of the saturation magnetization and Curie temperature with Fe (\textit{x}) substitution. The equiatomic member FeRuCrSi of this family is also considered as a technologically important material, where the band structure calculations suggest the material to be spin gapless semiconductor. Through our in-depth structural analysis of FeRuCrSi using X-ray diffraction, extended X-ray absorption fine structure and $^{57}$Fe Mössbauer spectrometry, we found a random disorder between Fe and Ru sites, while the magnetic moment in this system is actually contributed by Fe atoms, questioning the very basic foundation of the half-metallic character proposed by all theoretical calculations on Ru$_{2-x}$Fe$_x$CrSi series. Our Mössbauer result also envisions a rather rare scenario where the main physical properties are intricately correlated to the chemistry of the material in the form of random atomic disorder on a localised scale. △ Less

Submitted 5 June, 2024; originally announced June 2024.

arXiv:2406.03656 [pdf, other]

Restructuring disorder: Transformation from the antiferromagnetic order in Fe2VSi to the ferromagnetic state in FeRuVSi by substitution of a non-magnetic element

Authors: Shuvankar Gupta, Sudip Chakraborty, Celine Barreteau, Jean-Claude Crivello, Jean-Marc Greneche, Eric Alleno, Chandan Mazumdar

Abstract: The delicate nature of the half-metallic ferromagnetic (HMF) property in Heusler alloys is often compromised by inherent structural disorder within the systems. Fe2VSi is a prime example, where such disorder prevents the realization of the theoretically proposed HMF state as the anti-site disorder leads to the formation of two anti-parallel magnetic lattices resulting in antiferromagnetic order. I… ▽ More The delicate nature of the half-metallic ferromagnetic (HMF) property in Heusler alloys is often compromised by inherent structural disorder within the systems. Fe2VSi is a prime example, where such disorder prevents the realization of the theoretically proposed HMF state as the anti-site disorder leads to the formation of two anti-parallel magnetic lattices resulting in antiferromagnetic order. In this study, we propose an innovative and simple strategy to prevent this atomic disorder by replacing 50% of the magnetic element Fe by a large, isoelectronic, non-magnetic element, Ru. In this way, one of the magnetic sublattices of the antiferromagnetic lattice ceases to order while ferromagnetic order is restored, an essential criterion for exhibiting HMF properties. Through various experimental measurements and theoretical calculations, we have shown that such partial replacement of Fe by Ru prevents the cross-site substitution of V/Si sites and the system regains its ferromagnetic order. Our theoretical calculations suggest that a perfect structural arrangement in Fe and Ru would have restored the HMF property in FeRuVSi. However, the local atomic disorder of Fe and Ru was found to decrease the spin polarization value. The present work sheds light on the complex interplay between structural disorder and magnetic properties in Heusler alloys and provides insights for future design strategies in the pursuit of robust half-metallic ferromagnets. △ Less

Submitted 5 June, 2024; originally announced June 2024.

arXiv:2406.02810 [pdf, other]

Isolation of individual Er quantum emitters in anatase TiO$_2$ on Si photonics

Authors: Cheng Ji, Robert M. Pettit, Shobhit Gupta, Gregory D. Grant, Ignas Masiulionis, Ananthesh Sundaresh, Skylar Deckoff--Jones, Max Olberding, Manish K. Singh, F. Joseph Heremans, Supratik Guha, Alan M. Dibos, Sean E. Sullivan

Abstract: Defects and dopant atoms in solid state materials are a promising platform for realizing single photon sources and quantum memories, which are the basic building blocks of quantum repeaters needed for long distance quantum networks. In particular, trivalent erbium (Er$^{3+}$) is of interest because it couples C-band telecom optical transitions with a spin-based memory platform. In order to produce… ▽ More Defects and dopant atoms in solid state materials are a promising platform for realizing single photon sources and quantum memories, which are the basic building blocks of quantum repeaters needed for long distance quantum networks. In particular, trivalent erbium (Er$^{3+}$) is of interest because it couples C-band telecom optical transitions with a spin-based memory platform. In order to produce quantum repeaters at the scale required for a quantum internet, it is imperative to integrate these necessary building blocks with mature and scalable semiconductor processes. In this work, we demonstrate the optical isolation of single Er$^{3+}$ ions in CMOS-compatible titanium dioxide (TiO$_2$) thin films monolithically integrated on a silicon-on-insulator (SOI) photonics platform. Our results demonstrate a first step toward the realization of a monolithically integrated and scalable quantum photonics package based on Er$^{3+}$ doped thin films. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2406.00924 [pdf, ps, other]

Faster Diffusion-based Sampling with Randomized Midpoints: Sequential and Parallel

Authors: Shivam Gupta, Linda Cai, Sitan Chen

Abstract: In recent years, there has been a surge of interest in proving discretization bounds for diffusion models. These works show that for essentially any data distribution, one can approximately sample in polynomial time given a sufficiently accurate estimate of its score functions at different noise levels. In this work, we propose a new discretization scheme for diffusion models inspired by Shen and… ▽ More In recent years, there has been a surge of interest in proving discretization bounds for diffusion models. These works show that for essentially any data distribution, one can approximately sample in polynomial time given a sufficiently accurate estimate of its score functions at different noise levels. In this work, we propose a new discretization scheme for diffusion models inspired by Shen and Lee's randomized midpoint method for log-concave sampling~\cite{ShenL19}. We prove that this approach achieves the best known dimension dependence for sampling from arbitrary smooth distributions in total variation distance ($\widetilde O(d^{5/12})$ compared to $\widetilde O(\sqrt{d})$ from prior work). We also show that our algorithm can be parallelized to run in only $\widetilde O(\log^2 d)$ parallel rounds, constituting the first provable guarantees for parallel sampling with diffusion models. As a byproduct of our methods, for the well-studied problem of log-concave sampling in total variation distance, we give an algorithm and simple analysis achieving dimension dependence $\widetilde O(d^{5/12})$ compared to $\widetilde O(\sqrt{d})$ from prior work. △ Less

Submitted 2 June, 2024; originally announced June 2024.

arXiv:2405.20007 [pdf, ps, other]

On the restriction of some irreducible mod-$p$ representations of $\text{GL}_2(\mathbb{F}_q)$ to $\text{GL}_2(\mathbb{F}_p)$

Authors: Shubhanshi Gupta

Abstract: For a prime $p,$ let $\mathbb{F}_q$ be a finite extension of $\mathbb{F}_p$ and $\mathcal{G}=\text{GL}_2(\mathbb{F}_q).$ Then the irreducible representations of $\mathcal{G}$ are classified as twists of Sym$^{\vec{r}}(\overline{\mathbb F}_p^2).$ The restriction of irreducibles of $\mathcal{G}$ to its subgroup $G=\text{GL}_2(\mathbb{F}_p)$ is same as investigating the behavior of the tensor product… ▽ More For a prime $p,$ let $\mathbb{F}_q$ be a finite extension of $\mathbb{F}_p$ and $\mathcal{G}=\text{GL}_2(\mathbb{F}_q).$ Then the irreducible representations of $\mathcal{G}$ are classified as twists of Sym$^{\vec{r}}(\overline{\mathbb F}_p^2).$ The restriction of irreducibles of $\mathcal{G}$ to its subgroup $G=\text{GL}_2(\mathbb{F}_p)$ is same as investigating the behavior of the tensor product of irreducible representations of $G.$ In this paper, we study the restriction of some of these representations of $\mathcal{G}$ to $G,$ for $2$ and $3$ degree extensions of $\mathbb{F}_p.$ △ Less

Submitted 3 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.18585 [pdf, other]

Transfer Learning for Emulating Ocean Climate Variability across $CO_2$ forcing

Authors: Surya Dheeshjith, Adam Subel, Shubham Gupta, Alistair Adcroft, Carlos Fernandez-Granda, Julius Busecke, Laure Zanna

Abstract: With the success of machine learning (ML) applied to climate reaching further every day, emulators have begun to show promise not only for weather but for multi-year time scales in the atmosphere. Similar work for the ocean remains nascent, with state-of-the-art limited to models running for shorter time scales or only for regions of the globe. In this work, we demonstrate high-skill global emulat… ▽ More With the success of machine learning (ML) applied to climate reaching further every day, emulators have begun to show promise not only for weather but for multi-year time scales in the atmosphere. Similar work for the ocean remains nascent, with state-of-the-art limited to models running for shorter time scales or only for regions of the globe. In this work, we demonstrate high-skill global emulation for surface ocean fields over 5-8 years of model rollout, accurately representing modes of variability for two different ML architectures (ConvNext and Transformers). In addition, we address the outstanding question of generalization, an essential consideration if the end-use of emulation is to model warming scenarios outside of the model training data. We show that 1) generalization is not an intrinsic feature of a data-driven emulator, 2) fine-tuning the emulator on only small amounts of additional data from a distribution similar to the test set can enable the emulator to perform well in a warmed climate, and 3) the forced emulators are robust to noise in the forcing. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2405.18193 [pdf, other]

In-Context Symmetries: Self-Supervised Learning through Contextual World Models

Authors: Sharut Gupta, Chenyu Wang, Yifei Wang, Tommi Jaakkola, Stefanie Jegelka

Abstract: At the core of self-supervised learning for vision is the idea of learning invariant or equivariant representations with respect to a set of data transformations. This approach, however, introduces strong inductive biases, which can render the representations fragile in downstream tasks that do not conform to these symmetries. In this work, drawing insights from world models, we propose to instead… ▽ More At the core of self-supervised learning for vision is the idea of learning invariant or equivariant representations with respect to a set of data transformations. This approach, however, introduces strong inductive biases, which can render the representations fragile in downstream tasks that do not conform to these symmetries. In this work, drawing insights from world models, we propose to instead learn a general representation that can adapt to be invariant or equivariant to different transformations by paying attention to context -- a memory module that tracks task-specific states, actions, and future states. Here, the action is the transformation, while the current and future states respectively represent the input's representation before and after the transformation. Our proposed algorithm, Contextual Self-Supervised Learning (ContextSSL), learns equivariance to all transformations (as opposed to invariance). In this way, the model can learn to encode all relevant features as general representations while having the versatility to tail down to task-wise symmetries when given a few examples as the context. Empirically, we demonstrate significant performance gains over existing methods on equivariance-related tasks, supported by both qualitative and quantitative evaluations. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: 32 pages, 24 tables and 11 figures

arXiv:2405.17469 [pdf, other]

A Dataset for Research on Water Sustainability

Authors: Pranjol Sen Gupta, Md Rajib Hossen, Pengfei Li, Shaolei Ren, Mohammad A. Islam

Abstract: Freshwater scarcity is a global problem that requires collective efforts across all industry sectors. Nevertheless, a lack of access to operational water footprint data bars many applications from exploring optimization opportunities hidden within the temporal and spatial variations. To break this barrier into research in water sustainability, we build a dataset for operation direct water usage in… ▽ More Freshwater scarcity is a global problem that requires collective efforts across all industry sectors. Nevertheless, a lack of access to operational water footprint data bars many applications from exploring optimization opportunities hidden within the temporal and spatial variations. To break this barrier into research in water sustainability, we build a dataset for operation direct water usage in the cooling systems and indirect water embedded in electricity generation. Our dataset consists of the hourly water efficiency of major U.S. cities and states from 2019 to 2023. We also offer cooling system models that capture the impact of weather on water efficiency. We present a preliminary analysis of our dataset and discuss three potential applications that can benefit from it. Our dataset is publicly available at Open Science Framework (OSF) △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: Accepted by ACM e-Energy 2024

arXiv:2405.15378 [pdf, ps, other]

Dominating surface-group representations via Fock-Goncharov coordinates

Authors: Pabitra Barman, Subhojoy Gupta

Abstract: Let $S$ be a punctured surface of negative Euler characteristic. We show that given a generic representation $ρ:π_1(S) \rightarrow \mathrm{PSL}_n(\mathbb{C})$, there exists a positive representation $ρ_0:π_1(S) \rightarrow \mathrm{PSL}_n(\mathbb{R})$ that dominates $ρ$ in the Hilbert length spectrum as well as in the translation length spectrum, for the translation length in the symmetric space… ▽ More Let $S$ be a punctured surface of negative Euler characteristic. We show that given a generic representation $ρ:π_1(S) \rightarrow \mathrm{PSL}_n(\mathbb{C})$, there exists a positive representation $ρ_0:π_1(S) \rightarrow \mathrm{PSL}_n(\mathbb{R})$ that dominates $ρ$ in the Hilbert length spectrum as well as in the translation length spectrum, for the translation length in the symmetric space $\mathbb{X}_n= \mathrm{PSL}_n(\mathbb{C})/\mathrm{PSU}(n)$. Moreover, the $ρ_0$-lengths of peripheral curves remain unchanged. The dominating representation $ρ_0$ is explicitly described via Fock-Goncharov coordinates. Our methods are linear-algebraic, and involve weight matrices of weighted planar networks. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: 39 pages

arXiv:2405.15372 [pdf, other]

When far is better: The Chamberlin-Courant approach to obnoxious committee selection

Authors: Sushmita Gupta, Tanmay Inamdar, Pallavi Jain, Daniel Lokshtanov, Fahad Panolan, Saket Saurabh

Abstract: Classical work on metric space based committee selection problem interprets distance as ``near is better''. In this work, motivated by real-life situations, we interpret distance as ``far is better''. Formally stated, we initiate the study of ``obnoxious'' committee scoring rules when the voters' preferences are expressed via a metric space. To this end, we propose a model where large distances im… ▽ More Classical work on metric space based committee selection problem interprets distance as ``near is better''. In this work, motivated by real-life situations, we interpret distance as ``far is better''. Formally stated, we initiate the study of ``obnoxious'' committee scoring rules when the voters' preferences are expressed via a metric space. To this end, we propose a model where large distances imply high satisfaction and study the egalitarian avatar of the well-known Chamberlin-Courant voting rule and some of its generalizations. For a given integer value $1 \le λ\le k$, the committee size k, a voter derives satisfaction from only the $λ$-th favorite committee member; the goal is to maximize the satisfaction of the least satisfied voter. For the special case of $λ= 1$, this yields the egalitarian Chamberlin-Courant rule. In this paper, we consider general metric space and the special case of a $d$-dimensional Euclidean space. We show that when $λ$ is $1$ and $k$, the problem is polynomial-time solvable in $\mathbb{R}^2$ and general metric space, respectively. However, for $λ= k-1$, it is NP-hard even in $\mathbb{R}^2$. Thus, we have ``double-dichotomy'' in $\mathbb{R}^2$ with respect to the value of λ, where the extreme cases are solvable in polynomial time but an intermediate case is NP-hard. Furthermore, this phenomenon appears to be ``tight'' for $\mathbb{R}^2$ because the problem is NP-hard for general metric space, even for $λ=1$. Consequently, we are motivated to explore the problem in the realm of (parameterized) approximation algorithms and obtain positive results. Interestingly, we note that this generalization of Chamberlin-Courant rules encodes practical constraints that are relevant to solutions for certain facility locations. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2405.15254 [pdf, other]

Novel Kernel Models and Exact Representor Theory for Neural Networks Beyond the Over-Parameterized Regime

Authors: Alistair Shilton, Sunil Gupta, Santu Rana, Svetha Venkatesh

Abstract: This paper presents two models of neural-networks and their training applicable to neural networks of arbitrary width, depth and topology, assuming only finite-energy neural activations; and a novel representor theory for neural networks in terms of a matrix-valued kernel. The first model is exact (un-approximated) and global, casting the neural network as an elements in a reproducing kernel Banac… ▽ More This paper presents two models of neural-networks and their training applicable to neural networks of arbitrary width, depth and topology, assuming only finite-energy neural activations; and a novel representor theory for neural networks in terms of a matrix-valued kernel. The first model is exact (un-approximated) and global, casting the neural network as an elements in a reproducing kernel Banach space (RKBS); we use this model to provide tight bounds on Rademacher complexity. The second model is exact and local, casting the change in neural network function resulting from a bounded change in weights and biases (ie. a training step) in reproducing kernel Hilbert space (RKHS) in terms of a local-intrinsic neural kernel (LiNK). This local model provides insight into model adaptation through tight bounds on Rademacher complexity of network adaptation. We also prove that the neural tangent kernel (NTK) is a first-order approximation of the LiNK kernel. Finally, and noting that the LiNK does not provide a representor theory for technical reasons, we present an exact novel representor theory for layer-wise neural network training with unregularized gradient descent in terms of a local-extrinsic neural kernel (LeNK). This representor theory gives insight into the role of higher-order statistics in neural network training and the effect of kernel evolution in neural-network kernel models. Throughout the paper (a) feedforward ReLU networks and (b) residual networks (ResNet) are used as illustrative examples. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2405.15227 [pdf, other]

Neural Elevation Models for Terrain Mapping and Path Planning

Authors: Adam Dai, Shubh Gupta, Grace Gao

Abstract: This work introduces Neural Elevations Models (NEMos), which adapt Neural Radiance Fields to a 2.5D continuous and differentiable terrain model. In contrast to traditional terrain representations such as digital elevation models, NEMos can be readily generated from imagery, a low-cost data source, and provide a lightweight representation of terrain through an implicit continuous and differentiable… ▽ More This work introduces Neural Elevations Models (NEMos), which adapt Neural Radiance Fields to a 2.5D continuous and differentiable terrain model. In contrast to traditional terrain representations such as digital elevation models, NEMos can be readily generated from imagery, a low-cost data source, and provide a lightweight representation of terrain through an implicit continuous and differentiable height field. We propose a novel method for jointly training a height field and radiance field within a NeRF framework, leveraging quantile regression. Additionally, we introduce a path planning algorithm that performs gradient-based optimization of a continuous cost function for minimizing distance, slope changes, and control effort, enabled by differentiability of the height field. We perform experiments on simulated and real-world terrain imagery, demonstrating NEMos ability to generate high-quality reconstructions and produce smoother paths compared to discrete path planning methods. Future work will explore the incorporation of features and semantics into the height field, creating a generalized terrain model. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Showing 1–50 of 2,036 results for author: Gupta, S