subscribe to arXiv mailings

doi 10.1142/S0217732323501092

A description of classical and quantum cosmology for a single scalar field torsion gravity

Authors: Dipankar Laya, Roshni Bhaumik, Sourav Dutta, Subenoy Chakraborty

Abstract: In the background of homogeneous and isotropic flat FLRW space-time, both classical and quantum cosmology has been studied for teleparallel dark energy (DE) model. Using Noether symmetry analysis, not only the symmetry vector but also the coupling function in the Lagrangian and the potential of the scalar field has been determined. Also symmetry analysis identifies a cyclic variable in the Lagrang… ▽ More In the background of homogeneous and isotropic flat FLRW space-time, both classical and quantum cosmology has been studied for teleparallel dark energy (DE) model. Using Noether symmetry analysis, not only the symmetry vector but also the coupling function in the Lagrangian and the potential of the scalar field has been determined. Also symmetry analysis identifies a cyclic variable in the Lagrangian along the symmetry vector and as a result the Lagrangian simplifies to a great extend so that classical solution is obtained. Subsequently, in quantum cosmology Wheeler-DeWitt(WD) equation has been constructed and the quantum version of the conserved momenta corresponding to Noether symmetry identifies the periodic part of the wave function of the universe and as a result the Wheeler-DeWitt equation becomes solvable. Finally, quantum description shows finite non-zero probability at the classical big-bang singularity. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 16 Pages, 4 figures

Journal ref: Modern Physics Letters A Vol. 38, Nos. 22 & 23 (2023) 2350109 (15 pages)

arXiv:2407.08207 [pdf, ps, other]

doi 10.1142/S0217751X23500641

Classical and Quantum Cosmology in Einstein-aether Scalar-tensor gravity: Noether Symmetry Analysis

Authors: Dipanakr Laya, Roshni Bhaumik, Sourav Dutta, Subenoy Chakraborty

Abstract: The present work deals with Einstein-aether Scalar tensor gravity in the background of homogeneous and isotropic flat FLRW space-time model. The Noether symmetry vector identifies a transformation in the augmented space so that the field equations become solvable. The cosmological solutions are analyzed from the observational point of view. Finally, for quantum cosmology, the Wheeler-DeWitt (WD) h… ▽ More The present work deals with Einstein-aether Scalar tensor gravity in the background of homogeneous and isotropic flat FLRW space-time model. The Noether symmetry vector identifies a transformation in the augmented space so that the field equations become solvable. The cosmological solutions are analyzed from the observational point of view. Finally, for quantum cosmology, the Wheeler-DeWitt (WD) has been formulated and solutions have been determined by identifying the periodic nature of the wave function using conserved (Noether) charge. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 15 pages, 4 figures

Journal ref: International Journal of Modern Physics A Vol. 38, Nos. 12 & 13 (2023) 2350064 (14 pages)

arXiv:2407.07982 [pdf, other]

Automating Weak Label Generation for Data Programming with Clinicians in the Loop

Authors: Jean Park, Sydney Pugh, Kaustubh Sridhar, Mengyu Liu, Navish Yarna, Ramneet Kaur, Souradeep Dutta, Elena Bernardis, Oleg Sokolsky, Insup Lee

Abstract: Large Deep Neural Networks (DNNs) are often data hungry and need high-quality labeled data in copious amounts for learning to converge. This is a challenge in the field of medicine since high quality labeled data is often scarce. Data programming has been the ray of hope in this regard, since it allows us to label unlabeled data using multiple weak labeling functions. Such functions are often supp… ▽ More Large Deep Neural Networks (DNNs) are often data hungry and need high-quality labeled data in copious amounts for learning to converge. This is a challenge in the field of medicine since high quality labeled data is often scarce. Data programming has been the ray of hope in this regard, since it allows us to label unlabeled data using multiple weak labeling functions. Such functions are often supplied by a domain expert. Data-programming can combine multiple weak labeling functions and suggest labels better than simple majority voting over the different functions. However, it is not straightforward to express such weak labeling functions, especially in high-dimensional settings such as images and time-series data. What we propose in this paper is a way to bypass this issue, using distance functions. In high-dimensional spaces, it is easier to find meaningful distance metrics which can generalize across different labeling tasks. We propose an algorithm that queries an expert for labels of a few representative samples of the dataset. These samples are carefully chosen by the algorithm to capture the distribution of the dataset. The labels assigned by the expert on the representative subset induce a labeling on the full dataset, thereby generating weak labels to be used in the data programming pipeline. In our medical time series case study, labeling a subset of 50 to 130 out of 3,265 samples showed 17-28% improvement in accuracy and 13-28% improvement in F1 over the baseline using clinician-defined labeling functions. In our medical image case study, labeling a subset of about 50 to 120 images from 6,293 unlabeled medical images using our approach showed significant improvement over the baseline method, Snuba, with an increase of approximately 5-15% in accuracy and 12-19% in F1 score. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.06696 [pdf, ps, other]

doi 10.1142/S0218271823500013

Quantum Cosmology in Coupled Brans-Dicke Gravity: A Noether Symmetry Analysis

Authors: Dipankar Laya, Sourav Dutta, Subenoy Chakraborty

Abstract: The present work deals with a multi-field cosmological model in a spatially flat FLRW space-time geometry. The usual Brans-Dicke(BD) field and another scalar field are minimally coupled to gravity while they interact with each other through the Kinetic terms. {The main aim of the present work is to examine whether the model is compatible with cosmic observations. So cosmological solutions are obta… ▽ More The present work deals with a multi-field cosmological model in a spatially flat FLRW space-time geometry. The usual Brans-Dicke(BD) field and another scalar field are minimally coupled to gravity while they interact with each other through the Kinetic terms. {The main aim of the present work is to examine whether the model is compatible with cosmic observations. So cosmological solutions are obtained using symmetry analysis only.} By imposing Noether Symmetry to the Lagrangian of the system the potential of the scalar field as well as the coupling function has been determined. The classical solutions are determined after simplifying the Lagrangian using cyclic variables. Finally, Wheeler-DeWitt(WD) equation in quantum cosmology has been formulated and conserved momenta corresponding to Noether symmetry shows the periodic part of the wave function and it helps to have the complete integral for the wave function. △ Less

Submitted 9 July, 2024; originally announced July 2024.

Comments: 14 pages, 4 figures

arXiv:2407.04708 [pdf, other]

QMViT: A Mushroom is worth 16x16 Words

Authors: Siddhant Dutta, Hemant Singh, Kalpita Shankhdhar, Sridhar Iyer

Abstract: Consuming poisonous mushrooms can have severe health consequences, even resulting in fatality and accurately distinguishing edible from toxic mushroom varieties remains a significant challenge in ensuring food safety. So, it's crucial to distinguish between edible and poisonous mushrooms within the existing species. This is essential due to the significant demand for mushrooms in people's daily me… ▽ More Consuming poisonous mushrooms can have severe health consequences, even resulting in fatality and accurately distinguishing edible from toxic mushroom varieties remains a significant challenge in ensuring food safety. So, it's crucial to distinguish between edible and poisonous mushrooms within the existing species. This is essential due to the significant demand for mushrooms in people's daily meals and their potential contributions to medical science. This work presents a novel Quantum Vision Transformer architecture that leverages quantum computing to enhance mushroom classification performance. By implementing specialized quantum self-attention mechanisms using Variational Quantum Circuits, the proposed architecture achieved 92.33% and 99.24% accuracy based on their category and their edibility respectively. This demonstrates the success of the proposed architecture in reducing false negatives for toxic mushrooms, thus ensuring food safety. Our research highlights the potential of QMViT for improving mushroom classification as a whole. △ Less

Submitted 10 May, 2024; originally announced July 2024.

arXiv:2407.04173 [pdf, other]

Quantifying Prediction Consistency Under Model Multiplicity in Tabular LLMs

Authors: Faisal Hamman, Pasan Dissanayake, Saumitra Mishra, Freddy Lecue, Sanghamitra Dutta

Abstract: Fine-tuning large language models (LLMs) on limited tabular data for classification tasks can lead to \textit{fine-tuning multiplicity}, where equally well-performing models make conflicting predictions on the same inputs due to variations in the training process (i.e., seed, random weight initialization, retraining on additional or deleted samples). This raises critical concerns about the robustn… ▽ More Fine-tuning large language models (LLMs) on limited tabular data for classification tasks can lead to \textit{fine-tuning multiplicity}, where equally well-performing models make conflicting predictions on the same inputs due to variations in the training process (i.e., seed, random weight initialization, retraining on additional or deleted samples). This raises critical concerns about the robustness and reliability of Tabular LLMs, particularly when deployed for high-stakes decision-making, such as finance, hiring, education, healthcare, etc. This work formalizes the challenge of fine-tuning multiplicity in Tabular LLMs and proposes a novel metric to quantify the robustness of individual predictions without expensive model retraining. Our metric quantifies a prediction's stability by analyzing (sampling) the model's local behavior around the input in the embedding space. Interestingly, we show that sampling in the local neighborhood can be leveraged to provide probabilistic robustness guarantees against a broad class of fine-tuned models. By leveraging Bernstein's Inequality, we show that predictions with sufficiently high robustness (as defined by our measure) will remain consistent with high probability. We also provide empirical evaluation on real-world datasets to support our theoretical results. Our work highlights the importance of addressing fine-tuning instabilities to enable trustworthy deployment of LLMs in high-stakes and safety-critical applications. △ Less

Submitted 4 July, 2024; originally announced July 2024.

arXiv:2407.00482 [pdf, other]

Quantifying Spuriousness of Biased Datasets Using Partial Information Decomposition

Authors: Barproda Halder, Faisal Hamman, Pasan Dissanayake, Qiuyi Zhang, Ilia Sucholutsky, Sanghamitra Dutta

Abstract: Spurious patterns refer to a mathematical association between two or more variables in a dataset that are not causally related. However, this notion of spuriousness, which is usually introduced due to sampling biases in the dataset, has classically lacked a formal definition. To address this gap, this work presents the first information-theoretic formalization of spuriousness in a dataset (given a… ▽ More Spurious patterns refer to a mathematical association between two or more variables in a dataset that are not causally related. However, this notion of spuriousness, which is usually introduced due to sampling biases in the dataset, has classically lacked a formal definition. To address this gap, this work presents the first information-theoretic formalization of spuriousness in a dataset (given a split of spurious and core features) using a mathematical framework called Partial Information Decomposition (PID). Specifically, we disentangle the joint information content that the spurious and core features share about another target variable (e.g., the prediction label) into distinct components, namely unique, redundant, and synergistic information. We propose the use of unique information, with roots in Blackwell Sufficiency, as a novel metric to formally quantify dataset spuriousness and derive its desirable properties. We empirically demonstrate how higher unique information in the spurious features in a dataset could lead a model into choosing the spurious features over the core features for inference, often having low worst-group-accuracy. We also propose a novel autoencoder-based estimator for computing unique information that is able to handle high-dimensional image data. Finally, we also show how this unique information in the spurious feature is reduced across several dataset-based spurious-pattern-mitigation techniques such as data reweighting and varying levels of background mixing, demonstrating a novel tradeoff between unique information (spuriousness) and worst-group-accuracy. △ Less

Submitted 29 June, 2024; originally announced July 2024.

Comments: Accepted at ICML 2024 Workshop on Data-centric Machine Learning Research (DMLR): Datasets for Foundation Models

arXiv:2407.00181 [pdf, other]

$W-$mass and Muon $g-2$ in Inert 2HDM Extended by Singlet Complex Scalar

Authors: Hrishabh Bharadwaj, Mamta Dahiya, Sukanta Dutta, Ashok Goyal

Abstract: The deviations of the recent measurements of the muon magnetic moment and the $W-$boson mass from their SM predictions hint to new physics beyond the SM. In this article, we address the observed discrepancies in the $W$-boson mass and muon anomalous magnetic moment in the Inert Two Higgs Doublet Model (I2HDM) extended by a complex scalar field singlet under the SM gauge group. The model is constra… ▽ More The deviations of the recent measurements of the muon magnetic moment and the $W-$boson mass from their SM predictions hint to new physics beyond the SM. In this article, we address the observed discrepancies in the $W$-boson mass and muon anomalous magnetic moment in the Inert Two Higgs Doublet Model (I2HDM) extended by a complex scalar field singlet under the SM gauge group. The model is constrained from the existing LEP data and the measurements of partial decay widths to gauge bosons at LHC. It is shown that a large subset of this constrained parameter space of the model can simultaneously accommodate the $W$-boson mass and also explain the muon $g-2$ anomaly. △ Less

Submitted 28 June, 2024; originally announced July 2024.

Comments: 15 pages, 5 figures

arXiv:2406.20060 [pdf, other]

Applying RLAIF for Code Generation with API-usage in Lightweight LLMs

Authors: Sujan Dutta, Sayantan Mahinder, Raviteja Anantha, Bortik Bandyopadhyay

Abstract: Reinforcement Learning from AI Feedback (RLAIF) has demonstrated significant potential across various domains, including mitigating harm in LLM outputs, enhancing text summarization, and mathematical reasoning. This paper introduces an RLAIF framework for improving the code generation abilities of lightweight (<1B parameters) LLMs. We specifically focus on code generation tasks that require writin… ▽ More Reinforcement Learning from AI Feedback (RLAIF) has demonstrated significant potential across various domains, including mitigating harm in LLM outputs, enhancing text summarization, and mathematical reasoning. This paper introduces an RLAIF framework for improving the code generation abilities of lightweight (<1B parameters) LLMs. We specifically focus on code generation tasks that require writing appropriate API calls, which is challenging due to the well-known issue of hallucination in LLMs. Our framework extracts AI feedback from a larger LLM (e.g., GPT-3.5) through a specialized prompting strategy and uses this data to train a reward model towards better alignment from smaller LLMs. We run our experiments on the Gorilla dataset and meticulously assess the quality of the model-generated code across various metrics, including AST, ROUGE, and Code-BLEU, and develop a pipeline to compute its executability rate accurately. Our approach significantly enhances the fine-tuned LLM baseline's performance, achieving a 4.5% improvement in executability rate. Notably, a smaller LLM model (780M parameters) trained with RLAIF surpasses a much larger fine-tuned baseline with 7B parameters, achieving a 1.0% higher code executability rate. △ Less

Submitted 28 June, 2024; originally announced June 2024.

arXiv:2406.18545 [pdf, other]

Visual Analysis of Prediction Uncertainty in Neural Networks for Deep Image Synthesis

Authors: Soumya Dutta, Faheem Nizar, Ahmad Amaan, Ayan Acharya

Abstract: Ubiquitous applications of Deep neural networks (DNNs) in different artificial intelligence systems have led to their adoption in solving challenging visualization problems in recent years. While sophisticated DNNs offer an impressive generalization, it is imperative to comprehend the quality, confidence, robustness, and uncertainty associated with their prediction. A thorough understanding of the… ▽ More Ubiquitous applications of Deep neural networks (DNNs) in different artificial intelligence systems have led to their adoption in solving challenging visualization problems in recent years. While sophisticated DNNs offer an impressive generalization, it is imperative to comprehend the quality, confidence, robustness, and uncertainty associated with their prediction. A thorough understanding of these quantities produces actionable insights that help application scientists make informed decisions. Unfortunately, the intrinsic design principles of the DNNs cannot beget prediction uncertainty, necessitating separate formulations for robust uncertainty-aware models for diverse visualization applications. To that end, this contribution demonstrates how the prediction uncertainty and sensitivity of DNNs can be estimated efficiently using various methods and then interactively compared and contrasted for deep image synthesis tasks. Our inspection suggests that uncertainty-aware deep visualization models generate illustrations of informative and superior quality and diversity. Furthermore, prediction uncertainty improves the robustness and interpretability of deep visualization models, making them practical and convenient for various scientific domains that thrive on visual analyses. △ Less

Submitted 22 May, 2024; originally announced June 2024.

arXiv:2406.16807 [pdf, other]

Beyond Thumbs Up/Down: Untangling Challenges of Fine-Grained Feedback for Text-to-Image Generation

Authors: Katherine M. Collins, Najoung Kim, Yonatan Bitton, Verena Rieser, Shayegan Omidshafiei, Yushi Hu, Sherol Chen, Senjuti Dutta, Minsuk Chang, Kimin Lee, Youwei Liang, Georgina Evans, Sahil Singla, Gang Li, Adrian Weller, Junfeng He, Deepak Ramachandran, Krishnamurthy Dj Dvijotham

Abstract: Human feedback plays a critical role in learning and refining reward models for text-to-image generation, but the optimal form the feedback should take for learning an accurate reward function has not been conclusively established. This paper investigates the effectiveness of fine-grained feedback which captures nuanced distinctions in image quality and prompt-alignment, compared to traditional co… ▽ More Human feedback plays a critical role in learning and refining reward models for text-to-image generation, but the optimal form the feedback should take for learning an accurate reward function has not been conclusively established. This paper investigates the effectiveness of fine-grained feedback which captures nuanced distinctions in image quality and prompt-alignment, compared to traditional coarse-grained feedback (for example, thumbs up/down or ranking between a set of options). While fine-grained feedback holds promise, particularly for systems catering to diverse societal preferences, we show that demonstrating its superiority to coarse-grained feedback is not automatic. Through experiments on real and synthetic preference data, we surface the complexities of building effective models due to the interplay of model choice, feedback type, and the alignment between human judgment and computational interpretation. We identify key challenges in eliciting and utilizing fine-grained feedback, prompting a reassessment of its assumed benefits and practicality. Our findings -- e.g., that fine-grained feedback can lead to worse models for a fixed budget, in some settings; however, in controlled settings with known attributes, fine grained rewards can indeed be more helpful -- call for careful consideration of feedback attributes and potentially beckon novel modeling approaches to appropriately unlock the potential value of fine-grained feedback in-the-wild. △ Less

Submitted 24 June, 2024; originally announced June 2024.

arXiv:2406.11937 [pdf, other]

Using graph neural networks to reconstruct charged pion showers in the CMS High Granularity Calorimeter

Authors: M. Aamir, B. Acar, G. Adamov, T. Adams, C. Adloff, S. Afanasiev, C. Agrawal, C. Agrawal, A. Ahmad, H. A. Ahmed, S. Akbar, N. Akchurin, B. Akgul, B. Akgun, R. O. Akpinar, E. Aktas, A. AlKadhim, V. Alexakhin, J. Alimena, J. Alison, A. Alpana, W. Alshehri, P. Alvarez Dominguez, M. Alyari, C. Amendola , et al. (550 additional authors not shown)

Abstract: A novel method to reconstruct the energy of hadronic showers in the CMS High Granularity Calorimeter (HGCAL) is presented. The HGCAL is a sampling calorimeter with very fine transverse and longitudinal granularity. The active media are silicon sensors and scintillator tiles readout by SiPMs and the absorbers are a combination of lead and Cu/CuW in the electromagnetic section, and steel in the hadr… ▽ More A novel method to reconstruct the energy of hadronic showers in the CMS High Granularity Calorimeter (HGCAL) is presented. The HGCAL is a sampling calorimeter with very fine transverse and longitudinal granularity. The active media are silicon sensors and scintillator tiles readout by SiPMs and the absorbers are a combination of lead and Cu/CuW in the electromagnetic section, and steel in the hadronic section. The shower reconstruction method is based on graph neural networks and it makes use of a dynamic reduction network architecture. It is shown that the algorithm is able to capture and mitigate the main effects that normally hinder the reconstruction of hadronic showers using classical reconstruction methods, by compensating for fluctuations in the multiplicity, energy, and spatial distributions of the shower's constituents. The performance of the algorithm is evaluated using test beam data collected in 2018 prototype of the CMS HGCAL accompanied by a section of the CALICE AHCAL prototype. The capability of the method to mitigate the impact of energy leakage from the calorimeter is also demonstrated. △ Less

Submitted 30 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

Comments: Prepared for submission to JINST

arXiv:2406.09724 [pdf, ps, other]

Chow Groups: A Structure Theorem, RIEMANN-ROCH without denominators and ARTIN approximation

Authors: S. P. Dutta

Abstract: The focus of this note is on the Chow group problem over ramified regular local rings $(R, m)$. Our goal is threefold: i) to introduce a characterization of a ramified regular local ring essentially of finite type over a dvr, ii) to address the question whether $(i-1)!$ $\mathbb{A}^i(U)=0$ for specific open subsets $U$ of Spec$R$ and iii) to establish a constructive relation between Chow groups of… ▽ More The focus of this note is on the Chow group problem over ramified regular local rings $(R, m)$. Our goal is threefold: i) to introduce a characterization of a ramified regular local ring essentially of finite type over a dvr, ii) to address the question whether $(i-1)!$ $\mathbb{A}^i(U)=0$ for specific open subsets $U$ of Spec$R$ and iii) to establish a constructive relation between Chow groups of the henselization $(R^h, m^h)$ and Chow groups of the completion $(\hat{R}, \hat{m})$ of $(R, m)$. △ Less

Submitted 14 June, 2024; originally announced June 2024.

MSC Class: Primary: 13D22; 14C40; Secondary: 13H05

arXiv:2406.08341 [pdf, other]

doi 10.1093/mnras/stae1490

HI Imaging of a Blueberry Galaxy Suggests a Merger Origin

Authors: Saili Dutta, Apurba Bera, Omkar Bait, Chaitra A. Narayan, Biny Sebastian, Sravani Vaddi

Abstract: Blueberry galaxies (BBs) are fainter, less massive, and lower redshift counterparts of the Green pea galaxies. They are thought to be the nearest analogues of the high redshift Lyman Alpha (Ly$α$) emitters. We report the interferometric imaging of HI 21 cm emission from a Blueberry galaxy, J1509+3731, at redshift, z = 0.03259, using the Giant Metrewave Radio Telescope (GMRT). We find that this Blu… ▽ More Blueberry galaxies (BBs) are fainter, less massive, and lower redshift counterparts of the Green pea galaxies. They are thought to be the nearest analogues of the high redshift Lyman Alpha (Ly$α$) emitters. We report the interferometric imaging of HI 21 cm emission from a Blueberry galaxy, J1509+3731, at redshift, z = 0.03259, using the Giant Metrewave Radio Telescope (GMRT). We find that this Blueberry galaxy has an HI mass of $M_{\text{HI}} \approx 3\times 10^8 \, M_{\odot}$ and an HI-to-stellar mass ratio $M_{\text{HI}}/M_* \approx$ 2.4. Using SFR estimates from the H$β$ emission line, we find that it has a short HI depletion time scale of $\approx 0.2$ Gyr, which indicates a significantly higher star-formation efficiency compared to typical star-forming galaxies at the present epoch. Interestingly, we find an offset of $\approx 2$ kpc between the peak of the HI 21 cm emission and the optical centre which suggests a merger event in the past. Our study highlights the important role of mergers in triggering the starburst in BBs and their role in the possible leakage of Lyman-$α$ and Lyman-continuum photons which is consistent with the previous studies on BB galaxies. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 8 pages, 4 figures, accepted for publication in MNRAS

arXiv:2406.06135 [pdf]

Navigating the nexus: a perspective of centrosome -cytoskeleton interactions

Authors: Subarna Dutta, Arnab Barua

Abstract: A structural relationship between the centrosome and cytoskeleton has been recognized for many years. Centrosomes typically reside near the nucleus, establishing and maintaining the nucleus-centrosome axis. This spatial arrangement is critical for determining cell polarity during interphase and ensuring the proper assembly of the spindle apparatus during mitosis. Centrosomes also engage in physica… ▽ More A structural relationship between the centrosome and cytoskeleton has been recognized for many years. Centrosomes typically reside near the nucleus, establishing and maintaining the nucleus-centrosome axis. This spatial arrangement is critical for determining cell polarity during interphase and ensuring the proper assembly of the spindle apparatus during mitosis. Centrosomes also engage in physical interactions with various components of the cytoskeleton, balancing internal cellular architecture and polarity in a manner specific to tissue type and developmental stage. Numerous crosslinking proteins facilitate these interactions, promoting both cytoskeletal and centrosomal nucleation. This article provides an overview of how cytoskeletal elements and centrosomes coordinate their actions to regulate complex cellular functions such as cell migration, adhesion, and division. The reciprocal influence between cytoskeletal dynamics and centrosomal positioning underscores their integral roles in cellular organization and function. △ Less

Submitted 4 July, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

Comments: 15 pages, 1 Figure

arXiv:2406.04701 [pdf, ps, other]

Transition to synchronization in adaptive Sakaguchi-Kuramoto model with higher-order interactions

Authors: Sangita Dutta, Prosenjit Kundu, Pitambar Khanra, Chittaranjan Hens, Pinaki Pal

Abstract: We investigate the phenomenon of transition to synchronization in Sakaguchi-Kuramoto model in the presence of higher-order interactions and global order parameter adaptation. The investigation is done by performing extensive numerical simulations and low dimensional modeling of the system. Numerical simulations of the full system show both continuous (second order) as well as discontinuous transit… ▽ More We investigate the phenomenon of transition to synchronization in Sakaguchi-Kuramoto model in the presence of higher-order interactions and global order parameter adaptation. The investigation is done by performing extensive numerical simulations and low dimensional modeling of the system. Numerical simulations of the full system show both continuous (second order) as well as discontinuous transitions. The discontinuous transitions can either be associated with explosive (first order) or with tiered synchronization states depending on the choice of parameters. To develop an in depth understanding of the transition scenario in the parameter space we derive a reduced order model (ROM) using the Ott-Antonsen ansatz, the results of which closely matches with that of the numerical simulations of the full system. The simplicity and analytical accessibility of the ROM helps to conveniently unfold the transition scenario in the system having complex dependence on the parameters. Simultaneous analysis of the full system and the ROM clearly identifies the regions of the parameter space exhibiting different types of transitions. It is observed that the second order continuous transition is connected with a supercritical pitchfork bifurcation (PB) of the ROM. On the other hand, the discontinuous teired transition is associated with multiple saddle-node (SN) bifurcations along with a supercritical PB and the first order explosive transition involves a subcritical PB alongside a SN bifurcation. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: 11 pages, 11 figures

arXiv:2406.04562 [pdf, other]

A Unified View of Group Fairness Tradeoffs Using Partial Information Decomposition

Authors: Faisal Hamman, Sanghamitra Dutta

Abstract: This paper introduces a novel information-theoretic perspective on the relationship between prominent group fairness notions in machine learning, namely statistical parity, equalized odds, and predictive parity. It is well known that simultaneous satisfiability of these three fairness notions is usually impossible, motivating practitioners to resort to approximate fairness solutions rather than st… ▽ More This paper introduces a novel information-theoretic perspective on the relationship between prominent group fairness notions in machine learning, namely statistical parity, equalized odds, and predictive parity. It is well known that simultaneous satisfiability of these three fairness notions is usually impossible, motivating practitioners to resort to approximate fairness solutions rather than stringent satisfiability of these definitions. However, a comprehensive analysis of their interrelations, particularly when they are not exactly satisfied, remains largely unexplored. Our main contribution lies in elucidating an exact relationship between these three measures of (un)fairness by leveraging a body of work in information theory called partial information decomposition (PID). In this work, we leverage PID to identify the granular regions where these three measures of (un)fairness overlap and where they disagree with each other leading to potential tradeoffs. We also include numerical simulations to complement our results. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: Published as a conference paper at 2024 IEEE International Symposium on Information Theory (ISIT 2024)

arXiv:2406.02444 [pdf, other]

Noise-adapted qudit codes for amplitude-damping noise

Authors: Sourav Dutta, Debjyoti Biswas, Prabha Mandayam

Abstract: Quantum error correction (QEC) plays a critical role in preventing information loss in quantum systems and provides a framework for reliable quantum computation. Identifying quantum codes with nice code parameters for physically motivated noise models remains an interesting challenge. Going beyond qubit codes, here we propose a class of qudit error correcting codes tailored to protect against ampl… ▽ More Quantum error correction (QEC) plays a critical role in preventing information loss in quantum systems and provides a framework for reliable quantum computation. Identifying quantum codes with nice code parameters for physically motivated noise models remains an interesting challenge. Going beyond qubit codes, here we propose a class of qudit error correcting codes tailored to protect against amplitude-damping noise. Specifically, we construct a class of four-qudit codes that satisfies the error correction conditions for all single-qudit and a few two-qudit damping errors up to the leading order in the damping parameter $γ$. We devise a protocol to extract syndromes that identify this set of errors unambiguously, leading to a noise-adapted recovery scheme that achieves a fidelity loss of $\cO(γ^{2})$. For the $d=2$ case, our QEC scheme is identical to the known example of the $4$-qubit code and the associated syndrome-based recovery. We also assess the performance of our class of codes using the Petz recovery map and note some interesting deviations from the qubit case. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2405.19967 [pdf, other]

Improved Out-of-Scope Intent Classification with Dual Encoding and Threshold-based Re-Classification

Authors: Hossam M. Zawbaa, Wael Rashwan, Sourav Dutta, Haytham Assem

Abstract: Detecting out-of-scope user utterances is essential for task-oriented dialogues and intent classification. Current methodologies face difficulties with the unpredictable distribution of outliers and often rely on assumptions about data distributions. We present the Dual Encoder for Threshold-Based Re-Classification (DETER) to address these challenges. This end-to-end framework efficiently detects… ▽ More Detecting out-of-scope user utterances is essential for task-oriented dialogues and intent classification. Current methodologies face difficulties with the unpredictable distribution of outliers and often rely on assumptions about data distributions. We present the Dual Encoder for Threshold-Based Re-Classification (DETER) to address these challenges. This end-to-end framework efficiently detects out-of-scope intents without requiring assumptions on data distributions or additional post-processing steps. The core of DETER utilizes dual text encoders, the Universal Sentence Encoder (USE) and the Transformer-based Denoising AutoEncoder (TSDAE), to generate user utterance embeddings, which are classified through a branched neural architecture. Further, DETER generates synthetic outliers using self-supervision and incorporates out-of-scope phrases from open-domain datasets. This approach ensures a comprehensive training set for out-of-scope detection. Additionally, a threshold-based re-classification mechanism refines the model's initial predictions. Evaluations on the CLINC-150, Stackoverflow, and Banking77 datasets demonstrate DETER's efficacy. Our model outperforms previous benchmarks, increasing up to 13% and 5% in F1 score for known and unknown intents on CLINC-150 and Stackoverflow, and 16% for known and 24% % for unknown intents on Banking77. The source code has been released at https://github.com/Hossam-Mohammed-tech/Intent_Classification_OOS. △ Less

Submitted 31 May, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.17238 [pdf, other]

LLM-Assisted Static Analysis for Detecting Security Vulnerabilities

Authors: Ziyang Li, Saikat Dutta, Mayur Naik

Abstract: Software is prone to security vulnerabilities. Program analysis tools to detect them have limited effectiveness in practice. While large language models (or LLMs) have shown impressive code generation capabilities, they cannot do complex reasoning over code to detect such vulnerabilities, especially because this task requires whole-repository analysis. In this work, we propose IRIS, the first appr… ▽ More Software is prone to security vulnerabilities. Program analysis tools to detect them have limited effectiveness in practice. While large language models (or LLMs) have shown impressive code generation capabilities, they cannot do complex reasoning over code to detect such vulnerabilities, especially because this task requires whole-repository analysis. In this work, we propose IRIS, the first approach that systematically combines LLMs with static analysis to perform whole-repository reasoning to detect security vulnerabilities. We curate a new dataset, CWE-Bench-Java, comprising 120 manually validated security vulnerabilities in real-world Java projects. These projects are complex, with an average of 300,000 lines of code and a maximum of up to 7 million. Out of 120 vulnerabilities in CWE-Bench-Java, IRIS detects 69 using GPT-4, while the state-of-the-art static analysis tool only detects 27. Further, IRIS also significantly reduces the number of false alarms (by more than 80% in the best case). △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2405.15816 [pdf, other]

Riemannian Bilevel Optimization

Authors: Sanchayan Dutta, Xiang Cheng, Suvrit Sra

Abstract: We develop new algorithms for Riemannian bilevel optimization. We focus in particular on batch and stochastic gradient-based methods, with the explicit goal of avoiding second-order information such as Riemannian hyper-gradients. We propose and analyze $\mathrm{RF^2SA}$, a method that leverages first-order gradient information to navigate the complex geometry of Riemannian manifolds efficiently. N… ▽ More We develop new algorithms for Riemannian bilevel optimization. We focus in particular on batch and stochastic gradient-based methods, with the explicit goal of avoiding second-order information such as Riemannian hyper-gradients. We propose and analyze $\mathrm{RF^2SA}$, a method that leverages first-order gradient information to navigate the complex geometry of Riemannian manifolds efficiently. Notably, $\mathrm{RF^2SA}$ is a single-loop algorithm, and thus easier to implement and use. Under various setups, including stochastic optimization, we provide explicit convergence rates for reaching $ε$-stationary points. We also address the challenge of optimizing over Riemannian manifolds with constraints by adjusting the multiplier in the Lagrangian, ensuring convergence to the desired solution without requiring access to second-order derivatives. △ Less

Submitted 22 May, 2024; originally announced May 2024.

arXiv:2405.15657 [pdf, other]

Multiple Emission Regions in Jets of Low Luminosity Active Galactic Nucleus in NGC 4278

Authors: Samik Dutta, Nayantara Gupta

Abstract: The Large High Altitude Airshower Array (LHAASO) has detected very high energy gamma rays from the LINER galaxy NGC 4278, which has a low luminosity active galactic nucleus, and symmetric mildly relativistic S-shaped twin jets detected by radio observations. Few low-luminosity active galactic nuclei are detected in gamma rays due to their faintness. Earlier, several radio-emitting components were… ▽ More The Large High Altitude Airshower Array (LHAASO) has detected very high energy gamma rays from the LINER galaxy NGC 4278, which has a low luminosity active galactic nucleus, and symmetric mildly relativistic S-shaped twin jets detected by radio observations. Few low-luminosity active galactic nuclei are detected in gamma rays due to their faintness. Earlier, several radio-emitting components were detected in the jets of NGC 4278. We model their radio emission with synchrotron emission of ultra-relativistic electrons to estimate the strength of the magnetic field inside these components within a time-dependent framework after including the ages of the different components. We show that the synchrotron and synchrotron self-Compton emission by these components cannot explain the Swift X-ray data detected from NGC 4278 and the LHAASO gamma-ray data associated with NGC 4278. We suggest that a separate component in one of the jets is responsible for the high energy emission whose age, size, magnetic field and the spectrum of the ultra-relativistic electrons inside it have been estimated after fitting the multi-wavelength data of NGC 4278 with the sum of the spectral energy distributions from the radio components and the high energy component. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2405.14683 [pdf]

Development of a Gaussian Approximation Potential to Study Structure and Thermodynamics of Nickel Nanoclusters

Authors: Suvo Banik, Partha Sarathi Dutta, Sukriti Manna, Subramanian KRS Sankaranarayanan

Abstract: Machine Learning (ML) potentials such as Gaussian Approximation Potential (GAP) have demonstrated impressive capabilities in mapping structure to properties across diverse systems. Here, we introduce a GAP model for low-dimensional Ni nanoclusters and demonstrate its flexibility and effectiveness in capturing the energetics, structural diversity and thermodynamic properties of Ni nanoclusters acro… ▽ More Machine Learning (ML) potentials such as Gaussian Approximation Potential (GAP) have demonstrated impressive capabilities in mapping structure to properties across diverse systems. Here, we introduce a GAP model for low-dimensional Ni nanoclusters and demonstrate its flexibility and effectiveness in capturing the energetics, structural diversity and thermodynamic properties of Ni nanoclusters across a broad size range. Through a systematic approach encompassing model development, validation, and application, we evaluate the model's efficacy in representing energetics and configurational features in low-dimensional regimes, while also examining its extrapolative nature to vastly different spatiotemporal regimes. Our analysis and discussion shed light on the data quality required to effectively train such models. Trajectories from large scale MD simulations using the GAP model analyzed with data-driven models like Graph Neural Networks (GNN) reveal intriguing insights into the size-dependent phase behavior and thermo-mechanical stability characteristics of porous Ni nanoparticles. Overall, our work underscores the potential of ML models which coupled with data-driven approaches serve as versatile tools for studying low-dimensional systems and complex material dynamics. △ Less

Submitted 26 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.12226 [pdf]

Quantum inspired approach for denoising with application to medical imaging

Authors: Amirreza Hashemi, Sayantan Dutta, Bertrand Georgeot, Denis Kouame, Hamid Sabet

Abstract: Background noise in many fields such as medical imaging poses significant challenges for accurate diagnosis, prompting the development of denoising algorithms. Traditional methodologies, however, often struggle to address the complexities of noisy environments in high dimensional imaging systems. This paper introduces a novel quantum-inspired approach for image denoising, drawing upon principles o… ▽ More Background noise in many fields such as medical imaging poses significant challenges for accurate diagnosis, prompting the development of denoising algorithms. Traditional methodologies, however, often struggle to address the complexities of noisy environments in high dimensional imaging systems. This paper introduces a novel quantum-inspired approach for image denoising, drawing upon principles of quantum and condensed matter physics. Our approach views medical images as amorphous structures akin to those found in condensed matter physics and we propose an algorithm that incorporates the concept of mode resolved localization directly into the denoising process. Notably, our approach eliminates the need for hyperparameter tuning. The proposed method is a standalone algorithm with minimal manual intervention, demonstrating its potential to use quantum-based techniques in classical signal denoising. Through numerical validation, we showcase the effectiveness of our approach in addressing noise-related challenges in imaging and especially medical imaging, underscoring its relevance for possible quantum computing applications. △ Less

Submitted 17 June, 2024; v1 submitted 22 April, 2024; originally announced May 2024.

arXiv:2405.10548 [pdf, other]

Language Models can Exploit Cross-Task In-context Learning for Data-Scarce Novel Tasks

Authors: Anwoy Chatterjee, Eshaan Tanwar, Subhabrata Dutta, Tanmoy Chakraborty

Abstract: Large Language Models (LLMs) have transformed NLP with their remarkable In-context Learning (ICL) capabilities. Automated assistants based on LLMs are gaining popularity; however, adapting them to novel tasks is still challenging. While colossal models excel in zero-shot performance, their computational demands limit widespread use, and smaller language models struggle without context. This paper… ▽ More Large Language Models (LLMs) have transformed NLP with their remarkable In-context Learning (ICL) capabilities. Automated assistants based on LLMs are gaining popularity; however, adapting them to novel tasks is still challenging. While colossal models excel in zero-shot performance, their computational demands limit widespread use, and smaller language models struggle without context. This paper investigates whether LLMs can generalize from labeled examples of predefined tasks to novel tasks. Drawing inspiration from biological neurons and the mechanistic interpretation of the Transformer architecture, we explore the potential for information sharing across tasks. We design a cross-task prompting setup with three LLMs and show that LLMs achieve significant performance improvements despite no examples from the target task in the context. Cross-task prompting leads to a remarkable performance boost of 107% for LLaMA-2 7B, 18.6% for LLaMA-2 13B, and 3.2% for GPT 3.5 on average over zero-shot prompting, and performs comparable to standard in-context learning. The effectiveness of generating pseudo-labels for in-task examples is demonstrated, and our analyses reveal a strong correlation between the effect of cross-task examples and model activation similarities in source and target input tokens. This paper offers a first-of-its-kind exploration of LLMs' ability to solve novel tasks based on contextual signals from different task examples. △ Less

Submitted 12 June, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

Comments: Accepted at ACL 2024 Main

arXiv:2405.09704 [pdf, other]

Neutron and $\boldsymbolγ$-ray Discrimination by a Pressurized Helium-4 Based Scintillation Detector

Authors: Shubham Dutta, Sayan Ghosh, Satyajit Saha

Abstract: Pressurized helium-4 based fast neutron scintillation detector offers an useful alternative to organic liquid-based scintillator due to its relatively low response to the $γ$-rays compared to the latter type of scintillator. In the present work, we have investigated the capabilities of a pressurized $^4$He (PHe) detector for the detection of fast neutrons in a mixed radiation field where both the… ▽ More Pressurized helium-4 based fast neutron scintillation detector offers an useful alternative to organic liquid-based scintillator due to its relatively low response to the $γ$-rays compared to the latter type of scintillator. In the present work, we have investigated the capabilities of a pressurized $^4$He (PHe) detector for the detection of fast neutrons in a mixed radiation field where both the neutrons and the $γ$-rays are present. Discrimination between neutrons and $γ$-rays is achieved by using fast-slow charge integration method. We have also conducted systematic studies of the attenuation of fast neutrons and $γ$-rays by high-density polyethylene (HDPE). These studies are further corroborated by simulation analyses conducted using GEANT4, which show qualitative agreement with the experimental results. Additionally, the simulation provides detailed insights into the interactions of the radiation quanta with the PHe detector. Estimates of the scintillation signal yield are made based on our GEANT4 simulation results by considering the scintillation mechanism in the PHe gas. △ Less

Submitted 1 July, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

arXiv:2405.08866 [pdf, other]

On the quantum origin of limit cycles, fixed points, and critical slowing down

Authors: Shovan Dutta, Shu Zhang, Masudul Haque

Abstract: Among the most iconic features of classical dissipative dynamics are persistent limit-cycle oscillations and critical slowing down at the onset of such oscillations, where the system relaxes purely algebraically in time. On the other hand, quantum systems subject to generic Markovian dissipation decohere exponentially in time, approaching a unique steady state. Here we show how coherent limit-cycl… ▽ More Among the most iconic features of classical dissipative dynamics are persistent limit-cycle oscillations and critical slowing down at the onset of such oscillations, where the system relaxes purely algebraically in time. On the other hand, quantum systems subject to generic Markovian dissipation decohere exponentially in time, approaching a unique steady state. Here we show how coherent limit-cycle oscillations and algebraic decay can emerge in a quantum system governed by a Markovian master equation as one approaches the classical limit, illustrating general mechanisms using a single-spin model and a two-site lossy Bose-Hubbard model. In particular, we demonstrate that the fingerprint of a limit cycle is a slow-decaying branch with vanishing decoherence rates in the Liouville spectrum, while a power-law decay is realized by a spectral collapse at the bifurcation point. We also show how these are distinct from the case of a classical fixed point, for which the quantum spectrum is gapped and can be generated from the linearized classical dynamics. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 4 pages, 6 figures + Supplement

arXiv:2405.05813 [pdf]

Revitalising Stagecraft: NLP-Driven Sentiment Analysis for Traditional Theater Revival

Authors: Saikat Samanta, Saptarshi Karmakar, Satayajay Behuria, Shibam Dutta, Soujit Das, Soumik Saha

Abstract: This paper explores the application of FilmFrenzy, a python based ticket booking web application, in the revival of traditional Indian theatres. Additionally, this research paper explores how NLP can be implemented to improve user experience. Through clarifying audience views and pinpointing opportunities for development, FilmFrenzy aims to promote involvement and rejuvenation in India's conventio… ▽ More This paper explores the application of FilmFrenzy, a python based ticket booking web application, in the revival of traditional Indian theatres. Additionally, this research paper explores how NLP can be implemented to improve user experience. Through clarifying audience views and pinpointing opportunities for development, FilmFrenzy aims to promote involvement and rejuvenation in India's conventional theatre scene. The platform seeks to maintain the relevance and vitality of conventional theatres by bridging the gap between audiences and them through the incorporation of contemporary technologies, especially NLP. This research envisions a future in which technology plays a crucial part in maintaining India's rich theatrical traditions, thereby contributing to the preservation and development of cultural heritage. With sentiment analysis and natural language processing (NLP) as essential instruments for improving stagecraft, the research envisions a period when traditional theatre will still be vibrant. △ Less

Submitted 9 May, 2024; originally announced May 2024.

arXiv:2405.05655 [pdf, other]

Sheet model description of spatio-temporal evolution of upper-hybrid oscillations in an inhomogeneous magnetic field

Authors: Nidhi Rathee, Someswar Dutta, R. Srinivasan, Sudip Sengupta

Abstract: Spatio-temporal evolution of large amplitude upper hybrid oscillations in a cold homogeneous plasma in the presence of an inhomogeneous magnetic field is studied analytically and numerically using the Dawson sheet model. It is observed that the inhomogeneity in magnetic field which causes the upper hybrid frequency to acquire a spatial dependence, results in phase mixing and subsequent breaking of… ▽ More Spatio-temporal evolution of large amplitude upper hybrid oscillations in a cold homogeneous plasma in the presence of an inhomogeneous magnetic field is studied analytically and numerically using the Dawson sheet model. It is observed that the inhomogeneity in magnetic field which causes the upper hybrid frequency to acquire a spatial dependence, results in phase mixing and subsequent breaking of the upper hybrid oscillations at arbitrarily low amplitudes. This result is in sharp contrast to the usual upper hybrid oscillations in a homogeneous magnetic field where the oscillations break within a fraction of a period when the amplitude exceeds a certain critical value. Our perturbative calculations show that the phase mixing (wave breaking) time scales inversely with the amplitude of magnetic field inhomogeneity ($Δ$) and amplitude of imposed density perturbation ($δ$), and scales directly with the ratio of magnetic field inhomogeneity scale length to imposed density perturbation scale length ($(α/k_L)^{-1}$ ) as $ω_{pe}τ_{mix} \sim \left( 1+β^2 \right) ^{3/2}k_L/(β^2δΔα)$, where $β$ is the ratio of electron cyclotron frequency to electron plasma frequency. Further phase mixing time measured in simulations, performed using a 1-1/2 D code based on Dawson sheet model, shows good agreement with the above mentioned scaling. This result may be of relevance to plasma based particle acceleration experiments in the presence of a transverse inhomogeneous magnetic field. △ Less

Submitted 9 May, 2024; originally announced May 2024.

arXiv:2405.05369 [pdf, other]

Model Reconstruction Using Counterfactual Explanations: Mitigating the Decision Boundary Shift

Authors: Pasan Dissanayake, Sanghamitra Dutta

Abstract: Counterfactual explanations find ways of achieving a favorable model outcome with minimum input perturbation. However, counterfactual explanations can also be exploited to steal the model by strategically training a surrogate model to give similar predictions as the original (target) model. In this work, we investigate model extraction by specifically leveraging the fact that the counterfactual ex… ▽ More Counterfactual explanations find ways of achieving a favorable model outcome with minimum input perturbation. However, counterfactual explanations can also be exploited to steal the model by strategically training a surrogate model to give similar predictions as the original (target) model. In this work, we investigate model extraction by specifically leveraging the fact that the counterfactual explanations also lie quite close to the decision boundary. We propose a novel strategy for model extraction that we call Counterfactual Clamping Attack (CCA) which trains a surrogate model using a unique loss function that treats counterfactuals differently than ordinary instances. Our approach also alleviates the related problem of decision boundary shift that arises in existing model extraction attacks which treat counterfactuals as ordinary instances. We also derive novel mathematical relationships between the error in model approximation and the number of queries using polytope theory. Experimental results demonstrate that our strategy provides improved fidelity between the target and surrogate model predictions on several real world datasets. △ Less

Submitted 8 May, 2024; originally announced May 2024.

arXiv:2405.02056 [pdf, ps, other]

Some properties of the Zero-set Intersection graph of $C(X)$ and its Line graph

Authors: Yangersenba T Jamir, S Dutta

Abstract: Let $C(X)$ be the ring of all continuous real valued functions defined on a completely regular Hausdorff topological space $X$. The zero-set intersection graph $Γ(C(X))$ of $C(X)$ is a simple graph with vertex set all non units of $C(X)$ and two vertices are adjacent if the intersection of the zero sets of the functions is non empty. In this paper, we study the zero-set intersection graph of… ▽ More Let $C(X)$ be the ring of all continuous real valued functions defined on a completely regular Hausdorff topological space $X$. The zero-set intersection graph $Γ(C(X))$ of $C(X)$ is a simple graph with vertex set all non units of $C(X)$ and two vertices are adjacent if the intersection of the zero sets of the functions is non empty. In this paper, we study the zero-set intersection graph of $C(X)$ and its line graph. We show that if $X$ has more than two points, then these graphs are connected with diameter and radius 2. We show that the girth of the graph is 3 and the graphs are both triangulated and hypertriangulated. We find the domination number of these graphs and finally we prove that $C(X)$ is a von Neuman regular ring if and only if $C(X)$ is an almost regular ring and for all $f \in V(Γ(C(X)))$ there exists $g \in V(Γ(C(X)))$ such that $Z(f) \cap Z(g) = φ$ and $\{f, g\}$ dominates $Γ(C(X))$. Finally, we derive some properties of the line graph of $Γ(C(X))$. △ Less

Submitted 3 May, 2024; originally announced May 2024.

MSC Class: 54C40; 05C69

arXiv:2405.01040 [pdf, other]

Few Shot Class Incremental Learning using Vision-Language models

Authors: Anurag Kumar, Chinmay Bharti, Saikat Dutta, Srikrishna Karanam, Biplab Banerjee

Abstract: Recent advancements in deep learning have demonstrated remarkable performance comparable to human capabilities across various supervised computer vision tasks. However, the prevalent assumption of having an extensive pool of training data encompassing all classes prior to model training often diverges from real-world scenarios, where limited data availability for novel classes is the norm. The cha… ▽ More Recent advancements in deep learning have demonstrated remarkable performance comparable to human capabilities across various supervised computer vision tasks. However, the prevalent assumption of having an extensive pool of training data encompassing all classes prior to model training often diverges from real-world scenarios, where limited data availability for novel classes is the norm. The challenge emerges in seamlessly integrating new classes with few samples into the training data, demanding the model to adeptly accommodate these additions without compromising its performance on base classes. To address this exigency, the research community has introduced several solutions under the realm of few-shot class incremental learning (FSCIL). In this study, we introduce an innovative FSCIL framework that utilizes language regularizer and subspace regularizer. During base training, the language regularizer helps incorporate semantic information extracted from a Vision-Language model. The subspace regularizer helps in facilitating the model's acquisition of nuanced connections between image and text semantics inherent to base classes during incremental training. Our proposed framework not only empowers the model to embrace novel classes with limited data, but also ensures the preservation of performance on base classes. To substantiate the efficacy of our approach, we conduct comprehensive experiments on three distinct FSCIL benchmarks, where our framework attains state-of-the-art performance. △ Less

Submitted 2 May, 2024; originally announced May 2024.

Comments: under review at Pattern Recognition Letters

arXiv:2404.14770 [pdf, other]

Discrete-Time Open Quantum Walks for Vertex Ranking in Graphs

Authors: Supriyo Dutta

Abstract: This article utilizes the inspiration to apply the Wyel operators for producing the Kraus operators, which are crucial in the discrete-time open quantum walk. It assists us in extending the idea of discrete-time open quantum walk on arbitrary directed and undirected graphs. We make the new model of quantum walk useful to build up a quantum PageRank algorithm. In classical computation, Google's Pag… ▽ More This article utilizes the inspiration to apply the Wyel operators for producing the Kraus operators, which are crucial in the discrete-time open quantum walk. It assists us in extending the idea of discrete-time open quantum walk on arbitrary directed and undirected graphs. We make the new model of quantum walk useful to build up a quantum PageRank algorithm. In classical computation, Google's PageRank is a significant algorithm for arranging web pages on the World Wide Web. In general, it is also a fundamental measure for quantifying the importance of vertices in a network. Similarly, the new quantum PageRank also represents the importance of the vertices of a network. We can compute the new quantum PageRank algorithm in polynomial time using a classical computer. We compare the classical PageRank and the newly defined quantum PageRank for different types of complex networks, such as the scale-free network, Erdos-Renyi random network, Watts-Strogatz network, spatial network, Zachary Karate club network, random-k-out graph, binary tree graph, GNC network, Barabasi and Albert network, etc. △ Less

Submitted 23 April, 2024; originally announced April 2024.

arXiv:2404.11421 [pdf, ps, other]

Excitonic circular dichroism in boron-nitrogen clusters decorated graphene

Authors: Shneha Biswas, Souren Adhikary, Sudipta Dutta

Abstract: Within the first principle calculations, we propose a boron and nitrogen cluster incorporated graphene system for efficient valley polarization. The broken spatial inversion symmetry results in high Berry curvature at K and K' valleys of the hexagonal Brillouin zone in this semiconducting system. The consideration of excitonic quasiparticles within GW approximation along with their scattering proc… ▽ More Within the first principle calculations, we propose a boron and nitrogen cluster incorporated graphene system for efficient valley polarization. The broken spatial inversion symmetry results in high Berry curvature at K and K' valleys of the hexagonal Brillouin zone in this semiconducting system. The consideration of excitonic quasiparticles within GW approximation along with their scattering processes within many-body Bethe-Salpeter equation gives rise to an optical gap of 1.72 eV with an excitonic binding energy of 0.65 eV. Owing to the negligible intervalley scattering, the electrons in opposite valleys are selectively excited by left- and right-handed circular polarized lights, as evident from the oscillator strength calculations. Therefore, this system can exhibit circular-dichroism valley Hall effect in the presence of the in-plane electric field. Moreover, such excitonic qubits can be exploited for information processing. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: 15 pages, 4 figures

arXiv:2404.05581 [pdf, other]

Design and Simulation of Time-energy Optimal Anti-swing Trajectory Planner for Autonomous Tower Cranes

Authors: Souravik Dutta, Yiyu Cai

Abstract: For autonomous crane lifting, optimal trajectories of the crane are required as reference inputs to the crane controller to facilitate feedforward control. Reducing the unactuated payload motion is a crucial issue for under-actuated tower cranes with spherical pendulum dynamics. The planned trajectory should be optimal in terms of both operating time and energy consumption, to facilitate optimum o… ▽ More For autonomous crane lifting, optimal trajectories of the crane are required as reference inputs to the crane controller to facilitate feedforward control. Reducing the unactuated payload motion is a crucial issue for under-actuated tower cranes with spherical pendulum dynamics. The planned trajectory should be optimal in terms of both operating time and energy consumption, to facilitate optimum output spending optimum effort. This article proposes an anti-swing tower crane trajectory planner that can provide time-energy optimal solutions for the Computer-Aided Lift Planning (CALP) system developed at Nanyang Technological University, which facilitates collision-free lifting path planning of robotized tower cranes in autonomous construction sites. The current work introduces a trajectory planning module to the system that utilizes the geometric outputs from the path planning module and optimally scales them with time information. Firstly, analyzing the non-linear dynamics of the crane operations, the tower crane is established as differentially flat. Subsequently, the multi-objective trajectory optimization problems for all the crane operations are formulated in the flat output space through consideration of the mechanical and safety constraints. Two multi-objective evolutionary algorithms, namely Non-dominated Sorting Genetic Algorithm (NSGA-II) and Generalized Differential Evolution 3 (GDE3), are extensively compared via statistical measures based on the closeness of solutions to the Pareto front, distribution of solutions in the solution space and the runtime, to select the optimization engine of the planner. Finally, the crane operation trajectories are obtained via the corresponding planned flat output trajectories. Studies simulating real-world lifting scenarios are conducted to verify the effectiveness and reliability of the proposed module of the lift planning system. △ Less

Submitted 8 April, 2024; originally announced April 2024.

Comments: 18 pages, 12 figures, 9 tables

arXiv:2404.05159 [pdf]

Semantic Stealth: Adversarial Text Attacks on NLP Using Several Methods

Authors: Roopkatha Dey, Aivy Debnath, Sayak Kumar Dutta, Kaustav Ghosh, Arijit Mitra, Arghya Roy Chowdhury, Jaydip Sen

Abstract: In various real-world applications such as machine translation, sentiment analysis, and question answering, a pivotal role is played by NLP models, facilitating efficient communication and decision-making processes in domains ranging from healthcare to finance. However, a significant challenge is posed to the robustness of these natural language processing models by text adversarial attacks. These… ▽ More In various real-world applications such as machine translation, sentiment analysis, and question answering, a pivotal role is played by NLP models, facilitating efficient communication and decision-making processes in domains ranging from healthcare to finance. However, a significant challenge is posed to the robustness of these natural language processing models by text adversarial attacks. These attacks involve the deliberate manipulation of input text to mislead the predictions of the model while maintaining human interpretability. Despite the remarkable performance achieved by state-of-the-art models like BERT in various natural language processing tasks, they are found to remain vulnerable to adversarial perturbations in the input text. In addressing the vulnerability of text classifiers to adversarial attacks, three distinct attack mechanisms are explored in this paper using the victim model BERT: BERT-on-BERT attack, PWWS attack, and Fraud Bargain's Attack (FBA). Leveraging the IMDB, AG News, and SST2 datasets, a thorough comparative analysis is conducted to assess the effectiveness of these attacks on the BERT classifier model. It is revealed by the analysis that PWWS emerges as the most potent adversary, consistently outperforming other methods across multiple evaluation scenarios, thereby emphasizing its efficacy in generating adversarial examples for text classification. Through comprehensive experimentation, the performance of these attacks is assessed and the findings indicate that the PWWS attack outperforms others, demonstrating lower runtime, higher accuracy, and favorable semantic similarity scores. The key insight of this paper lies in the assessment of the relative performances of three prevalent state-of-the-art attack mechanisms. △ Less

Submitted 7 April, 2024; originally announced April 2024.

Comments: This report pertains to the Capstone Project done by Group 2 of the Fall batch of 2023 students at Praxis Tech School, Kolkata, India. The reports consists of 28 pages and it includes 10 tables. This is the preprint which will be submitted to IEEE CONIT 2024 for review

arXiv:2404.03877 [pdf, other]

Beyond the Bridge: Contention-Based Covert and Side Channel Attacks on Multi-GPU Interconnect

Authors: Yicheng Zhang, Ravan Nazaraliyev, Sankha Baran Dutta, Nael Abu-Ghazaleh, Andres Marquez, Kevin Barker

Abstract: High-speed interconnects, such as NVLink, are integral to modern multi-GPU systems, acting as a vital link between CPUs and GPUs. This study highlights the vulnerability of multi-GPU systems to covert and side channel attacks due to congestion on interconnects. An adversary can infer private information about a victim's activities by monitoring NVLink congestion without needing special permissions… ▽ More High-speed interconnects, such as NVLink, are integral to modern multi-GPU systems, acting as a vital link between CPUs and GPUs. This study highlights the vulnerability of multi-GPU systems to covert and side channel attacks due to congestion on interconnects. An adversary can infer private information about a victim's activities by monitoring NVLink congestion without needing special permissions. Leveraging this insight, we develop a covert channel attack across two GPUs with a bandwidth of 45.5 kbps and a low error rate, and introduce a side channel attack enabling attackers to fingerprint applications through the shared NVLink interconnect. △ Less

Submitted 2 May, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

Comments: Accepted to SEED 2024

arXiv:2404.03252 [pdf, other]

Precision tests of bulk entanglement entropy

Authors: Barsha G. Chowdhury, Justin R. David, Semanti Dutta, Jyotirmoy Mukherjee

Abstract: We consider linear superpositions of single particle excitations in a scalar field theory on $AdS_3$ and evaluate their contribution to the bulk entanglement entropy across the Ryu-Takayanagi surface. We compare the entanglement entropy of these excitations obtained using the Faulkner-Lewkowycz-Maldacena formula to the entanglement entropy of linear superposition of global descendants of a conform… ▽ More We consider linear superpositions of single particle excitations in a scalar field theory on $AdS_3$ and evaluate their contribution to the bulk entanglement entropy across the Ryu-Takayanagi surface. We compare the entanglement entropy of these excitations obtained using the Faulkner-Lewkowycz-Maldacena formula to the entanglement entropy of linear superposition of global descendants of a conformal primary in a large $c$ CFT obtained using the replica trick. We show that the closed from expressions for the entanglement entropy in the small interval expansion both in gravity and the CFT precisely agree. The agreement serves as a non-trivial check of the FLM formula for the quantum corrections to holographic entropy which also involves a contribution from the back reacted minimal area. Our checks includes an example in which the state is time dependent and spatially in-homogenous as well another example involving a coherent state with a Bañados geometry as its holographic dual. △ Less

Submitted 1 May, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

Comments: 85 pages, 9 figures, 2 tables, typos corrected, references added

arXiv:2404.02255 [pdf, other]

$\texttt{LM}^\texttt{2}$: A Simple Society of Language Models Solves Complex Reasoning

Authors: Gurusha Juneja, Subhabrata Dutta, Tanmoy Chakraborty

Abstract: Despite demonstrating emergent reasoning abilities, Large Language Models (LLMS) often lose track of complex, multi-step reasoning. Existing studies show that providing guidance via decomposing the original question into multiple subproblems elicits more robustness in LLM reasoning -- a decomposer generates the subproblems, and a solver solves each of these subproblems. However, these techniques f… ▽ More Despite demonstrating emergent reasoning abilities, Large Language Models (LLMS) often lose track of complex, multi-step reasoning. Existing studies show that providing guidance via decomposing the original question into multiple subproblems elicits more robustness in LLM reasoning -- a decomposer generates the subproblems, and a solver solves each of these subproblems. However, these techniques fail to accommodate coordination between the decomposer and the solver modules (either in a single model or different specialized ones) -- the decomposer does not keep track of the ability of the solver to follow the decomposed reasoning. In this paper, we propose LM2 to address these challenges. LM2 modularizes the decomposition, solution, and verification into three different language models. The decomposer module identifies the key concepts necessary to solve the problem and generates step-by-step subquestions according to the reasoning requirement. The solver model generates the solution to the subproblems that are then checked by the verifier module; depending upon the feedback from the verifier, the reasoning context is constructed using the subproblems and the solutions. These models are trained to coordinate using policy learning. Exhaustive experimentation suggests the superiority of LM2 over existing methods on in- and out-domain reasoning problems, outperforming the best baselines by $8.1\%$ on MATH, $7.71\%$ on JEEBench, and $9.7\%$ on MedQA problems (code available at https://github.com/LCS2-IIITD/Language_Model_Multiplex). △ Less

Submitted 2 April, 2024; originally announced April 2024.

arXiv:2403.16328 [pdf, other]

Uniform-over-dimension convergence with application to location tests for high-dimensional data

Authors: Joydeep Chowdhury, Subhajit Dutta, Marc G. Genton

Abstract: Asymptotic methods for hypothesis testing in high-dimensional data usually require the dimension of the observations to increase to infinity, often with an additional condition on its rate of increase compared to the sample size. On the other hand, multivariate asymptotic methods are valid for fixed dimension only, and their practical implementations in hypothesis testing methodology typically req… ▽ More Asymptotic methods for hypothesis testing in high-dimensional data usually require the dimension of the observations to increase to infinity, often with an additional condition on its rate of increase compared to the sample size. On the other hand, multivariate asymptotic methods are valid for fixed dimension only, and their practical implementations in hypothesis testing methodology typically require the sample size to be large compared to the dimension for yielding desirable results. However, in practical scenarios, it is usually not possible to determine whether the dimension of the data at hand conform to the conditions required for the validity of the high-dimensional asymptotic methods, or whether the sample size is large enough compared to the dimension of the data. In this work, a theory of asymptotic convergence is proposed, which holds uniformly over the dimension of the random vectors. This theory attempts to unify the asymptotic results for fixed-dimensional multivariate data and high-dimensional data, and accounts for the effect of the dimension of the data on the performance of the hypothesis testing procedures. The methodology developed based on this asymptotic theory can be applied to data of any dimension. An application of this theory is demonstrated in the two-sample test for the equality of locations. The test statistic proposed is unscaled by the sample covariance, similar to usual tests for high-dimensional data. Using simulated examples, it is demonstrated that the proposed test exhibits better performance compared to several popular tests in the literature for high-dimensional data. Further, it is demonstrated in simulated models that the proposed unscaled test performs better than the usual scaled two-sample tests for multivariate data, including the Hotelling's $T^2$ test for multivariate Gaussian data. △ Less

Submitted 24 March, 2024; originally announced March 2024.

MSC Class: Primary 62E20; secondary 62H15

arXiv:2403.15731 [pdf, ps, other]

Spiral, Core-defect and Wave break in a modified Oregonator Model

Authors: Parvej Khan, Sumana Dutta

Abstract: Target waves and spiral waves were discovered in the Belousov-Zhabotinsky (BZ) reaction around 50 years ago. Inwardly rotating spiral waves, also called the anti-spirals, were found in a BZ-AOT system about two decades ago. Many biological systems demonstrate both spiral and anti-spirals. In the glycolytic activity of yeast, anti-spirals are observed, whereas spiral waves are widely studied in the… ▽ More Target waves and spiral waves were discovered in the Belousov-Zhabotinsky (BZ) reaction around 50 years ago. Inwardly rotating spiral waves, also called the anti-spirals, were found in a BZ-AOT system about two decades ago. Many biological systems demonstrate both spiral and anti-spirals. In the glycolytic activity of yeast, anti-spirals are observed, whereas spiral waves are widely studied in the context of broken cardiac waves. In the cardiac system, only outwardly rotating normal spirals are reported. In the context of the cardiac waves, the spirals and scrolls are widely studied in BZ reactions for experimental study. On the other hand, the Oregonator model is considered for the numerical simulation in correlation to the BZ reaction system. In this study, we modified the Oregonator model a little using the same FKN mechanism, and we present a four-variable model that can be called a modified Oregonator system in which we observe the presence of the spiral, spiral defect, and spiral breakup by tuning the parameters. △ Less

Submitted 23 March, 2024; originally announced March 2024.

arXiv:2403.14117 [pdf, other]

doi 10.1145/3613904.3642697

A Design Space for Intelligent and Interactive Writing Assistants

Authors: Mina Lee, Katy Ilonka Gero, John Joon Young Chung, Simon Buckingham Shum, Vipul Raheja, Hua Shen, Subhashini Venugopalan, Thiemo Wambsganss, David Zhou, Emad A. Alghamdi, Tal August, Avinash Bhat, Madiha Zahrah Choksi, Senjuti Dutta, Jin L. C. Guo, Md Naimul Hoque, Yewon Kim, Simon Knight, Seyed Parsa Neshaei, Agnia Sergeyuk, Antonette Shibani, Disha Shrivastava, Lila Shroff, Jessi Stark, Sarah Sterman , et al. (11 additional authors not shown)

Abstract: In our era of rapid technological advancement, the research landscape for writing assistants has become increasingly fragmented across various research communities. We seek to address this challenge by proposing a design space as a structured way to examine and explore the multidimensional space of intelligent and interactive writing assistants. Through a large community collaboration, we explore… ▽ More In our era of rapid technological advancement, the research landscape for writing assistants has become increasingly fragmented across various research communities. We seek to address this challenge by proposing a design space as a structured way to examine and explore the multidimensional space of intelligent and interactive writing assistants. Through a large community collaboration, we explore five aspects of writing assistants: task, user, technology, interaction, and ecosystem. Within each aspect, we define dimensions (i.e., fundamental components of an aspect) and codes (i.e., potential options for each dimension) by systematically reviewing 115 papers. Our design space aims to offer researchers and designers a practical tool to navigate, comprehend, and compare the various possibilities of writing assistants, and aid in the envisioning and design of new writing assistants. △ Less

Submitted 26 March, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

Comments: Published as a conference paper at CHI 2024

arXiv:2403.09116 [pdf, other]

Frustrated Quantum Magnetism on Complex Networks: What Sets the Total Spin

Authors: G Preethi, Shovan Dutta

Abstract: Consider equal antiferromagnetic Heisenberg interactions between qubits sitting at the nodes of a complex, nonbipartite network. We ask the question: How does the network topology determine the net magnetization of the ground state and to what extent is it tunable? By examining various network families with tunable properties, we demonstrate that (i) graph heterogeneity, i.e., spread in the number… ▽ More Consider equal antiferromagnetic Heisenberg interactions between qubits sitting at the nodes of a complex, nonbipartite network. We ask the question: How does the network topology determine the net magnetization of the ground state and to what extent is it tunable? By examining various network families with tunable properties, we demonstrate that (i) graph heterogeneity, i.e., spread in the number of neighbors, is essential for a nonzero total spin, and (ii) other than the average number of neighbors, the key structure governing the total spin is the presence of (disassortative) hubs, as opposed to the level of frustration. We also show how to construct simple networks where the magnetization can be tuned over its entire range across both abrupt and continuous transitions, which may be realizable on existing platforms. Our findings pose a number of fundamental questions and strongly motivate wider exploration of quantum many-body phenomena beyond regular lattices. △ Less

Submitted 21 March, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

Comments: 5 pages, 6 figures + supplement + notebook with data for figures

arXiv:2403.08880 [pdf, other]

doi 10.1145/3600211.3604706

REFRESH: Responsible and Efficient Feature Reselection Guided by SHAP Values

Authors: Shubham Sharma, Sanghamitra Dutta, Emanuele Albini, Freddy Lecue, Daniele Magazzeni, Manuela Veloso

Abstract: Feature selection is a crucial step in building machine learning models. This process is often achieved with accuracy as an objective, and can be cumbersome and computationally expensive for large-scale datasets. Several additional model performance characteristics such as fairness and robustness are of importance for model development. As regulations are driving the need for more trustworthy mode… ▽ More Feature selection is a crucial step in building machine learning models. This process is often achieved with accuracy as an objective, and can be cumbersome and computationally expensive for large-scale datasets. Several additional model performance characteristics such as fairness and robustness are of importance for model development. As regulations are driving the need for more trustworthy models, deployed models need to be corrected for model characteristics associated with responsible artificial intelligence. When feature selection is done with respect to one model performance characteristic (eg. accuracy), feature selection with secondary model performance characteristics (eg. fairness and robustness) as objectives would require going through the computationally expensive selection process from scratch. In this paper, we introduce the problem of feature \emph{reselection}, so that features can be selected with respect to secondary model performance characteristics efficiently even after a feature selection process has been done with respect to a primary objective. To address this problem, we propose REFRESH, a method to reselect features so that additional constraints that are desirable towards model performance can be achieved without having to train several new models. REFRESH's underlying algorithm is a novel technique using SHAP values and correlation analysis that can approximate for the predictions of a model without having to train these models. Empirical evaluations on three datasets, including a large-scale loan defaulting dataset show that REFRESH can help find alternate models with better model characteristics efficiently. We also discuss the need for reselection and REFRESH based on regulation desiderata. △ Less

Submitted 13 March, 2024; originally announced March 2024.

arXiv:2403.05576 [pdf]

Understanding Subjectivity through the Lens of Motivational Context in Model-Generated Image Satisfaction

Authors: Senjuti Dutta, Sherol Chen, Sunny Mak, Amnah Ahmad, Katherine Collins, Alena Butryna, Deepak Ramachandran, Krishnamurthy Dvijotham, Ellie Pavlick, Ravi Rajakumar

Abstract: Image generation models are poised to become ubiquitous in a range of applications. These models are often fine-tuned and evaluated using human quality judgments that assume a universal standard, failing to consider the subjectivity of such tasks. To investigate how to quantify subjectivity, and the scale of its impact, we measure how assessments differ among human annotators across different use… ▽ More Image generation models are poised to become ubiquitous in a range of applications. These models are often fine-tuned and evaluated using human quality judgments that assume a universal standard, failing to consider the subjectivity of such tasks. To investigate how to quantify subjectivity, and the scale of its impact, we measure how assessments differ among human annotators across different use cases. Simulating the effects of ordinarily latent elements of annotators subjectivity, we contrive a set of motivations (t-shirt graphics, presentation visuals, and phone background images) to contextualize a set of crowdsourcing tasks. Our results show that human evaluations of images vary within individual contexts and across combinations of contexts. Three key factors affecting this subjectivity are image appearance, image alignment with text, and representation of objects mentioned in the text. Our study highlights the importance of taking individual users and contexts into account, both when building and evaluating generative models △ Less

Submitted 26 February, 2024; originally announced March 2024.

arXiv:2403.03036 [pdf, ps, other]

Superconductivity in Ca-intercalated bilayer silicene

Authors: Jisvin Sam, Sasmita Mohakud, Katsunori Wakabayashi, Sudipta Dutta

Abstract: Within first-principles calculations, we explore superconductivity in Ca-intercalated bilayer silicene compound, Si2CaSi2. This arises from the coupling of interlayer flower-like Γ-centered Fermi surface formed by the hybridization of Ca-3d and Si-3pz orbitals with low-energy out-of-plane vibrations enabled by silicene's buckling. The consequent large electron-phonon coupling, as evident from the… ▽ More Within first-principles calculations, we explore superconductivity in Ca-intercalated bilayer silicene compound, Si2CaSi2. This arises from the coupling of interlayer flower-like Γ-centered Fermi surface formed by the hybridization of Ca-3d and Si-3pz orbitals with low-energy out-of-plane vibrations enabled by silicene's buckling. The consequent large electron-phonon coupling, as evident from the Eliashberg spectral function leads to superconductivity below 5.4 K in this two-dimensional covalent system. Our results reveal the key control parameters to achieve superconductivity in experimentally synthesizable silicon-based thin materials that can find diverse applications. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Comments: 6 pages, 3 figures

arXiv:2402.19180 [pdf, other]

ModZoo: A Large-Scale Study of Modded Android Apps and their Markets

Authors: Luis A. Saavedra, Hridoy S. Dutta, Alastair R. Beresford, Alice Hutchings

Abstract: We present the results of the first large-scale study into Android markets that offer modified or modded apps: apps whose features and functionality have been altered by a third-party. We analyse over 146k (thousand) apps obtained from 13 of the most popular modded app markets. Around 90% of apps we collect are altered in some way when compared to the official counterparts on Google Play. Modifica… ▽ More We present the results of the first large-scale study into Android markets that offer modified or modded apps: apps whose features and functionality have been altered by a third-party. We analyse over 146k (thousand) apps obtained from 13 of the most popular modded app markets. Around 90% of apps we collect are altered in some way when compared to the official counterparts on Google Play. Modifications include games cheats, such as infinite coins or lives; mainstream apps with premium features provided for free; and apps with modified advertising identifiers or excluded ads. We find the original app developers lose significant potential revenue due to: the provision of paid for apps for free (around 5% of the apps across all markets); the free availability of premium features that require payment in the official app; and modified advertising identifiers. While some modded apps have all trackers and ads removed (3%), in general, the installation of these apps is significantly more risky for the user than the official version: modded apps are ten times more likely to be marked as malicious and often request additional permissions. △ Less

Submitted 15 February, 2024; originally announced February 2024.

arXiv:2402.18312 [pdf, other]

How to think step-by-step: A mechanistic understanding of chain-of-thought reasoning

Authors: Subhabrata Dutta, Joykirat Singh, Soumen Chakrabarti, Tanmoy Chakraborty

Abstract: Despite superior reasoning prowess demonstrated by Large Language Models (LLMs) with Chain-of-Thought (CoT) prompting, a lack of understanding prevails around the internal mechanisms of the models that facilitate CoT generation. This work investigates the neural sub-structures within LLMs that manifest CoT reasoning from a mechanistic point of view. From an analysis of Llama-2 7B applied to multis… ▽ More Despite superior reasoning prowess demonstrated by Large Language Models (LLMs) with Chain-of-Thought (CoT) prompting, a lack of understanding prevails around the internal mechanisms of the models that facilitate CoT generation. This work investigates the neural sub-structures within LLMs that manifest CoT reasoning from a mechanistic point of view. From an analysis of Llama-2 7B applied to multistep reasoning over fictional ontologies, we demonstrate that LLMs deploy multiple parallel pathways of answer generation for step-by-step reasoning. These parallel pathways provide sequential answers from the input question context as well as the generated CoT. We observe a functional rift in the middle layers of the LLM. Token representations in the initial half remain strongly biased towards the pretraining prior, with the in-context prior taking over in the later half. This internal phase shift manifests in different functional components: attention heads that write the answer token appear in the later half, attention heads that move information along ontological relationships appear in the initial half, and so on. To the best of our knowledge, this is the first attempt towards mechanistic investigation of CoT reasoning in LLMs. △ Less

Submitted 6 May, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

arXiv:2402.16512 [pdf, other]

Periodically driven thermodynamic systems under vanishingly small viscous drives

Authors: Shakul Awasthi, Sreedhar B. Dutta

Abstract: Periodically driven thermodynamic systems support stable non-equilibrium oscillating states with properties drastically different from equilibrium. They exhibit even more exotic features for low viscous drives, which is a regime that is hard to probe due to singular behavior of the underlying Langevin dynamics near vanishing viscosity. We propose a method, based on singular perturbation and Floque… ▽ More Periodically driven thermodynamic systems support stable non-equilibrium oscillating states with properties drastically different from equilibrium. They exhibit even more exotic features for low viscous drives, which is a regime that is hard to probe due to singular behavior of the underlying Langevin dynamics near vanishing viscosity. We propose a method, based on singular perturbation and Floquet theories, that allows us to obtain oscillating states in this limit. We then find two distinct classes of distributions, each exhibiting interesting features that can be exploited for a range of practical applicability, including cooling a system and triggering chemical reactions through weakly interacting driven environments. △ Less

Submitted 26 February, 2024; originally announced February 2024.

Comments: 6+13 pages, 2+4 figures

arXiv:2402.00689 [pdf, other]

Ocassionally Secure: A Comparative Analysis of Code Generation Assistants

Authors: Ran Elgedawy, John Sadik, Senjuti Dutta, Anuj Gautam, Konstantinos Georgiou, Farzin Gholamrezae, Fujiao Ji, Kyungchan Lim, Qian Liu, Scott Ruoti

Abstract: $ $Large Language Models (LLMs) are being increasingly utilized in various applications, with code generations being a notable example. While previous research has shown that LLMs have the capability to generate both secure and insecure code, the literature does not take into account what factors help generate secure and effective code. Therefore in this paper we focus on identifying and understan… ▽ More $ $Large Language Models (LLMs) are being increasingly utilized in various applications, with code generations being a notable example. While previous research has shown that LLMs have the capability to generate both secure and insecure code, the literature does not take into account what factors help generate secure and effective code. Therefore in this paper we focus on identifying and understanding the conditions and contexts in which LLMs can be effectively and safely deployed in real-world scenarios to generate quality code. We conducted a comparative analysis of four advanced LLMs--GPT-3.5 and GPT-4 using ChatGPT and Bard and Gemini from Google--using 9 separate tasks to assess each model's code generation capabilities. We contextualized our study to represent the typical use cases of a real-life developer employing LLMs for everyday tasks as work. Additionally, we place an emphasis on security awareness which is represented through the use of two distinct versions of our developer persona. In total, we collected 61 code outputs and analyzed them across several aspects: functionality, security, performance, complexity, and reliability. These insights are crucial for understanding the models' capabilities and limitations, guiding future development and practical applications in the field of automated code generation. △ Less

Submitted 1 February, 2024; originally announced February 2024.

Comments: 12 pages, 2 figures

Showing 1–50 of 823 results for author: Dutta, S