-
Subordination Involving Gauss Hypergeometric Function
Authors:
Anish Kumar,
Sourav Das
Abstract:
The primary objective of this work is to obtain some sufficient conditions so that normalized Gauss hypergeometric function satisfies exponential starlikeness and convexity in the unit disk. Moreover, conditions on parameter of this function has been derived for being Janowski convexity and starlikeness. Furthermore, some sufficient conditions are also obtained so that Gauss hypergeometric functio…
▽ More
The primary objective of this work is to obtain some sufficient conditions so that normalized Gauss hypergeometric function satisfies exponential starlikeness and convexity in the unit disk. Moreover, conditions on parameter of this function has been derived for being Janowski convexity and starlikeness. Furthermore, some sufficient conditions are also obtained so that Gauss hypergeometric function posses lemniscate starlikeness and convexity. Results established in this work are presumably new and their significance is illustrated by several consequences, graphical representations and examples.
△ Less
Submitted 3 May, 2024;
originally announced May 2024.
-
Sharp Nonuniqueness in the Transport Equation with Sobolev Velocity Field
Authors:
Elia Bruè,
Maria Colombo,
Anuj Kumar
Abstract:
Given a divergence-free vector field ${\bf u} \in L^\infty_t W^{1,p}_x(\mathbb R^d)$ and a nonnegative initial datum $ρ_0 \in L^r$, the celebrated DiPerna--Lions theory established the uniqueness of the weak solution in the class of $L^\infty_t L^r_x$ densities for $\frac{1}{p} + \frac{1}{r} \leq 1$. This range was later improved in [BCDL21] to $\frac{1}{p} + \frac{d-1}{dr} \leq 1$. We prove that…
▽ More
Given a divergence-free vector field ${\bf u} \in L^\infty_t W^{1,p}_x(\mathbb R^d)$ and a nonnegative initial datum $ρ_0 \in L^r$, the celebrated DiPerna--Lions theory established the uniqueness of the weak solution in the class of $L^\infty_t L^r_x$ densities for $\frac{1}{p} + \frac{1}{r} \leq 1$. This range was later improved in [BCDL21] to $\frac{1}{p} + \frac{d-1}{dr} \leq 1$. We prove that this range is sharp by providing a counterexample to uniqueness when $\frac{1}{p} + \frac{d-1}{dr} > 1$.
To this end, we introduce a novel flow mechanism. It is not based on convex integration, which has provided a non-optimal result in this context, nor on purely self-similar techniques, but shares features of both, such as a local (discrete) self similar nature and an intermittent space-frequency localization.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
A Semi-Formal Verification Methodology for Efficient Configuration Coverage of Highly Configurable Digital Designs
Authors:
Aman Kumar,
Sebastian Simon
Abstract:
Nowadays, a majority of System-on-Chips (SoCs) make use of Intellectual Property (IP) in order to shorten development cycles. When such IPs are developed, one of the main focuses lies in the high configurability of the design. This flexibility on the design side introduces the challenge of covering a huge state space of IP configurations on the verification side to ensure the functional correctnes…
▽ More
Nowadays, a majority of System-on-Chips (SoCs) make use of Intellectual Property (IP) in order to shorten development cycles. When such IPs are developed, one of the main focuses lies in the high configurability of the design. This flexibility on the design side introduces the challenge of covering a huge state space of IP configurations on the verification side to ensure the functional correctness under every possible parameter setting. The vast number of possibilities does not allow a brute-force approach, and therefore, only a selected number of settings based on typical and extreme assumptions are usually verified. Especially in automotive applications, which need to follow the ISO 26262 functional safety standard, the requirement of covering all significant variants needs to be fulfilled in any case. State-of-the-Art existing verification techniques such as simulation-based verification and formal verification have challenges such as time-space explosion and state-space explosion respectively and therefore, lack behind in verifying highly configurable digital designs efficiently. This paper is focused on a semi-formal verification methodology for efficient configuration coverage of highly configurable digital designs. The methodology focuses on reduced runtime based on simulative and formal methods that allow high configuration coverage. The paper also presents the results when the developed methodology was applied on a highly configurable microprocessor IP and discusses the gained benefits.
△ Less
Submitted 20 April, 2024;
originally announced May 2024.
-
Few Shot Class Incremental Learning using Vision-Language models
Authors:
Anurag Kumar,
Chinmay Bharti,
Saikat Dutta,
Srikrishna Karanam,
Biplab Banerjee
Abstract:
Recent advancements in deep learning have demonstrated remarkable performance comparable to human capabilities across various supervised computer vision tasks. However, the prevalent assumption of having an extensive pool of training data encompassing all classes prior to model training often diverges from real-world scenarios, where limited data availability for novel classes is the norm. The cha…
▽ More
Recent advancements in deep learning have demonstrated remarkable performance comparable to human capabilities across various supervised computer vision tasks. However, the prevalent assumption of having an extensive pool of training data encompassing all classes prior to model training often diverges from real-world scenarios, where limited data availability for novel classes is the norm. The challenge emerges in seamlessly integrating new classes with few samples into the training data, demanding the model to adeptly accommodate these additions without compromising its performance on base classes. To address this exigency, the research community has introduced several solutions under the realm of few-shot class incremental learning (FSCIL).
In this study, we introduce an innovative FSCIL framework that utilizes language regularizer and subspace regularizer. During base training, the language regularizer helps incorporate semantic information extracted from a Vision-Language model. The subspace regularizer helps in facilitating the model's acquisition of nuanced connections between image and text semantics inherent to base classes during incremental training. Our proposed framework not only empowers the model to embrace novel classes with limited data, but also ensures the preservation of performance on base classes. To substantiate the efficacy of our approach, we conduct comprehensive experiments on three distinct FSCIL benchmarks, where our framework attains state-of-the-art performance.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
A Flexible 2.5D Medical Image Segmentation Approach with In-Slice and Cross-Slice Attention
Authors:
Amarjeet Kumar,
Hongxu Jiang,
Muhammad Imran,
Cyndi Valdes,
Gabriela Leon,
Dahyun Kang,
Parvathi Nataraj,
Yuyin Zhou,
Michael D. Weiss,
Wei Shao
Abstract:
Deep learning has become the de facto method for medical image segmentation, with 3D segmentation models excelling in capturing complex 3D structures and 2D models offering high computational efficiency. However, segmenting 2.5D images, which have high in-plane but low through-plane resolution, is a relatively unexplored challenge. While applying 2D models to individual slices of a 2.5D image is f…
▽ More
Deep learning has become the de facto method for medical image segmentation, with 3D segmentation models excelling in capturing complex 3D structures and 2D models offering high computational efficiency. However, segmenting 2.5D images, which have high in-plane but low through-plane resolution, is a relatively unexplored challenge. While applying 2D models to individual slices of a 2.5D image is feasible, it fails to capture the spatial relationships between slices. On the other hand, 3D models face challenges such as resolution inconsistencies in 2.5D images, along with computational complexity and susceptibility to overfitting when trained with limited data. In this context, 2.5D models, which capture inter-slice correlations using only 2D neural networks, emerge as a promising solution due to their reduced computational demand and simplicity in implementation. In this paper, we introduce CSA-Net, a flexible 2.5D segmentation model capable of processing 2.5D images with an arbitrary number of slices through an innovative Cross-Slice Attention (CSA) module. This module uses the cross-slice attention mechanism to effectively capture 3D spatial information by learning long-range dependencies between the center slice (for segmentation) and its neighboring slices. Moreover, CSA-Net utilizes the self-attention mechanism to understand correlations among pixels within the center slice. We evaluated CSA-Net on three 2.5D segmentation tasks: (1) multi-class brain MRI segmentation, (2) binary prostate MRI segmentation, and (3) multi-class prostate MRI segmentation. CSA-Net outperformed leading 2D and 2.5D segmentation methods across all three tasks, demonstrating its efficacy and superiority. Our code is publicly available at https://github.com/mirthAI/CSA-Net.
△ Less
Submitted 30 April, 2024;
originally announced May 2024.
-
Acceptance Tests of more than 10 000 Photomultiplier Tubes for the multi-PMT Digital Optical Modules of the IceCube Upgrade
Authors:
R. Abbasi,
M. Ackermann,
J. Adams,
S. K. Agarwalla,
J. A. Aguilar,
M. Ahlers,
J. M. Alameddine,
N. M. Amin,
K. Andeen,
C. Argüelles,
Y. Ashida,
S. Athanasiadou,
L. Ausborm,
S. N. Axani,
X. Bai,
A. Balagopal V.,
M. Baricevic,
S. W. Barwick,
S. Bash,
V. Basu,
R. Bay,
J. J. Beatty,
J. Becker Tjus,
J. Beise,
C. Bellenghi
, et al. (399 additional authors not shown)
Abstract:
More than 10,000 photomultiplier tubes (PMTs) with a diameter of 80 mm will be installed in multi-PMT Digital Optical Modules (mDOMs) of the IceCube Upgrade. These have been tested and pre-calibrated at two sites. A throughput of more than 1000 PMTs per week with both sites was achieved with a modular design of the testing facilities and highly automated testing procedures. The testing facilities…
▽ More
More than 10,000 photomultiplier tubes (PMTs) with a diameter of 80 mm will be installed in multi-PMT Digital Optical Modules (mDOMs) of the IceCube Upgrade. These have been tested and pre-calibrated at two sites. A throughput of more than 1000 PMTs per week with both sites was achieved with a modular design of the testing facilities and highly automated testing procedures. The testing facilities can easily be adapted to other PMTs, such that they can, e.g., be re-used for testing the PMTs for IceCube-Gen2. Single photoelectron response, high voltage dependence, time resolution, prepulse, late pulse, afterpulse probabilities, and dark rates were measured for each PMT. We describe the design of the testing facilities, the testing procedures, and the results of the acceptance tests.
△ Less
Submitted 20 June, 2024; v1 submitted 30 April, 2024;
originally announced April 2024.
-
Pragmatic Formal Verification of Sequential Error Detection and Correction Codes (ECCs) used in Safety-Critical Design
Authors:
Aman Kumar
Abstract:
Error Detection and Correction Codes (ECCs) are often used in digital designs to protect data integrity. Especially in safety-critical systems such as automotive electronics, ECCs are widely used and the verification of such complex logic becomes more critical considering the ISO 26262 safety standards. Exhaustive verification of ECC using formal methods has been a challenge given the high number…
▽ More
Error Detection and Correction Codes (ECCs) are often used in digital designs to protect data integrity. Especially in safety-critical systems such as automotive electronics, ECCs are widely used and the verification of such complex logic becomes more critical considering the ISO 26262 safety standards. Exhaustive verification of ECC using formal methods has been a challenge given the high number of data bits to protect. As an example, for an ECC of 128 data bits with a possibility to detect up to four-bit errors, the combination of bit errors is given by 128C1 + 128C2 + 128C3 + 128C4 = 1.1 * 10^7. This vast analysis space often leads to bounded proof results. Moreover, the complexity and state-space increase further if the ECC has sequential encoding and decoding stages. To overcome such problems and sign-off the design with confidence within reasonable proof time, we present a pragmatic formal verification approach of complex ECC cores with several complexity reduction techniques and know-how that were learnt during the course of verification. We discuss using the linearity of the syndrome generator as a helper assertion, using the abstract model as glue logic to compare the RTL with the sequential version of the circuit, k-induction-based model checking and using mathematical relations captured as properties to simplify the verification in order to get an unbounded proof result within 24 hours of proof runtime.
△ Less
Submitted 28 April, 2024;
originally announced April 2024.
-
A Neural-Network-Based Approach for Loose-Fitting Clothing
Authors:
Yongxu Jin,
Dalton Omens,
Zhenglin Geng,
Joseph Teran,
Abishek Kumar,
Kenji Tashiro,
Ronald Fedkiw
Abstract:
Since loose-fitting clothing contains dynamic modes that have proven to be difficult to predict via neural networks, we first illustrate how to coarsely approximate these modes with a real-time numerical algorithm specifically designed to mimic the most important ballistic features of a classical numerical simulation. Although there is some flexibility in the choice of the numerical algorithm used…
▽ More
Since loose-fitting clothing contains dynamic modes that have proven to be difficult to predict via neural networks, we first illustrate how to coarsely approximate these modes with a real-time numerical algorithm specifically designed to mimic the most important ballistic features of a classical numerical simulation. Although there is some flexibility in the choice of the numerical algorithm used as a proxy for full simulation, it is essential that the stability and accuracy be independent from any time step restriction or similar requirements in order to facilitate real-time performance. In order to reduce the number of degrees of freedom that require approximations to their dynamics, we simulate rigid frames and use skinning to reconstruct a rough approximation to a desirable mesh; as one might expect, neural-network-based skinning seems to perform better than linear blend skinning in this scenario. Improved high frequency deformations are subsequently added to the skinned mesh via a quasistatic neural network (QNN). In contrast to recurrent neural networks that require a plethora of training data in order to adequately generalize to new examples, QNNs perform well with significantly less training data.
△ Less
Submitted 25 April, 2024;
originally announced April 2024.
-
Automatic AI controller that can drive with confidence: steering vehicle with uncertainty knowledge
Authors:
Neha Kumari,
Sumit Kumar. Sneha Priya,
Ayush Kumar,
Akash Fogla
Abstract:
In safety-critical systems that interface with the real world, the role of uncertainty in decision-making is pivotal, particularly in the context of machine learning models. For the secure functioning of Cyber-Physical Systems (CPS), it is imperative to manage such uncertainty adeptly. In this research, we focus on the development of a vehicle's lateral control system using a machine learning fram…
▽ More
In safety-critical systems that interface with the real world, the role of uncertainty in decision-making is pivotal, particularly in the context of machine learning models. For the secure functioning of Cyber-Physical Systems (CPS), it is imperative to manage such uncertainty adeptly. In this research, we focus on the development of a vehicle's lateral control system using a machine learning framework. Specifically, we employ a Bayesian Neural Network (BNN), a probabilistic learning model, to address uncertainty quantification. This capability allows us to gauge the level of confidence or uncertainty in the model's predictions. The BNN based controller is trained using simulated data gathered from the vehicle traversing a single track and subsequently tested on various other tracks. We want to share two significant results: firstly, the trained model demonstrates the ability to adapt and effectively control the vehicle on multiple similar tracks. Secondly, the quantification of prediction confidence integrated into the controller serves as an early-warning system, signaling when the algorithm lacks confidence in its predictions and is therefore susceptible to failure. By establishing a confidence threshold, we can trigger manual intervention, ensuring that control is relinquished from the algorithm when it operates outside of safe parameters.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Pocket Schlieren: a background oriented schlieren imaging platform on a smartphone
Authors:
Diganta Rabha,
Vimod Kumar,
Akshay Kumar,
Dinesh Saini,
Manish Kumar
Abstract:
Background-oriented schlieren (BOS) is a powerful technique for flow visualization. Nevertheless, the widespread dissemination of BOS is impeded by its dependence on scientific cameras, computing hardware, and dedicated analysis software. In this work, we aim to democratize BOS by providing a smartphone based scientific tool called "Pocket Schlieren". Pocket Schlieren enables users to directly cap…
▽ More
Background-oriented schlieren (BOS) is a powerful technique for flow visualization. Nevertheless, the widespread dissemination of BOS is impeded by its dependence on scientific cameras, computing hardware, and dedicated analysis software. In this work, we aim to democratize BOS by providing a smartphone based scientific tool called "Pocket Schlieren". Pocket Schlieren enables users to directly capture, process, and visualize flow phenomena on their smartphones. The underlying algorithm incorporates consecutive frame subtraction (CFS) and optical flow (OF) techniques to compute the density gradients inside a flow. It performs on both engineered and natural background patterns. Using Pocket Schlieren, we successfully visualized the flow produced from a burning candle flame, butane lighter, hot soldering iron, room heater, water immersion heating rod, and a large outdoor butane flame. Pocket Schlieren promises to serve as a frugal yet potent instrument for scientific and educational purposes. We have made it publicly available at doi: 10.5281/zenodo.10949271.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
Efficient Verification of a RADAR SoC Using Formal and Simulation-Based Methods
Authors:
Aman Kumar,
Mark Litterick,
Samuele Candido
Abstract:
As the demand for Internet of Things (IoT) and Human-to-Machine Interaction (HMI) increases, modern System-on-Chips (SoCs) offering such solutions are becoming increasingly complex. This intricate design poses significant challenges for verification, particularly when time-to-market is a crucial factor for consumer electronics products. This paper presents a case study based on our work to verify…
▽ More
As the demand for Internet of Things (IoT) and Human-to-Machine Interaction (HMI) increases, modern System-on-Chips (SoCs) offering such solutions are becoming increasingly complex. This intricate design poses significant challenges for verification, particularly when time-to-market is a crucial factor for consumer electronics products. This paper presents a case study based on our work to verify a complex Radio Detection And Ranging (RADAR) based SoC that performs on-chip sensing of human motion with millimetre accuracy. We leverage both formal and simulation-based methods to complement each other and achieve verification sign-off with high confidence. While employing a requirements-driven flow approach, we demonstrate the use of different verification methods to cater to multiple requirements and highlight our know-how from the project. Additionally, we used Machine Learning (ML) based methods, specifically the Xcelium ML tool from Cadence, to improve verification throughput.
△ Less
Submitted 20 April, 2024;
originally announced April 2024.
-
Beyond Scaling: Predicting Patent Approval with Domain-specific Fine-grained Claim Dependency Graph
Authors:
Xiaochen Kev Gao,
Feng Yao,
Kewen Zhao,
Beilei He,
Animesh Kumar,
Vish Krishnan,
Jingbo Shang
Abstract:
Model scaling is becoming the default choice for many language tasks due to the success of large language models (LLMs). However, it can fall short in specific scenarios where simple customized methods excel. In this paper, we delve into the patent approval pre-diction task and unveil that simple domain-specific graph methods outperform enlarging the model, using the intrinsic dependencies within…
▽ More
Model scaling is becoming the default choice for many language tasks due to the success of large language models (LLMs). However, it can fall short in specific scenarios where simple customized methods excel. In this paper, we delve into the patent approval pre-diction task and unveil that simple domain-specific graph methods outperform enlarging the model, using the intrinsic dependencies within the patent data. Specifically, we first extend the embedding-based state-of-the-art (SOTA) by scaling up its backbone model with various sizes of open-source LLMs, then explore prompt-based methods to harness proprietary LLMs' potential, but find the best results close to random guessing, underlining the ineffectiveness of model scaling-up. Hence, we propose a novel Fine-grained cLAim depeNdency (FLAN) Graph through meticulous patent data analyses, capturing the inherent dependencies across segments of the patent text. As it is model-agnostic, we apply cost-effective graph models to our FLAN Graph to obtain representations for approval prediction. Extensive experiments and detailed analyses prove that incorporating FLAN Graph via various graph models consistently outperforms all LLM baselines significantly. We hope that our observations and analyses in this paper can bring more attention to this challenging task and prompt further research into the limitations of LLMs. Our source code and dataset can be obtained from http://github.com/ShangDataLab/FLAN-Graph.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Authors:
Fahim Tajwar,
Anikait Singh,
Archit Sharma,
Rafael Rafailov,
Jeff Schneider,
Tengyang Xie,
Stefano Ermon,
Chelsea Finn,
Aviral Kumar
Abstract:
Learning from preference labels plays a crucial role in fine-tuning large language models. There are several distinct approaches for preference fine-tuning, including supervised learning, on-policy reinforcement learning (RL), and contrastive learning. Different methods come with different implementation tradeoffs and performance differences, and existing empirical findings present different concl…
▽ More
Learning from preference labels plays a crucial role in fine-tuning large language models. There are several distinct approaches for preference fine-tuning, including supervised learning, on-policy reinforcement learning (RL), and contrastive learning. Different methods come with different implementation tradeoffs and performance differences, and existing empirical findings present different conclusions, for instance, some results show that online RL is quite important to attain good fine-tuning results, while others find (offline) contrastive or even purely supervised methods sufficient. This raises a natural question: what kind of approaches are important for fine-tuning with preference data and why? In this paper, we answer this question by performing a rigorous analysis of a number of fine-tuning techniques on didactic and full-scale LLM problems. Our main finding is that, in general, approaches that use on-policy sampling or attempt to push down the likelihood on certain responses (i.e., employ a "negative gradient") outperform offline and maximum likelihood objectives. We conceptualize our insights and unify methods that use on-policy sampling or negative gradient under a notion of mode-seeking objectives for categorical distributions. Mode-seeking objectives are able to alter probability mass on specific bins of a categorical distribution at a fast rate compared to maximum likelihood, allowing them to relocate masses across bins more effectively. Our analysis prescribes actionable insights for preference fine-tuning of LLMs and informs how data should be collected for maximal improvement.
△ Less
Submitted 2 June, 2024; v1 submitted 22 April, 2024;
originally announced April 2024.
-
Visualizing Intelligent Tutor Interactions for Responsive Pedagogy
Authors:
Grace Guo,
Aishwarya Mudgal Sunil Kumar,
Adit Gupta,
Adam Coscia,
Chris MacLellan,
Alex Endert
Abstract:
Intelligent tutoring systems leverage AI models of expert learning and student knowledge to deliver personalized tutoring to students. While these intelligent tutors have demonstrated improved student learning outcomes, it is still unclear how teachers might integrate them into curriculum and course planning to support responsive pedagogy. In this paper, we conducted a design study with five teach…
▽ More
Intelligent tutoring systems leverage AI models of expert learning and student knowledge to deliver personalized tutoring to students. While these intelligent tutors have demonstrated improved student learning outcomes, it is still unclear how teachers might integrate them into curriculum and course planning to support responsive pedagogy. In this paper, we conducted a design study with five teachers who have deployed Apprentice Tutors, an intelligent tutoring platform, in their classes. We characterized their challenges around analyzing student interaction data from intelligent tutoring systems and built VisTA (Visualizations for Tutor Analytics), a visual analytics system that shows detailed provenance data across multiple coordinated views. We evaluated VisTA with the same five teachers, and found that the visualizations helped them better interpret intelligent tutor data, gain insights into student problem-solving provenance, and decide on necessary follow-up actions - such as providing students with further support or reviewing skills in the classroom. Finally, we discuss potential extensions of VisTA into sequence query and detection, as well as the potential for the visualizations to be useful for encouraging self-directed learning in students.
△ Less
Submitted 19 April, 2024;
originally announced April 2024.
-
Spin Hall Nano-Oscillator Empirical Electrical Model for Optimal On-chip Detector Design
Authors:
Rafaella Fiorelli,
Mona Rajabali,
Roberto Méndez-Romero,
Akash Kumar,
Artem Litvinenko,
Teresa Serrano-Gotarredona,
Farshad Moradi,
Johan Åkerman,
Bernabé Linares-Barranco,
Eduardo Peralías
Abstract:
As nascent nonlinear oscillators, nano-constriction spin Hall nano-oscillators (SHNOs) represent a promising potential for integration into more complicated systems such as neural networks, magnetic field sensors, and radio frequency (RF) signal classification, their tunable high-frequency operating regime, easy synchronization, and CMOS compatibility can streamline the process. To implement SHNOs…
▽ More
As nascent nonlinear oscillators, nano-constriction spin Hall nano-oscillators (SHNOs) represent a promising potential for integration into more complicated systems such as neural networks, magnetic field sensors, and radio frequency (RF) signal classification, their tunable high-frequency operating regime, easy synchronization, and CMOS compatibility can streamline the process. To implement SHNOs in any of these networks, the electrical features of a single device are needed before designing the signal detection CMOS circuitry. This study centers on presenting an empirical electrical model of the SHNO based on a comprehensive characterization of the output impedance of a single SHNO, and its available output power in the range of 2-10 GHz at various bias currents.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
Manipulating Large Language Models to Increase Product Visibility
Authors:
Aounon Kumar,
Himabindu Lakkaraju
Abstract:
Large language models (LLMs) are increasingly being integrated into search engines to provide natural language responses tailored to user queries. Customers and end-users are also becoming more dependent on these models for quick and easy purchase decisions. In this work, we investigate whether recommendations from LLMs can be manipulated to enhance a product's visibility. We demonstrate that addi…
▽ More
Large language models (LLMs) are increasingly being integrated into search engines to provide natural language responses tailored to user queries. Customers and end-users are also becoming more dependent on these models for quick and easy purchase decisions. In this work, we investigate whether recommendations from LLMs can be manipulated to enhance a product's visibility. We demonstrate that adding a strategic text sequence (STS) -- a carefully crafted message -- to a product's information page can significantly increase its likelihood of being listed as the LLM's top recommendation. To understand the impact of STS, we use a catalog of fictitious coffee machines and analyze its effect on two target products: one that seldom appears in the LLM's recommendations and another that usually ranks second. We observe that the strategic text sequence significantly enhances the visibility of both products by increasing their chances of appearing as the top recommendation. This ability to manipulate LLM-generated search responses provides vendors with a considerable competitive advantage and has the potential to disrupt fair market competition. Just as search engine optimization (SEO) revolutionized how webpages are customized to rank higher in search engine results, influencing LLM recommendations could profoundly impact content optimization for AI-driven search services. Code for our experiments is available at https://github.com/aounon/llm-rank-optimizer.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
Unveiling the Impact of Macroeconomic Policies: A Double Machine Learning Approach to Analyzing Interest Rate Effects on Financial Markets
Authors:
Anoop Kumar,
Suresh Dodda,
Navin Kamuni,
Rajeev Kumar Arora
Abstract:
This study examines the effects of macroeconomic policies on financial markets using a novel approach that combines Machine Learning (ML) techniques and causal inference. It focuses on the effect of interest rate changes made by the US Federal Reserve System (FRS) on the returns of fixed income and equity funds between January 1986 and December 2021. The analysis makes a distinction between active…
▽ More
This study examines the effects of macroeconomic policies on financial markets using a novel approach that combines Machine Learning (ML) techniques and causal inference. It focuses on the effect of interest rate changes made by the US Federal Reserve System (FRS) on the returns of fixed income and equity funds between January 1986 and December 2021. The analysis makes a distinction between actively and passively managed funds, hypothesizing that the latter are less susceptible to changes in interest rates. The study contrasts gradient boosting and linear regression models using the Double Machine Learning (DML) framework, which supports a variety of statistical learning techniques. Results indicate that gradient boosting is a useful tool for predicting fund returns; for example, a 1% increase in interest rates causes an actively managed fund's return to decrease by -11.97%. This understanding of the relationship between interest rates and fund performance provides opportunities for additional research and insightful, data-driven advice for fund managers and investors
△ Less
Submitted 30 March, 2024;
originally announced April 2024.
-
Sparse Points to Dense Clouds: Enhancing 3D Detection with Limited LiDAR Data
Authors:
Aakash Kumar,
Chen Chen,
Ajmal Mian,
Neils Lobo,
Mubarak Shah
Abstract:
3D detection is a critical task that enables machines to identify and locate objects in three-dimensional space. It has a broad range of applications in several fields, including autonomous driving, robotics and augmented reality. Monocular 3D detection is attractive as it requires only a single camera, however, it lacks the accuracy and robustness required for real world applications. High resolu…
▽ More
3D detection is a critical task that enables machines to identify and locate objects in three-dimensional space. It has a broad range of applications in several fields, including autonomous driving, robotics and augmented reality. Monocular 3D detection is attractive as it requires only a single camera, however, it lacks the accuracy and robustness required for real world applications. High resolution LiDAR on the other hand, can be expensive and lead to interference problems in heavy traffic given their active transmissions. We propose a balanced approach that combines the advantages of monocular and point cloud-based 3D detection. Our method requires only a small number of 3D points, that can be obtained from a low-cost, low-resolution sensor. Specifically, we use only 512 points, which is just 1% of a full LiDAR frame in the KITTI dataset. Our method reconstructs a complete 3D point cloud from this limited 3D information combined with a single image. The reconstructed 3D point cloud and corresponding image can be used by any multi-modal off-the-shelf detector for 3D object detection. By using the proposed network architecture with an off-the-shelf multi-modal 3D detector, the accuracy of 3D detection improves by 20% compared to the state-of-the-art monocular detection methods and 6% to 9% compare to the baseline multi-modal methods on KITTI and JackRabbot datasets.
△ Less
Submitted 9 April, 2024;
originally announced April 2024.
-
Increased LLM Vulnerabilities from Fine-tuning and Quantization
Authors:
Divyanshu Kumar,
Anurakt Kumar,
Sahil Agarwal,
Prashanth Harshangi
Abstract:
Large Language Models (LLMs) have become very popular and have found use cases in many domains, such as chatbots, auto-task completion agents, and much more. However, LLMs are vulnerable to different types of attacks, such as jailbreaking, prompt injection attacks, and privacy leakage attacks. Foundational LLMs undergo adversarial and alignment training to learn not to generate malicious and toxic…
▽ More
Large Language Models (LLMs) have become very popular and have found use cases in many domains, such as chatbots, auto-task completion agents, and much more. However, LLMs are vulnerable to different types of attacks, such as jailbreaking, prompt injection attacks, and privacy leakage attacks. Foundational LLMs undergo adversarial and alignment training to learn not to generate malicious and toxic content. For specialized use cases, these foundational LLMs are subjected to fine-tuning or quantization for better performance and efficiency. We examine the impact of downstream tasks such as fine-tuning and quantization on LLM vulnerability. We test foundation models like Mistral, Llama, MosaicML, and their fine-tuned versions. Our research shows that fine-tuning and quantization reduces jailbreak resistance significantly, leading to increased LLM vulnerabilities. Finally, we demonstrate the utility of external guardrails in reducing LLM vulnerabilities.
△ Less
Submitted 5 April, 2024;
originally announced April 2024.
-
Balancing Progress and Responsibility: A Synthesis of Sustainability Trade-Offs of AI-Based Systems
Authors:
Apoorva Nalini Pradeep Kumar,
Justus Bogner,
Markus Funke,
Patricia Lago
Abstract:
Recent advances in artificial intelligence (AI) capabilities have increased the eagerness of companies to integrate AI into software systems. While AI can be used to have a positive impact on several dimensions of sustainability, this is often overshadowed by its potential negative influence. While many studies have explored sustainability factors in isolation, there is insufficient holistic cover…
▽ More
Recent advances in artificial intelligence (AI) capabilities have increased the eagerness of companies to integrate AI into software systems. While AI can be used to have a positive impact on several dimensions of sustainability, this is often overshadowed by its potential negative influence. While many studies have explored sustainability factors in isolation, there is insufficient holistic coverage of potential sustainability benefits or costs that practitioners need to consider during decision-making for AI adoption. We therefore aim to synthesize trade-offs related to sustainability in the context of integrating AI into software systems. We want to make the sustainability benefits and costs of integrating AI more transparent and accessible for practitioners.
The study was conducted in collaboration with a Dutch financial organization. We first performed a rapid review that led to the inclusion of 151 research papers. Afterward, we conducted six semi-structured interviews to enrich the data with industry perspectives. The combined results showcase the potential sustainability benefits and costs of integrating AI. The labels synthesized from the review regarding potential sustainability benefits were clustered into 16 themes, with "energy management" being the most frequently mentioned one. 11 themes were identified in the interviews, with the top mentioned theme being "employee wellbeing". Regarding sustainability costs, the review discovered seven themes, with "deployment issues" being the most popular one, followed by "ethics & society". "Environmental issues" was the top theme from the interviews. Our results provide valuable insights to organizations and practitioners for understanding the potential sustainability implications of adopting AI.
△ Less
Submitted 5 April, 2024;
originally announced April 2024.
-
Investigation of reaction and $α$ production cross sections with $^9$Be projectile
Authors:
Satbir Kaur,
V. V. Parkar,
S. K. Pandit,
A. Shrivastava,
K. Mahata,
K. Ramachandran,
Sangeeta Dhuri,
P. C. Rout,
A. Kumar,
Shilpi Gupta
Abstract:
In order to investigate the contribution of $α$ production in the reaction cross sections, measurements of elastic scattering and inclusive $α$ particle angular distributions have been carried out with the $^9$Be projectile on $^{89}$Y, $^{124}$Sn, $^{159}$Tb, $^{198}$Pt, and $^{209}$Bi targets over a wide angular range at energies near the Coulomb barrier. The measured elastic scattering angular…
▽ More
In order to investigate the contribution of $α$ production in the reaction cross sections, measurements of elastic scattering and inclusive $α$ particle angular distributions have been carried out with the $^9$Be projectile on $^{89}$Y, $^{124}$Sn, $^{159}$Tb, $^{198}$Pt, and $^{209}$Bi targets over a wide angular range at energies near the Coulomb barrier. The measured elastic scattering angular distributions were fitted with optical model calculations, and reaction cross sections were extracted. The same data were also analysed using both global optical model potentials (Global OMP) and microscopic S$\tilde{a}$o Paulo potentials (SPP), to obtain the reaction cross sections. The data available in the literature for $^9$Be projectile includes the elastic scattering angular distributions, $α$ production cross sections, and complete fusion cross sections on these and other targets at several energies are also utilised for comparative studies. The reaction cross section extracted from the three potentials (Best Fit, Global OMP and SPP) are in reasonable agreement for all the targets except for the energies below the barrier where the results from SPP deviate by 30-50 \%. Inclusive $α$ particle production cross sections were also extracted by integrating the $α$ particle angular distributions. The present data and data available from literature of reaction and $α$-particle production cross sections were utilised to make systematic studies. Systematics of reaction and $α$-particle production cross sections revealed their universal behaviour.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
Non-radial oscillations in newly born compact star considering effects of phase transition
Authors:
Anil Kumar,
Pratik Thakur,
Monika Sinha
Abstract:
The massive stars end their lives by supernova explosions leaving central compact objects that may evolve into neutron stars. Initially, after birth, the star remains hot and gradually cools down. We explore the matter and star properties during this initial stage of the compact stars considering the possibility of the appearance of deconfined quark matter in the core of the star. At the initial s…
▽ More
The massive stars end their lives by supernova explosions leaving central compact objects that may evolve into neutron stars. Initially, after birth, the star remains hot and gradually cools down. We explore the matter and star properties during this initial stage of the compact stars considering the possibility of the appearance of deconfined quark matter in the core of the star. At the initial stage after the supernova explosion, the occurrence of non-radial oscillation in the newly born compact object is highly possible. Non-radial oscillations are an important source of GWs. There is a high chance for GWs from these oscillations, especially the nodeless fundamental (f-) mode to be detected by next-generation GW detectors. We study the evolution in frequencies of non-radial oscillation after birth considering phase transition and predicting the possible signature for different possibilities of theoretical compact star models.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
Numerical modelling of flame spread over thin circular ducts
Authors:
Vipin Kumar,
Kambam Naresh,
Amit Kumar
Abstract:
This paper presents a numerical investigation into the phenomenon of flame spread over thin circular ducts in normal gravity and microgravity environments. Flame spread over such geometry is of significant interest due to its relevance in various practical applications, including tubes for flow purpose in medical system, fire safety in spacecrafts, ducts as well as wiring tubes. This study compris…
▽ More
This paper presents a numerical investigation into the phenomenon of flame spread over thin circular ducts in normal gravity and microgravity environments. Flame spread over such geometry is of significant interest due to its relevance in various practical applications, including tubes for flow purpose in medical system, fire safety in spacecrafts, ducts as well as wiring tubes. This study comprises of a comprehensive investigation of key parameters affecting flame spread rate, including fuel radius and opposed flow speed in normal gravity and microgravity environments. A 2-D axisymmetric flame spread model accounted for char and numerical simulations were performed which revealed valuable insights into the underlying mechanisms governing flame spread over such geometry. The results computed from the numerical model is compared with the experimentally observed flame spread rate to validate the numerical model which can be used to gain a comprehensive understanding of the underlying physical phenomena. As the radius of circular duct increases the flame spread rate increases both in normal gravity and microgravity environments. The conduction heat feedback and radiation heat gain coming from hot char through gas phase at inner core region are the two major mechanisms which controls the flame spread phenomena over the circular duct fuels. The flame spread rate at different flow ranging from quiescent (0 cm/s) to 30 cm/s is also evaluated and 21 % oxygen and found a non-monotonic increasing decreasing trend of flame spread rate at different opposed flow speed in both normal gravity and microgravity environments.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
The Emotional Impact of Game Duration: A Framework for Understanding Player Emotions in Extended Gameplay Sessions
Authors:
Anoop Kumar,
Suresh Dodda,
Navin Kamuni,
Venkata Sai Mahesh Vuppalapati
Abstract:
Video games have played a crucial role in entertainment since their development in the 1970s, becoming even more prominent during the lockdown period when people were looking for ways to entertain them. However, at that time, players were unaware of the significant impact that playtime could have on their feelings. This has made it challenging for designers and developers to create new games since…
▽ More
Video games have played a crucial role in entertainment since their development in the 1970s, becoming even more prominent during the lockdown period when people were looking for ways to entertain them. However, at that time, players were unaware of the significant impact that playtime could have on their feelings. This has made it challenging for designers and developers to create new games since they have to control the emotional impact that these games will take on players. Thus, the purpose of this study is to look at how a player's emotions are affected by the duration of the game. In order to achieve this goal, a framework for emotion detection is created. According to the experiment's results, the volunteers' general ability to express emotions increased from 20 to 60 minutes. In comparison to shorter gameplay sessions, the experiment found that extended gameplay sessions did significantly affect the player's emotions. According to the results, it was recommended that in order to lessen the potential emotional impact that playing computer and video games may have in the future, game producers should think about creating shorter, entertaining games.
△ Less
Submitted 30 March, 2024;
originally announced April 2024.
-
SeaBird: Segmentation in Bird's View with Dice Loss Improves Monocular 3D Detection of Large Objects
Authors:
Abhinav Kumar,
Yuliang Guo,
Xinyu Huang,
Liu Ren,
Xiaoming Liu
Abstract:
Monocular 3D detectors achieve remarkable performance on cars and smaller objects. However, their performance drops on larger objects, leading to fatal accidents. Some attribute the failures to training data scarcity or their receptive field requirements of large objects. In this paper, we highlight this understudied problem of generalization to large objects. We find that modern frontal detectors…
▽ More
Monocular 3D detectors achieve remarkable performance on cars and smaller objects. However, their performance drops on larger objects, leading to fatal accidents. Some attribute the failures to training data scarcity or their receptive field requirements of large objects. In this paper, we highlight this understudied problem of generalization to large objects. We find that modern frontal detectors struggle to generalize to large objects even on nearly balanced datasets. We argue that the cause of failure is the sensitivity of depth regression losses to noise of larger objects. To bridge this gap, we comprehensively investigate regression and dice losses, examining their robustness under varying error levels and object sizes. We mathematically prove that the dice loss leads to superior noise-robustness and model convergence for large objects compared to regression losses for a simplified case. Leveraging our theoretical insights, we propose SeaBird (Segmentation in Bird's View) as the first step towards generalizing to large objects. SeaBird effectively integrates BEV segmentation on foreground objects for 3D detection, with the segmentation head trained with the dice loss. SeaBird achieves SoTA results on the KITTI-360 leaderboard and improves existing detectors on the nuScenes leaderboard, particularly for large objects. Code and models at https://github.com/abhi1kumar/SeaBird
△ Less
Submitted 29 March, 2024;
originally announced March 2024.
-
Field tuning Kitaev systems for spin fractionalization and topological order
Authors:
Jagannath Das,
Sarbajaya Kundu,
Aman Kumar,
Vikram Tripathi
Abstract:
The honeycomb Kitaev model describes a $Z_2$ spin liquid with topological order and fractionalized excitations consisting of gapped $π$-fluxes and free Majorana fermions. Competing interactions, even when not very strong, are known to destabilize the Kitaev spin liquid. Magnetic fields are a convenient parameter for tuning between different phases of the Kitaev systems, and have even been investig…
▽ More
The honeycomb Kitaev model describes a $Z_2$ spin liquid with topological order and fractionalized excitations consisting of gapped $π$-fluxes and free Majorana fermions. Competing interactions, even when not very strong, are known to destabilize the Kitaev spin liquid. Magnetic fields are a convenient parameter for tuning between different phases of the Kitaev systems, and have even been investigated for potentially counteracting the effects of other destabilizing interactions leading to a revival of the topological phase. Here we review the progress in understanding the effects of magnetic fields on some of the perturbed Kitaev systems, particularly on fractionalization and topological order.
△ Less
Submitted 29 March, 2024;
originally announced March 2024.
-
Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark
Authors:
Ziyang Chen,
Israel D. Gebru,
Christian Richardt,
Anurag Kumar,
William Laney,
Andrew Owens,
Alexander Richard
Abstract:
We present a new dataset called Real Acoustic Fields (RAF) that captures real acoustic room data from multiple modalities. The dataset includes high-quality and densely captured room impulse response data paired with multi-view images, and precise 6DoF pose tracking data for sound emitters and listeners in the rooms. We used this dataset to evaluate existing methods for novel-view acoustic synthes…
▽ More
We present a new dataset called Real Acoustic Fields (RAF) that captures real acoustic room data from multiple modalities. The dataset includes high-quality and densely captured room impulse response data paired with multi-view images, and precise 6DoF pose tracking data for sound emitters and listeners in the rooms. We used this dataset to evaluate existing methods for novel-view acoustic synthesis and impulse response generation which previously relied on synthetic data. In our evaluation, we thoroughly assessed existing audio and audio-visual models against multiple criteria and proposed settings to enhance their performance on real-world data. We also conducted experiments to investigate the impact of incorporating visual data (i.e., images and depth) into neural acoustic field models. Additionally, we demonstrated the effectiveness of a simple sim2real approach, where a model is pre-trained with simulated data and fine-tuned with sparse real-world data, resulting in significant improvements in the few-shot learning approach. RAF is the first dataset to provide densely captured room acoustic data, making it an ideal resource for researchers working on audio and audio-visual neural acoustic field modeling techniques. Demos and datasets are available on our project page: https://facebookresearch.github.io/real-acoustic-fields/
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Magnetars as Powering Sources of Gamma-Ray Burst Associated Supernovae, and Unsupervised Clustering of Cosmic Explosions
Authors:
Amit Kumar,
Kaushal Sharma,
Jozsef Vinkó,
Danny Steeghs,
Benjamin Gompertz,
Joseph Lyman,
Raya Dastidar,
Avinash Singh,
Kendall Ackley,
Miika Pursiainen
Abstract:
We present the semi-analytical light curve modelling of 13 supernovae associated with gamma-ray bursts (GRB-SNe) along with two relativistic broad-lined (Ic-BL) SNe without GRBs association (SNe 2009bb and 2012ap), considering millisecond magnetars as central-engine-based power sources for these events. The bolometric light curves of all 15 SNe in our sample are well-regenerated utilising a…
▽ More
We present the semi-analytical light curve modelling of 13 supernovae associated with gamma-ray bursts (GRB-SNe) along with two relativistic broad-lined (Ic-BL) SNe without GRBs association (SNe 2009bb and 2012ap), considering millisecond magnetars as central-engine-based power sources for these events. The bolometric light curves of all 15 SNe in our sample are well-regenerated utilising a $χ^2-$minimisation code, $\texttt{MINIM}$, and numerous parameters are constrained. The median values of ejecta mass ($M_{\textrm{ej}}$), magnetar's initial spin period ($P_\textrm{i}$) and magnetic field ($B$) for GRB-SNe are determined to be $\approx$ 5.2 M$_\odot$, 20.5 ms and 20.1 $\times$ 10$^{14}$ G, respectively. We leverage machine learning (ML) algorithms to comprehensively compare the 3-dimensional parameter space encompassing $M_{\textrm{ej}}$, $P_\textrm{i}$, and $B$ for GRB-SNe determined herein to those of H-deficient superluminous SNe (SLSNe-I), fast blue optical transients (FBOTs), long GRBs (LGRBs), and short GRBs (SGRBs) obtained from the literature. The application of unsupervised ML clustering algorithms on the parameters $M_{\textrm{ej}}$, $P_\textrm{i}$, and $B$ for GRB-SNe, SLSNe-I, and FBOTs yields a classification accuracy of $\sim$95%. Extending these methods to classify GRB-SNe, SLSNe-I, LGRBs, and SGRBs based on $P_\textrm{i}$ and $B$ values results in an accuracy of $\sim$84%. Our investigations show that GRB-SNe and relativistic Ic-BL SNe presented in this study occupy different parameter spaces for $M_{\textrm{ej}}$, $P_\textrm{i}$, and $B$ than those of SLSNe-I, FBOTs, LGRBs and SGRBs. This indicates that magnetars with different $P_\textrm{i}$ and $B$ can give birth to distinct types of transients.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
Behind the Counter: Exploring the Motivations and Barriers of Online Counterspeech Writing
Authors:
Kaike Ping,
Anisha Kumar,
Xiaohan Ding,
Eugenia Rho
Abstract:
Current research mainly explores the attributes and impact of online counterspeech, leaving a gap in understanding of who engages in online counterspeech or what motivates or deters users from participating. To investigate this, we surveyed 458 English-speaking U.S. participants, analyzing key motivations and barriers underlying online counterspeech engagement. We presented each participant with t…
▽ More
Current research mainly explores the attributes and impact of online counterspeech, leaving a gap in understanding of who engages in online counterspeech or what motivates or deters users from participating. To investigate this, we surveyed 458 English-speaking U.S. participants, analyzing key motivations and barriers underlying online counterspeech engagement. We presented each participant with three hate speech examples from a set of 900, spanning race, gender, religion, sexual orientation, and disability, and requested counterspeech responses. Subsequent questions assessed their satisfaction, perceived difficulty, and the effectiveness of their counterspeech. Our findings show that having been a target of online hate is a key driver of frequent online counterspeech engagement. People differ in their motivations and barriers towards engaging in online counterspeech across different demographic groups. Younger individuals, women, those with higher education levels, and regular witnesses to online hate are more reluctant to engage in online counterspeech due to concerns around public exposure, retaliation, and third-party harassment. Varying motivation and barriers in counterspeech engagement also shape how individuals view their own self-authored counterspeech and the difficulty experienced writing it. Additionally, our work explores people's willingness to use AI technologies like ChatGPT for counterspeech writing. Through this work we introduce a multi-item scale for understanding counterspeech motivation and barriers and a more nuanced understanding of the factors shaping online counterspeech engagement.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
All Artificial, Less Intelligence: GenAI through the Lens of Formal Verification
Authors:
Deepak Narayan Gadde,
Aman Kumar,
Thomas Nalapat,
Evgenii Rezunov,
Fabio Cappellini
Abstract:
Modern hardware designs have grown increasingly efficient and complex. However, they are often susceptible to Common Weakness Enumerations (CWEs). This paper is focused on the formal verification of CWEs in a dataset of hardware designs written in SystemVerilog from Regenerative Artificial Intelligence (AI) powered by Large Language Models (LLMs). We applied formal verification to categorize each…
▽ More
Modern hardware designs have grown increasingly efficient and complex. However, they are often susceptible to Common Weakness Enumerations (CWEs). This paper is focused on the formal verification of CWEs in a dataset of hardware designs written in SystemVerilog from Regenerative Artificial Intelligence (AI) powered by Large Language Models (LLMs). We applied formal verification to categorize each hardware design as vulnerable or CWE-free. This dataset was generated by 4 different LLMs and features a unique set of designs for each of the 10 CWEs we target in our paper. We have associated the identified vulnerabilities with CWE numbers for a dataset of 60,000 generated SystemVerilog Register Transfer Level (RTL) code. It was also found that most LLMs are not aware of any hardware CWEs; hence they are usually not considered when generating the hardware code. Our study reveals that approximately 60% of the hardware designs generated by LLMs are prone to CWEs, posing potential safety and security risks. The dataset could be ideal for training LLMs and Machine Learning (ML) algorithms to abstain from generating CWE-prone hardware designs.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Linguistically Differentiating Acts and Recalls of Racial Microaggressions on Social Media
Authors:
Uma Sushmitha Gunturi,
Anisha Kumar,
Xiaohan Ding,
Eugenia H. Rho
Abstract:
In this work, we examine the linguistic signature of online racial microaggressions (acts) and how it differs from that of personal narratives recalling experiences of such aggressions (recalls) by Black social media users. We manually curate and annotate a corpus of acts and recalls from in-the-wild social media discussions, and verify labels with Black workshop participants. We leverage Natural…
▽ More
In this work, we examine the linguistic signature of online racial microaggressions (acts) and how it differs from that of personal narratives recalling experiences of such aggressions (recalls) by Black social media users. We manually curate and annotate a corpus of acts and recalls from in-the-wild social media discussions, and verify labels with Black workshop participants. We leverage Natural Language Processing (NLP) and qualitative analysis on this data to classify (RQ1), interpret (RQ2), and characterize (RQ3) the language underlying acts and recalls of racial microaggressions in the context of racism in the U.S. Our findings show that neural language models (LMs) can classify acts and recalls with high accuracy (RQ1) with contextual words revealing themes that associate Blacks with objects that reify negative stereotypes (RQ2). Furthermore, overlapping linguistic signatures between acts and recalls serve functionally different purposes (RQ3), providing broader implications to the current challenges in content moderation systems on social media.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Symmetry breaker governs synchrony patterns in neuronal inspired networks
Authors:
Anil Kumar,
Edmilson Roque dos Santos,
Paul J. Laurienti,
Erik Bollt
Abstract:
Experiments in the human brain reveal switching between different activity patterns and functional network organization over time. Recently, multilayer modeling has been employed across multiple neurobiological levels (from spiking networks to brain regions) to unveil novel insights into the emergence and time evolution of synchrony patterns. We consider two layers with the top layer directly coup…
▽ More
Experiments in the human brain reveal switching between different activity patterns and functional network organization over time. Recently, multilayer modeling has been employed across multiple neurobiological levels (from spiking networks to brain regions) to unveil novel insights into the emergence and time evolution of synchrony patterns. We consider two layers with the top layer directly coupled to the bottom layer. When isolated, the bottom layer would remain in a specific stable pattern. However, in the presence of the top layer, the network exhibits spatiotemporal switching. The top layer in combination with the inter-layer coupling acts as a symmetry breaker, governing the bottom layer and restricting the number of allowed symmetry-induced patterns. This structure allows us to demonstrate the existence and stability of pattern states on the bottom layer, but most remarkably, it enables a simple mechanism for switching between patterns based on the unique symmetry-breaking role of the governing layer. We demonstrate that the symmetry breaker prevents complete synchronization in the bottom layer, a situation that would not be desirable in a normal functioning brain. We illustrate our findings using two layers of Hindmarsh-Rose (HR) oscillators, employing the Master Stability function approach in small networks to investigate the switching between patterns.
△ Less
Submitted 24 March, 2024;
originally announced March 2024.
-
AB5 type multicomponent TiVCoNiMn2 high-entropy alloy
Authors:
Abhishek Kumar,
M. A. Shaz,
N. K. Mukhopadhyay,
Thakur Prasad Yadav
Abstract:
Recent theoretical and practical research has focused on multi-component High Entropy Alloys (HEAs), which have superior mechanical and functional properties than standard alloys based on a single major element, thereby establishing a new field. A multi-component HEA contains five or more primary elements at concentrations ranging from 5 to 35 atomic percent. We examined the microstructure and mec…
▽ More
Recent theoretical and practical research has focused on multi-component High Entropy Alloys (HEAs), which have superior mechanical and functional properties than standard alloys based on a single major element, thereby establishing a new field. A multi-component HEA contains five or more primary elements at concentrations ranging from 5 to 35 atomic percent. We examined the microstructure and mechanical properties of TiVCoNiMn2 HEA. The mixing enthalpy and other thermodynamic parameters were determined using Meidma's model. TiVCoNiMn2 exhibits a mixing enthalpy of -15.6 kJ/mol and an atomic radius mismatch of approximately 10.03%. HEA is derived from both hydride and non-hydride-producing elements. This could be a useful hydrogen storage material. The hydrogen absorption/desorption capabilities of these HEAs are promising.
△ Less
Submitted 24 March, 2024;
originally announced March 2024.
-
UPNeRF: A Unified Framework for Monocular 3D Object Reconstruction and Pose Estimation
Authors:
Yuliang Guo,
Abhinav Kumar,
Cheng Zhao,
Ruoyu Wang,
Xinyu Huang,
Liu Ren
Abstract:
Monocular 3D reconstruction for categorical objects heavily relies on accurately perceiving each object's pose. While gradient-based optimization within a NeRF framework updates initially given poses, this paper highlights that such a scheme fails when the initial pose even moderately deviates from the true pose. Consequently, existing methods often depend on a third-party 3D object to provide an…
▽ More
Monocular 3D reconstruction for categorical objects heavily relies on accurately perceiving each object's pose. While gradient-based optimization within a NeRF framework updates initially given poses, this paper highlights that such a scheme fails when the initial pose even moderately deviates from the true pose. Consequently, existing methods often depend on a third-party 3D object to provide an initial object pose, leading to increased complexity and generalization issues. To address these challenges, we present UPNeRF, a Unified framework integrating Pose estimation and NeRF-based reconstruction, bringing us closer to real-time monocular 3D object reconstruction. UPNeRF decouples the object's dimension estimation and pose refinement to resolve the scale-depth ambiguity, and introduces an effective projected-box representation that generalizes well cross different domains. While using a dedicated pose estimator that smoothly integrates into an object-centric NeRF, UPNeRF is free from external 3D detectors. UPNeRF achieves state-of-the-art results in both reconstruction and pose estimation tasks on the nuScenes dataset. Furthermore, UPNeRF exhibits exceptional Cross-dataset generalization on the KITTI and Waymo datasets, surpassing prior methods with up to 50% reduction in rotation and translation error.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
Saturation and fluctuations in the proton wavefunction at large momentum transfers in exclusive diffraction at HERA
Authors:
Arjun Kumar,
Tobias Toll
Abstract:
We present a model of proton geometry where the number and size of gluon density hotspots in the proton's thickness function evolves with the resolution scale of the event given by the Mandelstam $t$ variable in exclusive diffractive $ep$ collisions. We use the impact-parameter dependent saturation dipole model bSat/IPSat, as well as its linearised (non-saturated) version bNonSat. In the latter th…
▽ More
We present a model of proton geometry where the number and size of gluon density hotspots in the proton's thickness function evolves with the resolution scale of the event given by the Mandelstam $t$ variable in exclusive diffractive $ep$ collisions. We use the impact-parameter dependent saturation dipole model bSat/IPSat, as well as its linearised (non-saturated) version bNonSat. In the latter the proton thickness has a clear interpretation as a thickness and in the former it is directly related to the saturation scale. The resulting phenomenological model for the splitting of hotspots, making full use of earlier experimental and phenomenological studies, is able to describe the entire incoherent $t$-spectrum for $|t|>1.1~$GeV$^2$ with a single phenomenological parameter. We use the previously suggested hotspot model as an initial condition for our evolution. The resulting model is a resolution scale-evolution in the same vein as a parton shower.The incoherent cross section is directly proportional to geometrical fluctuations in the proton's inital state. The hotspot evolution give rise to several kinds of event-by-event fluctuations such as in the hotspot number, width and normalization, and saturation scale fluctuations is a direct effect of these. A natural consequence of our resolution based evolution is that the hotspots obtain an effective repulsion. We use our hotspot evolution model to investigate saturation scale effects in the $t$-spectrum, and found that HERA data is not sensitive to this physics.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
Spintronic devices as next-generation computation accelerators
Authors:
Victor H. González,
Artem Litvinenko,
Akash Kumar,
Roman Khymyn,
Johan Åkerman
Abstract:
The ever increasing demand for computational power combined with the predicted plateau for the miniaturization of existing silicon-based technologies has made the search for low power alternatives an industrial and scientifically engaging problem. In this work, we explore spintronics-based Ising machines as hardware computation accelerators. We start by presenting the physical platforms on which t…
▽ More
The ever increasing demand for computational power combined with the predicted plateau for the miniaturization of existing silicon-based technologies has made the search for low power alternatives an industrial and scientifically engaging problem. In this work, we explore spintronics-based Ising machines as hardware computation accelerators. We start by presenting the physical platforms on which this emerging field is being developed, the different control schemes and the type of algorithms and problems on which these machines outperform conventional computers. We then benchmark these technologies and provide an outlook for future developments and use-cases that can help them get a running start for integration into the next generation of computing devices.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
Intelligent fault diagnosis of worm gearbox based on adaptive CNN using amended gorilla troop optimization with quantum gate mutation strategy
Authors:
Govind Vashishtha,
Sumika Chauhan,
Surinder Kumar,
Rajesh Kumar,
Radoslaw Zimroz,
Anil Kumar
Abstract:
The worm gearbox is a high-speed transmission system that plays a vital role in various industries. Therefore it becomes necessary to develop a robust fault diagnosis scheme for worm gearbox. Due to advancements in sensor technology, researchers from academia and industries prefer deep learning models for fault diagnosis purposes. The optimal selection of hyperparameters (HPs) of deep learning mod…
▽ More
The worm gearbox is a high-speed transmission system that plays a vital role in various industries. Therefore it becomes necessary to develop a robust fault diagnosis scheme for worm gearbox. Due to advancements in sensor technology, researchers from academia and industries prefer deep learning models for fault diagnosis purposes. The optimal selection of hyperparameters (HPs) of deep learning models plays a significant role in stable performance. Existing methods mainly focused on manual tunning of these parameters, which is a troublesome process and sometimes leads to inaccurate results. Thus, exploring more sophisticated methods to optimize the HPs automatically is important. In this work, a novel optimization, i.e. amended gorilla troop optimization (AGTO), has been proposed to make the convolutional neural network (CNN) adaptive for extracting the features to identify the worm gearbox defects. Initially, the vibration and acoustic signals are converted into 2D images by the Morlet wavelet function. Then, the initial model of CNN is developed by setting hyperparameters. Further, the search space of each Hp is identified and optimized by the developed AGTO algorithm. The classification accuracy has been evaluated by AGTO-CNN, which is further validated by the confusion matrix. The performance of the developed model has also been compared with other models. The AGTO algorithm is examined on twenty-three classical benchmark functions and the Wilcoxon test which demonstrates the effectiveness and dominance of the developed optimization algorithm. The results obtained suggested that the AGTO-CNN has the highest diagnostic accuracy and is stable and robust while diagnosing the worm gearbox.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Intent-conditioned and Non-toxic Counterspeech Generation using Multi-Task Instruction Tuning with RLAIF
Authors:
Amey Hengle,
Aswini Kumar,
Sahajpreet Singh,
Anil Bandhakavi,
Md Shad Akhtar,
Tanmoy Chakroborty
Abstract:
Counterspeech, defined as a response to mitigate online hate speech, is increasingly used as a non-censorial solution. Addressing hate speech effectively involves dispelling the stereotypes, prejudices, and biases often subtly implied in brief, single-sentence statements or abuses. These implicit expressions challenge language models, especially in seq2seq tasks, as model performance typically exc…
▽ More
Counterspeech, defined as a response to mitigate online hate speech, is increasingly used as a non-censorial solution. Addressing hate speech effectively involves dispelling the stereotypes, prejudices, and biases often subtly implied in brief, single-sentence statements or abuses. These implicit expressions challenge language models, especially in seq2seq tasks, as model performance typically excels with longer contexts. Our study introduces CoARL, a novel framework enhancing counterspeech generation by modeling the pragmatic implications underlying social biases in hateful statements. CoARL's first two phases involve sequential multi-instruction tuning, teaching the model to understand intents, reactions, and harms of offensive statements, and then learning task-specific low-rank adapter weights for generating intent-conditioned counterspeech. The final phase uses reinforcement learning to fine-tune outputs for effectiveness and non-toxicity. CoARL outperforms existing benchmarks in intent-conditioned counterspeech generation, showing an average improvement of 3 points in intent-conformity and 4 points in argument-quality metrics. Extensive human evaluation supports CoARL's efficacy in generating superior and more context-appropriate responses compared to existing systems, including prominent LLMs like ChatGPT.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
Komodo: A Linguistic Expedition into Indonesia's Regional Languages
Authors:
Louis Owen,
Vishesh Tripathi,
Abhay Kumar,
Biddwan Ahmed
Abstract:
The recent breakthroughs in Large Language Models (LLMs) have mostly focused on languages with easily available and sufficient resources, such as English. However, there remains a significant gap for languages that lack sufficient linguistic resources in the public domain. Our work introduces Komodo-7B, 7-billion-parameter Large Language Models designed to address this gap by seamlessly operating…
▽ More
The recent breakthroughs in Large Language Models (LLMs) have mostly focused on languages with easily available and sufficient resources, such as English. However, there remains a significant gap for languages that lack sufficient linguistic resources in the public domain. Our work introduces Komodo-7B, 7-billion-parameter Large Language Models designed to address this gap by seamlessly operating across Indonesian, English, and 11 regional languages in Indonesia. Komodo-7B is a family of LLMs that consist of Komodo-7B-Base and Komodo-7B-Instruct. Komodo-7B-Instruct stands out by achieving state-of-the-art performance in various tasks and languages, outperforming the benchmarks set by OpenAI's GPT-3.5, Cohere's Aya-101, Llama-2-Chat-13B, Mixtral-8x7B-Instruct-v0.1, Gemma-7B-it , and many more. This model not only demonstrates superior performance in both language-specific and overall assessments but also highlights its capability to excel in linguistic diversity. Our commitment to advancing language models extends beyond well-resourced languages, aiming to bridge the gap for those with limited linguistic assets. Additionally, Komodo-7B-Instruct's better cross-language understanding contributes to addressing educational disparities in Indonesia, offering direct translations from English to 11 regional languages, a significant improvement compared to existing language translation services. Komodo-7B represents a crucial step towards inclusivity and effectiveness in language models, providing to the linguistic needs of diverse communities.
△ Less
Submitted 19 March, 2024; v1 submitted 14 March, 2024;
originally announced March 2024.
-
Antiferromagnetic ordering and glassy nature in NASICON type NaFe$_2$PO$_4$(SO$_4$)$_2$
Authors:
Manish Kr. Singh,
A. K. Bera,
Ajay Kumar,
S. M. Yusuf,
R. S. Dhaka
Abstract:
We investigate crystal structure and magnetic properties including spin relaxation and magnetocaloric effect in NASICON type NaFe$_2$PO$_4$(SO$_4$)$_2$ sample. The Rietveld refinement of x-ray and neutron diffraction patterns show a rhombohedral crystal structure with the R$\bar{3}$c space group. The core-level spectra confirm the desired oxidation state of constituent elements. The {\it dc}--magn…
▽ More
We investigate crystal structure and magnetic properties including spin relaxation and magnetocaloric effect in NASICON type NaFe$_2$PO$_4$(SO$_4$)$_2$ sample. The Rietveld refinement of x-ray and neutron diffraction patterns show a rhombohedral crystal structure with the R$\bar{3}$c space group. The core-level spectra confirm the desired oxidation state of constituent elements. The {\it dc}--magnetic susceptibility ($χ$) behavior in zero field-cooled (ZFC) and field-cooled (FC) modes show the ordering temperature $\approx$50~K. Interestingly, the analysis of temperature dependent neutron diffraction patterns reveal an A-type antiferromagnetic (AFM) structure with the ordered moment of 3.8 $μ_{B}$/Fe$^{3+}$ at 5~K, and a magnetostriction below $T_{\rm N}=$ 50~K. Further, the peak position in the {\it ac}--$χ$ is found to be invariant with the excitation frequency supporting the notion of dominating AFM transition. Also, the unsaturated isothermal magnetization curve supports the AFM ordering of the moments; however, the observed coercivity suggests the presence of weak ferromagnetic (FM) correlations at 5~K. On the other hand, a clear bifurcation between ZFC and FC curves of {\it dc}--$χ$ and the observed decrease in peak height of {\it ac}--$χ$ with frequency suggest for the complex magnetic interactions. The spin relaxation behavior in thermo-remanent magnetization and aging measurements indicate the glassy states at 5~K. Moreover, the Arrott plots and magnetocaloric analysis reveal the AFM--FM interactions in the sample at lower temperatures.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
AADNet: Attention aware Demoiréing Network
Authors:
M Rakesh Reddy,
Shubham Mandloi,
Aman Kumar
Abstract:
Moire pattern frequently appears in photographs captured with mobile devices and digital cameras, potentially degrading image quality. Despite recent advancements in computer vision, image demoire'ing remains a challenging task due to the dynamic textures and variations in colour, shape, and frequency of moire patterns. Most existing methods struggle to generalize to unseen datasets, limiting thei…
▽ More
Moire pattern frequently appears in photographs captured with mobile devices and digital cameras, potentially degrading image quality. Despite recent advancements in computer vision, image demoire'ing remains a challenging task due to the dynamic textures and variations in colour, shape, and frequency of moire patterns. Most existing methods struggle to generalize to unseen datasets, limiting their effectiveness in removing moire patterns from real-world scenarios. In this paper, we propose a novel lightweight architecture, AADNet (Attention Aware Demoireing Network), for high-resolution image demoire'ing that effectively works across different frequency bands and generalizes well to unseen datasets. Extensive experiments conducted on the UHDM dataset validate the effectiveness of our approach, resulting in high-fidelity images.
△ Less
Submitted 6 May, 2024; v1 submitted 13 March, 2024;
originally announced March 2024.
-
CoroNetGAN: Controlled Pruning of GANs via Hypernetworks
Authors:
Aman Kumar,
Khushboo Anand,
Shubham Mandloi,
Ashutosh Mishra,
Avinash Thakur,
Neeraj Kasera,
Prathosh A P
Abstract:
Generative Adversarial Networks (GANs) have proven to exhibit remarkable performance and are widely used across many generative computer vision applications. However, the unprecedented demand for the deployment of GANs on resource-constrained edge devices still poses a challenge due to huge number of parameters involved in the generation process. This has led to focused attention on the area of co…
▽ More
Generative Adversarial Networks (GANs) have proven to exhibit remarkable performance and are widely used across many generative computer vision applications. However, the unprecedented demand for the deployment of GANs on resource-constrained edge devices still poses a challenge due to huge number of parameters involved in the generation process. This has led to focused attention on the area of compressing GANs. Most of the existing works use knowledge distillation with the overhead of teacher dependency. Moreover, there is no ability to control the degree of compression in these methods. Hence, we propose CoroNet-GAN for compressing GAN using the combined strength of differentiable pruning method via hypernetworks. The proposed method provides the advantage of performing controllable compression while training along with reducing training time by a substantial factor. Experiments have been done on various conditional GAN architectures (Pix2Pix and CycleGAN) to signify the effectiveness of our approach on multiple benchmark datasets such as Edges-to-Shoes, Horse-to-Zebra and Summer-to-Winter. The results obtained illustrate that our approach succeeds to outperform the baselines on Zebra-to-Horse and Summer-to-Winter achieving the best FID score of 32.3 and 72.3 respectively, yielding high-fidelity images across all the datasets. Additionally, our approach also outperforms the state-of-the-art methods in achieving better inference time on various smart-phone chipsets and data-types making it a feasible solution for deployment on edge devices.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
Sentiment-aware Enhancements of PageRank-based Citation Metric, Impact Factor, and H-index for Ranking the Authors of Scholarly Articles
Authors:
Shikha Gupta,
Animesh Kumar
Abstract:
Heretofore, the only way to evaluate an author has been frequency-based citation metrics that assume citations to be of a neutral sentiment. However, considering the sentiment behind citations aids in a better understanding of the viewpoints of fellow researchers for the scholarly output of an author.
Heretofore, the only way to evaluate an author has been frequency-based citation metrics that assume citations to be of a neutral sentiment. However, considering the sentiment behind citations aids in a better understanding of the viewpoints of fellow researchers for the scholarly output of an author.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
Early Directional Convergence in Deep Homogeneous Neural Networks for Small Initializations
Authors:
Akshay Kumar,
Jarvis Haupt
Abstract:
This paper studies the gradient flow dynamics that arise when training deep homogeneous neural networks, starting with small initializations. The present work considers neural networks that are assumed to have locally Lipschitz gradients and an order of homogeneity strictly greater than two. This paper demonstrates that for sufficiently small initializations, during the early stages of training, t…
▽ More
This paper studies the gradient flow dynamics that arise when training deep homogeneous neural networks, starting with small initializations. The present work considers neural networks that are assumed to have locally Lipschitz gradients and an order of homogeneity strictly greater than two. This paper demonstrates that for sufficiently small initializations, during the early stages of training, the weights of the neural network remain small in norm and approximately converge in direction along the Karush-Kuhn-Tucker (KKT) points of the neural correlation function introduced in [1]. Additionally, for square loss and under a separability assumption on the weights of neural networks, a similar directional convergence of gradient flow dynamics is shown near certain saddle points of the loss function.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
Existence and uniqueness of weak solutions for the generalized stochastic Navier-Stokes-Voigt equations
Authors:
Ankit Kumar,
Hermenegildo Borges de Oliveira,
Manil T. Mohan
Abstract:
In this work, we consider the incompressible generalized Navier-Stokes-Voigt equations in a bounded domain $\mathcal{O}\subset\mathbb{R}^d$, $d\geq 2$, driven by a multiplicative Gaussian noise. The considered momentum equation is given by:
\begin{align*}
\mathrm{d}\left(\boldsymbol{u} - κΔ\boldsymbol{u}\right) = \left[\boldsymbol{f} +\operatorname{div} \left(-π\mathbf{I}+ν|\mathbf{D}(\boldsym…
▽ More
In this work, we consider the incompressible generalized Navier-Stokes-Voigt equations in a bounded domain $\mathcal{O}\subset\mathbb{R}^d$, $d\geq 2$, driven by a multiplicative Gaussian noise. The considered momentum equation is given by:
\begin{align*}
\mathrm{d}\left(\boldsymbol{u} - κΔ\boldsymbol{u}\right) = \left[\boldsymbol{f} +\operatorname{div} \left(-π\mathbf{I}+ν|\mathbf{D}(\boldsymbol{u})|^{p-2}\mathbf{D}(\boldsymbol{u})-\boldsymbol{u}\otimes \boldsymbol{u}\right)\right]\mathrm{d} t + Φ(\boldsymbol{u})\mathrm{d} \mathrm{W}(t).
\end{align*} In the case of $d=2,3$, $\boldsymbol{u}$ accounts for the velocity field, $π$ is the pressure, $\boldsymbol{f}$ is a body force and the final term stay for the stochastic forces. Here, $κ$ and $ν$ are given positive constants that account for the kinematic viscosity and relaxation time, and the power-law index $p$ is another constant (assumed $p>1$) that characterizes the flow. We use the usual notation $\mathbf{I}$ for the unit tensor and $\mathbf{D}(\boldsymbol{u}):=\frac{1}{2}\left(\nabla \boldsymbol{u} + (\nabla \boldsymbol{u})^{\top}\right)$ for the symmetric part of velocity gradient. For $p\in\big(\frac{2d}{d+2},\infty\big)$, we first prove the existence of a martingale solution. Then we show the pathwise uniqueness of solutions. We employ the classical Yamada-Watanabe theorem to ensure the existence of a unique probabilistic strong solution.Then we show the pathwise uniqueness of solutions. We employ the classical Yamada-Watanabe theorem to ensure the existence of a unique probabilistic strong solution.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
Temporal Decisions: Leveraging Temporal Correlation for Efficient Decisions in Early Exit Neural Networks
Authors:
Max Sponner,
Lorenzo Servadei,
Bernd Waschneck,
Robert Wille,
Akash Kumar
Abstract:
Deep Learning is becoming increasingly relevant in Embedded and Internet-of-things applications. However, deploying models on embedded devices poses a challenge due to their resource limitations. This can impact the model's inference accuracy and latency. One potential solution are Early Exit Neural Networks, which adjust model depth dynamically through additional classifiers attached between thei…
▽ More
Deep Learning is becoming increasingly relevant in Embedded and Internet-of-things applications. However, deploying models on embedded devices poses a challenge due to their resource limitations. This can impact the model's inference accuracy and latency. One potential solution are Early Exit Neural Networks, which adjust model depth dynamically through additional classifiers attached between their hidden layers. However, the real-time termination decision mechanism is critical for the system's efficiency, latency, and sustained accuracy.
This paper introduces Difference Detection and Temporal Patience as decision mechanisms for Early Exit Neural Networks. They leverage the temporal correlation present in sensor data streams to efficiently terminate the inference. We evaluate their effectiveness in health monitoring, image classification, and wake-word detection tasks. Our novel contributions were able to reduce the computational footprint compared to established decision mechanisms significantly while maintaining higher accuracy scores. We achieved a reduction of mean operations per inference by up to 80% while maintaining accuracy levels within 5% of the original model.
These findings highlight the importance of considering temporal correlation in sensor data to improve the termination decision.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
Efficient Post-Training Augmentation for Adaptive Inference in Heterogeneous and Distributed IoT Environments
Authors:
Max Sponner,
Lorenzo Servadei,
Bernd Waschneck,
Robert Wille,
Akash Kumar
Abstract:
Early Exit Neural Networks (EENNs) present a solution to enhance the efficiency of neural network deployments. However, creating EENNs is challenging and requires specialized domain knowledge, due to the large amount of additional design choices. To address this issue, we propose an automated augmentation flow that focuses on converting an existing model into an EENN. It performs all required desi…
▽ More
Early Exit Neural Networks (EENNs) present a solution to enhance the efficiency of neural network deployments. However, creating EENNs is challenging and requires specialized domain knowledge, due to the large amount of additional design choices. To address this issue, we propose an automated augmentation flow that focuses on converting an existing model into an EENN. It performs all required design decisions for the deployment to heterogeneous or distributed hardware targets: Our framework constructs the EENN architecture, maps its subgraphs to the hardware targets, and configures its decision mechanism. To the best of our knowledge, it is the first framework that is able to perform all of these steps.
We evaluated our approach on a collection of Internet-of-Things and standard image classification use cases. For a speech command detection task, our solution was able to reduce the mean operations per inference by 59.67%. For an ECG classification task, it was able to terminate all samples early, reducing the mean inference energy by 74.9% and computations by 78.3%. On CIFAR-10, our solution was able to achieve up to a 58.75% reduction in computations.
The search on a ResNet-152 base model for CIFAR-10 took less than nine hours on a laptop CPU. Our proposed approach enables the creation of EENN optimized for IoT environments and can reduce the inference cost of Deep Learning applications on embedded and fog platforms, while also significantly reducing the search cost - making it more accessible for scientists and engineers in industry and research. The low search cost improves the accessibility of EENNs, with the potential to improve the efficiency of neural networks in a wide range of practical applications.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
Calibration of VELC detectors on-board Aditya-L1 mission
Authors:
Shalabh Mishra,
K. Sasikumar Raja,
Sanal Krishnan V U,
Venkata Suresh Narra,
Bhavana Hegde S,
Utkarsha D.,
Muthu Priyal V,
Pawan Kumar S,
Natarajan V,
Raghavendra Prasad B,
Jagdev Singh,
Umesh Kamath P,
Kathiravan S,
Vishnu T,
Suresha,
Savarimuthu P,
Jalshri H Desai,
Rajiv Kumaran,
Shiv Sagar,
Sumit Kumar,
Inderjeet Singh Bamrah,
Amit Kumar
Abstract:
Aditya-L1 is the first Indian space mission to explore the Sun and solar atmosphere with seven multi-wavelength payloads, with Visible Emission Line Coronagraph (VELC) being the prime payload. It is an internally occulted coronagraph with four channels to image the Sun at 5000 Å~ in the field of view 1.05 - 3 \rsun, and to pursue spectroscopy at 5303 Å, 7892 Å~ and 10747 Å~ channels in the FOV (1.…
▽ More
Aditya-L1 is the first Indian space mission to explore the Sun and solar atmosphere with seven multi-wavelength payloads, with Visible Emission Line Coronagraph (VELC) being the prime payload. It is an internally occulted coronagraph with four channels to image the Sun at 5000 Å~ in the field of view 1.05 - 3 \rsun, and to pursue spectroscopy at 5303 Å, 7892 Å~ and 10747 Å~ channels in the FOV (1.05 - 1.5 \rsun). In addition, spectropolarimetry is planned at 10747 Å~ channel. Therefore, VELC has three sCMOS detectors and one InGaAs detector. In this article, we aim to describe the technical details and specifications of the detectors achieved by way of thermo-vacuum calibration at the CREST campus of the Indian Institute of Astrophysics, Bangalore, India. Furthermore, we report the estimated conversion gain, full-well capacity, and readout noise at different temperatures. Based on the numbers, it is thus concluded that it is essential to operate the sCMOS detectors and InGaAs detectors at $-5^{\circ}$ and $-17^{\circ}$ C, respectively, at the spacecraft level.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
Search for CP-violating Neutrino Non-Standard Interactions with the NOvA Experiment
Authors:
NOvA Collaboration,
M. A. Acero,
B. Acharya,
P. Adamson,
L. Aliaga,
N. Anfimov,
A. Antoshkin,
E. Arrieta-Diaz,
L. Asquith,
A. Aurisano,
A. Back,
N. Balashov,
P. Baldi,
B. A. Bambah,
A. Bat,
K. Bays,
R. Bernstein,
T. J. C. Bezerra,
V. Bhatnagar,
D. Bhattarai,
B. Bhuyan,
J. Bian,
A. C. Booth,
R. Bowles,
B. Brahma
, et al. (182 additional authors not shown)
Abstract:
This Letter reports a search for charge-parity (CP) symmetry violating non-standard interactions (NSI) of neutrinos with matter using the NOvA Experiment, and examines their effects on the determination of the standard oscillation parameters. Data from $ν_μ(\barν_μ)\rightarrowν_μ(\barν_μ)$ and $ν_μ(\barν_μ)\rightarrowν_{e}(\barν_{e})$ oscillation channels are used to measure the effect of the NSI…
▽ More
This Letter reports a search for charge-parity (CP) symmetry violating non-standard interactions (NSI) of neutrinos with matter using the NOvA Experiment, and examines their effects on the determination of the standard oscillation parameters. Data from $ν_μ(\barν_μ)\rightarrowν_μ(\barν_μ)$ and $ν_μ(\barν_μ)\rightarrowν_{e}(\barν_{e})$ oscillation channels are used to measure the effect of the NSI parameters $\varepsilon_{eμ}$ and $\varepsilon_{eτ}$. With 90% C.L. the magnitudes of the NSI couplings are constrained to be $|\varepsilon_{eμ}| \, \lesssim 0.3$ and $|\varepsilon_{eτ}| \, \lesssim 0.4$. A degeneracy at $|\varepsilon_{eτ}| \, \approx 1.8$ is reported, and we observe that the presence of NSI limits sensitivity to the standard CP phase $δ_{\tiny\text{CP}}$.
△ Less
Submitted 11 March, 2024;
originally announced March 2024.
-
Ising model with non-reciprocal interactions
Authors:
Agney K. Rajeev,
A. V. Anil Kumar
Abstract:
Effective interactions that violate Newton's third law of action-reaction symmetry are common in systems where interactions are mediated by a non-equilibrium environment. Extensive Monte Carlo simulations are carried out on a two-dimensional Ising model, where the interactions are modified non-reciprocally. We demonstrate that the critical temperature decreases as the non-reciprocity increases and…
▽ More
Effective interactions that violate Newton's third law of action-reaction symmetry are common in systems where interactions are mediated by a non-equilibrium environment. Extensive Monte Carlo simulations are carried out on a two-dimensional Ising model, where the interactions are modified non-reciprocally. We demonstrate that the critical temperature decreases as the non-reciprocity increases and this decrease depends only on the magnitude of non-reciprocity. Further, travelling spin waves due to the local fluctuations in magnetisation are observed and these spin waves travel opposite to the non-reciprocity vector.
△ Less
Submitted 11 March, 2024;
originally announced March 2024.