subscribe to arXiv mailings

Unveiling Glitches: A Deep Dive into Image Encoding Bugs within CLIP

Authors: Ayush Ranjan, Daniel Wen, Karthik Bhat

Abstract: Understanding the limitations and weaknesses of state-of-the-art models in artificial intelligence is crucial for their improvement and responsible application. In this research, we focus on CLIP, a model renowned for its integration of vision and language processing. Our objective is to uncover recurring problems and blind spots in CLIP's image comprehension. By delving into both the commonalitie… ▽ More Understanding the limitations and weaknesses of state-of-the-art models in artificial intelligence is crucial for their improvement and responsible application. In this research, we focus on CLIP, a model renowned for its integration of vision and language processing. Our objective is to uncover recurring problems and blind spots in CLIP's image comprehension. By delving into both the commonalities and disparities between CLIP and human image understanding, we augment our comprehension of these models' capabilities. Through our analysis, we reveal significant discrepancies in CLIP's interpretation of images compared to human perception, shedding light on areas requiring improvement. Our methodologies, the Discrepancy Analysis Framework (DAF) and the Transformative Caption Analysis for CLIP (TCAC), enable a comprehensive evaluation of CLIP's performance. We identify 14 systemic faults, including Action vs. Stillness confusion, Failure to identify the direction of movement or positioning of objects in the image, Hallucination of Water-like Features, Misattribution of Geographic Context, among others. By addressing these limitations, we lay the groundwork for the development of more accurate and nuanced image embedding models, contributing to advancements in artificial intelligence. △ Less

Submitted 30 June, 2024; originally announced July 2024.

ACM Class: F.2.2; I.2.7

arXiv:2406.01000 [pdf, other]

doi 10.1016/j.asr.2024.06.040

Seasonal variation in nighttime NO radiative cooling as observed by TIMED/SABER in lower thermosphere during solar maximum and solar minimum

Authors: Alok Kumar Ranjan, MV Sunil Krishna, Akash Kumar, Dayakrishna Nailwal, Sumanta Sarkhel

Abstract: Both composition and temperature play a crucial role in determining the NO radiative cooling in lower thermosphere as observed by TIMED/SABER. In this work, we present a detailed investigation of seasonal variation in thermospheric NO radiative cooling. We have carried forward the investigation of \cite{li2018} regarding the variations in local nighttime peak NO radiative cooling and its altitude… ▽ More Both composition and temperature play a crucial role in determining the NO radiative cooling in lower thermosphere as observed by TIMED/SABER. In this work, we present a detailed investigation of seasonal variation in thermospheric NO radiative cooling. We have carried forward the investigation of \cite{li2018} regarding the variations in local nighttime peak NO radiative cooling and its altitude during solar maximum and solar minimum conditions. By analyzing latitudinal changes over quiet times for each month in year 2018, it is evident that both the investigative parameters exhibit summer-winter variability. The qualitative contribution of different species (i.e., NO, and O), and temperatures in determining the vertical profile of NO radiative cooling for different latitudes is investigated by utilizing the NRLMSISE-00 estimated parameters, and SNOE observed NO density. The temperature, NO density, meridional wind, and associated compositional variations due to asymmetrical solar heating in both the hemispheres during solar minimum conditions seem to be the dominating factor in controlling the NO radiative cooling during different seasons. The altitudes at which maximum cooling by NO occurs exhibits an inverse correlation with the amount of radiative cooling. The region of enhanced NO densities (polar and summer hemispheric low-mid latitude regions) have larger NO radiative cooling with lower peak altitudes in comparison to other regions (equatorial to winter hemispheric low-mid latitude regions), where NO radiative cooling is low with higher peak altitude values. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: 19 pages, 10 figures

arXiv:2405.19801 [pdf, other]

Modeling of Nitric Oxide Infrared radiative flux in lower thermosphere: a machine learning perspective

Authors: Dayakrishna Nailwal, MV Sunil Krishna, Alok Kumar Ranjan, Jia Yue

Abstract: Nitric Oxide (NO) significantly impacts energy distribution and chemical processes in the mesosphere and lower thermosphere (MLT). During geomagnetic storms, a substantial influx of energy in the thermosphere leads to an increase in NO infrared emissions. Accurately predicting the radiative flux of Nitric Oxide is crucial for understanding the thermospheric energy budget, particularly during extre… ▽ More Nitric Oxide (NO) significantly impacts energy distribution and chemical processes in the mesosphere and lower thermosphere (MLT). During geomagnetic storms, a substantial influx of energy in the thermosphere leads to an increase in NO infrared emissions. Accurately predicting the radiative flux of Nitric Oxide is crucial for understanding the thermospheric energy budget, particularly during extreme space weather events. With advancements in computational techniques, machine learning (ML) has become a highly effective tool for space weather forecasting. This effort becomes even more worthwhile considering the availability of two decades of continuous NO infrared emissions measurement by TIMED/SABER along with several other key thermospheric variables. We present the scheme of development of an ML-based predictive model for Nitric Oxide Infrared Radiative Flux (NOIRF). Various ML algorithms have been tested for better predictive ability, and an optimized model (NOEMLM) has been developed for the study of NOIRF. This model is able to extract the underlying relationships between the input features and effectively predict the NOIRF. The NOEMLM predictions have very good agreements with SABER observation during quiet time as well as geomagnetic storms. In comparison with the existing TIEGCM model, NOEMLM has very good performance, especially during extreme space weather conditions. The results of this study suggest that utilizing geomagnetic and space weather indices with ML/AI can serve as superior parameters for studying the upper atmosphere, as compared to focusing on specific species having complex chemical processes and associated uncertainties in constituents. ML techniques can effectively carry out the analysis with greater ease than traditional chemical studies. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 18 pages, 7 figures

Journal ref: Under review in Advances in Space Research 2024

arXiv:2405.10183 [pdf, other]

A Guide to Tracking Phylogenies in Parallel and Distributed Agent-based Evolution Models

Authors: Matthew Andres Moreno, Anika Ranjan, Emily Dolson, Luis Zaman

Abstract: Computer simulations are an important tool for studying the mechanics of biological evolution. In particular, in silico work with agent-based models provides an opportunity to collect high-quality records of ancestry relationships among simulated agents. Such phylogenies can provide insight into evolutionary dynamics within these simulations. Existing work generally tracks lineages directly, yield… ▽ More Computer simulations are an important tool for studying the mechanics of biological evolution. In particular, in silico work with agent-based models provides an opportunity to collect high-quality records of ancestry relationships among simulated agents. Such phylogenies can provide insight into evolutionary dynamics within these simulations. Existing work generally tracks lineages directly, yielding an exact phylogenetic record of evolutionary history. However, direct tracking can be inefficient for large-scale, many-processor evolutionary simulations. An alternate approach to extracting phylogenetic information from simulation that scales more favorably is post hoc estimation, akin to how bioinformaticians build phylogenies by assessing genetic similarities between organisms. Recently introduced ``hereditary stratigraphy'' algorithms provide means for efficient inference of phylogenetic history from non-coding annotations on simulated organisms' genomes. A number of options exist in configuring hereditary stratigraphy methodology, but no work has yet tested how they impact reconstruction quality. To address this question, we surveyed reconstruction accuracy under alternate configurations across a matrix of evolutionary conditions varying in selection pressure, spatial structure, and ecological dynamics. We synthesize results from these experiments to suggest a prescriptive system of best practices for work with hereditary stratigraphy, ultimately guiding researchers in choosing appropriate instrumentation for large-scale simulation studies. △ Less

Submitted 16 May, 2024; originally announced May 2024.

arXiv:2404.16401 [pdf, other]

Model independent approach for calculating galaxy rotation curves for low $S/N$ MaNGA galaxies

Authors: Sangwoo Park, Arman Shafieloo, Satadru Bag, Mikhail Denissenya, Eric V. Linder, Adarsh Ranjan

Abstract: Internal kinematics of galaxies, traced through the stellar rotation curve or two dimensional velocity map, carry important information on galactic structure and dark matter. With upcoming surveys, the velocity map may play a key role in the development of kinematic lensing as an astrophysical probe. We improve techniques for extracting velocity information from integral field spectroscopy at low… ▽ More Internal kinematics of galaxies, traced through the stellar rotation curve or two dimensional velocity map, carry important information on galactic structure and dark matter. With upcoming surveys, the velocity map may play a key role in the development of kinematic lensing as an astrophysical probe. We improve techniques for extracting velocity information from integral field spectroscopy at low signal-to-noise ($S/N$), without a template, and demonstrate substantial advantages over the standard Penalized PiXel-Fitting method (pPXF) approach. We note that Robust rotation curves can be derived down to $S/N\approx 2$ using our method. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: 13 pages, 8 figures, 1 table

arXiv:2403.16247 [pdf, other]

Improving Sequence-to-Sequence Models for Abstractive Text Summarization Using Meta Heuristic Approaches

Authors: Aditya Saxena, Ashutosh Ranjan

Abstract: As human society transitions into the information age, reduction in our attention span is a contingency, and people who spend time reading lengthy news articles are decreasing rapidly and the need for succinct information is higher than ever before. Therefore, it is essential to provide a quick overview of important news by concisely summarizing the top news article and the most intuitive headline… ▽ More As human society transitions into the information age, reduction in our attention span is a contingency, and people who spend time reading lengthy news articles are decreasing rapidly and the need for succinct information is higher than ever before. Therefore, it is essential to provide a quick overview of important news by concisely summarizing the top news article and the most intuitive headline. When humans try to make summaries, they extract the essential information from the source and add useful phrases and grammatical annotations from the original extract. Humans have a unique ability to create abstractions. However, automatic summarization is a complicated problem to solve. The use of sequence-to-sequence (seq2seq) models for neural abstractive text summarization has been ascending as far as prevalence. Numerous innovative strategies have been proposed to develop the current seq2seq models further, permitting them to handle different issues like saliency, familiarity, and human lucidness and create excellent synopses. In this article, we aimed toward enhancing the present architectures and models for abstractive text summarization. The modifications have been aimed at fine-tuning hyper-parameters, attempting specific encoder-decoder combinations. We examined many experiments on an extensively used CNN/DailyMail dataset to check the effectiveness of various models. △ Less

Submitted 24 March, 2024; originally announced March 2024.

arXiv:2403.06437 [pdf, other]

doi 10.3847/1538-4357/ad32ce

Distribution of merging and post-merging galaxies in nearby galaxy clusters

Authors: Duho Kim, Yun-Kyeong Sheen, Yara L. Jaffé, Kshitija Kelkar, Adarsh Ranjan, Franco Piraino-Cerda, Jacob P. Crossett, Ana Carolina Costa Lourenço, Garreth Martin, Julie B. Nantais, Ricardo Demarco, Ezequiel Treister, Sukyoung K. Yi

Abstract: We study the incidence and spatial distribution of galaxies that are currently undergoing gravitational merging (M) or that have signs of a post merger (PM) in six galaxy clusters (A754, A2399, A2670, A3558, A3562, and A3716) within the redshift range, 0.05$\lesssim$$z$$\lesssim$0.08. To this aim, we obtained Dark Energy Camera (DECam) mosaics in $u^{\prime}$, $g^{\prime}$, and $r^{\prime}$-bands… ▽ More We study the incidence and spatial distribution of galaxies that are currently undergoing gravitational merging (M) or that have signs of a post merger (PM) in six galaxy clusters (A754, A2399, A2670, A3558, A3562, and A3716) within the redshift range, 0.05$\lesssim$$z$$\lesssim$0.08. To this aim, we obtained Dark Energy Camera (DECam) mosaics in $u^{\prime}$, $g^{\prime}$, and $r^{\prime}$-bands covering up to $3\times R_{200}$ of the clusters, reaching 28 mag/arcsec$^2$ surface brightness limits. We visually inspect $u^{\prime}$$g^{\prime}$$r^{\prime}$ color-composite images of volume-limited ($M_r < -20$) cluster-member galaxies to identify whether galaxies are of M or PM types. We find 4% M-type and 7% PM-type galaxies in the galaxy clusters studied. By adding spectroscopic data and studying the projected phase space diagram (PPSD) of the projected clustocentric radius and the line-of-sight velocity, we find that PM-type galaxies are more virialized than M-type galaxies, having 1--5% point higher fraction within the escape-velocity region, while the fraction of M-type was $\sim$10% point higher than PM-type in the intermediate environment. Similarly, on a substructure analysis, M types were found in the outskirt groups, while PM types populated groups in ubiquitous regions of the PPSD. Adopting literature-derived dynamical state indicator values, we observed a higher abundance of M types in dynamically relaxed clusters. This finding suggests that galaxies displaying post-merging features within clusters likely merged in low-velocity environments, including cluster outskirts and dynamically relaxed clusters. △ Less

Submitted 3 May, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

Comments: 23 pages, 15 figures, 4 tables, Published in ApJ. For photometric catalogs and associated information, see https://data.kasi.re.kr/vo/DECam_catalogs/

Journal ref: ApJ 966 124 (2024)

arXiv:2403.01410 [pdf, other]

Barrier Functions Inspired Reward Shaping for Reinforcement Learning

Authors: Nilaksh Nilaksh, Abhishek Ranjan, Shreenabh Agrawal, Aayush Jain, Pushpak Jagtap, Shishir Kolathaya

Abstract: Reinforcement Learning (RL) has progressed from simple control tasks to complex real-world challenges with large state spaces. While RL excels in these tasks, training time remains a limitation. Reward shaping is a popular solution, but existing methods often rely on value functions, which face scalability issues. This paper presents a novel safety-oriented reward-shaping framework inspired by bar… ▽ More Reinforcement Learning (RL) has progressed from simple control tasks to complex real-world challenges with large state spaces. While RL excels in these tasks, training time remains a limitation. Reward shaping is a popular solution, but existing methods often rely on value functions, which face scalability issues. This paper presents a novel safety-oriented reward-shaping framework inspired by barrier functions, offering simplicity and ease of implementation across various environments and tasks. To evaluate the effectiveness of the proposed reward formulations, we conduct simulation experiments on CartPole, Ant, and Humanoid environments, along with real-world deployment on the Unitree Go1 quadruped robot. Our results demonstrate that our method leads to 1.4-2.8 times faster convergence and as low as 50-60% actuation effort compared to the vanilla reward. In a sim-to-real experiment with the Go1 robot, we demonstrated better control and dynamics of the bot with our reward framework. △ Less

Submitted 1 April, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

Comments: 7 pages, 10 figures, Accepted as contributed paper at ICRA 2024

ACM Class: I.2.9

arXiv:2402.07655 [pdf, other]

Inhomogeneous spin momentum induced orbital motion of birefringent particles in tight focusing of vector beams in optical tweezers

Authors: Ram Nandan Kumar, Sauvik Roy, Anand Dev Ranjan, Subhasish Dutta Gupta, Nirmalya Ghosh, Ayan Banerjee

Abstract: Spin orbit interaction (SOI) due to tight focusing of light in optical tweezers has led to exciting and exotic avenues towards inducing rotation in microscopic particles. However, instances where the back action of the particles influences and modifies SOI effects so as to induce rotational motion are rarely known. Here, we tightly focus a vector beam having radial/azimuthal polarization carrying… ▽ More Spin orbit interaction (SOI) due to tight focusing of light in optical tweezers has led to exciting and exotic avenues towards inducing rotation in microscopic particles. However, instances where the back action of the particles influences and modifies SOI effects so as to induce rotational motion are rarely known. Here, we tightly focus a vector beam having radial/azimuthal polarization carrying no intrinsic angular momentum, into a refractive index stratified medium, and observe orbital rotation of birefringent particles around the beam propagation axis. In order to validate our experimental findings, we perform numerical simulations of the underlying equations. Our simulations reveal that the interaction of light with a birefringent particle gives rise to inhomogeneous spin currents near the focus, resulting in a finite spin momentum. This spin momentum combines with the canonical momentum to finally generate an origin-dependent orbital angular momentum which is manifested in the rotation of the birefringent particles around the beam axis. Our study describes a unique modulation of the SOI of light due to interaction with anisotropic particles that can be used to identify new avenues for exotic and complex particle manipulation in optical tweezers. △ Less

Submitted 12 February, 2024; originally announced February 2024.

Comments: 7 pages, 6 figures. arXiv admin note: substantial text overlap with arXiv:2302.14443

arXiv:2401.03732 [pdf, other]

doi 10.1142/S0217732324500081

Heavy Hexaquarks in the Flux Tube Model

Authors: Sindhu D G, Akhilesh Ranjan, Hemwati Nandan, V. Sharma

Abstract: Hexaquarks are one of the currently emerging topics in both experimental and theoretical high energy physics. Hexaquarks have been examined in relation to particle physics, however, there are still some research and theoretical conjectures surrounding their relationship to dark matter. Due to some experimental discoveries, it has attracted much interest and also resulted in new theoretical models… ▽ More Hexaquarks are one of the currently emerging topics in both experimental and theoretical high energy physics. Hexaquarks have been examined in relation to particle physics, however, there are still some research and theoretical conjectures surrounding their relationship to dark matter. Due to some experimental discoveries, it has attracted much interest and also resulted in new theoretical models to study the properties of these states. In the present work, Regge trajectories of some hexaquark states are compared with tetraquark and pentaquark states. The study is mainly concentrated on fully heavy hexaquark states. The mass spectra of these hexaquark states have also been investigated and the results are compared with other theoretical works. Our findings agree well with those of other researchers. △ Less

Submitted 8 January, 2024; originally announced January 2024.

Comments: Preprint of an article submitted for consideration in [Modern Physics Letters A] \c{opyright} [2023] [copyright World Scientific Publishing Company] [https://www.worldscientific.com/worldscinet/mpla]

Journal ref: Mod. Phys. Lett. A, (2024) 2450008

arXiv:2312.11537 [pdf, other]

FastSR-NeRF: Improving NeRF Efficiency on Consumer Devices with A Simple Super-Resolution Pipeline

Authors: Chien-Yu Lin, Qichen Fu, Thomas Merth, Karren Yang, Anurag Ranjan

Abstract: Super-resolution (SR) techniques have recently been proposed to upscale the outputs of neural radiance fields (NeRF) and generate high-quality images with enhanced inference speeds. However, existing NeRF+SR methods increase training overhead by using extra input features, loss functions, and/or expensive training procedures such as knowledge distillation. In this paper, we aim to leverage SR for… ▽ More Super-resolution (SR) techniques have recently been proposed to upscale the outputs of neural radiance fields (NeRF) and generate high-quality images with enhanced inference speeds. However, existing NeRF+SR methods increase training overhead by using extra input features, loss functions, and/or expensive training procedures such as knowledge distillation. In this paper, we aim to leverage SR for efficiency gains without costly training or architectural changes. Specifically, we build a simple NeRF+SR pipeline that directly combines existing modules, and we propose a lightweight augmentation technique, random patch sampling, for training. Compared to existing NeRF+SR methods, our pipeline mitigates the SR computing overhead and can be trained up to 23x faster, making it feasible to run on consumer devices such as the Apple MacBook. Experiments show our pipeline can upscale NeRF outputs by 2-4x while maintaining high quality, increasing inference speeds by up to 18x on an NVIDIA V100 GPU and 12.8x on an M1 Pro chip. We conclude that SR can be a simple but effective technique for improving the efficiency of NeRF models for consumer devices. △ Less

Submitted 20 December, 2023; v1 submitted 15 December, 2023; originally announced December 2023.

Comments: WACV 2024 (Oral)

arXiv:2311.18168 [pdf, other]

Probabilistic Speech-Driven 3D Facial Motion Synthesis: New Benchmarks, Methods, and Applications

Authors: Karren D. Yang, Anurag Ranjan, Jen-Hao Rick Chang, Raviteja Vemulapalli, Oncel Tuzel

Abstract: We consider the task of animating 3D facial geometry from speech signal. Existing works are primarily deterministic, focusing on learning a one-to-one mapping from speech signal to 3D face meshes on small datasets with limited speakers. While these models can achieve high-quality lip articulation for speakers in the training set, they are unable to capture the full and diverse distribution of 3D f… ▽ More We consider the task of animating 3D facial geometry from speech signal. Existing works are primarily deterministic, focusing on learning a one-to-one mapping from speech signal to 3D face meshes on small datasets with limited speakers. While these models can achieve high-quality lip articulation for speakers in the training set, they are unable to capture the full and diverse distribution of 3D facial motions that accompany speech in the real world. Importantly, the relationship between speech and facial motion is one-to-many, containing both inter-speaker and intra-speaker variations and necessitating a probabilistic approach. In this paper, we identify and address key challenges that have so far limited the development of probabilistic models: lack of datasets and metrics that are suitable for training and evaluating them, as well as the difficulty of designing a model that generates diverse results while remaining faithful to a strong conditioning signal as speech. We first propose large-scale benchmark datasets and metrics suitable for probabilistic modeling. Then, we demonstrate a probabilistic model that achieves both diversity and fidelity to speech, outperforming other methods across the proposed benchmarks. Finally, we showcase useful applications of probabilistic models trained on these large-scale datasets: we can generate diverse speech-driven 3D facial motion that matches unseen speaker styles extracted from reference clips; and our synthetic meshes can be used to improve the performance of downstream audio-visual models. △ Less

Submitted 29 November, 2023; originally announced November 2023.

arXiv:2311.17910 [pdf, other]

HUGS: Human Gaussian Splats

Authors: Muhammed Kocabas, Jen-Hao Rick Chang, James Gabriel, Oncel Tuzel, Anurag Ranjan

Abstract: Recent advances in neural rendering have improved both training and rendering times by orders of magnitude. While these methods demonstrate state-of-the-art quality and speed, they are designed for photogrammetry of static scenes and do not generalize well to freely moving humans in the environment. In this work, we introduce Human Gaussian Splats (HUGS) that represents an animatable human togethe… ▽ More Recent advances in neural rendering have improved both training and rendering times by orders of magnitude. While these methods demonstrate state-of-the-art quality and speed, they are designed for photogrammetry of static scenes and do not generalize well to freely moving humans in the environment. In this work, we introduce Human Gaussian Splats (HUGS) that represents an animatable human together with the scene using 3D Gaussian Splatting (3DGS). Our method takes only a monocular video with a small number of (50-100) frames, and it automatically learns to disentangle the static scene and a fully animatable human avatar within 30 minutes. We utilize the SMPL body model to initialize the human Gaussians. To capture details that are not modeled by SMPL (e.g. cloth, hairs), we allow the 3D Gaussians to deviate from the human body model. Utilizing 3D Gaussians for animated humans brings new challenges, including the artifacts created when articulating the Gaussians. We propose to jointly optimize the linear blend skinning weights to coordinate the movements of individual Gaussians during animation. Our approach enables novel-pose synthesis of human and novel view synthesis of both the human and the scene. We achieve state-of-the-art rendering quality with a rendering speed of 60 FPS while being ~100x faster to train over previous work. Our code will be announced here: https://github.com/apple/ml-hugs △ Less

Submitted 29 November, 2023; originally announced November 2023.

arXiv:2311.07984 [pdf]

Intrinsic defect engineering of CVD grown monolayer MoS$_2$ for tuneable functional nanodevices

Authors: Irfan H. Abidi, Sindhu Priya Giridhar, Jonathan O. Tollerud, Jake Limb, Aishani Mazumder, Edwin LH Mayes, Billy J. Murdoch, Chenglong Xu, Ankit Bhoriya, Abhishek Ranjan, Taimur Ahmed, Yongxiang Li, Jeffrey A. Davis, Cameron L. Bentley, Salvy P. Russo, Enrico Della Gaspera, Sumeet Walia

Abstract: Defects in atomically thin materials can drive new functionalities and expand applications to multifunctional systems that are monolithically integrated. An ability to control formation of defects during the synthesis process is an important capability to create practical deployment opportunities. Molybdenum disulfide (MoS$_2$), a two-dimensional (2D) semiconducting material harbors intrinsic defe… ▽ More Defects in atomically thin materials can drive new functionalities and expand applications to multifunctional systems that are monolithically integrated. An ability to control formation of defects during the synthesis process is an important capability to create practical deployment opportunities. Molybdenum disulfide (MoS$_2$), a two-dimensional (2D) semiconducting material harbors intrinsic defects that can be harnessed to achieve tuneable electronic, optoelectronic, and electrochemical devices. However, achieving precise control over defect formation within monolayer MoS$_2$, while maintaining the structural integrity of the crystals remains a notable challenge. Here, we present a one-step, in-situ defect engineering approach for monolayer MoS$_2$ using a pressure dependent chemical vapour deposition (CVD) process. Monolayer MoS$_2$ grown in low-pressure CVD conditions (LP-MoS$_2$) produces sulfur vacancy (Vs) induced defect rich crystals primarily attributed to the kinetics of the growth conditions. Conversely, atmospheric pressure CVD grown MoS$_2$ (AP-MoS$_2$) passivates these Vs defects with oxygen. This disparity in defect profiles profoundly impacts crucial functional properties and device performance. AP-MoS$_2$ shows a drastically enhanced photoluminescence, which is significantly quenched in LP-MoS$_2$ attributed to in-gap electron donor states induced by the Vs defects. However, the n-doping induced by the Vs defects in LP-MoS$_2$ generates enhanced photoresponsivity and detectivity in our fabricated photodetectors compared to the AP-MoS$_2$ based devices. Defect-rich LP-MoS$_2$ outperforms AP-MoS$_2$ as channel layers of field-effect transistors (FETs), as well as electrocatalytic material for hydrogen evolution reaction (HER). This work presents a single-step CVD approach for in-situ defect engineering in monolayer MoS$_2$ and presents a pathway to control defects in other monolayer material systems. △ Less

Submitted 14 November, 2023; originally announced November 2023.

Comments: 29 pages, 5 figures

arXiv:2310.15130 [pdf, other]

Novel-View Acoustic Synthesis from 3D Reconstructed Rooms

Authors: Byeongjoo Ahn, Karren Yang, Brian Hamilton, Jonathan Sheaffer, Anurag Ranjan, Miguel Sarabia, Oncel Tuzel, Jen-Hao Rick Chang

Abstract: We investigate the benefit of combining blind audio recordings with 3D scene information for novel-view acoustic synthesis. Given audio recordings from 2-4 microphones and the 3D geometry and material of a scene containing multiple unknown sound sources, we estimate the sound anywhere in the scene. We identify the main challenges of novel-view acoustic synthesis as sound source localization, separ… ▽ More We investigate the benefit of combining blind audio recordings with 3D scene information for novel-view acoustic synthesis. Given audio recordings from 2-4 microphones and the 3D geometry and material of a scene containing multiple unknown sound sources, we estimate the sound anywhere in the scene. We identify the main challenges of novel-view acoustic synthesis as sound source localization, separation, and dereverberation. While naively training an end-to-end network fails to produce high-quality results, we show that incorporating room impulse responses (RIRs) derived from 3D reconstructed rooms enables the same network to jointly tackle these tasks. Our method outperforms existing methods designed for the individual tasks, demonstrating its effectiveness at utilizing 3D visual information. In a simulated study on the Matterport3D-NVAS dataset, our model achieves near-perfect accuracy on source localization, a PSNR of 26.44 dB and a SDR of 14.23 dB for source separation and dereverberation, resulting in a PSNR of 25.55 dB and a SDR of 14.20 dB on novel-view acoustic synthesis. Code, pretrained model, and video results are available on the project webpage (https://github.com/apple/ml-nvas3d). △ Less

Submitted 23 October, 2023; originally announced October 2023.

arXiv:2310.00831 [pdf, other]

Action Recognition Utilizing YGAR Dataset

Authors: Shuo Wang, Amiya Ranjan, Lawrence Jiang

Abstract: The scarcity of high quality actions video data is a bottleneck in the research and application of action recognition. Although significant effort has been made in this area, there still exist gaps in the range of available data types a more flexible and comprehensive data set could help bridge. In this paper, we present a new 3D actions data simulation engine and generate 3 sets of sample data to… ▽ More The scarcity of high quality actions video data is a bottleneck in the research and application of action recognition. Although significant effort has been made in this area, there still exist gaps in the range of available data types a more flexible and comprehensive data set could help bridge. In this paper, we present a new 3D actions data simulation engine and generate 3 sets of sample data to demonstrate its current functionalities. With the new data generation process, we demonstrate its applications to image classifications, action recognitions and potential to evolve into a system that would allow the exploration of much more complex action recognition tasks. In order to show off these capabilities, we also train and test a list of commonly used models for image recognition to demonstrate the potential applications and capabilities of the data sets and their generation process. △ Less

Submitted 1 October, 2023; originally announced October 2023.

Comments: 10 pages, 18 figures

arXiv:2309.15259 [pdf, other]

doi 10.1609/aaai.v37i8.26175

SLIQ: Quantum Image Similarity Networks on Noisy Quantum Computers

Authors: Daniel Silver, Tirthak Patel, Aditya Ranjan, Harshitta Gandhi, William Cutler, Devesh Tiwari

Abstract: Exploration into quantum machine learning has grown tremendously in recent years due to the ability of quantum computers to speed up classical programs. However, these efforts have yet to solve unsupervised similarity detection tasks due to the challenge of porting them to run on quantum computers. To overcome this challenge, we propose SLIQ, the first open-sourced work for resource-efficient quan… ▽ More Exploration into quantum machine learning has grown tremendously in recent years due to the ability of quantum computers to speed up classical programs. However, these efforts have yet to solve unsupervised similarity detection tasks due to the challenge of porting them to run on quantum computers. To overcome this challenge, we propose SLIQ, the first open-sourced work for resource-efficient quantum similarity detection networks, built with practical and effective quantum learning and variance-reducing algorithms. △ Less

Submitted 26 September, 2023; originally announced September 2023.

Journal ref: Vol. 37 No. 8: AAAI-2023 Technical Tracks 8

arXiv:2309.07164 [pdf, other]

Hybrid ASR for Resource-Constrained Robots: HMM - Deep Learning Fusion

Authors: Anshul Ranjan, Kaushik Jegadeesan

Abstract: This paper presents a novel hybrid Automatic Speech Recognition (ASR) system designed specifically for resource-constrained robots. The proposed approach combines Hidden Markov Models (HMMs) with deep learning models and leverages socket programming to distribute processing tasks effectively. In this architecture, the HMM-based processing takes place within the robot, while a separate PC handles t… ▽ More This paper presents a novel hybrid Automatic Speech Recognition (ASR) system designed specifically for resource-constrained robots. The proposed approach combines Hidden Markov Models (HMMs) with deep learning models and leverages socket programming to distribute processing tasks effectively. In this architecture, the HMM-based processing takes place within the robot, while a separate PC handles the deep learning model. This synergy between HMMs and deep learning enhances speech recognition accuracy significantly. We conducted experiments across various robotic platforms, demonstrating real-time and precise speech recognition capabilities. Notably, the system exhibits adaptability to changing acoustic conditions and compatibility with low-power hardware, making it highly effective in environments with limited computational resources. This hybrid ASR paradigm opens up promising possibilities for seamless human-robot interaction. In conclusion, our research introduces a pioneering dimension to ASR techniques tailored for robotics. By employing socket programming to distribute processing tasks across distinct devices and strategically combining HMMs with deep learning models, our hybrid ASR system showcases its potential to enable robots to comprehend and respond to spoken language adeptly, even in environments with restricted computational resources. This paradigm sets a innovative course for enhancing human-robot interaction across a wide range of real-world scenarios. △ Less

Submitted 11 September, 2023; originally announced September 2023.

Comments: To be published in IEEE Access, 9 pages, 14 figures, Received valuable support from CCBD PESU, for associated code, see https://github.com/AnshulRanjan2004/PyHMM

MSC Class: 62M09 (Primary) 62F10; 62F12 (Secondary) ACM Class: I.2.7; I.2.9

arXiv:2308.11096 [pdf, other]

MosaiQ: Quantum Generative Adversarial Networks for Image Generation on NISQ Computers

Authors: Daniel Silver, Tirthak Patel, William Cutler, Aditya Ranjan, Harshitta Gandhi, Devesh Tiwari

Abstract: Quantum machine learning and vision have come to the fore recently, with hardware advances enabling rapid advancement in the capabilities of quantum machines. Recently, quantum image generation has been explored with many potential advantages over non-quantum techniques; however, previous techniques have suffered from poor quality and robustness. To address these problems, we introduce, MosaiQ, a… ▽ More Quantum machine learning and vision have come to the fore recently, with hardware advances enabling rapid advancement in the capabilities of quantum machines. Recently, quantum image generation has been explored with many potential advantages over non-quantum techniques; however, previous techniques have suffered from poor quality and robustness. To address these problems, we introduce, MosaiQ, a high-quality quantum image generation GAN framework that can be executed on today's Near-term Intermediate Scale Quantum (NISQ) computers. △ Less

Submitted 21 August, 2023; originally announced August 2023.

Comments: Accepted to appear at ICCV'23

arXiv:2307.16799 [pdf, other]

Toward Privacy in Quantum Program Execution On Untrusted Quantum Cloud Computing Machines for Business-sensitive Quantum Needs

Authors: Tirthak Patel, Daniel Silver, Aditya Ranjan, Harshitta Gandhi, William Cutler, Devesh Tiwari

Abstract: Quantum computing is an emerging paradigm that has shown great promise in accelerating large-scale scientific, optimization, and machine-learning workloads. With most quantum computing solutions being offered over the cloud, it has become imperative to protect confidential and proprietary quantum code from being accessed by untrusted and/or adversarial agents. In response to this challenge, we pro… ▽ More Quantum computing is an emerging paradigm that has shown great promise in accelerating large-scale scientific, optimization, and machine-learning workloads. With most quantum computing solutions being offered over the cloud, it has become imperative to protect confidential and proprietary quantum code from being accessed by untrusted and/or adversarial agents. In response to this challenge, we propose SPYCE, which is the first known solution to obfuscate quantum code and output to prevent the leaking of any confidential information over the cloud. SPYCE implements a lightweight, scalable, and effective solution based on the unique principles of quantum computing to achieve this task. △ Less

Submitted 31 July, 2023; originally announced July 2023.

arXiv:2307.14231 [pdf, other]

Giant conductance of PSS:PEDOT micro-surfaces induced by microbubble lithography

Authors: Anand Dev Ranjan, Rakesh Sen, Sumeet Kumar, Rahul Vaippully, Soumya Dutta, Soumyajit Roy, Basudev Roy, Ayan Banerjee

Abstract: We provide direct evidence of the effects of interface engineering of various substrates by Microbubble lithography (MBL). We choose a model organic plastic (or polymer) poly(3,4-ethylenedioxythiophene) polystyrene sulfonate (PEDOT:PSS), with conductivity of 140 S/cm, as a representative organic system to showcase our technique. Thus, we fabricate permanent patterns of PEDOT:PSS on glass, followed… ▽ More We provide direct evidence of the effects of interface engineering of various substrates by Microbubble lithography (MBL). We choose a model organic plastic (or polymer) poly(3,4-ethylenedioxythiophene) polystyrene sulfonate (PEDOT:PSS), with conductivity of 140 S/cm, as a representative organic system to showcase our technique. Thus, we fabricate permanent patterns of PEDOT:PSS on glass, followed by a flexible PDMS substrate, and observe conductivity enhancement of 5 times on the former (694 S/cm), and 20 times (2844 S/cm) on the latter, without the use of external doping agents or invasive chemical treatment. Probing the patterned interface, we observe that MBL is able to tune the conformational states of PEDOT:PSS from coils in the pristine form, to extended coils on glass, and almost linear structures in PDMS due to its more malleable liquid-like interface. This results in higher ordering and vanishing grain boundaries leading to the highest conductivity of PEDOT:PSS on PDMS substrates. △ Less

Submitted 26 July, 2023; originally announced July 2023.

arXiv:2307.13284 [pdf, other]

doi 10.1142/S0217751X23500446

Regge Trajectories of Tetraquarks and Pentaquarks with Massive Quarks in the Flux Tube Model

Authors: Sindhu D G, Akhilesh Ranjan, Hemwati Nandan

Abstract: In recent years, many tetraquarks and pentaquarks have been discovered by various experimental groups and X(3872), Zc(3900), X(4430), P + c (4312), P + c (4457) are some of the interesting observed tetraquark and pentaquark states. The Regge trajectories of some such states are studied in view of the flux tube model of hadrons with finite quark masses. The effect of flux tube (or string) length va… ▽ More In recent years, many tetraquarks and pentaquarks have been discovered by various experimental groups and X(3872), Zc(3900), X(4430), P + c (4312), P + c (4457) are some of the interesting observed tetraquark and pentaquark states. The Regge trajectories of some such states are studied in view of the flux tube model of hadrons with finite quark masses. The effect of flux tube (or string) length variation on the Regge trajectories of these sates is analysed in detail. It is observed that for a fixed angular momentum, the string length has a constant value. Some other states are also proposed and the results obtained are then compared with the studies by others. Our findings correspond rather well with those of other researchers and with those of the experiment. △ Less

Submitted 25 July, 2023; originally announced July 2023.

Comments: 10 pages, 4 figures, Published in IJMPA

Journal ref: IJMPA 38, 06n07 2350044 (2023)

arXiv:2307.13278 [pdf, other]

doi 10.1088/1361-6471/acd1a3

Models and Potentials in Hadron Spectroscopy

Authors: Sreelakshmi M, Akhilesh Ranjan

Abstract: In the past twenty years, hadron spectroscopy has made immense progress. Experimental facilities have observed different multiquark states during these years. There are different models and phenomenological potentials to study the nature of interquark interaction. In this work, we have reviewed different quark potentials and models used in hadron spectroscopy. In the past twenty years, hadron spectroscopy has made immense progress. Experimental facilities have observed different multiquark states during these years. There are different models and phenomenological potentials to study the nature of interquark interaction. In this work, we have reviewed different quark potentials and models used in hadron spectroscopy. △ Less

Submitted 25 July, 2023; originally announced July 2023.

Comments: 35 pages, 1 figure

Journal ref: 2023 J. Phys. G: Nucl. Part. Phys. 50 073001

arXiv:2306.11177 [pdf, other]

Pipit: Scripting the analysis of parallel execution traces

Authors: Abhinav Bhatele, Rakrish Dhakal, Alexander Movsesyan, Aditya K. Ranjan, Onur Cankur

Abstract: Performance analysis is a critical step in the oft-repeated, iterative process of performance tuning of parallel programs. Per-process, per-thread traces (detailed logs of events with timestamps) enable in-depth analysis of parallel program execution to identify different kinds of performance issues. Often times, trace collection tools provide a graphical tool to analyze the trace output. However,… ▽ More Performance analysis is a critical step in the oft-repeated, iterative process of performance tuning of parallel programs. Per-process, per-thread traces (detailed logs of events with timestamps) enable in-depth analysis of parallel program execution to identify different kinds of performance issues. Often times, trace collection tools provide a graphical tool to analyze the trace output. However, these GUI-based tools only support specific file formats, are challenging to scale to large trace sizes, limit data exploration to the implemented graphical views, and do not support automated comparisons of two or more datasets. In this paper, we present a programmatic approach to analyzing parallel execution traces by leveraging pandas, a powerful Python-based data analysis library. We have developed a Python library, Pipit, on top of pandas that can read traces in different file formats (OTF2, HPCToolkit, Projections, Nsight Systems, etc.) and provides a uniform data structure in the form of a pandas DataFrame. Pipit provides operations to aggregate, filter, and transform the events in a trace to present the data in different ways. We also provide several functions to quickly and easily identify performance issues in parallel executions. More importantly, the API is easily extensible to support custom analyses by different end users. △ Less

Submitted 14 May, 2024; v1 submitted 19 June, 2023; originally announced June 2023.

arXiv:2305.13525 [pdf, other]

A 4D Hybrid Algorithm to Scale Parallel Training to Thousands of GPUs

Authors: Siddharth Singh, Prajwal Singhania, Aditya K. Ranjan, Zack Sating, Abhinav Bhatele

Abstract: Heavy communication, in particular, collective operations, can become a critical performance bottleneck in scaling the training of billion-parameter neural networks to large-scale parallel systems. This paper introduces a four-dimensional (4D) approach to optimize communication in parallel training. This 4D approach is a hybrid of 3D tensor and data parallelism, and is implemented in the AxoNN fra… ▽ More Heavy communication, in particular, collective operations, can become a critical performance bottleneck in scaling the training of billion-parameter neural networks to large-scale parallel systems. This paper introduces a four-dimensional (4D) approach to optimize communication in parallel training. This 4D approach is a hybrid of 3D tensor and data parallelism, and is implemented in the AxoNN framework. In addition, we employ two key strategies to further minimize communication overheads. First, we aggressively overlap expensive collective operations (reduce-scatter, all-gather, and all-reduce) with computation. Second, we develop an analytical model to identify high-performing configurations within the large search space defined by our 4D algorithm. This model empowers practitioners by simplifying the tuning process for their specific training workloads. When training an 80-billion parameter GPT on 1024 GPUs of Perlmutter, AxoNN surpasses Megatron-LM, a state-of-the-art framework, by a significant 26%. Additionally, it achieves a significantly high 57% of the theoretical peak FLOP/s or 182 PFLOP/s in total. △ Less

Submitted 14 May, 2024; v1 submitted 22 May, 2023; originally announced May 2023.

arXiv:2305.05687 [pdf, other]

doi 10.3847/1538-4357/accc89

Coronal Heating as Determined by the Solar Flare Frequency Distribution Obtained by Aggregating Case Studies

Authors: James Paul Mason, Alexandra Werth, Colin G. West, Allison A. Youngblood, Donald L. Woodraska, Courtney Peck, Kevin Lacjak, Florian G. Frick, Moutamen Gabir, Reema A. Alsinan, Thomas Jacobsen, Mohammad Alrubaie, Kayla M. Chizmar, Benjamin P. Lau, Lizbeth Montoya Dominguez, David Price, Dylan R. Butler, Connor J. Biron, Nikita Feoktistov, Kai Dewey, N. E. Loomis, Michal Bodzianowski, Connor Kuybus, Henry Dietrick, Aubrey M. Wolfe , et al. (977 additional authors not shown)

Abstract: Flare frequency distributions represent a key approach to addressing one of the largest problems in solar and stellar physics: determining the mechanism that counter-intuitively heats coronae to temperatures that are orders of magnitude hotter than the corresponding photospheres. It is widely accepted that the magnetic field is responsible for the heating, but there are two competing mechanisms th… ▽ More Flare frequency distributions represent a key approach to addressing one of the largest problems in solar and stellar physics: determining the mechanism that counter-intuitively heats coronae to temperatures that are orders of magnitude hotter than the corresponding photospheres. It is widely accepted that the magnetic field is responsible for the heating, but there are two competing mechanisms that could explain it: nanoflares or Alfvén waves. To date, neither can be directly observed. Nanoflares are, by definition, extremely small, but their aggregate energy release could represent a substantial heating mechanism, presuming they are sufficiently abundant. One way to test this presumption is via the flare frequency distribution, which describes how often flares of various energies occur. If the slope of the power law fitting the flare frequency distribution is above a critical threshold, $α=2$ as established in prior literature, then there should be a sufficient abundance of nanoflares to explain coronal heating. We performed $>$600 case studies of solar flares, made possible by an unprecedented number of data analysts via three semesters of an undergraduate physics laboratory course. This allowed us to include two crucial, but nontrivial, analysis methods: pre-flare baseline subtraction and computation of the flare energy, which requires determining flare start and stop times. We aggregated the results of these analyses into a statistical study to determine that $α= 1.63 \pm 0.03$. This is below the critical threshold, suggesting that Alfvén waves are an important driver of coronal heating. △ Less

Submitted 9 May, 2023; originally announced May 2023.

Comments: 1,002 authors, 14 pages, 4 figures, 3 tables, published by The Astrophysical Journal on 2023-05-09, volume 948, page 71

arXiv:2304.12390 [pdf, other]

Pointersect: Neural Rendering with Cloud-Ray Intersection

Authors: Jen-Hao Rick Chang, Wei-Yu Chen, Anurag Ranjan, Kwang Moo Yi, Oncel Tuzel

Abstract: We propose a novel method that renders point clouds as if they are surfaces. The proposed method is differentiable and requires no scene-specific optimization. This unique capability enables, out-of-the-box, surface normal estimation, rendering room-scale point clouds, inverse rendering, and ray tracing with global illumination. Unlike existing work that focuses on converting point clouds to other… ▽ More We propose a novel method that renders point clouds as if they are surfaces. The proposed method is differentiable and requires no scene-specific optimization. This unique capability enables, out-of-the-box, surface normal estimation, rendering room-scale point clouds, inverse rendering, and ray tracing with global illumination. Unlike existing work that focuses on converting point clouds to other representations--e.g., surfaces or implicit functions--our key idea is to directly infer the intersection of a light ray with the underlying surface represented by the given point cloud. Specifically, we train a set transformer that, given a small number of local neighbor points along a light ray, provides the intersection point, the surface normal, and the material blending weights, which are used to render the outcome of this light ray. Localizing the problem into small neighborhoods enables us to train a model with only 48 meshes and apply it to unseen point clouds. Our model achieves higher estimation accuracy than state-of-the-art surface reconstruction and point-cloud rendering methods on three test sets. When applied to room-scale point clouds, without any scene-specific optimization, the model achieves competitive quality with the state-of-the-art novel-view rendering methods. Moreover, we demonstrate ability to render and manipulate Lidar-scanned point clouds such as lighting control and object insertion. △ Less

Submitted 24 April, 2023; originally announced April 2023.

Comments: CVPR 2023

arXiv:2304.01480 [pdf, other]

FineRecon: Depth-aware Feed-forward Network for Detailed 3D Reconstruction

Authors: Noah Stier, Anurag Ranjan, Alex Colburn, Yajie Yan, Liang Yang, Fangchang Ma, Baptiste Angles

Abstract: Recent works on 3D reconstruction from posed images have demonstrated that direct inference of scene-level 3D geometry without test-time optimization is feasible using deep neural networks, showing remarkable promise and high efficiency. However, the reconstructed geometry, typically represented as a 3D truncated signed distance function (TSDF), is often coarse without fine geometric details. To a… ▽ More Recent works on 3D reconstruction from posed images have demonstrated that direct inference of scene-level 3D geometry without test-time optimization is feasible using deep neural networks, showing remarkable promise and high efficiency. However, the reconstructed geometry, typically represented as a 3D truncated signed distance function (TSDF), is often coarse without fine geometric details. To address this problem, we propose three effective solutions for improving the fidelity of inference-based 3D reconstructions. We first present a resolution-agnostic TSDF supervision strategy to provide the network with a more accurate learning signal during training, avoiding the pitfalls of TSDF interpolation seen in previous work. We then introduce a depth guidance strategy using multi-view depth estimates to enhance the scene representation and recover more accurate surfaces. Finally, we develop a novel architecture for the final layers of the network, conditioning the output TSDF prediction on high-resolution image features in addition to coarse voxel features, enabling sharper reconstruction of fine details. Our method, FineRecon, produces smooth and highly accurate reconstructions, showing significant improvements across multiple depth and 3D reconstruction metrics. △ Less

Submitted 18 August, 2023; v1 submitted 3 April, 2023; originally announced April 2023.

Comments: ICCV 2023

arXiv:2303.15437 [pdf, other]

FaceLit: Neural 3D Relightable Faces

Authors: Anurag Ranjan, Kwang Moo Yi, Jen-Hao Rick Chang, Oncel Tuzel

Abstract: We propose a generative framework, FaceLit, capable of generating a 3D face that can be rendered at various user-defined lighting conditions and views, learned purely from 2D images in-the-wild without any manual annotation. Unlike existing works that require careful capture setup or human labor, we rely on off-the-shelf pose and illumination estimators. With these estimates, we incorporate the Ph… ▽ More We propose a generative framework, FaceLit, capable of generating a 3D face that can be rendered at various user-defined lighting conditions and views, learned purely from 2D images in-the-wild without any manual annotation. Unlike existing works that require careful capture setup or human labor, we rely on off-the-shelf pose and illumination estimators. With these estimates, we incorporate the Phong reflectance model in the neural volume rendering framework. Our model learns to generate shape and material properties of a face such that, when rendered according to the natural statistics of pose and illumination, produces photorealistic face images with multiview 3D and illumination consistency. Our method enables photorealistic generation of faces with explicit illumination and view controls on multiple datasets - FFHQ, MetFaces and CelebA-HQ. We show state-of-the-art photorealism among 3D aware GANs on FFHQ dataset achieving an FID score of 3.5. △ Less

Submitted 27 March, 2023; originally announced March 2023.

Comments: CVPR 2023

arXiv:2303.14189 [pdf, other]

FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization

Authors: Pavan Kumar Anasosalu Vasu, James Gabriel, Jeff Zhu, Oncel Tuzel, Anurag Ranjan

Abstract: The recent amalgamation of transformer and convolutional designs has led to steady improvements in accuracy and efficiency of the models. In this work, we introduce FastViT, a hybrid vision transformer architecture that obtains the state-of-the-art latency-accuracy trade-off. To this end, we introduce a novel token mixing operator, RepMixer, a building block of FastViT, that uses structural repara… ▽ More The recent amalgamation of transformer and convolutional designs has led to steady improvements in accuracy and efficiency of the models. In this work, we introduce FastViT, a hybrid vision transformer architecture that obtains the state-of-the-art latency-accuracy trade-off. To this end, we introduce a novel token mixing operator, RepMixer, a building block of FastViT, that uses structural reparameterization to lower the memory access cost by removing skip-connections in the network. We further apply train-time overparametrization and large kernel convolutions to boost accuracy and empirically show that these choices have minimal effect on latency. We show that - our model is 3.5x faster than CMT, a recent state-of-the-art hybrid transformer architecture, 4.9x faster than EfficientNet, and 1.9x faster than ConvNeXt on a mobile device for the same accuracy on the ImageNet dataset. At similar latency, our model obtains 4.2% better Top-1 accuracy on ImageNet than MobileOne. Our model consistently outperforms competing architectures across several tasks -- image classification, detection, segmentation and 3D mesh regression with significant improvement in latency on both a mobile device and a desktop GPU. Furthermore, our model is highly robust to out-of-distribution samples and corruptions, improving over competing robust models. Code and models are available at https://github.com/apple/ml-fastvit. △ Less

Submitted 17 August, 2023; v1 submitted 24 March, 2023; originally announced March 2023.

Comments: ICCV 2023

arXiv:2302.14443 [pdf, other]

Probing inhomogeneous and dual asymmetric angular momentum exploiting spin-orbit interaction in tightly focused vector beams in optical tweezers

Authors: Ram Nandan Kumar, Jeeban Kumar Nayak, Anand Dev Ranjan, Subhasish Dutta Gupta, Nirmalya Ghosh, Ayan Banerjee

Abstract: The spin-orbit interaction (SOI) of light generated by tight focusing in optical tweezers has been regularly employed in generating angular momentum - both spin and orbital - in trapped mesoscopic particles. Specifically, the transverse spin angular momentum (TSAM), which arises due to the longitudinal component of the electromagnetic field generated by tight focusing, is of special interest, both… ▽ More The spin-orbit interaction (SOI) of light generated by tight focusing in optical tweezers has been regularly employed in generating angular momentum - both spin and orbital - in trapped mesoscopic particles. Specifically, the transverse spin angular momentum (TSAM), which arises due to the longitudinal component of the electromagnetic field generated by tight focusing, is of special interest, both in terms of fundamental studies and associated applications. We provide an effective and optimal strategy for generating TSAM in optical tweezers by tightly focusing radially and azimuthally polarized first-order Laguerre Gaussian beams with no intrinsic angular momentum, into a refractive index stratified medium. Our choice of such input fields ensures that the longitudinal spin angular momentum (LSAM) arising from the electric (magnetic) field for the radial (azimuthal) component is zero, which leads to the separate and exclusive effects of the electric and magnetic TSAM in the case of input radially and azimuthally polarized beams on single birefringent particles. We also observe the emergence of origin-dependent intrinsic orbital angular momentum causing the rotation of birefringent particles around the beam axis for both input beam types, which opens up new and simple avenues for exotic and complex particle manipulation in optical tweezers. △ Less

Submitted 28 February, 2023; originally announced February 2023.

Comments: 10 pages, 5 figures

arXiv:2210.14800 [pdf, other]

Naturalistic Head Motion Generation from Speech

Authors: Trisha Mittal, Zakaria Aldeneh, Masha Fedzechkina, Anurag Ranjan, Barry-John Theobald

Abstract: Synthesizing natural head motion to accompany speech for an embodied conversational agent is necessary for providing a rich interactive experience. Most prior works assess the quality of generated head motion by comparing them against a single ground-truth using an objective metric. Yet there are many plausible head motion sequences to accompany a speech utterance. In this work, we study the varia… ▽ More Synthesizing natural head motion to accompany speech for an embodied conversational agent is necessary for providing a rich interactive experience. Most prior works assess the quality of generated head motion by comparing them against a single ground-truth using an objective metric. Yet there are many plausible head motion sequences to accompany a speech utterance. In this work, we study the variation in the perceptual quality of head motions sampled from a generative model. We show that, despite providing more diverse head motions, the generative model produces motions with varying degrees of perceptual quality. We finally show that objective metrics commonly used in previous research do not accurately reflect the perceptual quality of generated head motions. These results open an interesting avenue for future work to investigate better objective metrics that correlate with human perception of quality. △ Less

Submitted 26 October, 2022; originally announced October 2022.

Comments: Submitted to ICASSP 2023

arXiv:2207.10237 [pdf, other]

SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks

Authors: Chien-Yu Lin, Anish Prabhu, Thomas Merth, Sachin Mehta, Anurag Ranjan, Maxwell Horton, Mohammad Rastegari

Abstract: Recent isotropic networks, such as ConvMixer and vision transformers, have found significant success across visual recognition tasks, matching or outperforming non-isotropic convolutional neural networks (CNNs). Isotropic architectures are particularly well-suited to cross-layer weight sharing, an effective neural network compression technique. In this paper, we perform an empirical evaluation on… ▽ More Recent isotropic networks, such as ConvMixer and vision transformers, have found significant success across visual recognition tasks, matching or outperforming non-isotropic convolutional neural networks (CNNs). Isotropic architectures are particularly well-suited to cross-layer weight sharing, an effective neural network compression technique. In this paper, we perform an empirical evaluation on methods for sharing parameters in isotropic networks (SPIN). We present a framework to formalize major weight sharing design decisions and perform a comprehensive empirical evaluation of this design space. Guided by our experimental results, we propose a weight sharing strategy to generate a family of models with better overall efficiency, in terms of FLOPs and parameters versus accuracy, compared to traditional scaling methods alone, for example compressing ConvMixer by 1.9x while improving accuracy on ImageNet. Finally, we perform a qualitative study to further understand the behavior of weight sharing in isotropic architectures. The code is available at https://github.com/apple/ml-spin. △ Less

Submitted 20 July, 2022; originally announced July 2022.

Comments: Accepted at ECCV 2022

arXiv:2206.04040 [pdf, other]

MobileOne: An Improved One millisecond Mobile Backbone

Authors: Pavan Kumar Anasosalu Vasu, James Gabriel, Jeff Zhu, Oncel Tuzel, Anurag Ranjan

Abstract: Efficient neural network backbones for mobile devices are often optimized for metrics such as FLOPs or parameter count. However, these metrics may not correlate well with latency of the network when deployed on a mobile device. Therefore, we perform extensive analysis of different metrics by deploying several mobile-friendly networks on a mobile device. We identify and analyze architectural and op… ▽ More Efficient neural network backbones for mobile devices are often optimized for metrics such as FLOPs or parameter count. However, these metrics may not correlate well with latency of the network when deployed on a mobile device. Therefore, we perform extensive analysis of different metrics by deploying several mobile-friendly networks on a mobile device. We identify and analyze architectural and optimization bottlenecks in recent efficient neural networks and provide ways to mitigate these bottlenecks. To this end, we design an efficient backbone MobileOne, with variants achieving an inference time under 1 ms on an iPhone12 with 75.9% top-1 accuracy on ImageNet. We show that MobileOne achieves state-of-the-art performance within the efficient architectures while being many times faster on mobile. Our best model obtains similar performance on ImageNet as MobileFormer while being 38x faster. Our model obtains 2.3% better top-1 accuracy on ImageNet than EfficientNet at similar latency. Furthermore, we show that our model generalizes to multiple tasks - image classification, object detection, and semantic segmentation with significant improvements in latency and accuracy as compared to existing efficient architectures when deployed on a mobile device. Code and models are available at https://github.com/apple/ml-mobileone △ Less

Submitted 28 March, 2023; v1 submitted 8 June, 2022; originally announced June 2022.

Comments: Accepted at CVPR 2023

arXiv:2205.03088 [pdf, other]

doi 10.1093/mnras/stac1290

Detecting PAHs in high-z galaxies in proxy: Modelling physical conditions in an extremely strong damped Lyman-alpha absorber towards QSO SDSS J1143+1420 at z=2.323

Authors: Gargi Shaw, A. Ranjan

Abstract: We explore indirect methods to detect Polycyclic Aromatic Hydrocarbons (PAHs) in gas-rich, absorption-selected galaxies at high redshift. We look at the optical VLT/X-shooter observations of an intervening, extremely strong damped Lyman-alpha absorber (or ESDLA, with log(N(HI))>~21.7)) towards QSO SDSS J1143+1420 at redshift, z(ESDLA)=2.323. Literature studies have shown that this ESDLA contains s… ▽ More We explore indirect methods to detect Polycyclic Aromatic Hydrocarbons (PAHs) in gas-rich, absorption-selected galaxies at high redshift. We look at the optical VLT/X-shooter observations of an intervening, extremely strong damped Lyman-alpha absorber (or ESDLA, with log(N(HI))>~21.7)) towards QSO SDSS J1143+1420 at redshift, z(ESDLA)=2.323. Literature studies have shown that this ESDLA contains signatures of dust and diffuse molecular hydrogen and it was specifically chosen for our study due to its close spatial proximity (impact parameter, rho=0.6+/-0.3 kpc) with its associated galaxy. There is no direct detection of PAHs emission in the limited observations of infrared(IR)-spectra along this sight-line. Hence, we use CLOUDY numerical simulation modelling to indirectly probe the presence of PAH in the ESDLA. We note that PAHs need to be included in the models to reproduce the observed column densities of warm H2 and CI. Thus, we infer the presence of PAHs indirectly in our ESDLA, with an abundance of PAH/H = 10^(-7.046). We also measure a low 2175 A bump strength (E(bump)~0.03-0.19 mag) relative to star-forming galaxies by modelling extinction of QSO spectra by dust at the absorber rest-frame. This is consistent with the low PAH abundance obtained indirectly using CLOUDY modelling. Our study highlights the usage of CLOUDY modelling to indirectly detect PAH in high-redshift gas-rich absorption-selected galaxies. △ Less

Submitted 6 May, 2022; originally announced May 2022.

Comments: 6 pages, 3 figures, 3 tables

arXiv:2204.03618 [pdf]

Pneumonia Detection in Chest X-Rays using Neural Networks

Authors: Narayana Darapaneni, Ashish Ranjan, Dany Bright, Devendra Trivedi, Ketul Kumar, Vivek Kumar, Anwesh Reddy Paduri

Abstract: With the advancement in AI, deep learning techniques are widely used to design robust classification models in several areas such as medical diagnosis tasks in which it achieves good performance. In this paper, we have proposed the CNN model (Convolutional Neural Network) for the classification of Chest X-ray images for Radiological Society of North America Pneumonia (RSNA) datasets. The study als… ▽ More With the advancement in AI, deep learning techniques are widely used to design robust classification models in several areas such as medical diagnosis tasks in which it achieves good performance. In this paper, we have proposed the CNN model (Convolutional Neural Network) for the classification of Chest X-ray images for Radiological Society of North America Pneumonia (RSNA) datasets. The study also tries to achieve the same RSNA benchmark results using the limited computational resources by trying out various approaches to the methodologies that have been implemented in recent years. The proposed method is based on a non-complex CNN and the use of transfer learning algorithms like Xception, InceptionV3/V4, EfficientNetB7. Along with this, the study also tries to achieve the same RSNA benchmark results using the limited computational resources by trying out various approaches to the methodologies that have been implemented in recent years. The RSNA benchmark MAP score is 0.25, but using the Mask RCNN model on a stratified sample of 3017 along with image augmentation gave a MAP score of 0.15. Meanwhile, the YoloV3 without any hyperparameter tuning gave the MAP score of 0.32 but still, the loss keeps decreasing. Running the model for a greater number of iterations can give better results. △ Less

Submitted 7 April, 2022; originally announced April 2022.

arXiv:2203.12575 [pdf, other]

NeuMan: Neural Human Radiance Field from a Single Video

Authors: Wei Jiang, Kwang Moo Yi, Golnoosh Samei, Oncel Tuzel, Anurag Ranjan

Abstract: Photorealistic rendering and reposing of humans is important for enabling augmented reality experiences. We propose a novel framework to reconstruct the human and the scene that can be rendered with novel human poses and views from just a single in-the-wild video. Given a video captured by a moving camera, we train two NeRF models: a human NeRF model and a scene NeRF model. To train these models,… ▽ More Photorealistic rendering and reposing of humans is important for enabling augmented reality experiences. We propose a novel framework to reconstruct the human and the scene that can be rendered with novel human poses and views from just a single in-the-wild video. Given a video captured by a moving camera, we train two NeRF models: a human NeRF model and a scene NeRF model. To train these models, we rely on existing methods to estimate the rough geometry of the human and the scene. Those rough geometry estimates allow us to create a warping field from the observation space to the canonical pose-independent space, where we train the human model in. Our method is able to learn subject specific details, including cloth wrinkles and accessories, from just a 10 seconds video clip, and to provide high quality renderings of the human under novel poses, from novel views, together with the background. △ Less

Submitted 21 September, 2022; v1 submitted 23 March, 2022; originally announced March 2022.

arXiv:2202.13938 [pdf, other]

Nonlinear Model Predictive Control and System Identification for a Dual-hormone Artificial Pancreas

Authors: Asbjørn Thode Reenberg, Tobias K. S. Ritschel, Emilie B. Lindkvist, Christian Laugesen, Jannet Svensson, Ajenthen G. Ranjan, Kirsten Nørgaard, John Bagterp Jørgensen

Abstract: In this work, we present a switching nonlinear model predictive control (NMPC) algorithm for a dual-hormone artificial pancreas (AP), and we use maximum likelihood estimation (MLE) to identify model parameters. A dual-hormone AP consists of a continuous glucose monitor (CGM), a control algorithm, an insulin pump, and a glucagon pump. The AP is designed with a heuristic to switch between insulin an… ▽ More In this work, we present a switching nonlinear model predictive control (NMPC) algorithm for a dual-hormone artificial pancreas (AP), and we use maximum likelihood estimation (MLE) to identify model parameters. A dual-hormone AP consists of a continuous glucose monitor (CGM), a control algorithm, an insulin pump, and a glucagon pump. The AP is designed with a heuristic to switch between insulin and glucagon as well as state-dependent constraints. We extend an existing glucoregulatory model with glucagon and exercise for simulation, and we use a simpler model for control. We test the AP (NMPC and MLE) using in silico numerical simulations on 50 virtual people with type 1 diabetes. The system is identified for each virtual person based on data generated with the simulation model. The simulations show a mean of 89.3% time in range (3.9-10 mmol/L) and no hypoglycemic events. △ Less

Submitted 28 February, 2022; originally announced February 2022.

Comments: In submission, 7 pages, 6 figures

arXiv:2202.13338 [pdf, other]

A one-size-fits-all artificial pancreas for people with type 1 diabetes based on physiological insight and feedback control

Authors: Tobias K. S. Ritschel, Asbjørn Thode Reenberg, Emilie B. Lindkvist, Christian Laugesen, Jannet Svensson, Ajenthen G. Ranjan, Kirsten Nørgaard, Bernd Dammann, John Bagterp Jørgensen

Abstract: We propose a model-free artificial pancreas (AP) for people with type 1 diabetes. The algorithmic parameters are tuned to a virtual population of 1,000,000 individuals, and the AP repeatedly estimates the basal and bolus insulin requirements necessary for maintaining normal blood glucose levels. Therefore, the AP can be used without healthcare personnel or engineers customizing the algorithm to ea… ▽ More We propose a model-free artificial pancreas (AP) for people with type 1 diabetes. The algorithmic parameters are tuned to a virtual population of 1,000,000 individuals, and the AP repeatedly estimates the basal and bolus insulin requirements necessary for maintaining normal blood glucose levels. Therefore, the AP can be used without healthcare personnel or engineers customizing the algorithm to each user. The estimates are based on bodyweight, measurements from a continuous glucose monitor (CGM), and estimates of the meal carbohydrate contents. In a virtual clinical trial with all 1,000,000 individuals (i.e., a Monte Carlo closed-loop simulation), the AP achieves a mean time in range of more than 87% and almost 89% of the participants satisfy several glycemic targets. △ Less

Submitted 27 February, 2022; originally announced February 2022.

Comments: In submission, 6 pages, 7 figures, and 3 tables

arXiv:2202.01175

Surface acoustic waves inside polystyrene microparticles through photoacoustic microscopy

Authors: Abhishek Ranjan, Anowarul Habib, Azeem Ahmad, Balpreet Singh Ahluwalia, Frank Melandsø

Abstract: We demonstrate surface acoustic waves inside polystyrene microspheres of different sizes experimentally through photoacoustic microscopy and validate the experimental result with simulation. A novel method for sample preparation of a lifted sample is also presented where the microparticles are suspended in agarose above the surface of the petridish. Another objective of this study was to investiga… ▽ More We demonstrate surface acoustic waves inside polystyrene microspheres of different sizes experimentally through photoacoustic microscopy and validate the experimental result with simulation. A novel method for sample preparation of a lifted sample is also presented where the microparticles are suspended in agarose above the surface of the petridish. Another objective of this study was to investigate the results with different laser focus hitting on the microparticles and their impact on photoacoustic images and photoacoustic signals. An absorbing microsphere is excited with a pulsed laser of wavelength 532 nm and the photoacoustic signal is detected using a 40 MHz transducer. On analyzing the photoacoustic signals from microspheres, we find the signature of surface acoustic waves. △ Less

Submitted 20 March, 2022; v1 submitted 31 January, 2022; originally announced February 2022.

Comments: The paper contains mistake in figures, information and some mistakes in results. So therefore I want to witdraw this

arXiv:2201.06413 [pdf, other]

doi 10.1051/0004-6361/202140604

Multi-phase gas properties of extremely strong intervening DLAs towards quasars

Authors: A. Ranjan, R. Srianand, P. Petitjean, G. Shaw, Y. -K. Sheen, S. A. Balashev, N. Gupta, C. Ledoux, K. N. Telikova

Abstract: We present the results of a spectroscopic analysis of extremely strong damped Lyman-α absorbers (ESDLAs, log N(Hi)>=21.7) observed with VLT-XShooter. ESDLAs probe gas from within the star-forming disk of the associated galaxies and thus ESDLAs provide a unique opportunity to study the interstellar medium of galaxies at high-redshift. We report column densities (N), equivalent widths (w), and the k… ▽ More We present the results of a spectroscopic analysis of extremely strong damped Lyman-α absorbers (ESDLAs, log N(Hi)>=21.7) observed with VLT-XShooter. ESDLAs probe gas from within the star-forming disk of the associated galaxies and thus ESDLAs provide a unique opportunity to study the interstellar medium of galaxies at high-redshift. We report column densities (N), equivalent widths (w), and the kinematic spread (Δ v90) of species from neutral, singly ionised, and higher ionisation species. We find that, using the dust correction prescription, the measured metallicities are consistent for singly ionised gas species such as Pii, S ii, Si ii, Mnii and Crii, and Znii in all ESDLAs within 3-sigma uncertainty. We find that the distributions of N(Ari)/N(Hi) ratio in DLAs and ESDLAs are similar. We further report that ESDLAs do not show a strong deficiency of Ari relative to other α-capture elements as is seen in DLAs. This supports the idea that the mentioned under-abundance of Ari in DLAs is possibly caused by the presence of background UV photons that penetrate the low N(Hi) clouds to ionise Ari, but they cannot penetrate deep enough in the high N(Hi) ESDLA environment. The w(Mgii lambda2796) distribution in ESDLAs is found to be similar to that of metal-rich Ci-selected absorbers, but the velocity spread of their Mgii profile is different. For higher ionisation species (such as C iv and Si iv), Δ v90 is similar in the two populations, while the Δ v90 of singly ionised species is smaller for ESDLAs. This suggests that the ESDLAs sample a different Hi region of their associated galaxy compared to the general DLA population. We further study the N(Cl i) distribution in high-redshift DLA and ESDLA sightlines, as Cl i is a good tracer of H2 gas. The N(Cl i)-N(H2) correlation is followed by all the clouds (ESDLAs and otherwise) having log N(H2)<22. △ Less

Submitted 6 May, 2022; v1 submitted 17 January, 2022; originally announced January 2022.

Comments: 16 pages, 12 figures, 16 page appendix, 14 figures in appendix

Journal ref: A&A 661, A134 (2022)

arXiv:2201.02912 [pdf, other]

doi 10.1016/j.ins.2022.07.127

λ-Scaled-Attention: A Novel Fast Attention Mechanism for Efficient Modeling of Protein Sequences

Authors: Ashish Ranjan, Md Shah Fahad, Akshay Deepak

Abstract: Attention-based deep networks have been successfully applied on textual data in the field of NLP. However, their application on protein sequences poses additional challenges due to the weak semantics of the protein words, unlike the plain text words. These unexplored challenges faced by the standard attention technique include (i) vanishing attention score problem and (ii) high variations in the a… ▽ More Attention-based deep networks have been successfully applied on textual data in the field of NLP. However, their application on protein sequences poses additional challenges due to the weak semantics of the protein words, unlike the plain text words. These unexplored challenges faced by the standard attention technique include (i) vanishing attention score problem and (ii) high variations in the attention distribution. In this regard, we introduce a novel λ-scaled attention technique for fast and efficient modeling of the protein sequences that addresses both the above problems. This is used to develop the λ-scaled attention network and is evaluated for the task of protein function prediction implemented at the protein sub-sequence level. Experiments on the datasets for biological process (BP) and molecular function (MF) showed significant improvements in the F1 score values for the proposed λ-scaled attention technique over its counterpart approach based on the standard attention technique (+2.01% for BP and +4.67% for MF) and state-of-the-art ProtVecGen-Plus approach (+2.61% for BP and +4.20% for MF). Further, fast convergence (converging in half the number of epochs) and efficient learning (in terms of very low difference between the training and validation losses) were also observed during the training process. △ Less

Submitted 8 January, 2022; originally announced January 2022.

Journal ref: Information Sciences, 2022

arXiv:2201.00245 [pdf, other]

doi 10.1093/mnras/stab3800

Extremely strong DLAs at high redshift: Gas cooling and H$_2$ formation

Authors: K. N. Telikova, S. A. Balashev, P. Noterdaeme, J. -K. Krogager, A. Ranjan

Abstract: We present a spectroscopic investigation with VLT/X-shooter of seven candidate extremely strong damped Lyman-$α$ absorption systems (ESDLAs, $N(\text{HI})\ge 5\times 10^{21}$ cm$^{-2}$) observed along quasar sightlines. We confirm the extremely high column densities, albeit slightly (0.1~dex) lower than the original ESDLA definition for four systems. We measured low-ionisation metal abundances and… ▽ More We present a spectroscopic investigation with VLT/X-shooter of seven candidate extremely strong damped Lyman-$α$ absorption systems (ESDLAs, $N(\text{HI})\ge 5\times 10^{21}$ cm$^{-2}$) observed along quasar sightlines. We confirm the extremely high column densities, albeit slightly (0.1~dex) lower than the original ESDLA definition for four systems. We measured low-ionisation metal abundances and dust extinction for all systems. For two systems we also found strong associated H$_2$ absorption $\log N(\text{H$_2$)[cm$^{-2}$]}=18.16\pm0.03$ and $19.28\pm0.06$ at $z=3.26$ and $2.25$ towards J2205+1021 and J2359+1354, respectively), while for the remaining five we measured conservative upper limits on the H$_2$ column densities of typically $\log N(\text{H$_2$)[cm$^{-2}$]}<17.3$. The increased H$_2$ detection rate ($10-55$% at 68% confidence level) at high HI column density compared to the overall damped Lyman-$α$ population ($\sim 5-10$%) confirms previous works. We find that these seven ESDLAs have similar observed properties as those previously studied towards quasars and gamma-ray burst afterglows, suggesting they probe inner regions of galaxies. We use the abundance of ionised carbon in excited fine-structure level to calculate the cooling rates through the CII $λ$158$μ$m emission, and compare them with the cooling rates from damped Lyman-$α$ systems in the literature. We find that the cooling rates distribution of ESDLAs also presents the same bimodality as previously observed for the general (mostly lower HI column density) damped Lyman-$α$ population. △ Less

Submitted 1 January, 2022; originally announced January 2022.

Comments: 34 pages, 30 figures, 11 tables; accepted for publication in MNRAS

arXiv:2110.04252 [pdf, other]

LCS: Learning Compressible Subspaces for Adaptive Network Compression at Inference Time

Authors: Elvis Nunez, Maxwell Horton, Anish Prabhu, Anurag Ranjan, Ali Farhadi, Mohammad Rastegari

Abstract: When deploying deep learning models to a device, it is traditionally assumed that available computational resources (compute, memory, and power) remain static. However, real-world computing systems do not always provide stable resource guarantees. Computational resources need to be conserved when load from other processes is high or battery power is low. Inspired by recent works on neural network… ▽ More When deploying deep learning models to a device, it is traditionally assumed that available computational resources (compute, memory, and power) remain static. However, real-world computing systems do not always provide stable resource guarantees. Computational resources need to be conserved when load from other processes is high or battery power is low. Inspired by recent works on neural network subspaces, we propose a method for training a "compressible subspace" of neural networks that contains a fine-grained spectrum of models that range from highly efficient to highly accurate. Our models require no retraining, thus our subspace of models can be deployed entirely on-device to allow adaptive network compression at inference time. We present results for achieving arbitrarily fine-grained accuracy-efficiency trade-offs at inference time for structured and unstructured sparsity. We achieve accuracies on-par with standard models when testing our uncompressed models, and maintain high accuracy for sparsity rates above 90% when testing our compressed models. We also demonstrate that our algorithm extends to quantization at variable bit widths, achieving accuracy on par with individually trained networks. △ Less

Submitted 8 October, 2021; originally announced October 2021.

arXiv:2110.03860 [pdf, other]

Token Pooling in Vision Transformers

Authors: Dmitrii Marin, Jen-Hao Rick Chang, Anurag Ranjan, Anish Prabhu, Mohammad Rastegari, Oncel Tuzel

Abstract: Despite the recent success in many applications, the high computational requirements of vision transformers limit their use in resource-constrained settings. While many existing methods improve the quadratic complexity of attention, in most vision transformers, self-attention is not the major computation bottleneck, e.g., more than 80% of the computation is spent on fully-connected layers. To impr… ▽ More Despite the recent success in many applications, the high computational requirements of vision transformers limit their use in resource-constrained settings. While many existing methods improve the quadratic complexity of attention, in most vision transformers, self-attention is not the major computation bottleneck, e.g., more than 80% of the computation is spent on fully-connected layers. To improve the computational complexity of all layers, we propose a novel token downsampling method, called Token Pooling, efficiently exploiting redundancies in the images and intermediate token representations. We show that, under mild assumptions, softmax-attention acts as a high-dimensional low-pass (smoothing) filter. Thus, its output contains redundancy that can be pruned to achieve a better trade-off between the computational cost and accuracy. Our new technique accurately approximates a set of tokens by minimizing the reconstruction error caused by downsampling. We solve this optimization problem via cost-efficient clustering. We rigorously analyze and compare to prior downsampling methods. Our experiments show that Token Pooling significantly improves the cost-accuracy trade-off over the state-of-the-art downsampling. Token Pooling is a simple and effective operator that can benefit many architectures. Applied to DeiT, it achieves the same ImageNet top-1 accuracy using 42% fewer computations. △ Less

Submitted 11 October, 2021; v1 submitted 7 October, 2021; originally announced October 2021.

Journal ref: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2023

arXiv:2108.00218 [pdf, other]

Study of tetraquarks in dipole-dipole interaction potential

Authors: Sindhu D G, Akhilesh Ranjan

Abstract: In recent years tetraquark and pentaquark states have received much attention due to the significant experimental findings. In this work masses of some heavy tetraquarks are estimated by considering the spin-spin interaction, the dipole-dipole interaction, and the meson-meson interaction. It is found that such interactions give significant mass contributions. The Regge trajectory of X(3872) state… ▽ More In recent years tetraquark and pentaquark states have received much attention due to the significant experimental findings. In this work masses of some heavy tetraquarks are estimated by considering the spin-spin interaction, the dipole-dipole interaction, and the meson-meson interaction. It is found that such interactions give significant mass contributions. The Regge trajectory of X(3872) state is also studied and is found to be non-linear. Masses of some tetraquark states are also proposed. △ Less

Submitted 31 July, 2021; originally announced August 2021.

Comments: 11 pages, 4 figures, Submitted to European Physical Journal Plus (EPJP)

arXiv:2106.02457 [pdf]

Thomson and Collisional Regimes of In-Phase Coherent Microwave Scattering Off Gaseous Microplasmas

Authors: Adam R. Patel, Apoorv Ranjan, Xingxing Wang, Mikhail N. Slipchenko, Mikhail N. Shneider, Alexey Shashurin

Abstract: The total number of electrons in a classical microplasma can be non-intrusively measured through elastic in-phase coherent microwave scattering (CMS). Here, we establish a theoretical basis for the CMS diagnostic technique with an emphasis on Thomson and collisional scattering in short, thin unmagnetized plasma media. Experimental validation of the diagnostic is subsequently performed via linearly… ▽ More The total number of electrons in a classical microplasma can be non-intrusively measured through elastic in-phase coherent microwave scattering (CMS). Here, we establish a theoretical basis for the CMS diagnostic technique with an emphasis on Thomson and collisional scattering in short, thin unmagnetized plasma media. Experimental validation of the diagnostic is subsequently performed via linearly polarized, variable frequency microwave scattering off laser induced air-based microplasmas with diverse ionization and collisional features. Namely, conducted studies include a verification of short-dipole-like radiation behavior, plasma volume imaging via intensified charge-coupled device (ICCD) photography, and measurements of relative phases, total scattering cross sections, and total number of electrons $N_e$ in the generated plasma filaments following absolute calibration using a dielectric scattering sample. Findings of the paper suggest an ideality of the diagnostic in the Thomson "free-electron" regime - where a detailed knowledge of plasma and collisional properties (which are often difficult to accurately characterize due to the potential influence of inhomogeneities, local temperatures and densities, present species, and so on) is unnecessary to extract $N_e$ from the scattered signal. △ Less

Submitted 21 September, 2021; v1 submitted 2 June, 2021; originally announced June 2021.

Comments: 27 Pages, 5 Figures

arXiv:2104.06938 [pdf, other]

doi 10.1103/PhysRevA.104.022437

On the state space structure of tripartite quantum systems

Authors: Hari Krishnan S V, Ashish Ranjan, Manik Banik

Abstract: State space structure of tripartite quantum systems is analyzed. In particular, it has been shown that the set of states separable across all the three bipartitions [say $\mathcal{B}^{int}(ABC)$] is a strict subset of the set of states having positive partial transposition (PPT) across the three bipartite cuts [say $\mathcal{P}^{int}(ABC)$] for all the tripartite Hilbert spaces… ▽ More State space structure of tripartite quantum systems is analyzed. In particular, it has been shown that the set of states separable across all the three bipartitions [say $\mathcal{B}^{int}(ABC)$] is a strict subset of the set of states having positive partial transposition (PPT) across the three bipartite cuts [say $\mathcal{P}^{int}(ABC)$] for all the tripartite Hilbert spaces $\mathbb{C}_A^{d_1}\otimes\mathbb{C}_B^{d_2}\otimes\mathbb{C}_C^{d_3}$ with $\min\{d_1,d_2,d_3\}\ge2$. The claim is proved by constructing state belonging to the set $\mathcal{P}^{int}(ABC)$ but not belonging to $\mathcal{B}^{int}(ABC)$. For $(\mathbb{C}^{d})^{\otimes3}$ with $d\ge3$, the construction follows from specific type of multipartite unextendible product bases. However, such a construction is not possible for $(\mathbb{C}^{2})^{\otimes3}$ since for any $n$ the bipartite system $\mathbb{C}^2\otimes\mathbb{C}^n$ cannot have any unextendible product bases [Phys. Rev. Lett. 82, 5385 (1999)]. For the $3$-qubit system we, therefore, come up with a different construction. △ Less

Submitted 20 April, 2021; v1 submitted 14 April, 2021; originally announced April 2021.

Comments: Comments are welcome

Journal ref: Phys. Rev. A 104, 022437 (2021)

arXiv:2012.05225 [pdf, other]

MorphGAN: One-Shot Face Synthesis GAN for Detecting Recognition Bias

Authors: Nataniel Ruiz, Barry-John Theobald, Anurag Ranjan, Ahmed Hussein Abdelaziz, Nicholas Apostoloff

Abstract: To detect bias in face recognition networks, it can be useful to probe a network under test using samples in which only specific attributes vary in some controlled way. However, capturing a sufficiently large dataset with specific control over the attributes of interest is difficult. In this work, we describe a simulator that applies specific head pose and facial expression adjustments to images o… ▽ More To detect bias in face recognition networks, it can be useful to probe a network under test using samples in which only specific attributes vary in some controlled way. However, capturing a sufficiently large dataset with specific control over the attributes of interest is difficult. In this work, we describe a simulator that applies specific head pose and facial expression adjustments to images of previously unseen people. The simulator first fits a 3D morphable model to a provided image, applies the desired head pose and facial expression controls, then renders the model into an image. Next, a conditional Generative Adversarial Network (GAN) conditioned on the original image and the rendered morphable model is used to produce the image of the original person with the new facial expression and head pose. We call this conditional GAN -- MorphGAN. Images generated using MorphGAN conserve the identity of the person in the original image, and the provided control over head pose and facial expression allows test sets to be created to identify robustness issues of a facial recognition deep network with respect to pose and expression. Images generated by MorphGAN can also serve as data augmentation when training data are scarce. We show that by augmenting small datasets of faces with new poses and expressions improves the recognition performance by up to 9% depending on the augmentation and data scarcity. △ Less

Submitted 10 December, 2020; v1 submitted 9 December, 2020; originally announced December 2020.

arXiv:2012.03880 [pdf, other]

doi 10.1093/mnras/staa4008

The properties and environment of very young galaxies in the local Universe

Authors: M. Trevisan, G. A. Mamon, T. X. Thuan, F. Ferrari, L. S. Pilyugin, A. Ranjan

Abstract: In the local Universe, there is a handful of dwarf compact star-forming galaxies with extremely low oxygen abundances. It has been proposed that they are young, having formed a large fraction of their stellar mass during their last few hundred Myr. However, little is known about the fraction of young stellar populations in more massive galaxies. In a previous article, we analyzed 280 000 SDSS spec… ▽ More In the local Universe, there is a handful of dwarf compact star-forming galaxies with extremely low oxygen abundances. It has been proposed that they are young, having formed a large fraction of their stellar mass during their last few hundred Myr. However, little is known about the fraction of young stellar populations in more massive galaxies. In a previous article, we analyzed 280 000 SDSS spectra to identify a surprisingly large sample of more massive Very Young Galaxies (VYGs), defined to have formed at least $50\%$ of their stellar mass within the last 1 Gyr. Here, we investigate in detail the properties of a subsample of 207 galaxies that are VYGs according to all three of our spectral models. We compare their properties with those of control sample galaxies (CSGs). We find that VYGs tend to have higher surface brightness and to be more compact, dusty, asymmetric, and clumpy than CSGs. Analysis of a subsample with HI detections reveals that VYGs are more gas-rich than CSGs. VYGs tend to reside more in the inner parts of low-mass groups and are twice as likely to be interacting with a neighbour galaxy than CSGs. On the other hand, VYGs and CSGs have similar gas metallicities and large scale environments (relative to filaments and voids). These results suggest that gas-rich interactions and mergers are the main mechanisms responsible for the recent triggering of star formation in low-redshift VYGs, except for the lowest mass VYGs, where the starbursts may arise from a mixture of mergers and gas infall. △ Less

Submitted 7 December, 2020; originally announced December 2020.

Comments: 33 pages, 24 figures, to be published in MNRAS

Showing 1–50 of 101 results for author: Ranjan, A