-
AgentInstruct: Toward Generative Teaching with Agentic Flows
Authors:
Arindam Mitra,
Luciano Del Corro,
Guoqing Zheng,
Shweti Mahajan,
Dany Rouhana,
Andres Codas,
Yadong Lu,
Wei-ge Chen,
Olga Vrousgos,
Corby Rosset,
Fillipe Silva,
Hamed Khanpour,
Yash Lara,
Ahmed Awadallah
Abstract:
Synthetic data is becoming increasingly important for accelerating the development of language models, both large and small. Despite several successful use cases, researchers also raised concerns around model collapse and drawbacks of imitating other models. This discrepancy can be attributed to the fact that synthetic data varies in quality and diversity. Effective use of synthetic data usually r…
▽ More
Synthetic data is becoming increasingly important for accelerating the development of language models, both large and small. Despite several successful use cases, researchers also raised concerns around model collapse and drawbacks of imitating other models. This discrepancy can be attributed to the fact that synthetic data varies in quality and diversity. Effective use of synthetic data usually requires significant human effort in curating the data. We focus on using synthetic data for post-training, specifically creating data by powerful models to teach a new skill or behavior to another model, we refer to this setting as Generative Teaching. We introduce AgentInstruct, an extensible agentic framework for automatically creating large amounts of diverse and high-quality synthetic data. AgentInstruct can create both the prompts and responses, using only raw data sources like text documents and code files as seeds. We demonstrate the utility of AgentInstruct by creating a post training dataset of 25M pairs to teach language models different skills, such as text editing, creative writing, tool usage, coding, reading comprehension, etc. The dataset can be used for instruction tuning of any base model. We post-train Mistral-7b with the data. When comparing the resulting model Orca-3 to Mistral-7b-Instruct (which uses the same base model), we observe significant improvements across many benchmarks. For example, 40% improvement on AGIEval, 19% improvement on MMLU, 54% improvement on GSM8K, 38% improvement on BBH and 45% improvement on AlpacaEval. Additionally, it consistently outperforms other models such as LLAMA-8B-instruct and GPT-3.5-turbo.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
NEBULA: Neural Empirical Bayes Under Latent Representations for Efficient and Controllable Design of Molecular Libraries
Authors:
Ewa M. Nowara,
Pedro O. Pinheiro,
Sai Pooja Mahajan,
Omar Mahmood,
Andrew Martin Watkins,
Saeed Saremi,
Michael Maser
Abstract:
We present NEBULA, the first latent 3D generative model for scalable generation of large molecular libraries around a seed compound of interest. Such libraries are crucial for scientific discovery, but it remains challenging to generate large numbers of high quality samples efficiently. 3D-voxel-based methods have recently shown great promise for generating high quality samples de novo from random…
▽ More
We present NEBULA, the first latent 3D generative model for scalable generation of large molecular libraries around a seed compound of interest. Such libraries are crucial for scientific discovery, but it remains challenging to generate large numbers of high quality samples efficiently. 3D-voxel-based methods have recently shown great promise for generating high quality samples de novo from random noise (Pinheiro et al., 2023). However, sampling in 3D-voxel space is computationally expensive and use in library generation is prohibitively slow. Here, we instead perform neural empirical Bayes sampling (Saremi & Hyvarinen, 2019) in the learned latent space of a vector-quantized variational autoencoder. NEBULA generates large molecular libraries nearly an order of magnitude faster than existing methods without sacrificing sample quality. Moreover, NEBULA generalizes better to unseen drug-like molecules, as demonstrated on two public datasets and multiple recently released drugs. We expect the approach herein to be highly enabling for machine learning-based drug discovery. The code is available at https://github.com/prescient-design/nebula
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Learning Disentangled Representation in Object-Centric Models for Visual Dynamics Prediction via Transformers
Authors:
Sanket Gandhi,
Atul,
Samanyu Mahajan,
Vishal Sharma,
Rushil Gupta,
Arnab Kumar Mondal,
Parag Singla
Abstract:
Recent work has shown that object-centric representations can greatly help improve the accuracy of learning dynamics while also bringing interpretability. In this work, we take this idea one step further, ask the following question: "can learning disentangled representation further improve the accuracy of visual dynamics prediction in object-centric models?" While there has been some attempt to le…
▽ More
Recent work has shown that object-centric representations can greatly help improve the accuracy of learning dynamics while also bringing interpretability. In this work, we take this idea one step further, ask the following question: "can learning disentangled representation further improve the accuracy of visual dynamics prediction in object-centric models?" While there has been some attempt to learn such disentangled representations for the case of static images \citep{nsb}, to the best of our knowledge, ours is the first work which tries to do this in a general setting for video, without making any specific assumptions about the kind of attributes that an object might have. The key building block of our architecture is the notion of a {\em block}, where several blocks together constitute an object. Each block is represented as a linear combination of a given number of learnable concept vectors, which is iteratively refined during the learning process. The blocks in our model are discovered in an unsupervised manner, by attending over object masks, in a style similar to discovery of slots \citep{slot_attention}, for learning a dense object-centric representation. We employ self-attention via transformers over the discovered blocks to predict the next state resulting in discovery of visual dynamics. We perform a series of experiments on several benchmark 2-D, and 3-D datasets demonstrating that our architecture (1) can discover semantically meaningful blocks (2) help improve accuracy of dynamics prediction compared to SOTA object-centric models (3) perform significantly better in OOD setting where the specific attribute combinations are not seen earlier during training. Our experiments highlight the importance discovery of disentangled representation for visual dynamics prediction.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Upshifted frequency of electromagnetic plasma waves due to reflecting gravitational waves acting as almost-luminal mirrors
Authors:
Felipe A. Asenjo,
Swadesh M. Mahajan
Abstract:
We show that dispersive gravitational waves, as a background spacetime, can reflect electromagnetic waves in a plasma. This reflection upshifts the frequency of the reflected wave, being larger for low-frequency incident waves. This effect takes place when the gravitational wave background propagates almost at the speed of light, allowing it to behave similar to a luminal mirror to electromagnetic…
▽ More
We show that dispersive gravitational waves, as a background spacetime, can reflect electromagnetic waves in a plasma. This reflection upshifts the frequency of the reflected wave, being larger for low-frequency incident waves. This effect takes place when the gravitational wave background propagates almost at the speed of light, allowing it to behave similar to a luminal mirror to electromagnetic plasma waves.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
The effect of separatrix density and PFC material on H-mode confinement in the ITPA global H-mode database
Authors:
M. Kotschenreuther,
X. Liu,
D. R. Hatch,
S. M. Mahajan
Abstract:
Recent data, added to the ITPA global H-mode database [1] for ASDEX-U [2] and JET-ILW, reveals that the separatrix density $n_{sep}$ has a correlation with the H20 factor (Confinement time relative to the ITPA20-IL scaling)[1]. These trends are analyzed in detail. They are not a result of proximity to the density limit. The normalized $n_{sepN} = n_{sep} / \bar{n}$ is introduced, motivated by theo…
▽ More
Recent data, added to the ITPA global H-mode database [1] for ASDEX-U [2] and JET-ILW, reveals that the separatrix density $n_{sep}$ has a correlation with the H20 factor (Confinement time relative to the ITPA20-IL scaling)[1]. These trends are analyzed in detail. They are not a result of proximity to the density limit. The normalized $n_{sepN} = n_{sep} / \bar{n}$ is introduced, motivated by theory ($\bar{n}$ is the average density). The trends in $n_{sepN}$ can be understood in terms of the two main mechanisms of pedestal characteristics -- MHD stability and recently developed theories of gyrokinetic transport. Careful analysis shows these mechanisms can be distinguished in the data. The most dramatic improvement in confinement time arises primarily from reductions in pedestal transport. A new definition of density peaking that includes core peaking is found to best explain H20 when advanced H-modes are included: $n_{sepN0} = n_{sep}/n(0)$, the inverse of the total density peaking from the separatrix to the axis. The highest H-factors are reached by the confluence of relatively low normalized $n_{sepN0}$ plus high Shafranov shift or poloidal beta. The importance of these two variables is also theoretically predicted from recent analysis of the gyrokinetic system, where a constraint can limit the access of ITG/TEM modes to free energy in equilibrium gradients. The Plasma Facing Component (PFC) material also shows a strong influence in the data. This is likely due to the importance of $n(0)/n_{sep}$ to attaining high H20, in conjunction with the known tendency for tungsten (W) to accumulate with density peaking and low transport. Preliminary results indicate that $n_{sepN0}$ might also be important with core ITBs.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
DePIN: A Framework for Token-Incentivized Participatory Sensing
Authors:
Michael T. C. Chiu,
Sachit Mahajan,
Mark C. Ballandies,
Uroš V. Kalabić
Abstract:
There is always demand for integrating data into microeconomic decision making. Participatory sensing deals with how real-world data may be extracted with stakeholder participation and resolves a problem of Big Data, which is concerned with monetizing data extracted from individuals without their participation. We present how Decentralized Physical Infrastructure Networks (DePINs) extend participa…
▽ More
There is always demand for integrating data into microeconomic decision making. Participatory sensing deals with how real-world data may be extracted with stakeholder participation and resolves a problem of Big Data, which is concerned with monetizing data extracted from individuals without their participation. We present how Decentralized Physical Infrastructure Networks (DePINs) extend participatory sensing. We discuss the threat models of these networks and how DePIN cryptoeconomics can advance participatory sensing.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
Unlocking Adaptive User Experience with Generative AI
Authors:
Yutan Huang,
Tanjila Kanij,
Anuradha Madugalla,
Shruti Mahajan,
Chetan Arora,
John Grundy
Abstract:
Developing user-centred applications that address diverse user needs requires rigorous user research. This is time, effort and cost-consuming. With the recent rise of generative AI techniques based on Large Language Models (LLMs), there is a possibility that these powerful tools can be used to develop adaptive interfaces. This paper presents a novel approach to develop user personas and adaptive i…
▽ More
Developing user-centred applications that address diverse user needs requires rigorous user research. This is time, effort and cost-consuming. With the recent rise of generative AI techniques based on Large Language Models (LLMs), there is a possibility that these powerful tools can be used to develop adaptive interfaces. This paper presents a novel approach to develop user personas and adaptive interface candidates for a specific domain using ChatGPT. We develop user personas and adaptive interfaces using both ChatGPT and a traditional manual process and compare these outcomes. To obtain data for the personas we collected data from 37 survey participants and 4 interviews in collaboration with a not-for-profit organisation. The comparison of ChatGPT generated content and manual content indicates promising results that encourage using LLMs in the adaptive interfaces design process.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
A posteriori error estimates for the Generalized Burgers-Huxley equation with weakly singular kernels
Authors:
Sumit Mahajan,
Arbaz Khan
Abstract:
This paper explores the residual based a posteriori error estimations for the generalized Burgers-Huxley equation (GBHE) featuring weakly singular kernels. Initially, we present a reliable and efficient error estimator for both the stationary GBHE and the semi-discrete GBHE with memory, utilizing the discontinuous Galerkin finite element method (DGFEM) in spatial dimensions. Additionally, employin…
▽ More
This paper explores the residual based a posteriori error estimations for the generalized Burgers-Huxley equation (GBHE) featuring weakly singular kernels. Initially, we present a reliable and efficient error estimator for both the stationary GBHE and the semi-discrete GBHE with memory, utilizing the discontinuous Galerkin finite element method (DGFEM) in spatial dimensions. Additionally, employing backward Euler and Crank Nicolson discretization in the temporal domain and DGFEM in spatial dimensions, we introduce an estimator for the fully discrete GBHE, taking into account the influence of past history. The paper also establishes optimal $L^2$ error estimates for both the stationary GBHE and GBHE. Ultimately, we validate the effectiveness of the proposed error estimator through numerical results, demonstrating its efficacy in an adaptive refinement strategy.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
Visual Concept-driven Image Generation with Text-to-Image Diffusion Model
Authors:
Tanzila Rahman,
Shweta Mahajan,
Hsin-Ying Lee,
Jian Ren,
Sergey Tulyakov,
Leonid Sigal
Abstract:
Text-to-image (TTI) diffusion models have demonstrated impressive results in generating high-resolution images of complex and imaginative scenes. Recent approaches have further extended these methods with personalization techniques that allow them to integrate user-illustrated concepts (e.g., the user him/herself) using a few sample image illustrations. However, the ability to generate images with…
▽ More
Text-to-image (TTI) diffusion models have demonstrated impressive results in generating high-resolution images of complex and imaginative scenes. Recent approaches have further extended these methods with personalization techniques that allow them to integrate user-illustrated concepts (e.g., the user him/herself) using a few sample image illustrations. However, the ability to generate images with multiple interacting concepts, such as human subjects, as well as concepts that may be entangled in one, or across multiple, image illustrations remains illusive. In this work, we propose a concept-driven TTI personalization framework that addresses these core challenges. We build on existing works that learn custom tokens for user-illustrated concepts, allowing those to interact with existing text tokens in the TTI model. However, importantly, to disentangle and better learn the concepts in question, we jointly learn (latent) segmentation masks that disentangle these concepts in user-provided image illustrations. We do so by introducing an Expectation Maximization (EM)-like optimization procedure where we alternate between learning the custom tokens and estimating masks encompassing corresponding concepts in user-supplied images. We obtain these masks based on cross-attention, from within the U-Net parameterized latent diffusion model and subsequent Dense CRF optimization. We illustrate that such joint alternating refinement leads to the learning of better tokens for concepts and, as a bi-product, latent masks. We illustrate the benefits of the proposed approach qualitatively and quantitatively (through user studies) with a number of examples and use cases that can combine up to three entangled concepts.
△ Less
Submitted 18 February, 2024;
originally announced February 2024.
-
Using JWST transits and occultations to determine $\sim1\%$ stellar radii and temperatures of low-mass stars
Authors:
Alexandra S. Mahajan,
Jason D. Eastman,
James Kirk
Abstract:
Using JWST observations of a primary transit and two secondary eclipses for GJ 1214b, we determine an eccentricity that is more precise than a decade of HARPS data, which enables us to measure the stellar density to 2.62%. Coupled with a prior on the stellar mass from a dynamically calibrated K-$M_*$ relation, we determine $R_*$ to 1.13% -- 3 times more precise than any other published analysis of…
▽ More
Using JWST observations of a primary transit and two secondary eclipses for GJ 1214b, we determine an eccentricity that is more precise than a decade of HARPS data, which enables us to measure the stellar density to 2.62%. Coupled with a prior on the stellar mass from a dynamically calibrated K-$M_*$ relation, we determine $R_*$ to 1.13% -- 3 times more precise than any other published analysis of this system. Then, using the bolometric flux from an spectral energy distribution model, we determine $T_{\rm eff}$ to 1.39% -- 40% more precise than systematic floors from spectroscopy. Within the global model, these also improve the planetary radius and insolation. This is a proof of concept for a new method to determine accurate $R_*$ and $T_{\rm eff}$ to a precision currently achieved for only a small number of low-mass stars. By applying our method to all high signal-to-noise ratio planetary transits and occultations, we can expand the sample of precisely measured stars without assuming tidal circularization and calibrate new relations to improve our understanding of all low-mass stars.
△ Less
Submitted 3 April, 2024; v1 submitted 8 February, 2024;
originally announced February 2024.
-
Heterogeneous attenuation of sound waves in three-dimensional amorphous solids
Authors:
Shivam Mahajan,
Massimo Pica Ciamarra
Abstract:
Sound waves are attenuated as they propagate in amorphous materials. We investigate the mechanism driving sound attenuation in the Rayleigh scattering regime by resolving the dynamics of an excited phonon in time and space via numerical simulations. We find sound attenuation is spatiotemporal heterogeneous. It starts in localised regions, which identify soft regions within the material and correla…
▽ More
Sound waves are attenuated as they propagate in amorphous materials. We investigate the mechanism driving sound attenuation in the Rayleigh scattering regime by resolving the dynamics of an excited phonon in time and space via numerical simulations. We find sound attenuation is spatiotemporal heterogeneous. It starts in localised regions, which identify soft regions within the material and correlate with low-frequency vibrational modes. As time progresses, the regions where sound is primarily attenuated invade the system via an apparent diffusive process.
△ Less
Submitted 16 January, 2024;
originally announced January 2024.
-
Impact of maternal high fat on neurovascular unit of adult offspring
Authors:
Cheryl A. Hawkes,
Victoria Goss,
Elina Zotova,
Tual Monfort,
Anthony Postle,
Sumeet Mahajan,
James A. R. Nicoll,
Roy O. Weller,
Roxana O. Carare
Abstract:
Maternal obesity is associated with increased risk of diabetes, cardiovascular disease and hypertension in adult offspring. Midlife hypercholesterolemia and hypertension are risk factors for Alzheimer's disease, suggesting that the ageing brain may be impacted by early life environment. We found that exposure to a high fat diet during gestation and lactation induced changes in multiple components…
▽ More
Maternal obesity is associated with increased risk of diabetes, cardiovascular disease and hypertension in adult offspring. Midlife hypercholesterolemia and hypertension are risk factors for Alzheimer's disease, suggesting that the ageing brain may be impacted by early life environment. We found that exposure to a high fat diet during gestation and lactation induced changes in multiple components of the neurovascular unit, including a downregulation in apolipoprotein E and fibronectin, an upregulation in markers of astrocytes and perivascular macrophages and altered blood vessel morphology in the brains of adult mice. Feeding of high fat diet after weaning increased lipid droplets in the brain and influenced the fatty acid composition of phosphatidylcholine and phosphatidylethanolamine species, but did not affect the neurovascular unit. Sustained high fat diet over the entire lifespan resulted in additional decreases in levels of pericytes and collagen IV, changes in phospholipid composition and impaired perivascular clearance of Beta-amyloid (A-Beta) from the brain. In humans, vascular A-Beta load was significantly increased in the brains of aged individuals with a history of hypercholesterolemia. These results support a critical role for early dietary influence on the brain vasculature across the lifespan, with consequences for the development of age-related cerebrovascular and neurodegenerative diseases.
△ Less
Submitted 9 January, 2024;
originally announced January 2024.
-
An AstroSat/UVIT study of galaxies in the cluster Abell 2199
Authors:
Smriti Mahajan,
Kulinderpal Singh,
Somak Raychaudhury
Abstract:
(abridged) We present the newly acquired data for an AstroSat/UVIT field centered on a face-on spiral starburst galaxy UGC 10420, located in the cluster Abell 2199. We have analysed the FUV data for this field along with the archival data from the Galex mission, optical photometric data from the SDSS, and low-frequency radio data from the LoTSS survey, respectively. The stars were separated from t…
▽ More
(abridged) We present the newly acquired data for an AstroSat/UVIT field centered on a face-on spiral starburst galaxy UGC 10420, located in the cluster Abell 2199. We have analysed the FUV data for this field along with the archival data from the Galex mission, optical photometric data from the SDSS, and low-frequency radio data from the LoTSS survey, respectively. The stars were separated from the galaxies using the SDSS pipeline classification, while the spectroscopic redshifts available for 35% of the detected UVIT sources were used to identify member galaxies of the cluster Abell 2199. We find that (a) the non-cluster galaxies are on average fainter than the cluster galaxies at fixed magnitude, (b) stars and galaxies are indistinguishable in the r vs NUV-r plane, and (c) bright stars are ~1.5 mag bluer than the galaxies in the FUV-r vs NUV-r colour-colour plane. Besides UGC 10420 which is the only known cluster galaxy with an extended-UV disk, we identify five more galaxies with asymmetric FUV morphology and extended radio emission in this field. All the asymmetric member galaxies of Abell 2199, lie within the virial boundaries of the cluster. This observation, together with the fact that these asymmetric cluster galaxies have low-frequency radio tails or FUV emission pointing away from the cluster centre leads us to hypothesise that these galaxies are likely undergoing ram-pressure stripping (RPS) under the influence of cluster-environment related mechanisms. A comparison of optical and FUV star formation rate of UVIT detected galaxies shows enhanced star formation in half of the RPS candidates, suggesting that environment-related mechanisms may lead to a burst of star formation in RPS galaxies. Our analysis indicates the presence of at least two more groups or clusters at z~0.077 and 0.260, coincident with Abell 2199 along the line of sight of the field of view studied here.
△ Less
Submitted 25 December, 2023;
originally announced December 2023.
-
Prompting Hard or Hardly Prompting: Prompt Inversion for Text-to-Image Diffusion Models
Authors:
Shweta Mahajan,
Tanzila Rahman,
Kwang Moo Yi,
Leonid Sigal
Abstract:
The quality of the prompts provided to text-to-image diffusion models determines how faithful the generated content is to the user's intent, often requiring `prompt engineering'. To harness visual concepts from target images without prompt engineering, current approaches largely rely on embedding inversion by optimizing and then mapping them to pseudo-tokens. However, working with such high-dimens…
▽ More
The quality of the prompts provided to text-to-image diffusion models determines how faithful the generated content is to the user's intent, often requiring `prompt engineering'. To harness visual concepts from target images without prompt engineering, current approaches largely rely on embedding inversion by optimizing and then mapping them to pseudo-tokens. However, working with such high-dimensional vector representations is challenging because they lack semantics and interpretability, and only allow simple vector operations when using them. Instead, this work focuses on inverting the diffusion model to obtain interpretable language prompts directly. The challenge of doing this lies in the fact that the resulting optimization problem is fundamentally discrete and the space of prompts is exponentially large; this makes using standard optimization techniques, such as stochastic gradient descent, difficult. To this end, we utilize a delayed projection scheme to optimize for prompts representative of the vocabulary space in the model. Further, we leverage the findings that different timesteps of the diffusion process cater to different levels of detail in an image. The later, noisy, timesteps of the forward diffusion process correspond to the semantic information, and therefore, prompt inversion in this range provides tokens representative of the image semantics. We show that our approach can identify semantically interpretable and meaningful prompts for a target image which can be used to synthesize diverse images with similar content. We further illustrate the application of the optimized prompts in evolutionary image generation and concept removal.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
ViVid-1-to-3: Novel View Synthesis with Video Diffusion Models
Authors:
Jeong-gi Kwak,
Erqun Dong,
Yuhe Jin,
Hanseok Ko,
Shweta Mahajan,
Kwang Moo Yi
Abstract:
Generating novel views of an object from a single image is a challenging task. It requires an understanding of the underlying 3D structure of the object from an image and rendering high-quality, spatially consistent new views. While recent methods for view synthesis based on diffusion have shown great progress, achieving consistency among various view estimates and at the same time abiding by the…
▽ More
Generating novel views of an object from a single image is a challenging task. It requires an understanding of the underlying 3D structure of the object from an image and rendering high-quality, spatially consistent new views. While recent methods for view synthesis based on diffusion have shown great progress, achieving consistency among various view estimates and at the same time abiding by the desired camera pose remains a critical problem yet to be solved. In this work, we demonstrate a strikingly simple method, where we utilize a pre-trained video diffusion model to solve this problem. Our key idea is that synthesizing a novel view could be reformulated as synthesizing a video of a camera going around the object of interest -- a scanning video -- which then allows us to leverage the powerful priors that a video diffusion model would have learned. Thus, to perform novel-view synthesis, we create a smooth camera trajectory to the target view that we wish to render, and denoise using both a view-conditioned diffusion model and a video diffusion model. By doing so, we obtain a highly consistent novel view synthesis, outperforming the state of the art.
△ Less
Submitted 3 December, 2023;
originally announced December 2023.
-
Unsupervised Keypoints from Pretrained Diffusion Models
Authors:
Eric Hedlin,
Gopal Sharma,
Shweta Mahajan,
Xingzhe He,
Hossam Isack,
Abhishek Kar Helge Rhodin,
Andrea Tagliasacchi,
Kwang Moo Yi
Abstract:
Unsupervised learning of keypoints and landmarks has seen significant progress with the help of modern neural network architectures, but performance is yet to match the supervised counterpart, making their practicability questionable. We leverage the emergent knowledge within text-to-image diffusion models, towards more robust unsupervised keypoints. Our core idea is to find text embeddings that w…
▽ More
Unsupervised learning of keypoints and landmarks has seen significant progress with the help of modern neural network architectures, but performance is yet to match the supervised counterpart, making their practicability questionable. We leverage the emergent knowledge within text-to-image diffusion models, towards more robust unsupervised keypoints. Our core idea is to find text embeddings that would cause the generative model to consistently attend to compact regions in images (i.e. keypoints). To do so, we simply optimize the text embedding such that the cross-attention maps within the denoising network are localized as Gaussians with small standard deviations. We validate our performance on multiple datasets: the CelebA, CUB-200-2011, Tai-Chi-HD, DeepFashion, and Human3.6m datasets. We achieve significantly improved accuracy, sometimes even outperforming supervised ones, particularly for data that is non-aligned and less curated. Our code is publicly available and can be found through our project page: https://ubc-vision.github.io/StableKeypoints/
△ Less
Submitted 21 May, 2024; v1 submitted 29 November, 2023;
originally announced December 2023.
-
Stability of near surface nitrogen vacancy centers using dielectric surface passivation
Authors:
Ravi Kumar,
Saksham Mahajan,
Felix Donaldson,
Siddharth Dhomkar,
Hector J. Lancaster,
Curran Kalha,
Aysha A. Riaz,
Yujiang Zhu,
Christopher A. Howard,
Anna Regoutz,
John J. L. Morton
Abstract:
We study the photo-physical stability of ensemble near-surface nitrogen vacancy (NV) centers in diamond under vacuum and air. The optically detected magnetic resonance contrast of the NV centers was measured following exposure to laser illumination, showing opposing trends in air compared to vacuum (increasing by up to 9% and dropping by up to 25%, respectively). Characterization using Raman and X…
▽ More
We study the photo-physical stability of ensemble near-surface nitrogen vacancy (NV) centers in diamond under vacuum and air. The optically detected magnetic resonance contrast of the NV centers was measured following exposure to laser illumination, showing opposing trends in air compared to vacuum (increasing by up to 9% and dropping by up to 25%, respectively). Characterization using Raman and X-ray photoelectron spectroscopies suggests a surface reconstruction: In air, atmospheric oxygen adsorption on surface leads to an increase in NV- fraction, whereas in vacuum, net oxygen desorption increases the NV0 fraction. NV charge state switching is confirmed by photoluminescence spectroscopy. Deposition of ~ 2 nm alumina (Al2O3) over the diamond surface was shown to stabilize the NV charge state under illumination in either environment, attributed to a more stable surface electronegativity. The use of alumina coating on diamond is therefore a promising approach to improve the resilience of NV sensors.
△ Less
Submitted 23 November, 2023;
originally announced November 2023.
-
Supermassive black holes in a mass-limited galaxy sample
Authors:
Zachary Byrne,
Michael J. Drinkwater,
Holger Baumgardt,
David Blyth,
Patrick Côté,
Nora Lüetzgendorf,
Chelsea Spengler,
Laura Ferrarese,
Smriti Mahajan,
Joel Pfeffer,
Sarah Sweet
Abstract:
The observed scaling relations between supermassive black hole masses and their host galaxy properties indicate that supermassive black holes influence the evolution of galaxies. However, the scaling relations may be affected by selection biases. We propose to measure black hole masses in a mass-limited galaxy sample including all non-detections to inprove constraints on galaxy mass - black hole m…
▽ More
The observed scaling relations between supermassive black hole masses and their host galaxy properties indicate that supermassive black holes influence the evolution of galaxies. However, the scaling relations may be affected by selection biases. We propose to measure black hole masses in a mass-limited galaxy sample including all non-detections to inprove constraints on galaxy mass - black hole mass scaling relations and test for selection bias. We use high spatial resolution spectroscopy from the Keck and Gemini telescopes, and the Jeans Anisotropic Modelling method to measure black hole masses in early type galaxies from the Virgo Cluster. We present four new black hole masses and one upper limit in our mass-selected sample of galaxies of galaxy mass (1.0-3.2) x $10^{10} M_\odot$. This brings the total measured to 11 galaxies out of a full sample of 18 galaxies, allowing us to constrain scaling relations. We calculate a lower limit for the average black hole mass in our sample of 3.7 x $10^{7} M_\odot$. This is at an average galaxy stellar mass of (1.81 +/- 0.14) x $10^{10} M_\odot $ and an average bulge mass of (1.31 +/- 0.15) x $10^{10} M_\odot$. This lower limit shows that black hole masses in early type galaxies are not strongly affected by selection biases.
△ Less
Submitted 22 November, 2023;
originally announced November 2023.
-
Orca 2: Teaching Small Language Models How to Reason
Authors:
Arindam Mitra,
Luciano Del Corro,
Shweti Mahajan,
Andres Codas,
Clarisse Simoes,
Sahaj Agarwal,
Xuxi Chen,
Anastasia Razdaibiedina,
Erik Jones,
Kriti Aggarwal,
Hamid Palangi,
Guoqing Zheng,
Corby Rosset,
Hamed Khanpour,
Ahmed Awadallah
Abstract:
Orca 1 learns from rich signals, such as explanation traces, allowing it to outperform conventional instruction-tuned models on benchmarks like BigBench Hard and AGIEval. In Orca 2, we continue exploring how improved training signals can enhance smaller LMs' reasoning abilities. Research on training small LMs has often relied on imitation learning to replicate the output of more capable models. We…
▽ More
Orca 1 learns from rich signals, such as explanation traces, allowing it to outperform conventional instruction-tuned models on benchmarks like BigBench Hard and AGIEval. In Orca 2, we continue exploring how improved training signals can enhance smaller LMs' reasoning abilities. Research on training small LMs has often relied on imitation learning to replicate the output of more capable models. We contend that excessive emphasis on imitation may restrict the potential of smaller models. We seek to teach small LMs to employ different solution strategies for different tasks, potentially different from the one used by the larger model. For example, while larger models might provide a direct answer to a complex task, smaller models may not have the same capacity. In Orca 2, we teach the model various reasoning techniques (step-by-step, recall then generate, recall-reason-generate, direct answer, etc.). More crucially, we aim to help the model learn to determine the most effective solution strategy for each task. We evaluate Orca 2 using a comprehensive set of 15 diverse benchmarks (corresponding to approximately 100 tasks and over 36,000 unique prompts). Orca 2 significantly surpasses models of similar size and attains performance levels similar or better to those of models 5-10x larger, as assessed on complex tasks that test advanced reasoning abilities in zero-shot settings. make Orca 2 weights publicly available at aka.ms/orca-lm to support research on the development, evaluation, and alignment of smaller LMs
△ Less
Submitted 21 November, 2023; v1 submitted 18 November, 2023;
originally announced November 2023.
-
Double Clad Antiresonant Hollow Core Fiber and Its Comparison with other Fibres for Multiphoton Micro-Endoscopy
Authors:
Marzanna Szwaj,
Ian A Davidson,
Peter B Johnson,
Greg Jasion,
Yongmin Jung,
Seyed Reza Sandoghchi,
Krzysztof P Herdzik,
Konstantinos N Bourdakos,
Natalie V Wheeler,
Hans Christian Mulvad,
David J Richardson,
Francesco Poletti,
Sumeet Mahajan
Abstract:
In this work, we study a new hollow-core (air-filled) double-clad anti-resonant fiber (DC-ARF) as a potent candidate for multiphoton micro-endoscopy. We compare the fiber characteristics with a single-clad anti-resonant fiber (SC-ARF) and a solid core fiber (SCF). While the DC-ARF and the SC-ARF enable low-loss (<0.2 dBm-1), close to dispersion-free excitation pulse delivery (<10% pulse width incr…
▽ More
In this work, we study a new hollow-core (air-filled) double-clad anti-resonant fiber (DC-ARF) as a potent candidate for multiphoton micro-endoscopy. We compare the fiber characteristics with a single-clad anti-resonant fiber (SC-ARF) and a solid core fiber (SCF). While the DC-ARF and the SC-ARF enable low-loss (<0.2 dBm-1), close to dispersion-free excitation pulse delivery (<10% pulse width increase at 900 nm per 1 m fiber) without any induced non-linearities, the SCF resulted in spectral broadening and pulse-stretching (> 2000% of pulse width increase at 900 nm per 1 m fiber). An ideal optical fiber endoscope needs to be several meters long and should enable both excitation and collection through the fiber. Therefore, we performed multiphoton imaging on endoscopy-compatible 1 m and 3 m lengths of fiber in the back-scattered geometry, wherein the signals were collected either directly (non-descanned detection) or through the fiber (descanned detection). Second harmonic images were collected from barium titanate crystals as well as from biological samples (rat tail tendon). In non-descanned detection conditions, the ARFs outperformed the SCF by up to 10 times in terms of signal-to-noise ratio of images. Significantly, only the DC-ARF, due to its high numerical aperture (0.45) and wide-collection bandwidth (>1 um), could provide images in the de-scanned detection configuration desirable for endoscopy. Thus, our systematic characterization and comparison of different optical fibres under different image collection configurations, confirms and establishes the utility of DC-ARFs for high-performing label-free multiphoton imaging based micro-endoscopy.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Transport Barriers in Magnetized Plasmas -- General Theory with Dynamical Constraints
Authors:
Mike Kotschenreuther,
Xing Liu,
Swadesh M. Mahajan,
David R. Hatch,
Gabriele Merlo
Abstract:
A fundamental dynamical constraint -- that fluctuation induced charge-weighted particle flux must vanish -- can prevent instabilities from accessing the free energy in the strong gradients characteristic of Transport Barriers (TBs). Density gradients, when larger than a certain threshold, lead to a violation of the constraint and emerge as a stabilizing force. This mechanism, then, broadens the cl…
▽ More
A fundamental dynamical constraint -- that fluctuation induced charge-weighted particle flux must vanish -- can prevent instabilities from accessing the free energy in the strong gradients characteristic of Transport Barriers (TBs). Density gradients, when larger than a certain threshold, lead to a violation of the constraint and emerge as a stabilizing force. This mechanism, then, broadens the class of configurations (in magnetized plasmas) where these high confinement states can be formed and sustained. The need for velocity shear, the conventional agent for TB formation, is obviated. The most important ramifications of the constraint is to permit a charting out of the domains conducive to TB formation and hence to optimally confined fusion worthy states; the detailed investigation is conducted through new analytic methods and extensive gyrokinetic simulations.
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
The free energy balance equation applied to gyrokinetic instabilities, the effect of the charge flux constraint, and application to simplified kinetic models
Authors:
M. Kotschenreuther,
X. Liu,
S. M. Mahajan,
D. R. Hatch
Abstract:
The free energy balance equation for gyrokinetic fluctuations is derived and applied to instabilities. An additional term due to electromagnetic sources is included. This can provide a simpler way to compute the free energy balance in practical applications, and is also conceptually clarifying. The free energy balance, by itself, is not sufficient to determine an eigenfrequency. The preceding resu…
▽ More
The free energy balance equation for gyrokinetic fluctuations is derived and applied to instabilities. An additional term due to electromagnetic sources is included. This can provide a simpler way to compute the free energy balance in practical applications, and is also conceptually clarifying. The free energy balance, by itself, is not sufficient to determine an eigenfrequency. The preceding results are derived in general geometry. The charge flux constraint in gyrokinetics can provide a necessary additional relation, and the combination of these two can be equivalent to a dispersion relation. The charge flux constraint can prevent the appearance of an unstable eigenmode even though the free energy balance would allow strongly growing fluctuations. The application of these concepts to simplified kinetic models in simplified geometry is also indicated.
△ Less
Submitted 17 October, 2023;
originally announced October 2023.
-
Finite element approximation for the delayed generalized Burgers-Huxley equation with weakly singular kernel: Part II Non-Conforming and DG approximation
Authors:
Sumit Mahajan,
Arbaz Khan
Abstract:
In this paper, the numerical approximation of the generalized Burgers'-Huxley equation (GBHE) with weakly singular kernels using non-conforming methods will be presented. Specifically, we discuss two new formulations. The first formulation is based on the non-conforming finite element method (NCFEM). The other formulation is based on discontinuous Galerkin finite element methods (DGFEM). The wellp…
▽ More
In this paper, the numerical approximation of the generalized Burgers'-Huxley equation (GBHE) with weakly singular kernels using non-conforming methods will be presented. Specifically, we discuss two new formulations. The first formulation is based on the non-conforming finite element method (NCFEM). The other formulation is based on discontinuous Galerkin finite element methods (DGFEM). The wellposedness results for both formulations are proved.
Then, a priori error estimates for both the semi-discrete and fully-discrete schemes are derived.
Specific numerical examples, including some applications for the GBHE with weakly singular model, are discussed to validate the theoretical results.
△ Less
Submitted 1 November, 2023; v1 submitted 11 October, 2023;
originally announced October 2023.
-
Radio pulsars resonantly accelerating electrons
Authors:
Zaza N. Osmanov,
Swadesh M. Mahajan
Abstract:
Based on the recently demonstrated resonant wave-wave process, it is shown that electrons can be accelerated to ultra-relativistic energies in the magnetospheres of radio pulsars. The energization occurs via the resonant interaction of the electron wave (described by a Klein-Gordon (KG) equation) moving in unison with an intense electromagnetic (EM) wave; the KG wave/particle continuously draws en…
▽ More
Based on the recently demonstrated resonant wave-wave process, it is shown that electrons can be accelerated to ultra-relativistic energies in the magnetospheres of radio pulsars. The energization occurs via the resonant interaction of the electron wave (described by a Klein-Gordon (KG) equation) moving in unison with an intense electromagnetic (EM) wave; the KG wave/particle continuously draws energy from EM. In a brief recapitulation of the general theory, the high energy (resonantly enhanced) electron states are investigated by solving the KG equation, minimally coupled to the EM field. The restricted class of solutions, that propagate in phase with EM radiation (functions only of $ζ=ωt-kz$), are explored to serve as a possible basis for the proposed electron energization in the radio pulsars. We show that the wave-wave resonant energization mechanism could be operative in a broad class of radio pulsars with periods ranging from milliseconds to the normal values ($\sim 1$ sec); it could drive the magnetospheric electrons to acquire energies from $100$s of TeVs (millisecond pulsars) to $10$ ZeVs (normal pulsars).
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
Unsupervised deep learning framework for temperature-compensated damage assessment using ultrasonic guided waves on edge device
Authors:
Pankhi Kashyap,
Kajal Shivgan,
Sheetal Patil,
Ramana Raja B,
Sagar Mahajan,
Sauvik Banerjee,
Siddharth Tallur
Abstract:
Fueled by the rapid development of machine learning (ML) and greater access to cloud computing and graphics processing units (GPUs), various deep learning based models have been proposed for improving performance of ultrasonic guided wave structural health monitoring (GW-SHM) systems, especially to counter complexity and heterogeneity in data due to varying environmental factors (e.g., temperature…
▽ More
Fueled by the rapid development of machine learning (ML) and greater access to cloud computing and graphics processing units (GPUs), various deep learning based models have been proposed for improving performance of ultrasonic guided wave structural health monitoring (GW-SHM) systems, especially to counter complexity and heterogeneity in data due to varying environmental factors (e.g., temperature) and types of damages. Such models typically comprise of millions of trainable parameters, and therefore add to cost of deployment due to requirements of cloud connectivity and processing, thus limiting the scale of deployment of GW-SHM. In this work, we propose an alternative solution that leverages TinyML framework for development of light-weight ML models that could be directly deployed on embedded edge devices. The utility of our solution is illustrated by presenting an unsupervised learning framework for damage detection in honeycomb composite sandwich structure (HCSS) with disbond and delamination type of damages, validated using data generated by finite element (FE) simulations and experiments performed at various temperatures in the range 0°C to 90°C. We demonstrate a fully-integrated solution using a Xilinx Artix-7 FPGA for data acquisition and control, and edge-inference of damage.
△ Less
Submitted 8 October, 2023;
originally announced October 2023.
-
Automatic Pair Construction for Contrastive Post-training
Authors:
Canwen Xu,
Corby Rosset,
Ethan C. Chau,
Luciano Del Corro,
Shweti Mahajan,
Julian McAuley,
Jennifer Neville,
Ahmed Hassan Awadallah,
Nikhil Rao
Abstract:
Alignment serves as an important step to steer large language models (LLMs) towards human preferences. In this paper, we propose an automatic way to construct contrastive data for LLM, using preference pairs from multiple models of varying strengths (e.g., InstructGPT, ChatGPT and GPT-4). We compare the contrastive techniques of SLiC and DPO to SFT baselines and find that DPO provides a step-funct…
▽ More
Alignment serves as an important step to steer large language models (LLMs) towards human preferences. In this paper, we propose an automatic way to construct contrastive data for LLM, using preference pairs from multiple models of varying strengths (e.g., InstructGPT, ChatGPT and GPT-4). We compare the contrastive techniques of SLiC and DPO to SFT baselines and find that DPO provides a step-function improvement even after continuing SFT saturates. We also explore a data curriculum learning scheme for contrastive post-training, which starts by learning from "easier" pairs and transitioning to "harder" ones, which further improves alignment. Finally, we scale up our experiments to train with more data and larger models like Orca. Remarkably, our automatic contrastive post-training further improves the performance of Orca, already a state-of-the-art instruction learning model tuned with GPT-4 outputs, to outperform ChatGPT.
△ Less
Submitted 2 April, 2024; v1 submitted 3 October, 2023;
originally announced October 2023.
-
Modeling electron temperature profiles in the pedestal with simple formulas for ETG transport
Authors:
D. R. Hatch,
M. T. Kotschenreuther,
P. -Y. Li,
B. Chapman-Oplopoiou,
J. Parisi,
S. M. Mahajan,
R. Groebner
Abstract:
This paper reports on the refinement (building on Ref.~\cite{hatch_22}) and application of simple formulas for electron heat transport from electron temperature gradient (ETG) driven turbulence in the pedestal. The formulas are improved by (1) improving the parameterization for certain key parameters and (2) carefully accounting for the impact of geometry and shaping in the underlying gyrokinetic…
▽ More
This paper reports on the refinement (building on Ref.~\cite{hatch_22}) and application of simple formulas for electron heat transport from electron temperature gradient (ETG) driven turbulence in the pedestal. The formulas are improved by (1) improving the parameterization for certain key parameters and (2) carefully accounting for the impact of geometry and shaping in the underlying gyrokinetic simulation database. Comparisons with nonlinear gyrokinetic simulations of ETG transport in the MAST pedestal demonstrate the model's applicability to spherical tokamaks in addition to standard aspect ratio tokamaks. We identify bounds for model applicability: the model is accurate in the steep gradient region, where the ETG turbulence is largely slab-like, but accuracy decreases as the temperature gradient becomes weaker in the pedestal top and the instabilities become increasingly toroidal in nature. We use the formula to model the electron temperature profile in the pedestal for four experimental scenarios while extensively varying input parameters to represent uncertainties. In all cases, the predicted electron temperature pedestal exhibits extreme sensitivity to separatrix temperature and density, which has implications for core-edge integration. The model reproduces the electron temperature profile for high $η_e = L_{ne}/L_{Te}$ scenarios but not for low $η_e$ scenarios in which microtearing modes have been identified. We develop a proof-of-concept model for MTM transport and explore the relative roles of ETG and MTM in setting the electron temperature profile. We propose that pedestal scenarios predicted for future devices should be tested for compatibility with ETG transport.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
Finite element approximation for a delayed generalized Burgers-Huxley equation with weakly singular kernels: Part I Well-posedness, Regularity and Conforming approximation
Authors:
Sumit Mahajan,
Arbaz Khan,
Manil T. Mohan
Abstract:
The analysis of a delayed generalized Burgers-Huxley equation (a non-linear advection-diffusion-reaction problem) with weakly singular kernels is carried out in this work. Moreover, numerical approximations are performed using the conforming finite element method (CFEM). The existence, uniqueness and regularity results for the continuous problem have been discussed in detail using the Faedo-Galerk…
▽ More
The analysis of a delayed generalized Burgers-Huxley equation (a non-linear advection-diffusion-reaction problem) with weakly singular kernels is carried out in this work. Moreover, numerical approximations are performed using the conforming finite element method (CFEM). The existence, uniqueness and regularity results for the continuous problem have been discussed in detail using the Faedo-Galerkin approximation technique. For the numerical studies, we first propose a semi-discrete conforming finite element scheme for space discretization and discuss its error estimates under minimal regularity assumptions. We then employ a backward Euler discretization in time and CFEM in space to obtain a fully-discrete approximation. Additionally, we derive a prior error estimates for the fully-discrete approximated solution. Finally, we present computational results that support the derived theoretical results.
△ Less
Submitted 4 September, 2023;
originally announced September 2023.
-
Exact particle-enhanced point-spread function unlocks 3D super-resolution localization microscopy on nanoparticles
Authors:
Teun A. P. M. Huijben,
Sarojini Mahajan,
Peter Zijlstra,
Rodolphe Marie,
Kim I. Mortensen
Abstract:
Nanoparticles (NPs) have proven their applicability in biosensing, drug delivery, and photo-thermal therapy, but their performance depends critically on the distribution and number of functional groups on their surface. When studying surface functionalization using super-resolution microscopy, the NP modifies the fluorophores point-spread function (PSF). This leads to systematic mislocalizations i…
▽ More
Nanoparticles (NPs) have proven their applicability in biosensing, drug delivery, and photo-thermal therapy, but their performance depends critically on the distribution and number of functional groups on their surface. When studying surface functionalization using super-resolution microscopy, the NP modifies the fluorophores point-spread function (PSF). This leads to systematic mislocalizations in conventional analyses employing Gaussian PSFs. Here, we address this shortcoming by deriving the first-ever analytical PSF model for a fluorophore near a spherical NP. Its calculation is four orders of magnitude faster than numerical approaches and thus feasible for direct use in localization algorithms. We fit this model to individual 2D images from DNA-PAINT experiments on DNA-coated gold NPs and demonstrate extraction of the 3D positions of functional groups with <5 nm precision, revealing inhomogeneous surface coverage. Our method is exact, fast, accessible, and poised to become the standard in super-resolution imaging of NPs for biosensing and drug delivery applications.
△ Less
Submitted 13 June, 2023;
originally announced June 2023.
-
Excitation Properties of Photopigments and Their Possible Dependence on the Host Star
Authors:
Manasvi Lingam,
Amedeo Balbi,
Swadesh M. Mahajan
Abstract:
Photosynthesis is a plausible pathway for the sustenance of a substantial biosphere on an exoplanet. In fact, it is also anticipated to create distinctive biosignatures detectable by next-generation telescopes. In this work, we explore the excitation features of photopigments that harvest electromagnetic radiation by constructing a simple quantum-mechanical model. Our analysis suggests that the pr…
▽ More
Photosynthesis is a plausible pathway for the sustenance of a substantial biosphere on an exoplanet. In fact, it is also anticipated to create distinctive biosignatures detectable by next-generation telescopes. In this work, we explore the excitation features of photopigments that harvest electromagnetic radiation by constructing a simple quantum-mechanical model. Our analysis suggests that the primary Earth-based photopigments for photosynthesis may not function efficiently at wavelengths $> 1.1$ $μ$m. In the context of (hypothetical) extrasolar photopigments, we calculate the potential number of conjugated $π$-electrons ($N_\star$) in the relevant molecules, which can participate in the absorption of photons. By hypothesizing that the absorption maxima of photopigments are close to the peak spectral photon flux of the host star, we utilize the model to estimate $N_\star$. As per our formalism, $N_\star$ is modulated by the stellar temperature, and is conceivably higher (lower) for planets orbiting stars cooler (hotter) than the sun; exoplanets around late-type M-dwarfs might require an $N_\star$ twice that of the Earth. We conclude the analysis with a brief exposition of how our model could be empirically tested by future observations.
△ Less
Submitted 6 June, 2023;
originally announced June 2023.
-
Unsupervised Semantic Correspondence Using Stable Diffusion
Authors:
Eric Hedlin,
Gopal Sharma,
Shweta Mahajan,
Hossam Isack,
Abhishek Kar,
Andrea Tagliasacchi,
Kwang Moo Yi
Abstract:
Text-to-image diffusion models are now capable of generating images that are often indistinguishable from real images. To generate such images, these models must understand the semantics of the objects they are asked to generate. In this work we show that, without any training, one can leverage this semantic knowledge within diffusion models to find semantic correspondences - locations in multiple…
▽ More
Text-to-image diffusion models are now capable of generating images that are often indistinguishable from real images. To generate such images, these models must understand the semantics of the objects they are asked to generate. In this work we show that, without any training, one can leverage this semantic knowledge within diffusion models to find semantic correspondences - locations in multiple images that have the same semantic meaning. Specifically, given an image, we optimize the prompt embeddings of these models for maximum attention on the regions of interest. These optimized embeddings capture semantic information about the location, which can then be transferred to another image. By doing so we obtain results on par with the strongly supervised state of the art on the PF-Willow dataset and significantly outperform (20.9% relative for the SPair-71k dataset) any existing weakly or unsupervised method on PF-Willow, CUB-200 and SPair-71k datasets.
△ Less
Submitted 23 December, 2023; v1 submitted 24 May, 2023;
originally announced May 2023.
-
Impact of Anomalous Active Regions on the Large-scale Magnetic Field of the Sun
Authors:
Shaonwita Pal,
Prantika Bhowmik,
Sushant S. Mahajan,
Dibyendu Nandy
Abstract:
One of the major sources of perturbation in the solar cycle amplitude is believed to be the emergence of anomalous active regions which do not obey Hale's polarity law and Joy's law of tilt angles. Anomalous regions containing high magnetic flux that disproportionately impact the polar field are sometimes referred to as ``rogue regions". In this study -- utilizing a surface flux transport model --…
▽ More
One of the major sources of perturbation in the solar cycle amplitude is believed to be the emergence of anomalous active regions which do not obey Hale's polarity law and Joy's law of tilt angles. Anomalous regions containing high magnetic flux that disproportionately impact the polar field are sometimes referred to as ``rogue regions". In this study -- utilizing a surface flux transport model -- we analyze the large-scale dipole moment build-up due to the emergence of anomalous active regions on the solar surface. Although these active regions comprise a small fraction of the total sunspot number, they can substantially influence the magnetic dipole moment build-up and subsequent solar cycle amplitude. Our numerical simulations demonstrate that the impact of ``Anti-Joy'' regions on the solar cycle is similar to those of ``Anti-Hale'' regions. We also find that the emergence time, emergence latitude, relative number and flux distribution of anomalous regions influence the large-scale magnetic field dynamics in diverse ways. We establish that the results of our numerical study are consistent with the algebraic (analytic) approach to explaining the Sun's dipole moment evolution. Our results are relevant for understanding how anomalous active regions modulate the Sun's large-scale dipole moment build-up and its reversal timing within the framework of the Babcock-Leighton dynamo mechanism -- now believed to be the primary source of solar cycle variations.
△ Less
Submitted 7 May, 2024; v1 submitted 22 May, 2023;
originally announced May 2023.
-
A Bayesian Analysis of Technological Intelligence in Land and Oceans
Authors:
Manasvi Lingam,
Amedeo Balbi,
Swadesh M. Mahajan
Abstract:
Current research indicates that (sub)surface ocean worlds essentially devoid of subaerial landmasses (e.g., continents) are common in the Milky Way, and that these worlds could host habitable conditions, thence raising the possibility that life and technological intelligence (TI) may arise in such aquatic settings. It is known, however, that TI on Earth (i.e., humans) arose on land. Motivated by t…
▽ More
Current research indicates that (sub)surface ocean worlds essentially devoid of subaerial landmasses (e.g., continents) are common in the Milky Way, and that these worlds could host habitable conditions, thence raising the possibility that life and technological intelligence (TI) may arise in such aquatic settings. It is known, however, that TI on Earth (i.e., humans) arose on land. Motivated by these considerations, we present a Bayesian framework to assess the prospects for the emergence of TIs in land- and ocean-based habitats (LBHs and OBHs). If all factors are equally conducive for TIs to arise in LBHs and OBHs, we demonstrate that the evolution of TIs in LBHs (which includes humans) might have very low odds of roughly $1$-in-$10^3$ to $1$-in-$10^4$, thus outwardly contradicting the Copernican Principle. Hence, we elucidate three avenues whereby the Copernican Principle can be preserved: (i) the emergence rate of TIs is much lower in OBHs, (ii) the habitability interval for TIs is much shorter in OBHs, and (iii) only a small fraction of worlds with OBHs comprise appropriate conditions for effectuating TIs. We also briefly discuss methods for empirically falsifying our predictions, and comment on the feasibility of supporting TIs in aerial environments.
△ Less
Submitted 10 May, 2023;
originally announced May 2023.
-
Interpretable (not just posthoc-explainable) heterogeneous survivor bias-corrected treatment effects for assignment of postdischarge interventions to prevent readmissions
Authors:
Hongjing Xia,
Joshua C. Chang,
Sarah Nowak,
Sonya Mahajan,
Rohit Mahajan,
Ted L. Chang,
Carson C. Chow
Abstract:
We used survival analysis to quantify the impact of postdischarge evaluation and management (E/M) services in preventing hospital readmission or death. Our approach avoids a specific pitfall of applying machine learning to this problem, which is an inflated estimate of the effect of interventions, due to survivors bias -- where the magnitude of inflation may be conditional on heterogeneous confoun…
▽ More
We used survival analysis to quantify the impact of postdischarge evaluation and management (E/M) services in preventing hospital readmission or death. Our approach avoids a specific pitfall of applying machine learning to this problem, which is an inflated estimate of the effect of interventions, due to survivors bias -- where the magnitude of inflation may be conditional on heterogeneous confounders in the population. This bias arises simply because in order to receive an intervention after discharge, a person must not have been readmitted in the intervening period. After deriving an expression for this phantom effect, we controlled for this and other biases within an inherently interpretable Bayesian survival framework. We identified case management services as being the most impactful for reducing readmissions overall.
△ Less
Submitted 3 August, 2023; v1 submitted 19 April, 2023;
originally announced April 2023.
-
Removal Of Active Region Inflows Reveals a Weak Solar Cycle Scale Trend In Near-surface Meridional Flow
Authors:
Sushant S. Mahajan,
Xudong Sun,
Junwei Zhao
Abstract:
Using time-distance local helioseismology flow maps within 1 Mm of the solar photosphere, we detect inflows toward activity belts that contribute to solar cycle scale variations in near-surface meridional flow. These inflows stretch out as far as 30 degrees away from active region centroids. If active region neighborhoods are excluded, the solar cycle scale variation in background meridional flow…
▽ More
Using time-distance local helioseismology flow maps within 1 Mm of the solar photosphere, we detect inflows toward activity belts that contribute to solar cycle scale variations in near-surface meridional flow. These inflows stretch out as far as 30 degrees away from active region centroids. If active region neighborhoods are excluded, the solar cycle scale variation in background meridional flow diminishes to below 2~m~s$^{-1}$, but still shows systematic variations in the absence of active regions between Sunspot Cycles 24 and 25. We, therefore, propose that the near-surface meridional flow is a three component flow made up of: a constant baseline flow profile that can be derived from quiet Sun regions, variations due to inflows around active regions, and solar cycle scale variation of the order of 2~m~s$^{-1}$. Torsional oscillation, on the other hand, is found to be a global phenomenon i.e. exclusion of active region neighborhoods does not affect its magnitude or phase significantly. This non-variation of torsional oscillation with distance away from active regions and the three-component breakdown of the near-surface meridional flow serve as vital constraints for solar dynamo models and surface flux transport simulations.
△ Less
Submitted 4 April, 2023;
originally announced April 2023.
-
Dashing through the cluster: An X-ray to radio view of UGC 10420 undergoing ram-pressure stripping
Authors:
Smriti Mahajan,
Kulinder Pal Singh,
Juhi Tiwari,
Somak Raychaudhury
Abstract:
We present multi-wavelength data and analysis, including new FUV AstroSat/UVIT observations of the spiral galaxy UGC 10420 (z=0.032), a member of the cluster Abell 2199. UGC 10420 is present on the edge of the X-ray emitting region of the cluster at a distance of ~ 680 kpc from the centre. The FUV data shows intense knots of star formation on the leading edge of the galaxy, accompanied by a tail o…
▽ More
We present multi-wavelength data and analysis, including new FUV AstroSat/UVIT observations of the spiral galaxy UGC 10420 (z=0.032), a member of the cluster Abell 2199. UGC 10420 is present on the edge of the X-ray emitting region of the cluster at a distance of ~ 680 kpc from the centre. The FUV data shows intense knots of star formation on the leading edge of the galaxy, accompanied by a tail of the same on the diametrically opposite side. Our analysis shows that the images of the galaxy disk in the optical and mid-infrared are much smaller in size than that in the FUV. While the broadband optical colours of UGC 10420 are typical of a post-starburst galaxy, the SFR derived from a UV-to-IR spectral energy distribution is at least a factor of nine higher than that expected for a star-forming field galaxy of similar mass at its redshift. A careful removal of the contribution of the diffuse intracluster gas shows that the significant diffuse X-ray emission associated with the inter-stellar medium of UGC 10420 has a temperature, T_X = 0.24^{+0.09}_{-0.06} keV (0.4-2.0 keV) and luminosity, L_X = 1.8+/-0.9 x 10^{40} erg/s, which are typical of the X-ray emission from late-type spiral galaxies.
Our analysis favours a scenario where the interaction of a galaxy with the hot intra-cluster medium of the cluster, perturbs the gas in the galaxy causing starburst in the leading edge of the disk. On the other hand, the turbulence thus developed may also push some of the gas out of the disk. Interactions between the gas ejected from the galaxy and the intracluster medium can then locally trigger star formation in the wake of the galaxy experiencing ram-pressure stripping. Our data however does not rule out the possibility of a flyby encounter with a neighbouring galaxy, although no relevant candidates are observed in the vicinity of UGC 10420. (abridged)
△ Less
Submitted 3 February, 2023;
originally announced February 2023.
-
Lexi: Self-Supervised Learning of the UI Language
Authors:
Pratyay Banerjee,
Shweti Mahajan,
Kushal Arora,
Chitta Baral,
Oriana Riva
Abstract:
Humans can learn to operate the user interface (UI) of an application by reading an instruction manual or how-to guide. Along with text, these resources include visual content such as UI screenshots and images of application icons referenced in the text. We explore how to leverage this data to learn generic visio-linguistic representations of UI screens and their components. These representations…
▽ More
Humans can learn to operate the user interface (UI) of an application by reading an instruction manual or how-to guide. Along with text, these resources include visual content such as UI screenshots and images of application icons referenced in the text. We explore how to leverage this data to learn generic visio-linguistic representations of UI screens and their components. These representations are useful in many real applications, such as accessibility, voice navigation, and task automation. Prior UI representation models rely on UI metadata (UI trees and accessibility labels), which is often missing, incompletely defined, or not accessible. We avoid such a dependency, and propose Lexi, a pre-trained vision and language model designed to handle the unique features of UI screens, including their text richness and context sensitivity. To train Lexi we curate the UICaption dataset consisting of 114k UI images paired with descriptions of their functionality. We evaluate Lexi on four tasks: UI action entailment, instruction-based UI image retrieval, grounding referring expressions, and UI entity recognition.
△ Less
Submitted 23 January, 2023;
originally announced January 2023.
-
Exploring the Solar Poles: The Last Great Frontier of the Sun
Authors:
Dibyendu Nandy,
Dipankar Banerjee,
Prantika Bhowmik,
Allan Sacha Brun,
Robert H. Cameron,
S. E. Gibson,
Shravan Hanasoge,
Louise Harra,
Donald M. Hassler,
Rekha Jain,
Jie Jiang,
Laurène Jouve,
Duncan H. Mackay,
Sushant S. Mahajan,
Cristina H. Mandrini,
Mathew Owens,
Shaonwita Pal,
Rui F. Pinto,
Chitradeep Saha,
Xudong Sun,
Durgesh Tripathi,
Ilya G. Usoskin
Abstract:
Despite investments in multiple space and ground-based solar observatories by the global community, the Sun's polar regions remain unchartered territory - the last great frontier for solar observations. Breaching this frontier is fundamental to understanding the solar cycle - the ultimate driver of short-to-long term solar activity that encompasses space weather and space climate. Magnetohydrodyna…
▽ More
Despite investments in multiple space and ground-based solar observatories by the global community, the Sun's polar regions remain unchartered territory - the last great frontier for solar observations. Breaching this frontier is fundamental to understanding the solar cycle - the ultimate driver of short-to-long term solar activity that encompasses space weather and space climate. Magnetohydrodynamic dynamo models and empirically observed relationships have established that the polar field is the primary determinant of the future solar cycle amplitude. Models of solar surface evolution of tilted active regions indicate that the mid to high latitude surges of magnetic flux govern dynamics leading to the reversal and build-up of polar fields. Our theoretical understanding and numerical models of this high latitude magnetic field dynamics and plasma flows - that are a critical component of the sunspot cycle - lack precise observational constraints. This limitation compromises our ability to observe the enigmatic kilo Gauss polar flux patches and constrain the polar field distribution at high latitudes. The lack of these observations handicap our understanding of how high latitude magnetic fields power polar jets, plumes, and the fast solar wind that extend to the boundaries of the heliosphere and modulate solar open flux and cosmic ray flux within the solar system. Accurate observation of the Sun's polar regions, therefore, is the single most outstanding challenge that confronts Heliophysics. This paper argues the scientific case for novel out of ecliptic observations of the Sun's polar regions, in conjunction with existing, or future multi-vantage point heliospheric observatories. Such a mission concept can revolutionize the field of Heliophysics like no other mission concept has - with relevance that transcends spatial regimes from the solar interior to the heliosphere.
△ Less
Submitted 30 December, 2022;
originally announced January 2023.
-
Variations in Differential Rotation and Meridional Flow within the Sun's Surface Shear Layer 1996-2022
Authors:
David H. Hathaway,
Lisa A. Upton,
Sushant S. Mahajan
Abstract:
We measure differential rotation and meridional flow in the Sun's surface shear layer by tracking the motions of the magnetic network seen in magnetograms from SOHO/MDI and SDO/HMI over solar cycles 23, 24, and the start of 25 (1996-2022). We examine the axisymmetric flows derived from 15-24 daily measurements averaged over individual 27-day Carrington rotations. Variations in the differential rot…
▽ More
We measure differential rotation and meridional flow in the Sun's surface shear layer by tracking the motions of the magnetic network seen in magnetograms from SOHO/MDI and SDO/HMI over solar cycles 23, 24, and the start of 25 (1996-2022). We examine the axisymmetric flows derived from 15-24 daily measurements averaged over individual 27-day Carrington rotations. Variations in the differential rotation include the equatorial torsional oscillation - cyclonic flows centered on the active latitudes with slower flows on the poleward sides of the active latitudes and faster flows equatorward. The fast flow band starts at $\sim$45$^\circ$ latitude during the declining phase of the previous cycle and drifts equatorward, terminating at the equator at about the time of cycle minimum. Variations in the differential rotation also include a polar oscillation above 45$^\circ$ with faster rotation at cycle maxima and slower rotation at cycle minima. The equatorial variations were stronger in cycle 24 than in cycle 23 but the polar variations were weaker. Variations in the meridional flow include a slowing of the poleward flow in the active latitudes during cycle rise and maximum and a speeding up of the poleward flow during cycle decline and minimum. The slowing in the active latitudes was less pronounced in cycle 24 than in cycle 23. Polar countercells (equatorward flow) extend from the poles down to $\sim$60$^\circ$ latitude from time to time (1996-2000 and 2016-2022 in the south and 2001-2011 and 2017-2022 in the north). Both axisymmetric flows vary in strength with depth. The rotation rate increases inward while the meridional flow weakens inward.
△ Less
Submitted 20 December, 2022;
originally announced December 2022.
-
Gamma Rays Bursts: A Viable Cosmological Probe?
Authors:
Darshan Kumar,
Nisha Rani,
Deepak Jain,
Shobhit Mahajan,
Amitabha Mukherjee
Abstract:
In this work, our focus is on exploring the potential of current GRB measurements to provide reliable constraints on cosmological model parameters at high redshift. This work is divided into two parts. First, we calibrate the Amati relation in a model-independent way by using Hubble parameter measurements obtained from the differential ages of the galaxies. We further check if the Amati relation p…
▽ More
In this work, our focus is on exploring the potential of current GRB measurements to provide reliable constraints on cosmological model parameters at high redshift. This work is divided into two parts. First, we calibrate the Amati relation in a model-independent way by using Hubble parameter measurements obtained from the differential ages of the galaxies. We further check if the Amati relation parameters evolve with the GRBs' redshift or not, using the data of Old Astrophysical Objects. The results indicate that GRBs do seem to evolve with redshift. In the second part, we test different cosmological models with the calibrated GRB data obtained by using constant and dynamical Amati relation. Our results indicate that the present quality of GRB data is not good enough to put tight constraints on the cosmological parameters. Hence we perform a joint analysis with the combined data of GRBs and Type Ia Supernovae (SNe) and find that this can considerably enhance cosmological constraints in contrast to solely relying on GRBs.
△ Less
Submitted 27 June, 2023; v1 submitted 12 December, 2022;
originally announced December 2022.
-
Parametric amplification of electromagnetic plasma waves in resonance with a dispersive background gravitational wave
Authors:
Swadesh M. Mahajan,
Felipe A. Asenjo
Abstract:
It is shown that a sub-luminal electromagnetic plasma wave, propagating in phase with a background sub-luminal gravitational wave in a dispersive medium, can undergo parametric amplification. For this phenomena to occur, the dispersive characteristics of the two waves must properly match. The response frequencies of the two waves (medium dependent) must lie within a definite and restrictive range.…
▽ More
It is shown that a sub-luminal electromagnetic plasma wave, propagating in phase with a background sub-luminal gravitational wave in a dispersive medium, can undergo parametric amplification. For this phenomena to occur, the dispersive characteristics of the two waves must properly match. The response frequencies of the two waves (medium dependent) must lie within a definite and restrictive range. The combined dynamics is represented by a Whitaker-Hill equation, the quintessential model for parametric instabilities. The exponential growth of the electromagnetic wave is displayed at the resonance; the plasma wave grows at the expense of the background gravitational wave. Different physical scenarios, where the phenomenon can be possible, are discussed.
△ Less
Submitted 28 November, 2022;
originally announced November 2022.
-
Make-A-Story: Visual Memory Conditioned Consistent Story Generation
Authors:
Tanzila Rahman,
Hsin-Ying Lee,
Jian Ren,
Sergey Tulyakov,
Shweta Mahajan,
Leonid Sigal
Abstract:
There has been a recent explosion of impressive generative models that can produce high quality images (or videos) conditioned on text descriptions. However, all such approaches rely on conditional sentences that contain unambiguous descriptions of scenes and main actors in them. Therefore employing such models for more complex task of story visualization, where naturally references and co-referen…
▽ More
There has been a recent explosion of impressive generative models that can produce high quality images (or videos) conditioned on text descriptions. However, all such approaches rely on conditional sentences that contain unambiguous descriptions of scenes and main actors in them. Therefore employing such models for more complex task of story visualization, where naturally references and co-references exist, and one requires to reason about when to maintain consistency of actors and backgrounds across frames/scenes, and when not to, based on story progression, remains a challenge. In this work, we address the aforementioned challenges and propose a novel autoregressive diffusion-based framework with a visual memory module that implicitly captures the actor and background context across the generated frames. Sentence-conditioned soft attention over the memories enables effective reference resolution and learns to maintain scene and actor consistency when needed. To validate the effectiveness of our approach, we extend the MUGEN dataset and introduce additional characters, backgrounds and referencing in multi-sentence storylines. Our experiments for story generation on the MUGEN, the PororoSV and the FlintstonesSV dataset show that our method not only outperforms prior state-of-the-art in generating frames with high visual quality, which are consistent with the story, but also models appropriate correspondences between the characters and the background.
△ Less
Submitted 5 May, 2023; v1 submitted 23 November, 2022;
originally announced November 2022.
-
Quasi-localized vibrational modes, Boson peak and sound attenuation in model mass-spring networks
Authors:
Shivam Mahajan,
Massimo Pica Ciamarra
Abstract:
We introduce an algorithm that constructs disordered mass-spring networks whose elastic properties mimic that of glasses by tuning the fluctuations of the local elastic properties, keeping fixed connectivity and controlling the prestress. In two dimensions, the algorithm reproduces the dependence of gasses' vibrational properties, such as quasi-localised vibrational modes and Boson peak, on the de…
▽ More
We introduce an algorithm that constructs disordered mass-spring networks whose elastic properties mimic that of glasses by tuning the fluctuations of the local elastic properties, keeping fixed connectivity and controlling the prestress. In two dimensions, the algorithm reproduces the dependence of gasses' vibrational properties, such as quasi-localised vibrational modes and Boson peak, on the degree of stability. The sound attenuation displays Rayleigh scattering and disorder-broadening regimes at different frequencies, and the attenuation rate increases with increased stability. Our results establish a strong connection between the vibrational features of disordered solids and the fluctuations of the local elastic properties and provide a new approach to investigating glasses' vibrational anomalies.
△ Less
Submitted 14 April, 2023; v1 submitted 2 November, 2022;
originally announced November 2022.
-
Deepest far ultraviolet view of a central field in the Coma cluster by AstroSat UVIT
Authors:
Smriti Mahajan,
Kulinder Pal Singh,
Joseph E. Postma,
Kala G. Pradeep,
Koshy George,
Patrick Côté
Abstract:
We present analysis of the far ultraviolet (FUV) emission of sources in the central region of the Coma cluster (z=0.023) using the data taken by the UVIT aboard the multi-wavelength satellite mission AstroSat. We find a good correlation between the UVIT FUV flux and the fluxes in both wavebands of the Galex mission, for the common sources. We detect stars and galaxies, amongst which the brightest…
▽ More
We present analysis of the far ultraviolet (FUV) emission of sources in the central region of the Coma cluster (z=0.023) using the data taken by the UVIT aboard the multi-wavelength satellite mission AstroSat. We find a good correlation between the UVIT FUV flux and the fluxes in both wavebands of the Galex mission, for the common sources. We detect stars and galaxies, amongst which the brightest (r <= 17 mag) galaxies in the field of view are mostly members of the Coma cluster. We also detect three quasars (z = 0.38, 0.51, 2.31), one of which is likely the farthest object observed by the UVIT so far. In almost all the optical and UV colour-colour and colour-magnitude planes explored in this work, the Coma galaxies, other galaxies and bright stars could be separately identified, but the fainter stars and quasars often coincide with the faint galaxies. We have also investigated galaxies with unusual FUV morphology which are likely to be galaxies experiencing ram-pressure stripping in the cluster. Amongst others, two confirmed cluster members which were not investigated in the literature earlier, have been found to show unusual FUV emission. All the distorted sources are likely to have fallen into the cluster recently, and hence have not virialised yet. A subset of our data have optical spectroscopic information available from the archives. For these sources (~ 10% of the sample), we find that 17 galaxies identify as star-forming, 18 as composite and 13 as host galaxies for active galactic nuclei, respectively on the emission-line diagnostic diagram.
△ Less
Submitted 13 September, 2022;
originally announced September 2022.
-
Gyrokinetic Simulations Compared with Magnetic Fluctuations Diagnosed with a Faraday-Effect Radial Interferometer-Polarimeter in the DIII-D pedestal
Authors:
M. T. Curie,
D. R. Hatch,
M. Halfmoon,
J. Chen,
D. L. Brower,
E. Hassan,
M. Kotschenreuther,
S. M. Mahajan,
R. J. Groebner
Abstract:
Experimental data on electromagnetic fluctuations in DIII-D, made available by the Faraday-effect Radial Interferometer-Polarimeter (RIP) diagnostic, is examined in comparison with detailed gyrokinetic simulations using Gyrokinetic Electromagnetic Numerical Experiment (GENE). The diagnostic has the unique capability of making internal measurements of fluctuating magnetic fields…
▽ More
Experimental data on electromagnetic fluctuations in DIII-D, made available by the Faraday-effect Radial Interferometer-Polarimeter (RIP) diagnostic, is examined in comparison with detailed gyrokinetic simulations using Gyrokinetic Electromagnetic Numerical Experiment (GENE). The diagnostic has the unique capability of making internal measurements of fluctuating magnetic fields $\frac{\int n_e δB_r dR}{\int n_e dR}$. Local linear simulations identify microtearing modes (MTMs) over a substantial range of toroidal mode numbers (peaking at $n=15$) with frequencies in good agreement with the experimental data. Local nonlinear simulations reinforce this result by producing a magnetic frequency spectrum in good agreement with that diagnosed by RIP. Simulated heat fluxes are in the range of experimental expectations. However, magnetic fluctuation amplitudes are substantially lower than the experimental expectations. Possible sources of this discrepancy are discussed, notably the fact that the diagnostics are localized at the mid-plane -- the poloidal location where the simulations predict the fluctuation amplitudes to be smallest. Despite some discrepancies, several connections between simulations and experiments, combined with general criteria discriminating between potential pedestal instabilities, strongly point to MTMs as the source of the observed magnetic fluctuations.
△ Less
Submitted 4 October, 2022; v1 submitted 29 August, 2022;
originally announced August 2022.
-
Interpretable (not just posthoc-explainable) medical claims modeling for discharge placement to prevent avoidable all-cause readmissions or death
Authors:
Joshua C. Chang,
Ted L. Chang,
Carson C. Chow,
Rohit Mahajan,
Sonya Mahajan,
Joe Maisog,
Shashaank Vattikuti,
Hongjing Xia
Abstract:
We developed an inherently interpretable multilevel Bayesian framework for representing variation in regression coefficients that mimics the piecewise linearity of ReLU-activated deep neural networks. We used the framework to formulate a survival model for using medical claims to predict hospital readmission and death that focuses on discharge placement, adjusting for confounding in estimating cau…
▽ More
We developed an inherently interpretable multilevel Bayesian framework for representing variation in regression coefficients that mimics the piecewise linearity of ReLU-activated deep neural networks. We used the framework to formulate a survival model for using medical claims to predict hospital readmission and death that focuses on discharge placement, adjusting for confounding in estimating causal local average treatment effects. We trained the model on a 5% sample of Medicare beneficiaries from 2008 and 2011, based on their 2009--2011 inpatient episodes, and then tested the model on 2012 episodes. The model scored an AUROC of approximately 0.76 on predicting all-cause readmissions -- defined using official Centers for Medicare and Medicaid Services (CMS) methodology -- or death within 30-days of discharge, being competitive against XGBoost and a Bayesian deep neural network, demonstrating that one need-not sacrifice interpretability for accuracy. Crucially, as a regression model, we provide what blackboxes cannot -- the exact gold-standard global interpretation of the model, identifying relative risk factors and quantifying the effect of discharge placement. We also show that the posthoc explainer SHAP fails to provide accurate explanations.
△ Less
Submitted 29 January, 2023; v1 submitted 28 August, 2022;
originally announced August 2022.
-
On the fraction of particles involved in magneto-centrifugally generated ultra-high energy electrons in the Crab pulsar
Authors:
Z. N. Osmanov,
S. M. Mahajan
Abstract:
The earthward journey of ultra high energy electrons ($\sim 600$ TeV) produced in the Pulsar atmosphere by Landau damping of magneto-centrifugally excited Langmuir waves (drawing energy form the rotational slowdown) on primary electrons, is charted. It is shown, that just as they escape the light cylinder zone, the ultra-high energy particles, interacting with the medium of the Crab nebula, rapidl…
▽ More
The earthward journey of ultra high energy electrons ($\sim 600$ TeV) produced in the Pulsar atmosphere by Landau damping of magneto-centrifugally excited Langmuir waves (drawing energy form the rotational slowdown) on primary electrons, is charted. It is shown, that just as they escape the light cylinder zone, the ultra-high energy particles, interacting with the medium of the Crab nebula, rapidly loose their energy via the quantum synchrotron process, producing highly energetic gamma rays ~ $\sim 0.6$PeV. Interacting with the cosmic background radiation in the interstellar medium, only a tiny fraction of these ultra high energy photons (via the $γγ$ channel) are, then transformed into electron-positron pairs. Detected flux of these photons imposes an upper limit on the fraction ($4\times 10^{-7}$) of the magnetospheric particles involved in the process of generation of ultra-high energy photons (up to $600$ TeV).
△ Less
Submitted 3 August, 2022;
originally announced August 2022.
-
Resonant energization of particles by radio AGN
Authors:
S. M. Mahajan,
Z. N. Osmanov
Abstract:
A new mechanism of particle acceleration, based on the resonant interaction of a classical electromagnetic wave (EM) with a quantum wave (associated with a relativistic particle), is explored.
In a model illustrative calculation, we study the fate of a Klein Gordon wave subjected to the intense radio frequency waves generated in the vicinity of an active galactic nuclei (AGN). In the framework o…
▽ More
A new mechanism of particle acceleration, based on the resonant interaction of a classical electromagnetic wave (EM) with a quantum wave (associated with a relativistic particle), is explored.
In a model illustrative calculation, we study the fate of a Klein Gordon wave subjected to the intense radio frequency waves generated in the vicinity of an active galactic nuclei (AGN). In the framework of the paper we examine a quantum wave associated with a relativistic particle, and it is shown that the group velocity of the wave approaches the speed of light, implying that the particles resonantly exchange energy with EM waves, eventually leading to acceleration of particles to very high energies.
For typical parameters of under accreting Eddington radio AGN, it is shown that the resonant energization could catapult particles to extreme energies $\sim 10^{16-20}$eV.
△ Less
Submitted 11 June, 2022;
originally announced June 2022.
-
Adversarial synthesis based data-augmentation for code-switched spoken language identification
Authors:
Parth Shastri,
Chirag Patil,
Poorval Wanere,
Dr. Shrinivas Mahajan,
Dr. Abhishek Bhatt,
Dr. Hardik Sailor
Abstract:
Spoken Language Identification (LID) is an important sub-task of Automatic Speech Recognition(ASR) that is used to classify the language(s) in an audio segment. Automatic LID plays an useful role in multilingual countries. In various countries, identifying a language becomes hard, due to the multilingual scenario where two or more than two languages are mixed together during conversation. Such phe…
▽ More
Spoken Language Identification (LID) is an important sub-task of Automatic Speech Recognition(ASR) that is used to classify the language(s) in an audio segment. Automatic LID plays an useful role in multilingual countries. In various countries, identifying a language becomes hard, due to the multilingual scenario where two or more than two languages are mixed together during conversation. Such phenomenon of speech is called as code-mixing or code-switching. This nature is followed not only in India but also in many Asian countries. Such code-mixed data is hard to find, which further reduces the capabilities of the spoken LID. Hence, this work primarily addresses this problem using data augmentation as a solution on the on the data scarcity of the code-switched class. This study focuses on Indic language code-mixed with English. Spoken LID is performed on Hindi, code-mixed with English. This research proposes Generative Adversarial Network (GAN) based data augmentation technique performed using Mel spectrograms for audio data. GANs have already been proven to be accurate in representing the real data distribution in the image domain. Proposed research exploits these capabilities of GANs in speech domains such as speech classification, automatic speech recognition, etc. GANs are trained to generate Mel spectrograms of the minority code-mixed class which are then used to augment data for the classifier. Utilizing GANs give an overall improvement on Unweighted Average Recall by an amount of 3.5% as compared to a Convolutional Recurrent Neural Network (CRNN) classifier used as the baseline reference.
△ Less
Submitted 1 June, 2022; v1 submitted 30 May, 2022;
originally announced May 2022.
-
Constraints on the Transition Redshift using Hubble Phase Space Portrait
Authors:
Darshan Kumar,
Deepak Jain,
Shobhit Mahajan,
Amitabha Mukherjee,
Akshay Rana
Abstract:
One of the most significant discoveries in modern cosmology is that the universe is currently in a phase of accelerated expansion after a switch from a decelerated expansion. The redshift corresponding to this epoch is referred to as the transition redshift $z_t$. In this work we put constraints on the $z_t$ with both model-independent and model-dependent approaches. We consider 32 Hubble paramete…
▽ More
One of the most significant discoveries in modern cosmology is that the universe is currently in a phase of accelerated expansion after a switch from a decelerated expansion. The redshift corresponding to this epoch is referred to as the transition redshift $z_t$. In this work we put constraints on the $z_t$ with both model-independent and model-dependent approaches. We consider 32 Hubble parameter measurements and the Pantheon sample of Type Ia Supernovae (SNe). In order to include the possible systematic effects in this analysis, we use the full covariance matrix of systematic uncertainties for the Hubble parameter measurements. We plot a Hubble Phase Space Portrait (HPSP) between $\dot{H}(z)$ and $H(z)$ in a model-independent way. From this HPSP diagram, we estimate the transition redshift as well as the current value of the equation of state parameter $ω_0$ in a model-independent way. By considering H(z) measurements, we find the best fit value of $z_t=0.591^{+0.332}_{-0.332}$ and $ω_0=-0.677^{+0.238}_{-0.238}$. We obtain the best fit value of $z_t=0.849^{+0.117}_{-0.117}$ and $ω_0=-0.870^{+0.013}_{-0.013}$ using the Pantheon database. Further, we also use a model dependent approach to determine $z_t$. Here, we consider a non-flat $Λ$CDM model as a background cosmological model. We reconstruct the cosmic triangle plot among $\log(Ω_{m0})$, $-\log(2Ω_{\Lambda0})$ and $3\log(1+z_t)$ where the constraints of each parameter are determined by the location in this triangle plot. Using $Ω_{m0}$ and $Ω_{\Lambda0}$ values, we find the best value of the transition redshift $z_t=0.619^{+0.580}_{-0.758}$, which is in good agreement with the Planck 2018 results at $1σ$ confidence level. We also simulate the observed Hubble parameter measurements in the redshift range $0<z<2$ and perform the same analysis to estimate the transition redshift.
△ Less
Submitted 27 March, 2023; v1 submitted 26 May, 2022;
originally announced May 2022.