subscribe to arXiv mailings

doi 10.18653/v1/2023.emnlp-industry.53

InsightNet: Structured Insight Mining from Customer Feedback

Authors: Sandeep Sricharan Mukku, Manan Soni, Jitenkumar Rana, Chetan Aggarwal, Promod Yenigalla, Rashmi Patange, Shyam Mohan

Abstract: We propose InsightNet, a novel approach for the automated extraction of structured insights from customer reviews. Our end-to-end machine learning framework is designed to overcome the limitations of current solutions, including the absence of structure for identified topics, non-standard aspect names, and lack of abundant training data. The proposed solution builds a semi-supervised multi-level t… ▽ More We propose InsightNet, a novel approach for the automated extraction of structured insights from customer reviews. Our end-to-end machine learning framework is designed to overcome the limitations of current solutions, including the absence of structure for identified topics, non-standard aspect names, and lack of abundant training data. The proposed solution builds a semi-supervised multi-level taxonomy from raw reviews, a semantic similarity heuristic approach to generate labelled data and employs a multi-task insight extraction architecture by fine-tuning an LLM. InsightNet identifies granular actionable topics with customer sentiments and verbatim for each topic. Evaluations on real-world customer review data show that InsightNet performs better than existing solutions in terms of structure, hierarchy and completeness. We empirically demonstrate that InsightNet outperforms the current state-of-the-art methods in multi-label topic classification, achieving an F1 score of 0.85, which is an improvement of 11% F1-score over the previous best results. Additionally, InsightNet generalises well for unseen aspects and suggests new topics to be added to the taxonomy. △ Less

Submitted 12 May, 2024; originally announced May 2024.

Comments: EMNLP 2023

arXiv:2404.09920 [pdf, other]

Combined Pre-Supernova Alert System with Kamland and Super-Kamiokande

Authors: KamLAND, Super-Kamiokande Collaborations, :, Seisho Abe, Minori Eizuka, Sawako Futagi, Azusa Gando, Yoshihito Gando, Shun Goto, Takahiko Hachiya, Kazumi Hata, Koichi Ichimura, Sei Ieki, Haruo Ikeda, Kunio Inoue, Koji Ishidoshiro, Yuto Kamei, Nanami Kawada, Yasuhiro Kishimoto, Masayuki Koga, Maho Kurasawa, Tadao Mitsui, Haruhiko Miyake, Daisuke Morita, Takeshi Nakahata , et al. (290 additional authors not shown)

Abstract: Preceding a core-collapse supernova, various processes produce an increasing amount of neutrinos of all flavors characterized by mounting energies from the interior of massive stars. Among them, the electron antineutrinos are potentially detectable by terrestrial neutrino experiments such as KamLAND and Super-Kamiokande via inverse beta decay interactions. Once these pre-supernova neutrinos are ob… ▽ More Preceding a core-collapse supernova, various processes produce an increasing amount of neutrinos of all flavors characterized by mounting energies from the interior of massive stars. Among them, the electron antineutrinos are potentially detectable by terrestrial neutrino experiments such as KamLAND and Super-Kamiokande via inverse beta decay interactions. Once these pre-supernova neutrinos are observed, an early warning of the upcoming core-collapse supernova can be provided. In light of this, KamLAND and Super-Kamiokande, both located in the Kamioka mine in Japan, have been monitoring pre-supernova neutrinos since 2015 and 2021, respectively. Recently, we performed a joint study between KamLAND and Super-Kamiokande on pre-supernova neutrino detection. A pre-supernova alert system combining the KamLAND detector and the Super-Kamiokande detector was developed and put into operation, which can provide a supernova alert to the astrophysics community. Fully leveraging the complementary properties of these two detectors, the combined alert is expected to resolve a pre-supernova neutrino signal from a 15 M$_{\odot}$ star within 510 pc of the Earth, at a significance level corresponding to a false alarm rate of no more than 1 per century. For a Betelgeuse-like model with optimistic parameters, it can provide early warnings up to 12 hours in advance. △ Less

Submitted 1 July, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

Comments: Resubmitted to ApJ. 22 pages, 16 figures, for more information about the combined pre-supernova alert system, see https://www.lowbg.org/presnalarm/

arXiv:2404.01158 [pdf, other]

Dialogue with Robots: Proposals for Broadening Participation and Research in the SLIVAR Community

Authors: Casey Kennington, Malihe Alikhani, Heather Pon-Barry, Katherine Atwell, Yonatan Bisk, Daniel Fried, Felix Gervits, Zhao Han, Mert Inan, Michael Johnston, Raj Korpan, Diane Litman, Matthew Marge, Cynthia Matuszek, Ross Mead, Shiwali Mohan, Raymond Mooney, Natalie Parde, Jivko Sinapov, Angela Stewart, Matthew Stone, Stefanie Tellex, Tom Williams

Abstract: The ability to interact with machines using natural human language is becoming not just commonplace, but expected. The next step is not just text interfaces, but speech interfaces and not just with computers, but with all machines including robots. In this paper, we chronicle the recent history of this growing field of spoken dialogue with robots and offer the community three proposals, the first… ▽ More The ability to interact with machines using natural human language is becoming not just commonplace, but expected. The next step is not just text interfaces, but speech interfaces and not just with computers, but with all machines including robots. In this paper, we chronicle the recent history of this growing field of spoken dialogue with robots and offer the community three proposals, the first focused on education, the second on benchmarks, and the third on the modeling of language when it comes to spoken interaction with robots. The three proposals should act as white papers for any researcher to take and build upon. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: NSF Report on the "Dialogue with Robots" Workshop held in Pittsburg, PA, April 2023

arXiv:2403.11337 [pdf, other]

Enhancing Bandwidth Efficiency for Video Motion Transfer Applications using Deep Learning Based Keypoint Prediction

Authors: Xue Bai, Tasmiah Haque, Sumit Mohan, Yuliang Cai, Byungheon Jeong, Adam Halasz, Srinjoy Das

Abstract: We propose a deep learning based novel prediction framework for enhanced bandwidth reduction in motion transfer enabled video applications such as video conferencing, virtual reality gaming and privacy preservation for patient health monitoring. To model complex motion, we use the First Order Motion Model (FOMM) that represents dynamic objects using learned keypoints along with their local affine… ▽ More We propose a deep learning based novel prediction framework for enhanced bandwidth reduction in motion transfer enabled video applications such as video conferencing, virtual reality gaming and privacy preservation for patient health monitoring. To model complex motion, we use the First Order Motion Model (FOMM) that represents dynamic objects using learned keypoints along with their local affine transformations. Keypoints are extracted by a self-supervised keypoint detector and organized in a time series corresponding to the video frames. Prediction of keypoints, to enable transmission using lower frames per second on the source device, is performed using a Variational Recurrent Neural Network (VRNN). The predicted keypoints are then synthesized to video frames using an optical flow estimator and a generator network. This efficacy of leveraging keypoint based representations in conjunction with VRNN based prediction for both video animation and reconstruction is demonstrated on three diverse datasets. For real-time applications, our results show the effectiveness of our proposed architecture by enabling up to 2x additional bandwidth reduction over existing keypoint based video motion transfer frameworks without significantly compromising video quality. △ Less

Submitted 17 March, 2024; originally announced March 2024.

arXiv:2402.18434 [pdf, other]

Graph Regularized Encoder Training for Extreme Classification

Authors: Anshul Mittal, Shikhar Mohan, Deepak Saini, Suchith C. Prabhu, Jain jiao, Sumeet Agarwal, Soumen Chakrabarti, Purushottam Kar, Manik Varma

Abstract: Deep extreme classification (XC) aims to train an encoder architecture and an accompanying classifier architecture to tag a data point with the most relevant subset of labels from a very large universe of labels. XC applications in ranking, recommendation and tagging routinely encounter tail labels for which the amount of training data is exceedingly small. Graph convolutional networks (GCN) prese… ▽ More Deep extreme classification (XC) aims to train an encoder architecture and an accompanying classifier architecture to tag a data point with the most relevant subset of labels from a very large universe of labels. XC applications in ranking, recommendation and tagging routinely encounter tail labels for which the amount of training data is exceedingly small. Graph convolutional networks (GCN) present a convenient but computationally expensive way to leverage task metadata and enhance model accuracies in these settings. This paper formally establishes that in several use cases, the steep computational cost of GCNs is entirely avoidable by replacing GCNs with non-GCN architectures. The paper notices that in these settings, it is much more effective to use graph data to regularize encoder training than to implement a GCN. Based on these insights, an alternative paradigm RAMEN is presented to utilize graph metadata in XC settings that offers significant performance boosts with zero increase in inference computational costs. RAMEN scales to datasets with up to 1M labels and offers prediction accuracy up to 15% higher on benchmark datasets than state of the art methods, including those that use graph metadata to train GCNs. RAMEN also offers 10% higher accuracy over the best baseline on a proprietary recommendation dataset sourced from click logs of a popular search engine. Code for RAMEN will be released publicly. △ Less

Submitted 28 February, 2024; originally announced February 2024.

arXiv:2402.06964 [pdf, other]

NLP for Knowledge Discovery and Information Extraction from Energetics Corpora

Authors: Francis G. VanGessel, Efrem Perry, Salil Mohan, Oliver M. Barham, Mark Cavolowsky

Abstract: We present a demonstration of the utility of NLP for aiding research into energetic materials and associated systems. The NLP method enables machine understanding of textual data, offering an automated route to knowledge discovery and information extraction from energetics text. We apply three established unsupervised NLP models: Latent Dirichlet Allocation, Word2Vec, and the Transformer to a larg… ▽ More We present a demonstration of the utility of NLP for aiding research into energetic materials and associated systems. The NLP method enables machine understanding of textual data, offering an automated route to knowledge discovery and information extraction from energetics text. We apply three established unsupervised NLP models: Latent Dirichlet Allocation, Word2Vec, and the Transformer to a large curated dataset of energetics-related scientific articles. We demonstrate that each NLP algorithm is capable of identifying energetic topics and concepts, generating a language model which aligns with Subject Matter Expert knowledge. Furthermore, we present a document classification pipeline for energetics text. Our classification pipeline achieves 59-76\% accuracy depending on the NLP model used, with the highest performing Transformer model rivaling inter-annotator agreement metrics. The NLP approaches studied in this work can identify concepts germane to energetics and therefore hold promise as a tool for accelerating energetics research efforts and energetics material development. △ Less

Submitted 10 February, 2024; originally announced February 2024.

arXiv:2402.00234 [pdf, other]

Are Generative AI systems Capable of Supporting Information Needs of Patients?

Authors: Shreya Rajagopal, Subhashis Hazarika, Sookyung Kim, Yan-ming Chiou, Jae Ho Sohn, Hari Subramonyam, Shiwali Mohan

Abstract: Patients managing a complex illness such as cancer face a complex information challenge where they not only must learn about their illness but also how to manage it. Close interaction with healthcare experts (radiologists, oncologists) can improve patient learning and thereby, their disease outcome. However, this approach is resource intensive and takes expert time away from other critical tasks.… ▽ More Patients managing a complex illness such as cancer face a complex information challenge where they not only must learn about their illness but also how to manage it. Close interaction with healthcare experts (radiologists, oncologists) can improve patient learning and thereby, their disease outcome. However, this approach is resource intensive and takes expert time away from other critical tasks. Given the recent advancements in Generative AI models aimed at improving the healthcare system, our work investigates whether and how generative visual question answering systems can responsibly support patient information needs in the context of radiology imaging data. We conducted a formative need-finding study in which participants discussed chest computed tomography (CT) scans and associated radiology reports of a fictitious close relative with a cardiothoracic radiologist. Using thematic analysis of the conversation between participants and medical experts, we identified commonly occurring themes across interactions, including clarifying medical terminology, locating the problems mentioned in the report in the scanned image, understanding disease prognosis, discussing the next diagnostic steps, and comparing treatment options. Based on these themes, we evaluated two state-of-the-art generative visual language models against the radiologist's responses. Our results reveal variability in the quality of responses generated by the models across various themes. We highlight the importance of patient-facing generative AI systems to accommodate a diverse range of conversational themes, catering to the real-world informational needs of patients. △ Less

Submitted 31 January, 2024; originally announced February 2024.

arXiv:2401.00909 [pdf, other]

Taming Mode Collapse in Score Distillation for Text-to-3D Generation

Authors: Peihao Wang, Dejia Xu, Zhiwen Fan, Dilin Wang, Sreyas Mohan, Forrest Iandola, Rakesh Ranjan, Yilei Li, Qiang Liu, Zhangyang Wang, Vikas Chandra

Abstract: Despite the remarkable performance of score distillation in text-to-3D generation, such techniques notoriously suffer from view inconsistency issues, also known as "Janus" artifact, where the generated objects fake each view with multiple front faces. Although empirically effective methods have approached this problem via score debiasing or prompt engineering, a more rigorous perspective to explai… ▽ More Despite the remarkable performance of score distillation in text-to-3D generation, such techniques notoriously suffer from view inconsistency issues, also known as "Janus" artifact, where the generated objects fake each view with multiple front faces. Although empirically effective methods have approached this problem via score debiasing or prompt engineering, a more rigorous perspective to explain and tackle this problem remains elusive. In this paper, we reveal that the existing score distillation-based text-to-3D generation frameworks degenerate to maximal likelihood seeking on each view independently and thus suffer from the mode collapse problem, manifesting as the Janus artifact in practice. To tame mode collapse, we improve score distillation by re-establishing the entropy term in the corresponding variational objective, which is applied to the distribution of rendered images. Maximizing the entropy encourages diversity among different views in generated 3D assets, thereby mitigating the Janus problem. Based on this new objective, we derive a new update rule for 3D score distillation, dubbed Entropic Score Distillation (ESD). We theoretically reveal that ESD can be simplified and implemented by just adopting the classifier-free guidance trick upon variational score distillation. Although embarrassingly straightforward, our extensive experiments successfully demonstrate that ESD can be an effective treatment for Janus artifacts in score distillation. △ Less

Submitted 29 March, 2024; v1 submitted 31 December, 2023; originally announced January 2024.

Comments: Project page: https://vita-group.github.io/3D-Mode-Collapse/

arXiv:2401.00604 [pdf, other]

SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity

Authors: Peihao Wang, Zhiwen Fan, Dejia Xu, Dilin Wang, Sreyas Mohan, Forrest Iandola, Rakesh Ranjan, Yilei Li, Qiang Liu, Zhangyang Wang, Vikas Chandra

Abstract: Score distillation has emerged as one of the most prevalent approaches for text-to-3D asset synthesis. Essentially, score distillation updates 3D parameters by lifting and back-propagating scores averaged over different views. In this paper, we reveal that the gradient estimation in score distillation is inherent to high variance. Through the lens of variance reduction, the effectiveness of SDS an… ▽ More Score distillation has emerged as one of the most prevalent approaches for text-to-3D asset synthesis. Essentially, score distillation updates 3D parameters by lifting and back-propagating scores averaged over different views. In this paper, we reveal that the gradient estimation in score distillation is inherent to high variance. Through the lens of variance reduction, the effectiveness of SDS and VSD can be interpreted as applications of various control variates to the Monte Carlo estimator of the distilled score. Motivated by this rethinking and based on Stein's identity, we propose a more general solution to reduce variance for score distillation, termed Stein Score Distillation (SSD). SSD incorporates control variates constructed by Stein identity, allowing for arbitrary baseline functions. This enables us to include flexible guidance priors and network architectures to explicitly optimize for variance reduction. In our experiments, the overall pipeline, dubbed SteinDreamer, is implemented by instantiating the control variate with a monocular depth estimator. The results suggest that SSD can effectively reduce the distillation variance and consistently improve visual quality for both object- and scene-level generation. Moreover, we demonstrate that SteinDreamer achieves faster convergence than existing methods due to more stable gradient updates. △ Less

Submitted 29 March, 2024; v1 submitted 31 December, 2023; originally announced January 2024.

Comments: Project page: https://vita-group.github.io/SteinDreamer/

arXiv:2312.10214 [pdf, other]

Healthcare Policy Compliance: A Blockchain Smart Contract-Based Approach

Authors: Md Al Amin, Hemanth Tummala, Seshamalini Mohan, Indrajit Ray

Abstract: This paper addresses the critical challenge of ensuring healthcare policy compliance in the context of Electronic Health Records (EHRs). Despite stringent regulations like HIPAA, significant gaps in policy compliance often remain undetected until a data breach occurs. To bridge this gap, we propose a novel blockchain-powered, smart contract-based access control model. This model is specifically de… ▽ More This paper addresses the critical challenge of ensuring healthcare policy compliance in the context of Electronic Health Records (EHRs). Despite stringent regulations like HIPAA, significant gaps in policy compliance often remain undetected until a data breach occurs. To bridge this gap, we propose a novel blockchain-powered, smart contract-based access control model. This model is specifically designed to enforce patient-provider agreements (PPAs) and other relevant policies, thereby ensuring both policy compliance and provenance. Our approach integrates components of informed consent into PPAs, employing blockchain smart contracts to automate and secure policy enforcement. The authorization module utilizes these contracts to make informed access decisions, recording all actions in a transparent, immutable blockchain ledger. This system not only ensures that policies are rigorously applied but also maintains a verifiable record of all actions taken, thus facilitating an easy audit and proving compliance. We implement this model in a private Ethereum blockchain setup, focusing on maintaining the integrity and lineage of policies and ensuring that audit trails are accurately and securely recorded. The Proof of Compliance (PoC) consensus mechanism enables decentralized, independent auditor nodes to verify compliance status based on the audit trails recorded. Experimental evaluation demonstrates the effectiveness of the proposed model in a simulated healthcare environment. The results show that our approach not only strengthens policy compliance and provenance but also enhances the transparency and accountability of the entire process. In summary, this paper presents a comprehensive, blockchain-based solution to a longstanding problem in healthcare data management, offering a robust framework for ensuring policy compliance and provenance through smart contracts and blockchain technology. △ Less

Submitted 15 December, 2023; originally announced December 2023.

arXiv:2310.14835 [pdf, other]

doi 10.1103/PhysRevB.108.125133

Multiple exciton generation in VO2

Authors: S. R. Sahu, S. Khan, A. Tripathy, K. Dey, N. Bano, S. Raj Mohan, M. P. Joshi, S. Verma, B. T. Rao, V. G. Sathe, D. K. Shukla

Abstract: Multiple exciton generation (MEG) is a widely studied phenomenon in semiconductor nanocrystals and quantum dots, aimed at improving the energy conversion efficiency of solar cells. MEG is the process wherein incident photon energy is significantly larger than the band gap, and the resulting photoexcited carriers relax by generating additional electron-hole pairs, rather than decaying by heat dissi… ▽ More Multiple exciton generation (MEG) is a widely studied phenomenon in semiconductor nanocrystals and quantum dots, aimed at improving the energy conversion efficiency of solar cells. MEG is the process wherein incident photon energy is significantly larger than the band gap, and the resulting photoexcited carriers relax by generating additional electron-hole pairs, rather than decaying by heat dissipation. Here, we present an experimental demonstration of MEG in a prototype strongly correlated material, VO2, through photocurrent spectroscopy and ultrafast transient reflectivity measurements, both of which are considered the most prominent ways for detecting MEG in working devices. The key result of this paper is the observation of MEG at room temperature (in a correlated insulating phase of VO2), and the estimated threshold for MEG is 3Eg. We demonstrate an escalated photocurrent due to MEG in VO2, and quantum efficiency is found to exceed 100%. Our studies suggest that this phenomenon is a manifestation of expeditious impact ionization due to stronger electron correlations and could be exploited in a large number of strongly correlated materials. △ Less

Submitted 23 October, 2023; originally announced October 2023.

Comments: 6 pages, 5 figures, Physical Review B

Journal ref: Physical Review B 108, 125133 (2023)

arXiv:2310.10290 [pdf, other]

Autonomous Mapping and Navigation using Fiducial Markers and Pan-Tilt Camera for Assisting Indoor Mobility of Blind and Visually Impaired People

Authors: Dharmateja Adapa, Virendra Singh Shekhawat, Avinash Gautam, Sudeept Mohan

Abstract: Large indoor spaces have complex layouts making them difficult to navigate. Indoor spaces in hospitals, universities, shopping complexes, etc., carry multi-modal information in the form of text and symbols. Hence, it is difficult for Blind and Visually Impaired (BVI) people to independently navigate such spaces. Indoor environments are usually GPS-denied; therefore, Bluetooth-based, WiFi-based, or… ▽ More Large indoor spaces have complex layouts making them difficult to navigate. Indoor spaces in hospitals, universities, shopping complexes, etc., carry multi-modal information in the form of text and symbols. Hence, it is difficult for Blind and Visually Impaired (BVI) people to independently navigate such spaces. Indoor environments are usually GPS-denied; therefore, Bluetooth-based, WiFi-based, or Range-based methods are used for localization. These methods have high setup costs, lesser accuracy, and sometimes need special sensing equipment. We propose a Visual Assist (VA) system for the indoor navigation of BVI individuals using visual Fiducial markers for localization. State-of-the-art (SOTA) approaches for visual localization using Fiducial markers use fixed cameras having a narrow field of view. These approaches stop tracking the markers when they are out of sight. We employ a Pan-Tilt turret-mounted camera which enhances the field of view to 360° for enhanced marker tracking. We, therefore, need fewer markers for mapping and navigation. The efficacy of the proposed VA system is measured on three metrics, i.e., RMSE (Root Mean Square Error), ADNN (Average Distance to Nearest Neighbours), and ATE (Absolute Trajectory Error). Our system outperforms Hector-SLAM, ORB-SLAM3, and UcoSLAM. The proposed system achieves localization accuracy within $\pm8cm$ compared to $\pm12cm$ and $\pm10cm$ for ORB-SLAM3 and UcoSLAM, respectively. △ Less

Submitted 16 October, 2023; originally announced October 2023.

ACM Class: I.3.5; H.5.2

arXiv:2309.09522 [pdf, other]

TOPr: Enhanced Static Code Pruning for Fast and Precise Directed Fuzzing

Authors: Chaitra Niddodi, Stefan Nagy, Darko Marinov, Sibin Mohan

Abstract: Directed fuzzing is a dynamic testing technique that focuses exploration on specific, pre targeted program locations. Like other types of fuzzers, directed fuzzers are most effective when maximizing testing speed and precision. To this end, recent directed fuzzers have begun leveraging path pruning: preventing the wasteful testing of program paths deemed irrelevant to reaching a desired target loc… ▽ More Directed fuzzing is a dynamic testing technique that focuses exploration on specific, pre targeted program locations. Like other types of fuzzers, directed fuzzers are most effective when maximizing testing speed and precision. To this end, recent directed fuzzers have begun leveraging path pruning: preventing the wasteful testing of program paths deemed irrelevant to reaching a desired target location. Yet, despite code pruning's substantial speedup, current approaches are imprecise failing to capture indirect control flow requiring additional dynamic analyses that diminish directed fuzzers' speeds. Thus, without code pruning that is both fast and precise, directed fuzzers' effectiveness will continue to remain limited. This paper aims to tackle the challenge of upholding both speed and precision in pruning-based directed fuzzing. We show that existing pruning approaches fail to recover common case indirect control flow; and identify opportunities to enhance them with lightweight heuristics namely, function signature matching enabling them to maximize precision without the burden of dynamic analysis. We implement our enhanced pruning as a prototype, TOPr (Target Oriented Pruning), and evaluate it against the leading pruning based and pruning agnostic directed fuzzers SieveFuzz and AFLGo. We show that TOPr's enhanced pruning outperforms these fuzzers in (1) speed (achieving 222% and 73% higher test case throughput, respectively); (2) reachability (achieving 149% and 9% more target relevant coverage, respectively); and (3) bug discovery time (triggering bugs faster 85% and 8%, respectively). Furthermore, TOPr's balance of speed and precision enables it to find 24 new bugs in 5 open source applications, with 18 confirmed by developers, 12 bugs labelled as "Priority - 1. High", and 12 bugs fixed, underscoring the effectiveness of our framework. △ Less

Submitted 18 September, 2023; originally announced September 2023.

Comments: 13 pages

arXiv:2308.03329 [pdf]

New radio lobes at parsec scale from the East-West protostellar jet RAFGL2591

Authors: A. G. Cheriyan, S. Vig, Sreelekshmi Mohan

Abstract: RAFGL2591 is a massive star-forming complex in the Cygnus-X region comprising of a cluster of embedded protostars and young stellar objects located at a distance of 3.33 kpc. We investigate low-frequency radio emission from the protostellar jet associated with RAFGL2591 using the Giant Metrewave Radio Telescope (GMRT) at 325, 610 and 1280 MHz. For the first time, we have detected radio jet lobes i… ▽ More RAFGL2591 is a massive star-forming complex in the Cygnus-X region comprising of a cluster of embedded protostars and young stellar objects located at a distance of 3.33 kpc. We investigate low-frequency radio emission from the protostellar jet associated with RAFGL2591 using the Giant Metrewave Radio Telescope (GMRT) at 325, 610 and 1280 MHz. For the first time, we have detected radio jet lobes in the E-W direction, labelled as GMRT-1 and GMRT-2. While GMRT-1 displays a flat radio spectral index of $α$ = -0.10 , GMRT-2 shows a steeply negative value $α$ = -0.62 suggestive of non-thermal emission. H$_2$ emission maps show the presence of numerous knots, arcs and extended emission towards the East-West jet, excited by the protostar VLA 3. In addition, we report a few H$_2$ knots in the North-East and South-West for the first time. The radio lobes (GMRT-1, GMRT-2) and H$_2$ emission towards this region are understood in the context of the prominent East-West jet as well as its lesser-known sibling jet in the North-East and South-West direction. To model the radio emission from the lobes, we have employed a numerical model including both thermal and non-thermal emission and found number densities towards these lobes in the range 100 - 1000 cm$^{-3}$ . The misalignment of the East-West jet lobes exhibits a reflection symmetry with a bending of $\sim$ 20$\circ$ . We attempt to understand this misalignment through precession caused by a binary partner and/or a supersonic side wind from source(s) in the vicinity. △ Less

Submitted 7 August, 2023; originally announced August 2023.

Comments: 15 pages, 9 figures, Accepted for publication in MNRAS

arXiv:2307.01292 [pdf, other]

Pareto-Secure Machine Learning (PSML): Fingerprinting and Securing Inference Serving Systems

Authors: Debopam Sanyal, Jui-Tse Hung, Manav Agrawal, Prahlad Jasti, Shahab Nikkhoo, Somesh Jha, Tianhao Wang, Sibin Mohan, Alexey Tumanov

Abstract: Model-serving systems have become increasingly popular, especially in real-time web applications. In such systems, users send queries to the server and specify the desired performance metrics (e.g., desired accuracy, latency). The server maintains a set of models (model zoo) in the back-end and serves the queries based on the specified metrics. This paper examines the security, specifically robust… ▽ More Model-serving systems have become increasingly popular, especially in real-time web applications. In such systems, users send queries to the server and specify the desired performance metrics (e.g., desired accuracy, latency). The server maintains a set of models (model zoo) in the back-end and serves the queries based on the specified metrics. This paper examines the security, specifically robustness against model extraction attacks, of such systems. Existing black-box attacks assume a single model can be repeatedly selected for serving inference requests. Modern inference serving systems break this assumption. Thus, they cannot be directly applied to extract a victim model, as models are hidden behind a layer of abstraction exposed by the serving system. An attacker can no longer identify which model she is interacting with. To this end, we first propose a query-efficient fingerprinting algorithm to enable the attacker to trigger any desired model consistently. We show that by using our fingerprinting algorithm, model extraction can have fidelity and accuracy scores within $1\%$ of the scores obtained when attacking a single, explicitly specified model, as well as up to $14.6\%$ gain in accuracy and up to $7.7\%$ gain in fidelity compared to the naive attack. Second, we counter the proposed attack with a noise-based defense mechanism that thwarts fingerprinting by adding noise to the specified performance metrics. The proposed defense strategy reduces the attack's accuracy and fidelity by up to $9.8\%$ and $4.8\%$, respectively (on medium-sized model extraction). Third, we show that the proposed defense induces a fundamental trade-off between the level of protection and system goodput, achieving configurable and significant victim model extraction protection while maintaining acceptable goodput ($>80\%$). We implement the proposed defense in a real system with plans to open source. △ Less

Submitted 6 August, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

Comments: 17 pages, 9 figures, 6 tables

arXiv:2306.06272 [pdf, other]

A Domain-Independent Agent Architecture for Adaptive Operation in Evolving Open Worlds

Authors: Shiwali Mohan, Wiktor Piotrowski, Roni Stern, Sachin Grover, Sookyung Kim, Jacob Le, Johan De Kleer

Abstract: Model-based reasoning agents are ill-equipped to act in novel situations in which their model of the environment no longer sufficiently represents the world. We propose HYDRA - a framework for designing model-based agents operating in mixed discrete-continuous worlds, that can autonomously detect when the environment has evolved from its canonical setup, understand how it has evolved, and adapt th… ▽ More Model-based reasoning agents are ill-equipped to act in novel situations in which their model of the environment no longer sufficiently represents the world. We propose HYDRA - a framework for designing model-based agents operating in mixed discrete-continuous worlds, that can autonomously detect when the environment has evolved from its canonical setup, understand how it has evolved, and adapt the agents' models to perform effectively. HYDRA is based upon PDDL+, a rich modeling language for planning in mixed, discrete-continuous environments. It augments the planning module with visual reasoning, task selection, and action execution modules for closed-loop interaction with complex environments. HYDRA implements a novel meta-reasoning process that enables the agent to monitor its own behavior from a variety of aspects. The process employs a diverse set of computational methods to maintain expectations about the agent's own behavior in an environment. Divergences from those expectations are useful in detecting when the environment has evolved and identifying opportunities to adapt the underlying models. HYDRA builds upon ideas from diagnosis and repair and uses a heuristics-guided search over model changes such that they become competent in novel conditions. The HYDRA framework has been used to implement novelty-aware agents for three diverse domains - CartPole++ (a higher dimension variant of a classic control problem), Science Birds (an IJCAI competition problem), and PogoStick (a specific problem domain in Minecraft). We report empirical observations from these domains to demonstrate the efficacy of various components in the novelty meta-reasoning process. △ Less

Submitted 9 June, 2023; originally announced June 2023.

Comments: Under review in Artificial Intelligence Journal - Open World Learning track

ACM Class: I.2.4; I.2.6

arXiv:2305.09011 [pdf, other]

The Brain Tumor Segmentation (BraTS) Challenge 2023: Brain MR Image Synthesis for Tumor Segmentation (BraSyn)

Authors: Hongwei Bran Li, Gian Marco Conte, Syed Muhammad Anwar, Florian Kofler, Ivan Ezhov, Koen van Leemput, Marie Piraud, Maria Diaz, Byrone Cole, Evan Calabrese, Jeff Rudie, Felix Meissen, Maruf Adewole, Anastasia Janas, Anahita Fathi Kazerooni, Dominic LaBella, Ahmed W. Moawad, Keyvan Farahani, James Eddy, Timothy Bergquist, Verena Chung, Russell Takeshi Shinohara, Farouk Dako, Walter Wiggins, Zachary Reitman , et al. (43 additional authors not shown)

Abstract: Automated brain tumor segmentation methods have become well-established and reached performance levels offering clear clinical utility. These methods typically rely on four input magnetic resonance imaging (MRI) modalities: T1-weighted images with and without contrast enhancement, T2-weighted images, and FLAIR images. However, some sequences are often missing in clinical practice due to time const… ▽ More Automated brain tumor segmentation methods have become well-established and reached performance levels offering clear clinical utility. These methods typically rely on four input magnetic resonance imaging (MRI) modalities: T1-weighted images with and without contrast enhancement, T2-weighted images, and FLAIR images. However, some sequences are often missing in clinical practice due to time constraints or image artifacts, such as patient motion. Consequently, the ability to substitute missing modalities and gain segmentation performance is highly desirable and necessary for the broader adoption of these algorithms in the clinical routine. In this work, we present the establishment of the Brain MR Image Synthesis Benchmark (BraSyn) in conjunction with the Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2023. The primary objective of this challenge is to evaluate image synthesis methods that can realistically generate missing MRI modalities when multiple available images are provided. The ultimate aim is to facilitate automated brain tumor segmentation pipelines. The image dataset used in the benchmark is diverse and multi-modal, created through collaboration with various hospitals and research institutions. △ Less

Submitted 28 June, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

Comments: Technical report of BraSyn

arXiv:2305.08992 [pdf, other]

The Brain Tumor Segmentation (BraTS) Challenge 2023: Local Synthesis of Healthy Brain Tissue via Inpainting

Authors: Florian Kofler, Felix Meissen, Felix Steinbauer, Robert Graf, Eva Oswald, Ezequiel de da Rosa, Hongwei Bran Li, Ujjwal Baid, Florian Hoelzl, Oezguen Turgut, Izabela Horvath, Diana Waldmannstetter, Christina Bukas, Maruf Adewole, Syed Muhammad Anwar, Anastasia Janas, Anahita Fathi Kazerooni, Dominic LaBella, Ahmed W Moawad, Keyvan Farahani, James Eddy, Timothy Bergquist, Verena Chung, Russell Takeshi Shinohara, Farouk Dako , et al. (43 additional authors not shown)

Abstract: A myriad of algorithms for the automatic analysis of brain MR images is available to support clinicians in their decision-making. For brain tumor patients, the image acquisition time series typically starts with a scan that is already pathological. This poses problems, as many algorithms are designed to analyze healthy brains and provide no guarantees for images featuring lesions. Examples include… ▽ More A myriad of algorithms for the automatic analysis of brain MR images is available to support clinicians in their decision-making. For brain tumor patients, the image acquisition time series typically starts with a scan that is already pathological. This poses problems, as many algorithms are designed to analyze healthy brains and provide no guarantees for images featuring lesions. Examples include but are not limited to algorithms for brain anatomy parcellation, tissue segmentation, and brain extraction. To solve this dilemma, we introduce the BraTS 2023 inpainting challenge. Here, the participants' task is to explore inpainting techniques to synthesize healthy brain scans from lesioned ones. The following manuscript contains the task formulation, dataset, and submission procedure. Later it will be updated to summarize the findings of the challenge. The challenge is organized as part of the BraTS 2023 challenge hosted at the MICCAI 2023 conference in Vancouver, Canada. △ Less

Submitted 9 August, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

Comments: 5 pages, 1 figure

arXiv:2305.07291 [pdf, other]

Impact of errors in the magnetic field measurement on the precision determination of neutrino oscillation parameters at the proposed ICAL detector at INO

Authors: Honey Khindri, D. Indumathi, Lakshmi S. Mohan

Abstract: The magnetised iron calorimeter (ICAL) detector proposed at the India-based Neutrino Observatory will be a 51 kton detector made up of 151 layers of 56 mm thick soft iron with 40 mm air gap in between where the RPCs, the active detectors, will be placed. The main goal of ICAL is to make precision measurements of the neutrino oscillation parameters using the atmospheric neutrinos as source. The cha… ▽ More The magnetised iron calorimeter (ICAL) detector proposed at the India-based Neutrino Observatory will be a 51 kton detector made up of 151 layers of 56 mm thick soft iron with 40 mm air gap in between where the RPCs, the active detectors, will be placed. The main goal of ICAL is to make precision measurements of the neutrino oscillation parameters using the atmospheric neutrinos as source. The charged current interactions of the atmospheric muon neutrinos and anti-neutrinos in the detector produce charged muons. The magnetic field, with a maximum value of $\sim$ 1.5 T in the central region of ICAL, is a critical component since it will be used to distinguish the charges and determine the momentum and direction of these muons. It is difficult to measure the magnetic field inside the iron. The existing methods can only estimate the internal field and hence will be prone to error. This paper presents the first simulations study of the effect of errors in the measurement of the magnetic field in ICAL on its physics potential, especially the neutrino mass ordering and precision measurement of oscillation parameters in the 2--3 sector. The study is a GEANT4-based analysis, using measurements of the magnetic field at the prototype ICAL detector. We find that there is only a small effect on the determination of the mass ordering. While local fluctuations in the magnetic field measurement are well-tolerated, calibration errors must remain well within 5\% to retain good precision determination of the parameters $\sin^2θ_{23}$ and $Δm^2_{32}$. △ Less

Submitted 15 May, 2023; v1 submitted 12 May, 2023; originally announced May 2023.

Comments: 25 pages, 10 figures latex; had uploaded Fig6(R) and Fig 7 (R) wrongly; now replaced in version2

Report number: IMSc/2023/03

arXiv:2305.07214 [pdf, other]

MMG-Ego4D: Multi-Modal Generalization in Egocentric Action Recognition

Authors: Xinyu Gong, Sreyas Mohan, Naina Dhingra, Jean-Charles Bazin, Yilei Li, Zhangyang Wang, Rakesh Ranjan

Abstract: In this paper, we study a novel problem in egocentric action recognition, which we term as "Multimodal Generalization" (MMG). MMG aims to study how systems can generalize when data from certain modalities is limited or even completely missing. We thoroughly investigate MMG in the context of standard supervised action recognition and the more challenging few-shot setting for learning new action cat… ▽ More In this paper, we study a novel problem in egocentric action recognition, which we term as "Multimodal Generalization" (MMG). MMG aims to study how systems can generalize when data from certain modalities is limited or even completely missing. We thoroughly investigate MMG in the context of standard supervised action recognition and the more challenging few-shot setting for learning new action categories. MMG consists of two novel scenarios, designed to support security, and efficiency considerations in real-world applications: (1) missing modality generalization where some modalities that were present during the train time are missing during the inference time, and (2) cross-modal zero-shot generalization, where the modalities present during the inference time and the training time are disjoint. To enable this investigation, we construct a new dataset MMG-Ego4D containing data points with video, audio, and inertial motion sensor (IMU) modalities. Our dataset is derived from Ego4D dataset, but processed and thoroughly re-annotated by human experts to facilitate research in the MMG problem. We evaluate a diverse array of models on MMG-Ego4D and propose new methods with improved generalization ability. In particular, we introduce a new fusion module with modality dropout training, contrastive-based alignment training, and a novel cross-modal prototypical loss for better few-shot performance. We hope this study will serve as a benchmark and guide future research in multimodal generalization problems. The benchmark and code will be available at https://github.com/facebookresearch/MMG_Ego4D. △ Less

Submitted 11 May, 2023; originally announced May 2023.

Comments: Accepted to CVPR 2023

arXiv:2304.13956 [pdf, other]

You Can't Always Check What You Wanted: Selective Checking and Trusted Execution to Prevent False Actuations in Cyber-Physical Systems

Authors: Monowar Hasan, Sibin Mohan

Abstract: Cyber-physical systems (CPS) are vulnerable to attacks targeting outgoing actuation commands that modify their physical behaviors. The limited resources in such systems, coupled with their stringent timing constraints, often prevents the checking of every outgoing command. We present a "selective checking" mechanism that uses game-theoretic modeling to identify the right subset of commands to be c… ▽ More Cyber-physical systems (CPS) are vulnerable to attacks targeting outgoing actuation commands that modify their physical behaviors. The limited resources in such systems, coupled with their stringent timing constraints, often prevents the checking of every outgoing command. We present a "selective checking" mechanism that uses game-theoretic modeling to identify the right subset of commands to be checked in order to deter an adversary. This mechanism is coupled with a "delay-aware" trusted execution environment (TEE) to ensure that only verified actuation commands are ever sent to the physical system, thus maintaining their safety and integrity. The selective checking and trusted execution (SCATE) framework is implemented on an off-the-shelf ARM platform running standard embedded Linux. We demonstrate the effectiveness of SCATE using four realistic cyber-physical systems (a ground rover, a flight controller, a robotic arm and an automated syringe pump) and study design trade-offs. Not only does SCATE provide a high level of security and high performance, it also suffers from significantly lower overheads (30.48%-47.32% less) in the process. In fact, SCATE can work with more systems without negatively affecting the safety of the system. Considering that most CPS do not have any such checking mechanisms, and SCATE is guaranteed to meet all the timing requirements (i.e., ensure the safety/integrity of the system), our methods can significantly improve the security (and, hence, safety) of the system. △ Less

Submitted 27 April, 2023; originally announced April 2023.

Comments: Extended version of SCATE published in ISORC'23

arXiv:2304.00714 [pdf, other]

Ensemble prosody prediction for expressive speech synthesis

Authors: Tian Huey Teh, Vivian Hu, Devang S Ram Mohan, Zack Hodari, Christopher G. R. Wallis, Tomás Gomez Ibarrondo, Alexandra Torresquintero, James Leoni, Mark Gales, Simon King

Abstract: Generating expressive speech with rich and varied prosody continues to be a challenge for Text-to-Speech. Most efforts have focused on sophisticated neural architectures intended to better model the data distribution. Yet, in evaluations it is generally found that no single model is preferred for all input texts. This suggests an approach that has rarely been used before for Text-to-Speech: an ens… ▽ More Generating expressive speech with rich and varied prosody continues to be a challenge for Text-to-Speech. Most efforts have focused on sophisticated neural architectures intended to better model the data distribution. Yet, in evaluations it is generally found that no single model is preferred for all input texts. This suggests an approach that has rarely been used before for Text-to-Speech: an ensemble of models. We apply ensemble learning to prosody prediction. We construct simple ensembles of prosody predictors by varying either model architecture or model parameter values. To automatically select amongst the models in the ensemble when performing Text-to-Speech, we propose a novel, and computationally trivial, variance-based criterion. We demonstrate that even a small ensemble of prosody predictors yields useful diversity, which, combined with the proposed selection criterion, outperforms any individual model from the ensemble. △ Less

Submitted 3 April, 2023; originally announced April 2023.

Comments: ICASSP 2023

arXiv:2303.16967 [pdf, other]

Heuristic Search For Physics-Based Problems: Angry Birds in PDDL+

Authors: Wiktor Piotrowski, Yoni Sher, Sachin Grover, Roni Stern, Shiwali Mohan

Abstract: This paper studies how a domain-independent planner and combinatorial search can be employed to play Angry Birds, a well established AI challenge problem. To model the game, we use PDDL+, a planning language for mixed discrete/continuous domains that supports durative processes and exogenous events. The paper describes the model and identifies key design decisions that reduce the problem complexit… ▽ More This paper studies how a domain-independent planner and combinatorial search can be employed to play Angry Birds, a well established AI challenge problem. To model the game, we use PDDL+, a planning language for mixed discrete/continuous domains that supports durative processes and exogenous events. The paper describes the model and identifies key design decisions that reduce the problem complexity. In addition, we propose several domain-specific enhancements including heuristics and a search technique similar to preferred operators. Together, they alleviate the complexity of combinatorial search. We evaluate our approach by comparing its performance with dedicated domain-specific solvers on a range of Angry Birds levels. The results show that our performance is on par with these domain-specific approaches in most levels, even without using our domain-specific search enhancements. △ Less

Submitted 29 March, 2023; originally announced March 2023.

arXiv:2303.14272 [pdf, other]

Learning to Operate in Open Worlds by Adapting Planning Models

Authors: Wiktor Piotrowski, Roni Stern, Yoni Sher, Jacob Le, Matthew Klenk, Johan deKleer, Shiwali Mohan

Abstract: Planning agents are ill-equipped to act in novel situations in which their domain model no longer accurately represents the world. We introduce an approach for such agents operating in open worlds that detects the presence of novelties and effectively adapts their domain models and consequent action selection. It uses observations of action execution and measures their divergence from what is expe… ▽ More Planning agents are ill-equipped to act in novel situations in which their domain model no longer accurately represents the world. We introduce an approach for such agents operating in open worlds that detects the presence of novelties and effectively adapts their domain models and consequent action selection. It uses observations of action execution and measures their divergence from what is expected, according to the environment model, to infer existence of a novelty. Then, it revises the model through a heuristics-guided search over model changes. We report empirical evaluations on the CartPole problem, a standard Reinforcement Learning (RL) benchmark. The results show that our approach can deal with a class of novelties very quickly and in an interpretable fashion. △ Less

Submitted 24 March, 2023; originally announced March 2023.

Comments: To appears in the Proceedings of the 22nd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2023)

ACM Class: I.2.6; I.2.8

arXiv:2303.13150 [pdf, ps, other]

doi 10.1007/s12036-023-09947-7

Modeling of thermal and non-thermal radio emission from HH80-81 jet

Authors: Sreelekshmi Mohan, Sarita Vig, Samir Mandal

Abstract: Protostellar jets are one of the primary signposts of star formation. A handful of protostellar objects exhibit radio emission from ionized jets, of which a few display negative spectral indices, indicating the presence of synchrotron emission. In this study, we characterize the radio spectra of HH80-81 jet with the help of a numerical model that we have developed earlier, which takes into account… ▽ More Protostellar jets are one of the primary signposts of star formation. A handful of protostellar objects exhibit radio emission from ionized jets, of which a few display negative spectral indices, indicating the presence of synchrotron emission. In this study, we characterize the radio spectra of HH80-81 jet with the help of a numerical model that we have developed earlier, which takes into account both thermal free-free and non-thermal synchrotron emission mechanisms. For modeling the HH80-81 jet, we consider jet emission towards the central region close to the driving source along with two Herbig-Haro objects, HH80 and HH81. We have obtained the best-fit parameters for each of these sources by fitting the model to radio observational data corresponding to two frequency windows taken across two epochs. Considering an electron number density in the range $10^3 - 10^5$ cm$^{-3}$, we obtained the thickness of the jet edges and fraction of relativistic electrons that contribute to non-thermal emission in the range $0.01^{\circ} - 0.1^{\circ}$ and $10^{-7} - 10^{-4}$, respectively. For the best-fit parameter sets, the model spectral indices lie in the range of -0.15 to +0.11 within the observed frequency windows. △ Less

Submitted 24 March, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

Comments: 14 pages, 6 figures, Accepted for publication in the Journal of Astrophysics and Astronomy

arXiv:2303.09446 [pdf, other]

Controllable Prosody Generation With Partial Inputs

Authors: Dan Andrei Iliescu, Devang Savita Ram Mohan, Tian Huey Teh, Zack Hodari

Abstract: We address the problem of human-in-the-loop control for generating prosody in the context of text-to-speech synthesis. Controlling prosody is challenging because existing generative models lack an efficient interface through which users can modify the output quickly and precisely. To solve this, we introduce a novel framework whereby the user provides partial inputs and the generative model genera… ▽ More We address the problem of human-in-the-loop control for generating prosody in the context of text-to-speech synthesis. Controlling prosody is challenging because existing generative models lack an efficient interface through which users can modify the output quickly and precisely. To solve this, we introduce a novel framework whereby the user provides partial inputs and the generative model generates the missing features. We propose a model that is specifically designed to encode partial prosodic features and output complete audio. We show empirically that our model displays two essential qualities of a human-in-the-loop control mechanism: efficiency and robustness. With even a very small number of input values (~4), our model enables users to improve the quality of the output significantly in terms of listener preference (4:1). △ Less

Submitted 15 April, 2024; v1 submitted 14 March, 2023; originally announced March 2023.

Comments: 5 pages

arXiv:2303.03883 [pdf, ps, other]

A note on the Bures-Wasserstein metric

Authors: Shravan Mohan

Abstract: In this brief note, it is shown that the Bures-Wasserstein (BW) metric on the space positive definite matrices lends itself to convex optimization. In other words, the computation of the BW metric can be posed as a convex optimization problem. In turn, this leads to efficient computations of (i) the BW distance between convex subsets of positive definite matrices, (ii) the BW barycenter, and (iii)… ▽ More In this brief note, it is shown that the Bures-Wasserstein (BW) metric on the space positive definite matrices lends itself to convex optimization. In other words, the computation of the BW metric can be posed as a convex optimization problem. In turn, this leads to efficient computations of (i) the BW distance between convex subsets of positive definite matrices, (ii) the BW barycenter, and (iii) incorporating BW distance from a given matrix as a convex constraint. Computations are provided for corroboration. △ Less

Submitted 7 March, 2023; originally announced March 2023.

arXiv:2211.08012 [pdf, ps, other]

doi 10.3847/1538-4357/aca413

Imaging of HH80-81 jet in the NIR shock tracers H$_2$ and [Fe II]

Authors: Sreelekshmi Mohan, Sarita Vig, Watson P. Varricatt, Anandmayee Tej

Abstract: The HH80-81 system is one of the most powerful jets driven by a massive protostar. We present new near-infrared (NIR) line imaging observations of the HH80-81 jet in the H$_2$ (2.122 $μ$m) and [Fe II] (1.644 $μ$m) lines. These lines trace not only the jet close to the exciting source but also the knots located farther away. We have detected nine groups of knot-like structures in the jet including… ▽ More The HH80-81 system is one of the most powerful jets driven by a massive protostar. We present new near-infrared (NIR) line imaging observations of the HH80-81 jet in the H$_2$ (2.122 $μ$m) and [Fe II] (1.644 $μ$m) lines. These lines trace not only the jet close to the exciting source but also the knots located farther away. We have detected nine groups of knot-like structures in the jet including HH80 and HH81 spaced $0.2-0.9$ pc apart. The knots in the northern arm of the jet show only [Fe II] emission closer to the exciting source, a combination of [Fe II] and H$_2$ at intermediate distances, and solely H$_2$ emission farther outwards. Towards the southern arm, all the knots exhibit both H$_2$ and [Fe II] emission. The nature of the shocks is inferred by assimilating the NIR observations with radio and X-ray observations from literature. In the northern arm, we infer the presence of strong dissociative shocks, in the knots located close to the exciting source. The knots in the southern arm that include HH80 and HH81 are explicable as a combination of strong and weak shocks. The mass-loss rates of the knots determined from [Fe II] luminosities are in the range $\sim 3.0\times 10^{-7}-5.2\times 10^{-5}$ M$_{\odot}$ yr$^{-1}$, consistent with those from massive protostars. Towards the central region, close to the driving source of the jet, we have observed various arcs in H$_2$ emission which resemble bow shocks, and strings of H$_2$ knots which reveal traces of multiple outflows. △ Less

Submitted 15 November, 2022; originally announced November 2022.

Comments: 20 pages, 5 figures, 3 tables, Accepted for publication in The Astrophysical Journal

arXiv:2211.06827 [pdf, other]

A note on power allocation for optimal capacity

Authors: Shravan Mohan

Abstract: The problems of determining the optimal power allocation, within maximum power bounds, to (i) maximize the minimum Shannon capacity, and (ii) minimize the weighted latency are considered. In the first case, the global optima can be achieved in polynomial time by solving a sequence of linear programs (LP). In the second case, the original non-convex problem is replaced by a convex surrogate (a geom… ▽ More The problems of determining the optimal power allocation, within maximum power bounds, to (i) maximize the minimum Shannon capacity, and (ii) minimize the weighted latency are considered. In the first case, the global optima can be achieved in polynomial time by solving a sequence of linear programs (LP). In the second case, the original non-convex problem is replaced by a convex surrogate (a geometric program), using a functional approximation. Since the approximation error is relatively low, the optima of the surrogate is close to the global optimal point of the original problem. In either cases, there is no assumption on the SINR range. The use of LPs and geometric programming make the proposed algorithms numerically efficient. Computations are provided for corroboration. △ Less

Submitted 13 November, 2022; originally announced November 2022.

arXiv:2210.11731 [pdf, other]

Analogical Concept Memory for Architectures Implementing the Common Model of Cognition

Authors: Shiwali Mohan, Matthew Klenk

Abstract: Architectures that implement the Common Model of Cognition - Soar, ACT-R, and Sigma - have a prominent place in research on cognitive modeling as well as on designing complex intelligent agents. In this paper, we explore how computational models of analogical processing can be brought into these architectures to enable concept acquisition from examples obtained interactively. We propose a new anal… ▽ More Architectures that implement the Common Model of Cognition - Soar, ACT-R, and Sigma - have a prominent place in research on cognitive modeling as well as on designing complex intelligent agents. In this paper, we explore how computational models of analogical processing can be brought into these architectures to enable concept acquisition from examples obtained interactively. We propose a new analogical concept memory for Soar that augments its current system of declarative long-term memories. We frame the problem of concept learning as embedded within the larger context of interactive task learning (ITL) and embodied language processing (ELP). We demonstrate that the analogical learning methods implemented in the proposed memory can quickly learn a diverse types of novel concepts that are useful not only in recognition of a concept in the environment but also in action selection. Our approach has been instantiated in an implemented cognitive system AILEEN and evaluated on a simulated robotic domain. △ Less

Submitted 21 October, 2022; originally announced October 2022.

Comments: Under review at Cognitive Systems Research. arXiv admin note: substantial text overlap with arXiv:2006.01962

arXiv:2210.05553 [pdf, other]

Evaluating Unsupervised Denoising Requires Unsupervised Metrics

Authors: Adria Marcos-Morales, Matan Leibovich, Sreyas Mohan, Joshua Lawrence Vincent, Piyush Haluai, Mai Tan, Peter Crozier, Carlos Fernandez-Granda

Abstract: Unsupervised denoising is a crucial challenge in real-world imaging applications. Unsupervised deep-learning methods have demonstrated impressive performance on benchmarks based on synthetic noise. However, no metrics are available to evaluate these methods in an unsupervised fashion. This is highly problematic for the many practical applications where ground-truth clean images are not available.… ▽ More Unsupervised denoising is a crucial challenge in real-world imaging applications. Unsupervised deep-learning methods have demonstrated impressive performance on benchmarks based on synthetic noise. However, no metrics are available to evaluate these methods in an unsupervised fashion. This is highly problematic for the many practical applications where ground-truth clean images are not available. In this work, we propose two novel metrics: the unsupervised mean squared error (MSE) and the unsupervised peak signal-to-noise ratio (PSNR), which are computed using only noisy data. We provide a theoretical analysis of these metrics, showing that they are asymptotically consistent estimators of the supervised MSE and PSNR. Controlled numerical experiments with synthetic noise confirm that they provide accurate approximations in practice. We validate our approach on real-world data from two imaging modalities: videos in raw format and transmission electron microscopy. Our results demonstrate that the proposed metrics enable unsupervised evaluation of denoising methods based exclusively on noisy data. △ Less

Submitted 30 May, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

arXiv:2210.02241 [pdf, other]

HeartSpot: Privatized and Explainable Data Compression for Cardiomegaly Detection

Authors: Elvin Johnson, Shreshta Mohan, Alex Gaudio, Asim Smailagic, Christos Faloutsos, Aurélio Campilho

Abstract: Advances in data-driven deep learning for chest X-ray image analysis underscore the need for explainability, privacy, large datasets and significant computational resources. We frame privacy and explainability as a lossy single-image compression problem to reduce both computational and data requirements without training. For Cardiomegaly detection in chest X-ray images, we propose HeartSpot and fo… ▽ More Advances in data-driven deep learning for chest X-ray image analysis underscore the need for explainability, privacy, large datasets and significant computational resources. We frame privacy and explainability as a lossy single-image compression problem to reduce both computational and data requirements without training. For Cardiomegaly detection in chest X-ray images, we propose HeartSpot and four spatial bias priors. HeartSpot priors define how to sample pixels based on domain knowledge from medical literature and from machines. HeartSpot privatizes chest X-ray images by discarding up to 97% of pixels, such as those that reveal the shape of the thoracic cage, bones, small lesions and other sensitive features. HeartSpot priors are ante-hoc explainable and give a human-interpretable image of the preserved spatial features that clearly outlines the heart. HeartSpot offers strong compression, with up to 32x fewer pixels and 11x smaller filesize. Cardiomegaly detectors using HeartSpot are up to 9x faster to train or at least as accurate (up to +.01 AUC ROC) when compared to a baseline DenseNet121. HeartSpot is post-hoc explainable by re-using existing attribution methods without requiring access to the original non-privatized image. In summary, HeartSpot improves speed and accuracy, reduces image size, improves privacy and ensures explainability. Source code: https://www.github.com/adgaudio/HeartSpot △ Less

Submitted 5 October, 2022; originally announced October 2022.

Comments: Accepted to IEEE-EMBS International Conference on Biomedical and Health Informatics 2022. IEEE copyrights may apply

arXiv:2208.02699 [pdf, other]

Ellipsis: Towards Efficient System Auditing for Real-Time Systems

Authors: Ayoosh Bansal, Anant Kandikuppa, Chien-Ying Chen, Monowar Hasan, Adam Bates, Sibin Mohan

Abstract: System auditing is a powerful tool that provides insight into the nature of suspicious events in computing systems, allowing machine operators to detect and subsequently investigate security incidents. While auditing has proven invaluable to the security of traditional computers, existing audit frameworks are rarely designed with consideration for Real-Time Systems (RTS). The transparency provided… ▽ More System auditing is a powerful tool that provides insight into the nature of suspicious events in computing systems, allowing machine operators to detect and subsequently investigate security incidents. While auditing has proven invaluable to the security of traditional computers, existing audit frameworks are rarely designed with consideration for Real-Time Systems (RTS). The transparency provided by system auditing would be of tremendous benefit in a variety of security-critical RTS domains, (e.g., autonomous vehicles); however, if audit mechanisms are not carefully integrated into RTS, auditing can be rendered ineffectual and violate the real-world temporal requirements of the RTS. In this paper, we demonstrate how to adapt commodity audit frameworks to RTS. Using Linux Audit as a case study, we first demonstrate that the volume of audit events generated by commodity frameworks is unsustainable within the temporal and resource constraints of real-time (RT) applications. To address this, we present Ellipsis, a set of kernel-based reduction techniques that leverage the periodic repetitive nature of RT applications to aggressively reduce the costs of system-level auditing. Ellipsis generates succinct descriptions of RT applications' expected activity while retaining a detailed record of unexpected activities, enabling analysis of suspicious activity while meeting temporal constraints. Our evaluation of Ellipsis, using ArduPilot (an open-source autopilot application suite) demonstrates up to 93% reduction in audit log generation. △ Less

Submitted 4 August, 2022; originally announced August 2022.

Comments: Extended version of a paper accepted at ESORICS 2022

ACM Class: D.4.6; C.3

arXiv:2207.14040 [pdf, other]

doi 10.1093/mnras/stac2124

Investigating star-formation activity towards the southern HII region RCW 42

Authors: Vipin Kumar, S. Vig, V. S. Veena, S. Mohan, S. K. Ghosh, A. Tej, D. K. Ojha

Abstract: The star-forming activity in the HII region RCW 42 is investigated using multiple wavebands, from near-infrared to radio wavelengths. Located at a distance of 5.8 kpc, this southern region has a bolometric luminosity of 1.8 $\times$ 10$^6$ L$_{\odot}$. The ionized gas emission has been imaged at low radio frequencies of 610 and 1280 MHz using the Giant Metrewave Radio Telescope, India and shows a… ▽ More The star-forming activity in the HII region RCW 42 is investigated using multiple wavebands, from near-infrared to radio wavelengths. Located at a distance of 5.8 kpc, this southern region has a bolometric luminosity of 1.8 $\times$ 10$^6$ L$_{\odot}$. The ionized gas emission has been imaged at low radio frequencies of 610 and 1280 MHz using the Giant Metrewave Radio Telescope, India and shows a large expanse of the HII region, spanning $20\times 15$ pc$^2$. The average electron number density in the region is estimated to be $\sim70$ cm$^{-3}$, which suggests an average ionization fraction of the cloud to be $11\%$. An extended green object EGO G274.0649-01.1460 and several young stellar objects have been identified in the region using data from the 2MASS and Spitzer surveys. The dust emission from the associated molecular cloud is probed using Herschel Space Telescope, which reveals the presence of 5 clumps, C1-C5, in this region. Two millimetre emission cores of masses 380 and 390 M$_{\odot}$ towards the radio emission peak have been identified towards C1 from the ALMA map at 1.4 mm. The clumps are investigated for their evolutionary stages based on association with various star-formation tracers, and we find that all the clumps are in active/evolved stage. △ Less

Submitted 29 July, 2022; v1 submitted 28 July, 2022; originally announced July 2022.

Comments: 14 pages, 13 figures, 3 tables, Accepted by MNRAS

arXiv:2207.08122 [pdf, ps, other]

Simulation analysis with rock muons from atmospheric neutrino interactions in the ICAL detector at INO

Authors: R. Kanishka, D. Indumathi, Lakshmi S. Mohan, V. Bhatnagar

Abstract: The proposed magnetized Iron CALorimeter detector (ICAL) to be built in the India-based Neutrino Observatory (INO) laboratory aims to study atmospheric neutrinos and its properties such as precision measurements of oscillation parameters and the neutrino mass hierarchy. High energy charged current (CC) interactions of atmospheric neutrinos with the rock surrounding the detector produce so-called "… ▽ More The proposed magnetized Iron CALorimeter detector (ICAL) to be built in the India-based Neutrino Observatory (INO) laboratory aims to study atmospheric neutrinos and its properties such as precision measurements of oscillation parameters and the neutrino mass hierarchy. High energy charged current (CC) interactions of atmospheric neutrinos with the rock surrounding the detector produce so-called "rock muons" along with hadrons. While the hadron component of these events are absorbed in the rock itself, the rock muons traverse the rock and are detected in the detector. These rock muon events can be distinguished from cosmic muons only in the upward direction and can provide an independent measurement of the oscillation parameters. A simulation study of these events at the ICAL detector shows that, although reduced in significance compared to muons produced in direct CC neutrino interactions with the detector, these events are indeed sensitive to the oscillation parameters, achieving a possible $1σ$ precision of 10\% and 27\% in determining $Δm_{32}^2$ and $\sin^2θ_{23}$, respectively. Hence a combination of the standard atmospheric neutrino analysis which is the main goal of ICAL, with these rock muon events, will improve the precision reach of ICAL for these parameters. △ Less

Submitted 21 November, 2023; v1 submitted 17 July, 2022; originally announced July 2022.

arXiv:2204.12272 [pdf, ps, other]

doi 10.1093/mnras/stac1159

Radio spectra of protostellar jets: Thermal and non-thermal emission

Authors: Sreelekshmi Mohan, Sarita Vig, Samir Mandal

Abstract: Protostellar jets and outflows are pointers of star-formation and serve as important sources of momentum and energy transfer to the interstellar medium. Radio emission from ionized jets have been detected towards a number of protostellar objects. In few cases, negative spectral indices and polarized emission have also been observed suggesting the presence of synchrotron emission from relativistic… ▽ More Protostellar jets and outflows are pointers of star-formation and serve as important sources of momentum and energy transfer to the interstellar medium. Radio emission from ionized jets have been detected towards a number of protostellar objects. In few cases, negative spectral indices and polarized emission have also been observed suggesting the presence of synchrotron emission from relativistic electrons. In this work, we develop a numerical model that incorporates both thermal free-free and non-thermal synchrotron emission mechanisms in the jet geometry. The flux densities include contribution from an inner thermal jet, and a combination of emission from thermal and non-thermal distributions along the edges and extremities, where the jet interacts with the interstellar medium. We also include the effect of varying ionization fraction laterally across the jet. An investigation of radio emission and spectra along the jet shows the dependence of the emission process and optical depth along the line of sight. We explore the effect of various parameters on the turnover frequencies and the radio spectral indices (between 10 MHz and 300 GHz) associated with them. △ Less

Submitted 26 April, 2022; originally announced April 2022.

Comments: 18 pages, 14 figures, 2 Tables. Accepted for publication in MNRAS

arXiv:2204.12219 [pdf, other]

A note on load balancing in DC microgrids

Authors: Shravan Mohan, Bharath Bhikkaji

Abstract: A problem of load balancing in isolated DC microgrids is considered in this paper. Here, a DC load is fed by multiple heterogenous DC sources, each of which is connected to the load via a boost converter. The gains of the DCC's provide for a means to control the division of load current amongst the DC sources. The primary objective of the control scheme is to minimise the total losses in the netwo… ▽ More A problem of load balancing in isolated DC microgrids is considered in this paper. Here, a DC load is fed by multiple heterogenous DC sources, each of which is connected to the load via a boost converter. The gains of the DCC's provide for a means to control the division of load current amongst the DC sources. The primary objective of the control scheme is to minimise the total losses in the network, while maintaining the output voltage within a desired range, serving the load current demand and adhering to VI-characteristics of the power sources. Under assumptions of concavity/monotonocity/piece-wise-linearity of the VI-characteristics, the problem is solved using a convex relaxation. It is shown that the solution to the relaxed problem is tight. Thus, the resulting algorithm is guaranteed to reach global optimality in a numerically efficient manner. Simulations are provided for corroboration. △ Less

Submitted 26 April, 2022; originally announced April 2022.

arXiv:2204.10836 [pdf, other]

doi 10.1038/s41467-022-33407-5

Federated Learning Enables Big Data for Rare Cancer Boundary Detection

Authors: Sarthak Pati, Ujjwal Baid, Brandon Edwards, Micah Sheller, Shih-Han Wang, G Anthony Reina, Patrick Foley, Alexey Gruzdev, Deepthi Karkada, Christos Davatzikos, Chiharu Sako, Satyam Ghodasara, Michel Bilello, Suyash Mohan, Philipp Vollmuth, Gianluca Brugnara, Chandrakanth J Preetha, Felix Sahm, Klaus Maier-Hein, Maximilian Zenk, Martin Bendszus, Wolfgang Wick, Evan Calabrese, Jeffrey Rudie, Javier Villanueva-Meyer , et al. (254 additional authors not shown)

Abstract: Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train acc… ▽ More Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train accurate and generalizable ML models, by only sharing numerical model updates. Here we present findings from the largest FL study to-date, involving data from 71 healthcare institutions across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, utilizing the largest dataset of such patients ever used in the literature (25,256 MRI scans from 6,314 patients). We demonstrate a 33% improvement over a publicly trained model to delineate the surgically targetable tumor, and 23% improvement over the tumor's entire extent. We anticipate our study to: 1) enable more studies in healthcare informed by large and diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further quantitative analyses for glioblastoma via performance optimization of our consensus model for eventual public release, and 3) demonstrate the effectiveness of FL at such scale and task complexity as a paradigm shift for multi-site collaborations, alleviating the need for data sharing. △ Less

Submitted 25 April, 2022; v1 submitted 22 April, 2022; originally announced April 2022.

Comments: federated learning, deep learning, convolutional neural network, segmentation, brain tumor, glioma, glioblastoma, FeTS, BraTS

arXiv:2204.09717 [pdf]

LSTM-RASA Based Agri Farm Assistant for Farmers

Authors: Narayana Darapaneni, Selvakumar Raj, Raghul V, Venkatesh Sivaraman, Sunil Mohan, Anwesh Reddy Paduri

Abstract: The application of Deep Learning and Natural Language based ChatBots are growing rapidly in recent years. They are used in many fields like customer support, reservation system and as personal assistant. The Enterprises are using such ChatBots to serve their customers in a better and efficient manner. Even after such technological advancement, the expert advice does not reach the farmers on timely… ▽ More The application of Deep Learning and Natural Language based ChatBots are growing rapidly in recent years. They are used in many fields like customer support, reservation system and as personal assistant. The Enterprises are using such ChatBots to serve their customers in a better and efficient manner. Even after such technological advancement, the expert advice does not reach the farmers on timely manner. The farmers are still largely dependent on their peers knowledge in solving the problems they face in their field. These technologies have not been effectively used to give the required information to farmers on timely manner. This project aims to implement a closed domain ChatBot for the field of Agriculture Farmers Assistant. Farmers can have conversation with the Chatbot and get the expert advice in their field. Farmers Assistant is based on RASA Open Source Framework. The Chatbot identifies the intent and entity from user utterances and retrieve the remedy from the database and share it with the user. We tested the Bot with existing data and it showed promising results. △ Less

Submitted 7 April, 2022; originally announced April 2022.

arXiv:2204.06584 [pdf, other]

A Distant Supervision Corpus for Extracting Biomedical Relationships Between Chemicals, Diseases and Genes

Authors: Dongxu Zhang, Sunil Mohan, Michaela Torkar, Andrew McCallum

Abstract: We introduce ChemDisGene, a new dataset for training and evaluating multi-class multi-label document-level biomedical relation extraction models. Our dataset contains 80k biomedical research abstracts labeled with mentions of chemicals, diseases, and genes, portions of which human experts labeled with 18 types of biomedical relationships between these entities (intended for evaluation), and the re… ▽ More We introduce ChemDisGene, a new dataset for training and evaluating multi-class multi-label document-level biomedical relation extraction models. Our dataset contains 80k biomedical research abstracts labeled with mentions of chemicals, diseases, and genes, portions of which human experts labeled with 18 types of biomedical relationships between these entities (intended for evaluation), and the remainder of which (intended for training) has been distantly labeled via the CTD database with approximately 78\% accuracy. In comparison to similar preexisting datasets, ours is both substantially larger and cleaner; it also includes annotations linking mentions to their entities. We also provide three baseline deep neural network relation extraction models trained and evaluated on our new dataset. △ Less

Submitted 13 April, 2022; originally announced April 2022.

Comments: LREC 2022 (Oral)

arXiv:2111.10734 [pdf, other]

Deep Probability Estimation

Authors: Sheng Liu, Aakash Kaku, Weicheng Zhu, Matan Leibovich, Sreyas Mohan, Boyang Yu, Haoxiang Huang, Laure Zanna, Narges Razavian, Jonathan Niles-Weed, Carlos Fernandez-Granda

Abstract: Reliable probability estimation is of crucial importance in many real-world applications where there is inherent (aleatoric) uncertainty. Probability-estimation models are trained on observed outcomes (e.g. whether it has rained or not, or whether a patient has died or not), because the ground-truth probabilities of the events of interest are typically unknown. The problem is therefore analogous t… ▽ More Reliable probability estimation is of crucial importance in many real-world applications where there is inherent (aleatoric) uncertainty. Probability-estimation models are trained on observed outcomes (e.g. whether it has rained or not, or whether a patient has died or not), because the ground-truth probabilities of the events of interest are typically unknown. The problem is therefore analogous to binary classification, with the difference that the objective is to estimate probabilities rather than predicting the specific outcome. This work investigates probability estimation from high-dimensional data using deep neural networks. There exist several methods to improve the probabilities generated by these models but they mostly focus on model (epistemic) uncertainty. For problems with inherent uncertainty, it is challenging to evaluate performance without access to ground-truth probabilities. To address this, we build a synthetic dataset to study and compare different computable metrics. We evaluate existing methods on the synthetic data as well as on three real-world probability estimation tasks, all of which involve inherent uncertainty: precipitation forecasting from radar images, predicting cancer patient survival from histopathology images, and predicting car crashes from dashcam videos. We also give a theoretical analysis of a model for high-dimensional probability estimation which reproduces several of the phenomena evinced in our experiments. Finally, we propose a new method for probability estimation using neural networks, which modifies the training process to promote output probabilities that are consistent with empirical probabilities computed from the data. The method outperforms existing approaches on most metrics on the simulated as well as real-world data. △ Less

Submitted 11 October, 2022; v1 submitted 20 November, 2021; originally announced November 2021.

Comments: SL, AK, WZ, ML, SM contributed equally to this work; 36 pages, 17 figures, 12 tables

Journal ref: Proceedings of the 39th International Conference on Machine Learning, PMLR 162:13746-13781, 2022

arXiv:2110.08811 [pdf, other]

Attention W-Net: Improved Skip Connections for better Representations

Authors: Shikhar Mohan, Saumik Bhattacharya, Sayantari Ghosh

Abstract: Segmentation of macro and microvascular structures in fundoscopic retinal images plays a crucial role in the detection of multiple retinal and systemic diseases, yet it is a difficult problem to solve. Most neural network approaches face several issues such as lack of enough parameters, overfitting and/or incompatibility between internal feature-spaces. We propose Attention W-Net, a new U-Net base… ▽ More Segmentation of macro and microvascular structures in fundoscopic retinal images plays a crucial role in the detection of multiple retinal and systemic diseases, yet it is a difficult problem to solve. Most neural network approaches face several issues such as lack of enough parameters, overfitting and/or incompatibility between internal feature-spaces. We propose Attention W-Net, a new U-Net based architecture for retinal vessel segmentation to address these problems. In this architecture, we have two main contributions: Attention Block and regularisation measures. Our Attention Block uses attention between encoder and decoder features, resulting in higher compatibility upon addition. Our regularisation measures include augmentation and modifications to the ResNet Block used, which greatly prevent overfitting. We observe an F1 and AUC of 0.8407 and 0.9833 on the DRIVE and 0.8174 and 0.9865 respectively on the CHASE-DB1 datasets - a sizeable improvement over its backbone as well as competitive performance among contemporary state-of-the-art methods. △ Less

Submitted 29 June, 2022; v1 submitted 17 October, 2021; originally announced October 2021.

Comments: Accepted at ICPR'22

arXiv:2110.07686 [pdf, other]

Making Document-Level Information Extraction Right for the Right Reasons

Authors: Liyan Tang, Dhruv Rajan, Suyash Mohan, Abhijeet Pradhan, R. Nick Bryan, Greg Durrett

Abstract: Document-level models for information extraction tasks like slot-filling are flexible: they can be applied to settings where information is not necessarily localized in a single sentence. For example, key features of a diagnosis in a radiology report may not be explicitly stated in one place, but nevertheless can be inferred from parts of the report's text. However, these models can easily learn s… ▽ More Document-level models for information extraction tasks like slot-filling are flexible: they can be applied to settings where information is not necessarily localized in a single sentence. For example, key features of a diagnosis in a radiology report may not be explicitly stated in one place, but nevertheless can be inferred from parts of the report's text. However, these models can easily learn spurious correlations between labels and irrelevant information. This work studies how to ensure that these models make correct inferences from complex text and make those inferences in an auditable way: beyond just being right, are these models "right for the right reasons?" We experiment with post-hoc evidence extraction in a predict-select-verify framework using feature attribution techniques. We show that regularization with small amounts of evidence supervision during training can substantially improve the quality of extracted evidence. We evaluate on two domains: a small-scale labeled dataset of brain MRI reports and a large-scale modified version of DocRED (Yao et al., 2019) and show that models' plausibility can be improved with no loss in accuracy. △ Less

Submitted 18 May, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

Comments: 9 pages (15 with references and appendix), 3 figures

arXiv:2109.05771 [pdf, other]

Perturbation CheckLists for Evaluating NLG Evaluation Metrics

Authors: Ananya B. Sai, Tanay Dixit, Dev Yashpal Sheth, Sreyas Mohan, Mitesh M. Khapra

Abstract: Natural Language Generation (NLG) evaluation is a multifaceted task requiring assessment of multiple desirable criteria, e.g., fluency, coherency, coverage, relevance, adequacy, overall quality, etc. Across existing datasets for 6 NLG tasks, we observe that the human evaluation scores on these multiple criteria are often not correlated. For example, there is a very low correlation between human sc… ▽ More Natural Language Generation (NLG) evaluation is a multifaceted task requiring assessment of multiple desirable criteria, e.g., fluency, coherency, coverage, relevance, adequacy, overall quality, etc. Across existing datasets for 6 NLG tasks, we observe that the human evaluation scores on these multiple criteria are often not correlated. For example, there is a very low correlation between human scores on fluency and data coverage for the task of structured data to text generation. This suggests that the current recipe of proposing new automatic evaluation metrics for NLG by showing that they correlate well with scores assigned by humans for a single criteria (overall quality) alone is inadequate. Indeed, our extensive study involving 25 automatic evaluation metrics across 6 different tasks and 18 different evaluation criteria shows that there is no single metric which correlates well with human scores on all desirable criteria, for most NLG tasks. Given this situation, we propose CheckLists for better design and evaluation of automatic metrics. We design templates which target a specific criteria (e.g., coverage) and perturb the output such that the quality gets affected only along this specific criteria (e.g., the coverage drops). We show that existing evaluation metrics are not robust against even such simple perturbations and disagree with scores assigned by humans to the perturbed output. The proposed templates thus allow for a fine-grained assessment of automatic evaluation metrics exposing their limitations and will facilitate better design, analysis and evaluation of such metrics. △ Less

Submitted 13 September, 2021; originally announced September 2021.

Comments: Accepted at EMNLP 2021. See https://iitmnlp.github.io/EvalEval/ for our templates and code

arXiv:2107.12815 [pdf, other]

Adaptive Denoising via GainTuning

Authors: Sreyas Mohan, Joshua L. Vincent, Ramon Manzorro, Peter A. Crozier, Eero P. Simoncelli, Carlos Fernandez-Granda

Abstract: Deep convolutional neural networks (CNNs) for image denoising are usually trained on large datasets. These models achieve the current state of the art, but they have difficulties generalizing when applied to data that deviate from the training distribution. Recent work has shown that it is possible to train denoisers on a single noisy image. These models adapt to the features of the test image, bu… ▽ More Deep convolutional neural networks (CNNs) for image denoising are usually trained on large datasets. These models achieve the current state of the art, but they have difficulties generalizing when applied to data that deviate from the training distribution. Recent work has shown that it is possible to train denoisers on a single noisy image. These models adapt to the features of the test image, but their performance is limited by the small amount of information used to train them. Here we propose "GainTuning", in which CNN models pre-trained on large datasets are adaptively and selectively adjusted for individual test images. To avoid overfitting, GainTuning optimizes a single multiplicative scaling parameter (the "Gain") of each channel in the convolutional layers of the CNN. We show that GainTuning improves state-of-the-art CNNs on standard image-denoising benchmarks, boosting their denoising performance on nearly every image in a held-out test set. These adaptive improvements are even more substantial for test images differing systematically from the training data, either in noise level or image type. We illustrate the potential of adaptive denoising in a scientific application, in which a CNN is trained on synthetic data, and tested on real transmission-electron-microscope images. In contrast to the existing methodology, GainTuning is able to faithfully reconstruct the structure of catalytic nanoparticles from these data at extremely low signal-to-noise ratios. △ Less

Submitted 27 July, 2021; originally announced July 2021.

arXiv:2107.04635 [pdf, ps, other]

Playing Angry Birds with a Domain-Independent PDDL+ Planner

Authors: Wiktor Piotrowski, Roni Stern, Matthew Klenk, Alexandre Perez, Shiwali Mohan, Johan de Kleer, Jacob Le

Abstract: This demo paper presents the first system for playing the popular Angry Birds game using a domain-independent planner. Our system models Angry Birds levels using PDDL+, a planning language for mixed discrete/continuous domains. It uses a domain-independent PDDL+ planner to generate plans and executes them. In this demo paper, we present the system's PDDL+ model for this domain, identify key design… ▽ More This demo paper presents the first system for playing the popular Angry Birds game using a domain-independent planner. Our system models Angry Birds levels using PDDL+, a planning language for mixed discrete/continuous domains. It uses a domain-independent PDDL+ planner to generate plans and executes them. In this demo paper, we present the system's PDDL+ model for this domain, identify key design decisions that reduce the problem complexity, and compare the performance of our system to model-specific methods for this domain. The results show that our system's performance is on par with other domain-specific systems for Angry Birds, suggesting the applicability of domain-independent planning to this benchmark AI challenge. △ Less

Submitted 9 July, 2021; originally announced July 2021.

Comments: 2 pages, submitted to ICAPS 2021 Demonstration Track

Journal ref: Proceedings of the International Conference on Automated Planning and Scheduling (2021) Demonstration Track

arXiv:2107.02314 [pdf, other]

The RSNA-ASNR-MICCAI BraTS 2021 Benchmark on Brain Tumor Segmentation and Radiogenomic Classification

Authors: Ujjwal Baid, Satyam Ghodasara, Suyash Mohan, Michel Bilello, Evan Calabrese, Errol Colak, Keyvan Farahani, Jayashree Kalpathy-Cramer, Felipe C. Kitamura, Sarthak Pati, Luciano M. Prevedello, Jeffrey D. Rudie, Chiharu Sako, Russell T. Shinohara, Timothy Bergquist, Rong Chai, James Eddy, Julia Elliott, Walter Reade, Thomas Schaffter, Thomas Yu, Jiaxin Zheng, Ahmed W. Moawad, Luiz Otavio Coelho, Olivia McDonnell , et al. (78 additional authors not shown)

Abstract: The BraTS 2021 challenge celebrates its 10th anniversary and is jointly organized by the Radiological Society of North America (RSNA), the American Society of Neuroradiology (ASNR), and the Medical Image Computing and Computer Assisted Interventions (MICCAI) society. Since its inception, BraTS has been focusing on being a common benchmarking venue for brain glioma segmentation algorithms, with wel… ▽ More The BraTS 2021 challenge celebrates its 10th anniversary and is jointly organized by the Radiological Society of North America (RSNA), the American Society of Neuroradiology (ASNR), and the Medical Image Computing and Computer Assisted Interventions (MICCAI) society. Since its inception, BraTS has been focusing on being a common benchmarking venue for brain glioma segmentation algorithms, with well-curated multi-institutional multi-parametric magnetic resonance imaging (mpMRI) data. Gliomas are the most common primary malignancies of the central nervous system, with varying degrees of aggressiveness and prognosis. The RSNA-ASNR-MICCAI BraTS 2021 challenge targets the evaluation of computational algorithms assessing the same tumor compartmentalization, as well as the underlying tumor's molecular characterization, in pre-operative baseline mpMRI data from 2,040 patients. Specifically, the two tasks that BraTS 2021 focuses on are: a) the segmentation of the histologically distinct brain tumor sub-regions, and b) the classification of the tumor's O[6]-methylguanine-DNA methyltransferase (MGMT) promoter methylation status. The performance evaluation of all participating algorithms in BraTS 2021 will be conducted through the Sage Bionetworks Synapse platform (Task 1) and Kaggle (Task 2), concluding in distributing to the top ranked participants monetary awards of $60,000 collectively. △ Less

Submitted 12 September, 2021; v1 submitted 5 July, 2021; originally announced July 2021.

Comments: 19 pages, 2 figures, 1 table

arXiv:2106.10048 [pdf, other]

Comparative assessment of typical controlrealizations of grid forming converters based ontheir voltage source behaviour

Authors: Kanakesh Vatta Kkuni, Sibin Mohan, Guangya Yang, Wilsun Xu

Abstract: The converter control functions to provide the capabilities similar to synchronous generators are referred to as grid forming converters (GFC). Identical to a synchronous machine, a grid forming converter is expected to behave as a voltage source behind an impedance beyond the control bandwidth. However, GFC's realization has been different, with some utilizes inner current and voltage controllers… ▽ More The converter control functions to provide the capabilities similar to synchronous generators are referred to as grid forming converters (GFC). Identical to a synchronous machine, a grid forming converter is expected to behave as a voltage source behind an impedance beyond the control bandwidth. However, GFC's realization has been different, with some utilizes inner current and voltage controllers while others do not. This paper studies the impact of the inner loop on the grid forming converter's ability to behave as a voltage source behind an impedance. Three of the most popular GFC structures, 1) GFC with cascaded voltage and current control, 2) with inner current control, 3) with no inner loop, are chosen for the comparison. The analysis revealed that MW level GFC with inner loops could potentially go unstable under weak power system. Additionally, the GFC with cascaded control can only operate stably within a narrow range of network impedances. Furthermore, it is also shown that slow response behavior based on cascaded inner loop can impact on dynamic reactive and active power-sharing. △ Less

Submitted 18 June, 2021; originally announced June 2021.

Comments: 22 pages, 28 figures

arXiv:2106.08352 [pdf, other]

Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis

Authors: Devang S Ram Mohan, Vivian Hu, Tian Huey Teh, Alexandra Torresquintero, Christopher G. R. Wallis, Marlene Staib, Lorenzo Foglianti, Jiameng Gao, Simon King

Abstract: Text does not fully specify the spoken form, so text-to-speech models must be able to learn from speech data that vary in ways not explained by the corresponding text. One way to reduce the amount of unexplained variation in training data is to provide acoustic information as an additional learning signal. When generating speech, modifying this acoustic information enables multiple distinct rendit… ▽ More Text does not fully specify the spoken form, so text-to-speech models must be able to learn from speech data that vary in ways not explained by the corresponding text. One way to reduce the amount of unexplained variation in training data is to provide acoustic information as an additional learning signal. When generating speech, modifying this acoustic information enables multiple distinct renditions of a text to be produced. Since much of the unexplained variation is in the prosody, we propose a model that generates speech explicitly conditioned on the three primary acoustic correlates of prosody: $F_{0}$, energy and duration. The model is flexible about how the values of these features are specified: they can be externally provided, or predicted from text, or predicted then subsequently modified. Compared to a model that employs a variational auto-encoder to learn unsupervised latent features, our model provides more interpretable, temporally-precise, and disentangled control. When automatically predicting the acoustic features from text, it generates speech that is more natural than that from a Tacotron 2 model with reference encoder. Subsequent human-in-the-loop modification of the predicted acoustic features can significantly further increase naturalness. △ Less

Submitted 15 June, 2021; originally announced June 2021.

Comments: To be published in Interspeech 2021. 5 pages, 4 figures

arXiv:2106.08321 [pdf, other]

ADEPT: A Dataset for Evaluating Prosody Transfer

Authors: Alexandra Torresquintero, Tian Huey Teh, Christopher G. R. Wallis, Marlene Staib, Devang S Ram Mohan, Vivian Hu, Lorenzo Foglianti, Jiameng Gao, Simon King

Abstract: Text-to-speech is now able to achieve near-human naturalness and research focus has shifted to increasing expressivity. One popular method is to transfer the prosody from a reference speech sample. There have been considerable advances in using prosody transfer to generate more expressive speech, but the field lacks a clear definition of what successful prosody transfer means and a method for meas… ▽ More Text-to-speech is now able to achieve near-human naturalness and research focus has shifted to increasing expressivity. One popular method is to transfer the prosody from a reference speech sample. There have been considerable advances in using prosody transfer to generate more expressive speech, but the field lacks a clear definition of what successful prosody transfer means and a method for measuring it. We introduce a dataset of prosodically-varied reference natural speech samples for evaluating prosody transfer. The samples include global variations reflecting emotion and interpersonal attitude, and local variations reflecting topical emphasis, propositional attitude, syntactic phrasing and marked tonicity. The corpus only includes prosodic variations that listeners are able to distinguish with reasonable accuracy, and we report these figures as a benchmark against which text-to-speech prosody transfer can be compared. We conclude the paper with a demonstration of our proposed evaluation methodology, using the corpus to evaluate two text-to-speech models that perform prosody transfer. △ Less

Submitted 21 July, 2021; v1 submitted 15 June, 2021; originally announced June 2021.

Comments: 5 pages, 1 figure, accepted to Interspeech 2021

Showing 1–50 of 156 results for author: Mohan, S