subscribe to arXiv mailings

arXiv:2407.08839 [pdf, other]

A Survey on the Application of Generative Adversarial Networks in Cybersecurity: Prospective, Direction and Open Research Scopes

Authors: Md Mashrur Arifin, Md Shoaib Ahmed, Tanmai Kumar Ghosh, Jun Zhuang, Jyh-haw Yeh

Abstract: With the proliferation of Artificial Intelligence, there has been a massive increase in the amount of data required to be accumulated and disseminated digitally. As the data are available online in digital landscapes with complex and sophisticated infrastructures, it is crucial to implement various defense mechanisms based on cybersecurity. Generative Adversarial Networks (GANs), which are deep le… ▽ More With the proliferation of Artificial Intelligence, there has been a massive increase in the amount of data required to be accumulated and disseminated digitally. As the data are available online in digital landscapes with complex and sophisticated infrastructures, it is crucial to implement various defense mechanisms based on cybersecurity. Generative Adversarial Networks (GANs), which are deep learning models, have emerged as powerful solutions for addressing the constantly changing security issues. This survey studies the significance of the deep learning model, precisely on GANs, in strengthening cybersecurity defenses. Our survey aims to explore the various works completed in GANs, such as Intrusion Detection Systems (IDS), Mobile and Network Trespass, BotNet Detection, and Malware Detection. The focus is to examine how GANs can be influential tools to strengthen cybersecurity defenses in these domains. Further, the paper discusses the challenges and constraints of using GANs in these areas and suggests future research directions. Overall, the paper highlights the potential of GANs in enhancing cybersecurity measures and addresses the need for further exploration in this field. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.07452 [pdf]

Missile detection and destruction robot using detection algorithm

Authors: Md Kamrul Siam, Shafayet Ahmed, Md Habibur Rahman, Amir Hossain Mollah

Abstract: This research is based on the present missile detection technologies in the world and the analysis of these technologies to find a cost effective solution to implement the system in Bangladesh. The paper will give an idea of the missile detection technologies using the electro-optical sensor and the pulse doppler radar. The system is made to detect the target missile. Automatic detection and destr… ▽ More This research is based on the present missile detection technologies in the world and the analysis of these technologies to find a cost effective solution to implement the system in Bangladesh. The paper will give an idea of the missile detection technologies using the electro-optical sensor and the pulse doppler radar. The system is made to detect the target missile. Automatic detection and destruction with the help of ultrasonic sonar, a metal detector sensor, and a smoke detector sensor. The system is mainly based on an ultrasonic sonar sensor. It has a transducer, a transmitter, and a receiver. Transducer is connected with the connected with controller. When it detects an object by following the algorithm, it finds its distance and angle. It can also assure whether the system can destroy the object or not by using another algorithm's simulation. △ Less

Submitted 11 July, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

Comments: 67 pages

arXiv:2407.04743 [pdf, other]

Black hole spin-down via cosmic strings

Authors: Sameer Ahmed, Michael J. Kavic, Steven L. Liebling, Matthew Lippert, Mohammad Mian, John Simonetti

Abstract: Cosmic strings that are attached to rapidly spinning black holes can extract significant amounts of rotational energy and angular momentum. Here we study the effect on primordial black holes, which are expected to form with one or more cosmic strings attached. Although large primordial black holes are predicted to rapidly spin up due to accretion soon after forming, we argue that cosmic strings wi… ▽ More Cosmic strings that are attached to rapidly spinning black holes can extract significant amounts of rotational energy and angular momentum. Here we study the effect on primordial black holes, which are expected to form with one or more cosmic strings attached. Although large primordial black holes are predicted to rapidly spin up due to accretion soon after forming, we argue that cosmic strings will spin them down again. As a result, the spins of primordial black holes should consequently be observed to be near zero. We also investigate the effect on a supermassive black hole of capturing a cosmic string and the possibility of observing the subsequent spin down by its effect on a pulsar orbiting the black hole. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 11 pages

Report number: OW-2024-2

arXiv:2407.03830 [pdf, other]

DocXplain: A Novel Model-Agnostic Explainability Method for Document Image Classification

Authors: Saifullah Saifullah, Stefan Agne, Andreas Dengel, Sheraz Ahmed

Abstract: Deep learning (DL) has revolutionized the field of document image analysis, showcasing superhuman performance across a diverse set of tasks. However, the inherent black-box nature of deep learning models still presents a significant challenge to their safe and robust deployment in industry. Regrettably, while a plethora of research has been dedicated in recent years to the development of DL-powere… ▽ More Deep learning (DL) has revolutionized the field of document image analysis, showcasing superhuman performance across a diverse set of tasks. However, the inherent black-box nature of deep learning models still presents a significant challenge to their safe and robust deployment in industry. Regrettably, while a plethora of research has been dedicated in recent years to the development of DL-powered document analysis systems, research addressing their transparency aspects has been relatively scarce. In this paper, we aim to bridge this research gap by introducing DocXplain, a novel model-agnostic explainability method specifically designed for generating high interpretability feature attribution maps for the task of document image classification. In particular, our approach involves independently segmenting the foreground and background features of the documents into different document elements and then ablating these elements to assign feature importance. We extensively evaluate our proposed approach in the context of document image classification, utilizing 4 different evaluation metrics, 2 widely recognized document benchmark datasets, and 10 state-of-the-art document image classification models. By conducting a thorough quantitative and qualitative analysis against 9 existing state-of-the-art attribution methods, we demonstrate the superiority of our approach in terms of both faithfulness and interpretability. To the best of the authors' knowledge, this work presents the first model-agnostic attribution-based explainability method specifically tailored for document images. We anticipate that our work will significantly contribute to advancing research on transparency, fairness, and robustness of document image classification models. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: Accepted in ICDAR 2024

arXiv:2407.00136 [pdf, other]

Observation of the Electromagnetic Dalitz Transition $h_c \rightarrow e^+e^-η_c$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, S. Ahmed, M. Albrecht, R. Aliberti, A. Amoroso, M. R. An, Q. An, X. H. Bai, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (495 additional authors not shown)

Abstract: Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions… ▽ More Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions $\frac{\mathcal{B}(h_c\rightarrow e^+e^-η_c)}{\mathcal{B}(h_c\rightarrow γη_c)}$ separately for the $h_c$ samples produced via $ψ(3686)\toπ^0h_c$ and $e^+e^-\toπ^+π^-h_c$. The average ratio is determined to be $(0.59\pm0.10(\text{stat.})\pm0.04(\text{syst.}))\%$, where the uncertainty includes both statistical and systematic components. △ Less

Submitted 2 July, 2024; v1 submitted 28 June, 2024; originally announced July 2024.

arXiv:2406.17999 [pdf, other]

Entangling Schrödinger's cat states by seeding a Bell state or swapping the cats

Authors: Daisuke Hoshi, Toshiaki Nagase, Sangil Kwon, Daisuke Iyama, Takahiko Kamiya, Shiori Fujii, Hiroto Mukai, Shahnawaz Ahmed, Anton Frisk Kockum, Shohei Watabe, Fumiki Yoshihara, Jaw-Shen Tsai

Abstract: In quantum information processing, two primary research directions have emerged: one based on discrete variables (DV) and the other on the structure of quantum states in a continuous-variable (CV) space. It is increasingly recognized that integrating these two approaches could unlock new potentials, overcoming the inherent limitations of each. Here, we show that such a DV-CV hybrid approach, appli… ▽ More In quantum information processing, two primary research directions have emerged: one based on discrete variables (DV) and the other on the structure of quantum states in a continuous-variable (CV) space. It is increasingly recognized that integrating these two approaches could unlock new potentials, overcoming the inherent limitations of each. Here, we show that such a DV-CV hybrid approach, applied to superconducting Kerr parametric oscillators (KPOs), enables us to entangle a pair of Schrödinger's cat states by two straightforward methods. The first method involves the entanglement-preserving and deterministic conversion between Bell states in the Fock-state basis (DV encoding) and those in the cat-state basis (CV encoding). This method would allow us to construct quantum networks in the cat-state basis using conventional schemes originally developed for the Fock-state basis. In the second method, the $\sqrt{\textrm{iSWAP}}$ gate operation is implemented between two cat states following the procedure used for Fock-state encoding. This DV-like gate operation on CV encoding not only completes the demonstration of a universal quantum gate set in a KPO system but also enables faster and simpler gate operations compared to previous SWAP gate implementations on bosonic modes. Our work offers a simple yet powerful application of DV-CV hybridization while also highlighting the scalability of this planar KPO system. △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.15953 [pdf, other]

Exploring the Influence of Online Videos on Parents or Caregivers of Children with Developmental Delays

Authors: Saquib Ahmed, Md Nazmus Sakib, Sanorita Dey

Abstract: Developmental Delays and Disabilities (DDDs) refer to conditions where children are slower or unable to reach developmental milestones compared to typically developing children. This can cause significant stress for parents, leading to social isolation and loneliness. Online videos, particularly those on YouTube, aim to support these parents and caregivers by offering guidance and assistance. Stud… ▽ More Developmental Delays and Disabilities (DDDs) refer to conditions where children are slower or unable to reach developmental milestones compared to typically developing children. This can cause significant stress for parents, leading to social isolation and loneliness. Online videos, particularly those on YouTube, aim to support these parents and caregivers by offering guidance and assistance. Studies show that parents of children with DDDs create videos on YouTube to enhance authenticity and build connections. However, there is limited knowledge about how other parents with children with DDDs perceive and are impacted by these videos. Our study used a mixed-method approach to annotate and analyze more than fifteen hundred YouTube videos on children's DDDs. We found that these videos provide crucial informational content and offer mental and emotional support through shared personal experiences. Comments analysis revealed a strong sense of community among YouTubers and viewers. Interviews with parents of children with DDDs showed that they find these videos relatable and essential for managing their children's diagnosis and treatments. We concluded by discussing platform-centric design implications for supporting parents and other caregivers of children with DDDs. △ Less

Submitted 22 June, 2024; originally announced June 2024.

arXiv:2406.07840 [pdf, other]

SynthForge: Synthesizing High-Quality Face Dataset with Controllable 3D Generative Models

Authors: Abhay Rawat, Shubham Dokania, Astitva Srivastava, Shuaib Ahmed, Haiwen Feng, Rahul Tallamraju

Abstract: Recent advancements in generative models have unlocked the capabilities to render photo-realistic data in a controllable fashion. Trained on the real data, these generative models are capable of producing realistic samples with minimal to no domain gap, as compared to the traditional graphics rendering. However, using the data generated using such models for training downstream tasks remains under… ▽ More Recent advancements in generative models have unlocked the capabilities to render photo-realistic data in a controllable fashion. Trained on the real data, these generative models are capable of producing realistic samples with minimal to no domain gap, as compared to the traditional graphics rendering. However, using the data generated using such models for training downstream tasks remains under-explored, mainly due to the lack of 3D consistent annotations. Moreover, controllable generative models are learned from massive data and their latent space is often too vast to obtain meaningful sample distributions for downstream task with limited generation. To overcome these challenges, we extract 3D consistent annotations from an existing controllable generative model, making the data useful for downstream tasks. Our experiments show competitive performance against state-of-the-art models using only generated synthetic data, demonstrating potential for solving downstream tasks. Project page: https://synth-forge.github.io △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 11 pages, 4 figures, 3 tables. Under Review

arXiv:2406.07647 [pdf, other]

FP-Inconsistent: Detecting Evasive Bots using Browser Fingerprint Inconsistencies

Authors: Hari Venugopalan, Shaoor Munir, Shuaib Ahmed, Tangbaihe Wang, Samuel T. King, Zubair Shafiq

Abstract: As browser fingerprinting is increasingly being used for bot detection, bots have started altering their fingerprints for evasion. We conduct the first large-scale evaluation of evasive bots to investigate whether and how altering fingerprints helps bots evade detection. To systematically investigate evasive bots, we deploy a honey site incorporating two anti-bot services (DataDome and BotD) and s… ▽ More As browser fingerprinting is increasingly being used for bot detection, bots have started altering their fingerprints for evasion. We conduct the first large-scale evaluation of evasive bots to investigate whether and how altering fingerprints helps bots evade detection. To systematically investigate evasive bots, we deploy a honey site incorporating two anti-bot services (DataDome and BotD) and solicit bot traffic from 20 different bot services that purport to sell "realistic and undetectable traffic". Across half a million requests from 20 different bot services on our honey site, we find an average evasion rate of 52.93% against DataDome and 44.56% evasion rate against BotD. Our comparison of fingerprint attributes from bot services that evade each anti-bot service individually as well as bot services that evade both shows that bot services indeed alter different browser fingerprint attributes for evasion. Further, our analysis reveals the presence of inconsistent fingerprint attributes in evasive bots. Given evasive bots seem to have difficulty in ensuring consistency in their fingerprint attributes, we propose a data-driven approach to discover rules to detect such inconsistencies across space (two attributes in a given browser fingerprint) and time (a single attribute at two different points in time). These rules, which can be readily deployed by anti-bot services, reduce the evasion rate of evasive bots against DataDome and BotD by 48.11% and 44.95% respectively. △ Less

Submitted 11 June, 2024; originally announced June 2024.

arXiv:2406.05447 [pdf, other]

The PLATO Mission

Authors: Heike Rauer, Conny Aerts, Juan Cabrera, Magali Deleuil, Anders Erikson, Laurent Gizon, Mariejo Goupil, Ana Heras, Jose Lorenzo-Alvarez, Filippo Marliani, Cesar Martin-Garcia, J. Miguel Mas-Hesse, Laurence O'Rourke, Hugh Osborn, Isabella Pagano, Giampaolo Piotto, Don Pollacco, Roberto Ragazzoni, Gavin Ramsay, Stéphane Udry, Thierry Appourchaux, Willy Benz, Alexis Brandeker, Manuel Güdel, Eduardo Janot-Pacheco , et al. (801 additional authors not shown)

Abstract: PLATO (PLAnetary Transits and Oscillations of stars) is ESA's M3 mission designed to detect and characterise extrasolar planets and perform asteroseismic monitoring of a large number of stars. PLATO will detect small planets (down to <2 R_(Earth)) around bright stars (<11 mag), including terrestrial planets in the habitable zone of solar-like stars. With the complement of radial velocity observati… ▽ More PLATO (PLAnetary Transits and Oscillations of stars) is ESA's M3 mission designed to detect and characterise extrasolar planets and perform asteroseismic monitoring of a large number of stars. PLATO will detect small planets (down to <2 R_(Earth)) around bright stars (<11 mag), including terrestrial planets in the habitable zone of solar-like stars. With the complement of radial velocity observations from the ground, planets will be characterised for their radius, mass, and age with high accuracy (5 %, 10 %, 10 % for an Earth-Sun combination respectively). PLATO will provide us with a large-scale catalogue of well-characterised small planets up to intermediate orbital periods, relevant for a meaningful comparison to planet formation theories and to better understand planet evolution. It will make possible comparative exoplanetology to place our Solar System planets in a broader context. In parallel, PLATO will study (host) stars using asteroseismology, allowing us to determine the stellar properties with high accuracy, substantially enhancing our knowledge of stellar structure and evolution. The payload instrument consists of 26 cameras with 12cm aperture each. For at least four years, the mission will perform high-precision photometric measurements. Here we review the science objectives, present PLATO's target samples and fields, provide an overview of expected core science performance as well as a description of the instrument and the mission profile at the beginning of the serial production of the flight cameras. PLATO is scheduled for a launch date end 2026. This overview therefore provides a summary of the mission to the community in preparation of the upcoming operational phases. △ Less

Submitted 8 June, 2024; originally announced June 2024.

arXiv:2405.18894 [pdf, other]

Few-Shot Testing: Estimating Uncertainty of Memristive Deep Neural Networks Using One Bayesian Test Vector

Authors: Soyed Tuhin Ahmed, Mehdi Tahoori

Abstract: The performance of deep learning algorithms such as neural networks (NNs) has increased tremendously recently, and they can achieve state-of-the-art performance in many domains. However, due to memory and computation resource constraints, implementing NNs on edge devices is a challenging task. Therefore, hardware accelerators such as computation-in-memory (CIM) with memristive devices have been de… ▽ More The performance of deep learning algorithms such as neural networks (NNs) has increased tremendously recently, and they can achieve state-of-the-art performance in many domains. However, due to memory and computation resource constraints, implementing NNs on edge devices is a challenging task. Therefore, hardware accelerators such as computation-in-memory (CIM) with memristive devices have been developed to accelerate the most common operations, i.e., matrix-vector multiplication. However, due to inherent device properties, external environmental factors such as temperature, and an immature fabrication process, memristors suffer from various non-idealities, including defects and variations occurring during manufacturing and runtime. Consequently, there is a lack of complete confidence in the predictions made by the model. To improve confidence in NN predictions made by hardware accelerators in the presence of device non-idealities, in this paper, we propose a Bayesian test vector generation framework that can estimate the model uncertainty of NNs implemented on memristor-based CIM hardware. Compared to the conventional point estimate test vector generation method, our method is more generalizable across different model dimensions and requires storing only one test Bayesian vector in the hardware. Our method is evaluated on different model dimensions, tasks, fault rates, and variation noise to show that it can consistently achieve $100\%$ coverage with only $0.024$ MB of memory overhead. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.10426 [pdf, other]

Memory-efficient Energy-adaptive Inference of Pre-Trained Models on Batteryless Embedded Systems

Authors: Pietro Farina, Subrata Biswas, Eren Yıldız, Khakim Akhunov, Saad Ahmed, Bashima Islam, Kasım Sinan Yıldırım

Abstract: Batteryless systems frequently face power failures, requiring extra runtime buffers to maintain inference progress and leaving only a memory space for storing ultra-tiny deep neural networks (DNNs). Besides, making these models responsive to stochastic energy harvesting dynamics during inference requires a balance between inference accuracy, latency, and energy overhead. Recent works on compressio… ▽ More Batteryless systems frequently face power failures, requiring extra runtime buffers to maintain inference progress and leaving only a memory space for storing ultra-tiny deep neural networks (DNNs). Besides, making these models responsive to stochastic energy harvesting dynamics during inference requires a balance between inference accuracy, latency, and energy overhead. Recent works on compression mostly focus on time and memory, but often ignore energy dynamics or significantly reduce the accuracy of pre-trained DNNs. Existing energy-adaptive inference works modify the architecture of pre-trained models and have significant memory overhead. Thus, energy-adaptive and accurate inference of pre-trained DNNs on batteryless devices with extreme memory constraints is more challenging than traditional microcontrollers. We combat these issues by proposing FreeML, a framework to optimize pre-trained DNN models for memory-efficient and energy-adaptive inference on batteryless systems. FreeML comprises (1) a novel compression technique to reduce the model footprint and runtime memory requirements simultaneously, making them executable on extremely memory-constrained batteryless platforms; and (2) the first early exit mechanism that uses a single exit branch for all exit points to terminate inference at any time, making models energy-adaptive with minimal memory overhead. Our experiments showed that FreeML reduces the model sizes by up to $95 \times$, supports adaptive inference with a $2.03-19.65 \times$ less memory overhead, and provides significant time and energy benefits with only a negligible accuracy drop compared to the state-of-the-art. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: This paper has been selected for publication at the 21st International Conference on Embedded Wireless Systems and Networks (EWSN'24)

arXiv:2405.08696 [pdf, other]

A magnetically silent optically pumped magnetometer for a neutron electric dipole moment experiment

Authors: Wolfgang Klassen, Shomi Ahmed, Kiera Pond Grehan, Chris Hovde, Kirk W. Madison, Russel R. Mammei, Jeffery W. Martin, Mark McCrea, Tahereh Mohammadi, Takamasa Momose, Patrick Opsahl, David C. M. Ostapchuk

Abstract: We report the performance of a magnetically silent optically pumped cesium magnetometer with a statistical sensitivity of 3.9 pT and a stability of 90 fT over 150 seconds of measurement. Optical pumping with coherent, linearly-polarized, resonant light leads to a relatively long-lived polarized ground state of the cesium vapour contained in a measurement cell. The state precesses at its Larmor fre… ▽ More We report the performance of a magnetically silent optically pumped cesium magnetometer with a statistical sensitivity of 3.9 pT and a stability of 90 fT over 150 seconds of measurement. Optical pumping with coherent, linearly-polarized, resonant light leads to a relatively long-lived polarized ground state of the cesium vapour contained in a measurement cell. The state precesses at its Larmor frequency in the magnetic field to be measured. Nonlinear magneto-optical rotation then leads to the rotation of the plane of polarization of a linearly polarized probe laser beam. The rotation angle is modulated at twice the Larmor frequency. A measurement of this frequency constitutes an absolute measurement of the magnetic field magnitude. Featuring purely optical operation, non-magnetic construction, low noise floor, and high stability, this sensor will be used for the upcoming TUCAN electric dipole moment experiment and other highly sensitive magnetic applications. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 13 pages, 9 figures. Submitted to The European Physical Journal C

arXiv:2405.08252 [pdf, other]

Smart Sampling: Self-Attention and Bootstrapping for Improved Ensembled Q-Learning

Authors: Muhammad Junaid Khan, Syed Hammad Ahmed, Gita Sukthankar

Abstract: We present a novel method aimed at enhancing the sample efficiency of ensemble Q learning. Our proposed approach integrates multi-head self-attention into the ensembled Q networks while bootstrapping the state-action pairs ingested by the ensemble. This not only results in performance improvements over the original REDQ (Chen et al. 2021) and its variant DroQ (Hi-raoka et al. 2022), thereby enhanc… ▽ More We present a novel method aimed at enhancing the sample efficiency of ensemble Q learning. Our proposed approach integrates multi-head self-attention into the ensembled Q networks while bootstrapping the state-action pairs ingested by the ensemble. This not only results in performance improvements over the original REDQ (Chen et al. 2021) and its variant DroQ (Hi-raoka et al. 2022), thereby enhancing Q predictions, but also effectively reduces both the average normalized bias and standard deviation of normalized bias within Q-function ensembles. Importantly, our method also performs well even in scenarios with a low update-to-data (UTD) ratio. Notably, the implementation of our proposed method is straightforward, requiring minimal modifications to the base model. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: FLAIRS-37 (2024)

arXiv:2405.08226 [pdf, other]

SeNMo: A Self-Normalizing Deep Learning Model for Enhanced Multi-Omics Data Analysis in Oncology

Authors: Asim Waqas, Aakash Tripathi, Sabeen Ahmed, Ashwin Mukund, Hamza Farooq, Matthew B. Schabath, Paul Stewart, Mia Naeini, Ghulam Rasool

Abstract: Multi-omics research has enhanced our understanding of cancer heterogeneity and progression. Investigating molecular data through multi-omics approaches is crucial for unraveling the complex biological mechanisms underlying cancer, thereby enabling effective diagnosis, treatment, and prevention strategies. However, predicting patient outcomes through integration of all available multi-omics data i… ▽ More Multi-omics research has enhanced our understanding of cancer heterogeneity and progression. Investigating molecular data through multi-omics approaches is crucial for unraveling the complex biological mechanisms underlying cancer, thereby enabling effective diagnosis, treatment, and prevention strategies. However, predicting patient outcomes through integration of all available multi-omics data is an under-study research direction. Here, we present SeNMo (Self-normalizing Network for Multi-omics), a deep neural network trained on multi-omics data across 33 cancer types. SeNMo is efficient in handling multi-omics data characterized by high-width (many features) and low-length (fewer samples) attributes. We trained SeNMo for the task of overall survival using pan-cancer data involving 33 cancer sites from Genomics Data Commons (GDC). The training data includes gene expression, DNA methylation, miRNA expression, DNA mutations, protein expression modalities, and clinical data. We evaluated the model's performance in predicting overall survival using concordance index (C-Index). SeNMo performed consistently well in training regime, with the validation C-Index of 0.76 on GDC's public data. In the testing regime, SeNMo performed with a C-Index of 0.758 on a held-out test set. The model showed an average accuracy of 99.8% on the task of classifying the primary cancer type on the pan-cancer test cohort. SeNMo proved to be a mini-foundation model for multi-omics oncology data because it demonstrated robust performance, and adaptability not only across molecular data types but also on the classification task of predicting the primary cancer type of patients. SeNMo can be further scaled to any cancer site and molecular data type. We believe SeNMo and similar models are poised to transform the oncology landscape, offering hope for more effective, efficient, and patient-centric cancer care. △ Less

Submitted 13 May, 2024; originally announced May 2024.

arXiv:2405.06836 [pdf, other]

Improving Targeted Molecule Generation through Language Model Fine-Tuning Via Reinforcement Learning

Authors: Salma J. Ahmed, Mustafa A. Elattar

Abstract: Developing new drugs is laborious and costly, demanding extensive time investment. In this study, we introduce an innovative de-novo drug design strategy, which harnesses the capabilities of language models to devise targeted drugs for specific proteins. Employing a Reinforcement Learning (RL) framework utilizing Proximal Policy Optimization (PPO), we refine the model to acquire a policy for gener… ▽ More Developing new drugs is laborious and costly, demanding extensive time investment. In this study, we introduce an innovative de-novo drug design strategy, which harnesses the capabilities of language models to devise targeted drugs for specific proteins. Employing a Reinforcement Learning (RL) framework utilizing Proximal Policy Optimization (PPO), we refine the model to acquire a policy for generating drugs tailored to protein targets. Our method integrates a composite reward function, combining considerations of drug-target interaction and molecular validity. Following RL fine-tuning, our approach demonstrates promising outcomes, yielding notable improvements in molecular validity, interaction efficacy, and critical chemical properties, achieving 65.37 for Quantitative Estimation of Drug-likeness (QED), 321.55 for Molecular Weight (MW), and 4.47 for Octanol-Water Partition Coefficient (logP), respectively. Furthermore, out of the generated drugs, only 0.041\% do not exhibit novelty. △ Less

Submitted 10 May, 2024; originally announced May 2024.

arXiv:2405.06507 [pdf, other]

EcoEdgeTwin: Enhanced 6G Network via Mobile Edge Computing and Digital Twin Integration

Authors: Synthia Hossain Karobi, Shakil Ahmed, Saifur Rahman Sabuj, Ashfaq Khokhar

Abstract: In the 6G era, integrating Mobile Edge Computing (MEC) and Digital Twin (DT) technologies presents a transformative approach to enhance network performance through predictive, adaptive control for energy-efficient, low-latency communication. This paper presents the EcoEdgeTwin model, an innovative framework that harnesses the synergy between MEC and DT technologies to ensure efficient network oper… ▽ More In the 6G era, integrating Mobile Edge Computing (MEC) and Digital Twin (DT) technologies presents a transformative approach to enhance network performance through predictive, adaptive control for energy-efficient, low-latency communication. This paper presents the EcoEdgeTwin model, an innovative framework that harnesses the synergy between MEC and DT technologies to ensure efficient network operation. We optimize the utility function within the EcoEdgeTwin model to balance enhancing users' Quality of Experience (QoE) and minimizing latency and energy consumption at edge servers. This approach ensures efficient and adaptable network operations, utilizing DT to synchronize and integrate real-time data seamlessly. Our framework achieves this by implementing robust mechanisms for task offloading, service caching, and cost-effective service migration. Additionally, it manages energy consumption related to task processing, communication, and the influence of DT predictions, all essential for optimizing latency and minimizing energy usage. Through the utility model, we also prioritize QoE, fostering a user-centric approach to network management that balances network efficiency with user satisfaction. A cornerstone of our approach is integrating the advantage actor-critic algorithm, marking a pioneering use of deep reinforcement learning for dynamic network management. This strategy addresses challenges in service mobility and network variability, ensuring optimal network performance matrices. Our extensive simulations demonstrate that compared to benchmark models lacking DT integration, EcoEdgeTwin framework significantly reduces energy usage and latency while enhancing QoE. △ Less

Submitted 10 May, 2024; originally announced May 2024.

arXiv:2405.06128 [pdf, other]

Enhanced Multimodal Content Moderation of Children's Videos using Audiovisual Fusion

Authors: Syed Hammad Ahmed, Muhammad Junaid Khan, Gita Sukthankar

Abstract: Due to the rise in video content creation targeted towards children, there is a need for robust content moderation schemes for video hosting platforms. A video that is visually benign may include audio content that is inappropriate for young children while being impossible to detect with a unimodal content moderation system. Popular video hosting platforms for children such as YouTube Kids still p… ▽ More Due to the rise in video content creation targeted towards children, there is a need for robust content moderation schemes for video hosting platforms. A video that is visually benign may include audio content that is inappropriate for young children while being impossible to detect with a unimodal content moderation system. Popular video hosting platforms for children such as YouTube Kids still publish videos which contain audio content that is not conducive to a child's healthy behavioral and physical development. A robust classification of malicious videos requires audio representations in addition to video features. However, recent content moderation approaches rarely employ multimodal architectures that explicitly consider non-speech audio cues. To address this, we present an efficient adaptation of CLIP (Contrastive Language-Image Pre-training) that can leverage contextual audio cues for enhanced content moderation. We incorporate 1) the audio modality and 2) prompt learning, while keeping the backbone modules of each modality frozen. We conduct our experiments on a multimodal version of the MOB (Malicious or Benign) dataset in supervised and few-shot settings. △ Less

Submitted 9 May, 2024; originally announced May 2024.

Comments: 8 pages, 3 figures, Accepted at The 37th International FLAIRS Conference

arXiv:2405.05286 [pdf, other]

Tiny Deep Ensemble: Uncertainty Estimation in Edge AI Accelerators via Ensembling Normalization Layers with Shared Weights

Authors: Soyed Tuhin Ahmed, Michael Hefenbrock, Mehdi B. Tahoori

Abstract: The applications of artificial intelligence (AI) are rapidly evolving, and they are also commonly used in safety-critical domains, such as autonomous driving and medical diagnosis, where functional safety is paramount. In AI-driven systems, uncertainty estimation allows the user to avoid overconfidence predictions and achieve functional safety. Therefore, the robustness and reliability of model pr… ▽ More The applications of artificial intelligence (AI) are rapidly evolving, and they are also commonly used in safety-critical domains, such as autonomous driving and medical diagnosis, where functional safety is paramount. In AI-driven systems, uncertainty estimation allows the user to avoid overconfidence predictions and achieve functional safety. Therefore, the robustness and reliability of model predictions can be improved. However, conventional uncertainty estimation methods, such as the deep ensemble method, impose high computation and, accordingly, hardware (latency and energy) overhead because they require the storage and processing of multiple models. Alternatively, Monte Carlo dropout (MC-dropout) methods, although having low memory overhead, necessitate numerous ($\sim 100$) forward passes, leading to high computational overhead and latency. Thus, these approaches are not suitable for battery-powered edge devices with limited computing and memory resources. In this paper, we propose the Tiny-Deep Ensemble approach, a low-cost approach for uncertainty estimation on edge devices. In our approach, only normalization layers are ensembled $M$ times, with all ensemble members sharing common weights and biases, leading to a significant decrease in storage requirements and latency. Moreover, our approach requires only one forward pass in a hardware architecture that allows batch processing for inference and uncertainty estimation. Furthermore, it has approximately the same memory overhead compared to a single model. Therefore, latency and memory overhead are reduced by a factor of up to $\sim M\times$. Nevertheless, our method does not compromise accuracy, with an increase in inference accuracy of up to $\sim 1\%$ and a reduction in RMSE of $17.17\%$ in various benchmark datasets, tasks, and state-of-the-art architectures. △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2405.04252 [pdf, other]

VAEneu: A New Avenue for VAE Application on Probabilistic Forecasting

Authors: Alireza Koochali, Ensiye Tahaei, Andreas Dengel, Sheraz Ahmed

Abstract: This paper presents VAEneu, an innovative autoregressive method for multistep ahead univariate probabilistic time series forecasting. We employ the conditional VAE framework and optimize the lower bound of the predictive distribution likelihood function by adopting the Continuous Ranked Probability Score (CRPS), a strictly proper scoring rule, as the loss function. This novel pipeline results in f… ▽ More This paper presents VAEneu, an innovative autoregressive method for multistep ahead univariate probabilistic time series forecasting. We employ the conditional VAE framework and optimize the lower bound of the predictive distribution likelihood function by adopting the Continuous Ranked Probability Score (CRPS), a strictly proper scoring rule, as the loss function. This novel pipeline results in forecasting sharp and well-calibrated predictive distribution. Through a comprehensive empirical study, VAEneu is rigorously benchmarked against 12 baseline models across 12 datasets. The results unequivocally demonstrate VAEneu's remarkable forecasting performance. VAEneu provides a valuable tool for quantifying future uncertainties, and our extensive empirical study lays the foundation for future comparative studies for univariate multistep ahead probabilistic forecasting. △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2405.01597 [pdf, other]

Improving Disease Detection from Social Media Text via Self-Augmentation and Contrastive Learning

Authors: Pervaiz Iqbal Khan, Andreas Dengel, Sheraz Ahmed

Abstract: Detecting diseases from social media has diverse applications, such as public health monitoring and disease spread detection. While language models (LMs) have shown promising performance in this domain, there remains ongoing research aimed at refining their discriminating representations. In this paper, we propose a novel method that integrates Contrastive Learning (CL) with language modeling to a… ▽ More Detecting diseases from social media has diverse applications, such as public health monitoring and disease spread detection. While language models (LMs) have shown promising performance in this domain, there remains ongoing research aimed at refining their discriminating representations. In this paper, we propose a novel method that integrates Contrastive Learning (CL) with language modeling to address this challenge. Our approach introduces a self-augmentation method, wherein hidden representations of the model are augmented with their own representations. This method comprises two branches: the first branch, a traditional LM, learns features specific to the given data, while the second branch incorporates augmented representations from the first branch to encourage generalization. CL further refines these representations by pulling pairs of original and augmented versions closer while pushing other samples away. We evaluate our method on three NLP datasets encompassing binary, multi-label, and multi-class classification tasks involving social media posts related to various diseases. Our approach demonstrates notable improvements over traditional fine-tuning methods, achieving up to a 2.48% increase in F1-score compared to baseline approaches and a 2.1% enhancement over state-of-the-art methods. △ Less

Submitted 30 April, 2024; originally announced May 2024.

arXiv:2404.19238 [pdf, other]

Pilot Contamination in Massive MIMO Systems: Challenges and Future Prospects

Authors: Muhammad Kamran Saeed, Ashfaq Khokhar, Shakil Ahmed

Abstract: Massive multiple input multiple output (M-MIMO) technology plays a pivotal role in fifth-generation (5G) and beyond communication systems, offering a wide range of benefits, from increased spectral efficiency (SE) to enhanced energy efficiency and higher reliability. However, these advantages are contingent upon precise channel state information (CSI) availability at the base station (BS). Ensurin… ▽ More Massive multiple input multiple output (M-MIMO) technology plays a pivotal role in fifth-generation (5G) and beyond communication systems, offering a wide range of benefits, from increased spectral efficiency (SE) to enhanced energy efficiency and higher reliability. However, these advantages are contingent upon precise channel state information (CSI) availability at the base station (BS). Ensuring precise CSI is challenging due to the constrained size of the coherence interval and the resulting limitations on pilot sequence length. Therefore, reusing pilot sequences in adjacent cells introduces pilot contamination, hindering SE enhancement. This paper reviews recent advancements and addresses research challenges in mitigating pilot contamination and improving channel estimation, categorizing the existing research into three broader categories: pilot assignment schemes, advanced signal processing methods, and advanced channel estimation techniques. Salient representative pilot mitigation/assignment techniques are analyzed and compared in each category. Lastly, possible future research directions are discussed. △ Less

Submitted 29 April, 2024; originally announced April 2024.

Comments: Accepted At IWCMC 2024 Comm & SP Symposium

arXiv:2404.18396 [pdf, other]

DRAM-Profiler: An Experimental DRAM RowHammer Vulnerability Profiling Mechanism

Authors: Ranyang Zhou, Jacqueline T. Liu, Nakul Kochar, Sabbir Ahmed, Adnan Siraj Rakin, Shaahin Angizi

Abstract: RowHammer stands out as a prominent example, potentially the pioneering one, showcasing how a failure mechanism at the circuit level can give rise to a significant and pervasive security vulnerability within systems. Prior research has approached RowHammer attacks within a static threat model framework. Nonetheless, it warrants consideration within a more nuanced and dynamic model. This paper pres… ▽ More RowHammer stands out as a prominent example, potentially the pioneering one, showcasing how a failure mechanism at the circuit level can give rise to a significant and pervasive security vulnerability within systems. Prior research has approached RowHammer attacks within a static threat model framework. Nonetheless, it warrants consideration within a more nuanced and dynamic model. This paper presents a low-overhead DRAM RowHammer vulnerability profiling technique termed DRAM-Profiler, which utilizes innovative test vectors for categorizing memory cells into distinct security levels. The proposed test vectors intentionally weaken the spatial correlation between the aggressors and victim rows before an attack for evaluation, thus aiding designers in mitigating RowHammer vulnerabilities in the mapping phase. While there has been no previous research showcasing the impact of such profiling to our knowledge, our study methodically assesses 128 commercial DDR4 DRAM products. The results uncover the significant variability among chips from different manufacturers in the type and quantity of RowHammer attacks that can be exploited by adversaries. △ Less

Submitted 28 April, 2024; originally announced April 2024.

Comments: 6 pages, 6 figures

arXiv:2404.17944 [pdf]

Mobile Edge Computing

Authors: Sohaib Ahmed, Hassan Khalid, Muhammad Hamza, Danyal Farhat

Abstract: Mobile Edge Computing (MEC) has emerged as a solution to the high latency and suboptimal Quality of Experience (QoE) associated with Mobile Cloud Computing (MCC). By processing data near the source, MEC reduces the need to send information to distant data centers, resulting in faster response times and lower latency. This paper explores the differences between MEC and traditional cloud computing,… ▽ More Mobile Edge Computing (MEC) has emerged as a solution to the high latency and suboptimal Quality of Experience (QoE) associated with Mobile Cloud Computing (MCC). By processing data near the source, MEC reduces the need to send information to distant data centers, resulting in faster response times and lower latency. This paper explores the differences between MEC and traditional cloud computing, emphasizing architecture, data flow, and resource allocation. Key technologies like Network Function Virtualization (NFV) and Software-Defined Networking (SDN) are discussed for their role in achieving scalability and flexibility. Additionally, security and privacy challenges are addressed, underscoring the need for robust frameworks. We conclude with an examination of various edge computing applications and suggest future research directions to enhance the effectiveness and adoption of MEC in the evolving technological landscape. △ Less

Submitted 27 April, 2024; originally announced April 2024.

Comments: 13 pages, 5 figures

arXiv:2404.16397 [pdf, other]

Deep Learning-based Prediction of Breast Cancer Tumor and Immune Phenotypes from Histopathology

Authors: Tiago Gonçalves, Dagoberto Pulido-Arias, Julian Willett, Katharina V. Hoebel, Mason Cleveland, Syed Rakin Ahmed, Elizabeth Gerstner, Jayashree Kalpathy-Cramer, Jaime S. Cardoso, Christopher P. Bridge, Albert E. Kim

Abstract: The interactions between tumor cells and the tumor microenvironment (TME) dictate therapeutic efficacy of radiation and many systemic therapies in breast cancer. However, to date, there is not a widely available method to reproducibly measure tumor and immune phenotypes for each patient's tumor. Given this unmet clinical need, we applied multiple instance learning (MIL) algorithms to assess activi… ▽ More The interactions between tumor cells and the tumor microenvironment (TME) dictate therapeutic efficacy of radiation and many systemic therapies in breast cancer. However, to date, there is not a widely available method to reproducibly measure tumor and immune phenotypes for each patient's tumor. Given this unmet clinical need, we applied multiple instance learning (MIL) algorithms to assess activity of ten biologically relevant pathways from the hematoxylin and eosin (H&E) slide of primary breast tumors. We employed different feature extraction approaches and state-of-the-art model architectures. Using binary classification, our models attained area under the receiver operating characteristic (AUROC) scores above 0.70 for nearly all gene expression pathways and on some cases, exceeded 0.80. Attention maps suggest that our trained models recognize biologically relevant spatial patterns of cell sub-populations from H&E. These efforts represent a first step towards developing computational H&E biomarkers that reflect facets of the TME and hold promise for augmenting precision oncology. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: Paper accepted at the First Workshop on Imageomics (Imageomics-AAAI-24) - Discovering Biological Knowledge from Images using AI (https://sites.google.com/vt.edu/imageomics-aaai-24/home), held as part of the 38th Annual AAAI Conference on Artificial Intelligence (https://aaai.org/aaai-conference/)

MSC Class: 92C55 ACM Class: I.5.1; I.5.4; I.2.10; J.3

arXiv:2404.12627 [pdf, other]

A Soft e-Textile Sensor for Enhanced Deep Learning-based Shape Sensing of Soft Continuum Robots

Authors: Eric Vincent Galeta, Ayman A. Nada, Sabah M. Ahmed, Victor Parque, Haitham El-Hussieny

Abstract: The safety and accuracy of robotic navigation hold paramount importance, especially in the realm of soft continuum robotics, where the limitations of traditional rigid sensors become evident. Encoders, piezoresistive, and potentiometer sensors often fail to integrate well with the flexible nature of these robots, adding unwanted bulk and rigidity. To overcome these hurdles, our study presents a ne… ▽ More The safety and accuracy of robotic navigation hold paramount importance, especially in the realm of soft continuum robotics, where the limitations of traditional rigid sensors become evident. Encoders, piezoresistive, and potentiometer sensors often fail to integrate well with the flexible nature of these robots, adding unwanted bulk and rigidity. To overcome these hurdles, our study presents a new approach to shape sensing in soft continuum robots through the use of soft e-textile resistive sensors. This sensor, designed to flawlessly integrate with the robot's structure, utilizes a resistive material that adjusts its resistance in response to the robot's movements and deformations. This adjustment facilitates the capture of multidimensional force measurements across the soft sensor layers. A deep Convolutional Neural Network (CNN) is employed to decode the sensor signals, enabling precise estimation of the robot's shape configuration based on the detailed data from the e-textile sensor. Our research investigates the efficacy of this e-textile sensor in determining the curvature parameters of soft continuum robots. The findings are encouraging, showing that the soft e-textile sensor not only matches but potentially exceeds the capabilities of traditional rigid sensors in terms of shape sensing and estimation. This advancement significantly boosts the safety and efficiency of robotic navigation systems. △ Less

Submitted 19 April, 2024; originally announced April 2024.

arXiv:2404.12185 [pdf, other]

An Adaptive Metaheuristic Framework for Changing Environments

Authors: Bestoun S. Ahmed

Abstract: The rapidly changing landscapes of modern optimization problems require algorithms that can be adapted in real-time. This paper introduces an Adaptive Metaheuristic Framework (AMF) designed for dynamic environments. It is capable of intelligently adapting to changes in the problem parameters. The AMF combines a dynamic representation of problems, a real-time sensing system, and adaptive techniques… ▽ More The rapidly changing landscapes of modern optimization problems require algorithms that can be adapted in real-time. This paper introduces an Adaptive Metaheuristic Framework (AMF) designed for dynamic environments. It is capable of intelligently adapting to changes in the problem parameters. The AMF combines a dynamic representation of problems, a real-time sensing system, and adaptive techniques to navigate continuously changing optimization environments. Through a simulated dynamic optimization problem, the AMF's capability is demonstrated to detect environmental changes and proactively adjust its search strategy. This framework utilizes a differential evolution algorithm that is improved with an adaptation module that adjusts solutions in response to detected changes. The capability of the AMF to adjust is tested through a series of iterations, demonstrating its resilience and robustness in sustaining solution quality despite the problem's development. The effectiveness of AMF is demonstrated through a series of simulations on a dynamic optimization problem. Robustness and agility characterize the algorithm's performance, as evidenced by the presented fitness evolution and solution path visualizations. The findings show that AMF is a practical solution to dynamic optimization and a major step forward in the creation of algorithms that can handle the unpredictability of real-world problems. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: Accepted in 2024 IEEE Congress on Evolutionary Computation (CEC)

arXiv:2404.10486 [pdf, other]

doi 10.1051/0004-6361/202449763

Discovery of a dormant 33 solar-mass black hole in pre-release Gaia astrometry

Authors: Gaia Collaboration, P. Panuzzo, T. Mazeh, F. Arenou, B. Holl, E. Caffau, A. Jorissen, C. Babusiaux, P. Gavras, J. Sahlmann, U. Bastian, Ł. Wyrzykowski, L. Eyer, N. Leclerc, N. Bauchet, A. Bombrun, N. Mowlavi, G. M. Seabroke, D. Teyssier, E. Balbinot, A. Helmi, A. G. A. Brown, A. Vallenari, T. Prusti, J. H. J. de Bruijne , et al. (390 additional authors not shown)

Abstract: Gravitational waves from black-hole merging events have revealed a population of extra-galactic BHs residing in short-period binaries with masses that are higher than expected based on most stellar evolution models - and also higher than known stellar-origin black holes in our Galaxy. It has been proposed that those high-mass BHs are the remnants of massive metal-poor stars. Gaia astrometry is exp… ▽ More Gravitational waves from black-hole merging events have revealed a population of extra-galactic BHs residing in short-period binaries with masses that are higher than expected based on most stellar evolution models - and also higher than known stellar-origin black holes in our Galaxy. It has been proposed that those high-mass BHs are the remnants of massive metal-poor stars. Gaia astrometry is expected to uncover many Galactic wide-binary systems containing dormant BHs, which may not have been detected before. The study of this population will provide new information on the BH-mass distribution in binaries and shed light on their formation mechanisms and progenitors. As part of the validation efforts in preparation for the fourth Gaia data release (DR4), we analysed the preliminary astrometric binary solutions, obtained by the Gaia Non-Single Star pipeline, to verify their significance and to minimise false-detection rates in high-mass-function orbital solutions. The astrometric binary solution of one source, Gaia BH3, implies the presence of a 32.70 \pm 0.82 M\odot BH in a binary system with a period of 11.6 yr. Gaia radial velocities independently validate the astrometric orbit. Broad-band photometric and spectroscopic data show that the visible component is an old, very metal-poor giant of the Galactic halo, at a distance of 590 pc. The BH in the Gaia BH3 system is more massive than any other Galactic stellar-origin BH known thus far. The low metallicity of the star companion supports the scenario that metal-poor massive stars are progenitors of the high-mass BHs detected by gravitational-wave telescopes. The Galactic orbit of the system and its metallicity indicate that it might belong to the Sequoia halo substructure. Alternatively, and more plausibly, it could belong to the ED-2 stream, which likely originated from a globular cluster that had been disrupted by the Milky Way. △ Less

Submitted 19 April, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

Comments: 23 pages, accepted fro publication in A&A Letters. New version with small fixes

arXiv:2404.10356 [pdf, other]

Generating Counterfactual Trajectories with Latent Diffusion Models for Concept Discovery

Authors: Payal Varshney, Adriano Lucieri, Christoph Balada, Andreas Dengel, Sheraz Ahmed

Abstract: Trustworthiness is a major prerequisite for the safe application of opaque deep learning models in high-stakes domains like medicine. Understanding the decision-making process not only contributes to fostering trust but might also reveal previously unknown decision criteria of complex models that could advance the state of medical research. The discovery of decision-relevant concepts from black bo… ▽ More Trustworthiness is a major prerequisite for the safe application of opaque deep learning models in high-stakes domains like medicine. Understanding the decision-making process not only contributes to fostering trust but might also reveal previously unknown decision criteria of complex models that could advance the state of medical research. The discovery of decision-relevant concepts from black box models is a particularly challenging task. This study proposes Concept Discovery through Latent Diffusion-based Counterfactual Trajectories (CDCT), a novel three-step framework for concept discovery leveraging the superior image synthesis capabilities of diffusion models. In the first step, CDCT uses a Latent Diffusion Model (LDM) to generate a counterfactual trajectory dataset. This dataset is used to derive a disentangled representation of classification-relevant concepts using a Variational Autoencoder (VAE). Finally, a search algorithm is applied to identify relevant concepts in the disentangled latent space. The application of CDCT to a classifier trained on the largest public skin lesion dataset revealed not only the presence of several biases but also meaningful biomarkers. Moreover, the counterfactuals generated within CDCT show better FID scores than those produced by a previously established state-of-the-art method, while being 12 times more resource-efficient. Unsupervised concept discovery holds great potential for the application of trustworthy AI and the further development of human knowledge in various domains. CDCT represents a further step in this direction. △ Less

Submitted 16 April, 2024; originally announced April 2024.

Comments: Submitted to International Conference on Pattern Recognition (ICPR) 2024

arXiv:2404.09517 [pdf, other]

Odd-frequency superconducting pairing and multiple Majorana edge modes in driven topological superconductors

Authors: Eslam Ahmed, Shun Tamura, Yukio Tanaka, Jorge Cayao

Abstract: Majorana zero modes have been shown to be the simplest quasiparticles exhibiting pure odd-frequency pairing, an effect that has so far been theoretically established in the static regime. In this work we investigate the formation of Majorana modes and odd-frequency pairing in $p$-wave spin-polarized superconductors under a time-dependent drive. We first show that the driven system hosts multiple M… ▽ More Majorana zero modes have been shown to be the simplest quasiparticles exhibiting pure odd-frequency pairing, an effect that has so far been theoretically established in the static regime. In this work we investigate the formation of Majorana modes and odd-frequency pairing in $p$-wave spin-polarized superconductors under a time-dependent drive. We first show that the driven system hosts multiple Majorana modes emerging at zero and $π$, whose formation can be controlled by an appropriate tuning of the drive frequency and chemical potential. Then we explore the induced pair correlations and find that odd-frequency spin-polarized $s$-wave pairing is broadly induced, acquiring large values in the presence of Majorana modes. We discover that, while odd-frequency pairing is proportional to $\sim1/ω$ in the presence of Majorana zero modes, it is proportional to $\sim 1/(ω-π\hbar/T)$ in the presence of Majorana $π$ modes, where $T$ is the periodicity of the drive. Furthermore, we find that the amount of odd-frequency pairing becomes larger when multiple Majorana modes appear but the overall divergent profile as a function of frequency remains. Our work thus paves the way for understanding the emergent pair correlations in driven topological superconductors △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: 12 pages, 6 figures

arXiv:2404.08949 [pdf, other]

Multimodal Cross-Document Event Coreference Resolution Using Linear Semantic Transfer and Mixed-Modality Ensembles

Authors: Abhijnan Nath, Huma Jamil, Shafiuddin Rehan Ahmed, George Baker, Rahul Ghosh, James H. Martin, Nathaniel Blanchard, Nikhil Krishnaswamy

Abstract: Event coreference resolution (ECR) is the task of determining whether distinct mentions of events within a multi-document corpus are actually linked to the same underlying occurrence. Images of the events can help facilitate resolution when language is ambiguous. Here, we propose a multimodal cross-document event coreference resolution method that integrates visual and textual cues with a simple l… ▽ More Event coreference resolution (ECR) is the task of determining whether distinct mentions of events within a multi-document corpus are actually linked to the same underlying occurrence. Images of the events can help facilitate resolution when language is ambiguous. Here, we propose a multimodal cross-document event coreference resolution method that integrates visual and textual cues with a simple linear map between vision and language models. As existing ECR benchmark datasets rarely provide images for all event mentions, we augment the popular ECB+ dataset with event-centric images scraped from the internet and generated using image diffusion models. We establish three methods that incorporate images and text for coreference: 1) a standard fused model with finetuning, 2) a novel linear mapping method without finetuning and 3) an ensembling approach based on splitting mention pairs by semantic and discourse-level difficulty. We evaluate on 2 datasets: the augmented ECB+, and AIDA Phase 1. Our ensemble systems using cross-modal linear mapping establish an upper limit (91.9 CoNLL F1) on ECB+ ECR performance given the preprocessing assumptions used, and establish a novel baseline on AIDA Phase 1. Our results demonstrate the utility of multimodal information in ECR for certain challenging coreference problems, and highlight a need for more multimodal resources in the coreference resolution space. △ Less

Submitted 13 April, 2024; originally announced April 2024.

Comments: To appear at LREC-COLING 2024

arXiv:2404.08656 [pdf, other]

Linear Cross-document Event Coreference Resolution with X-AMR

Authors: Shafiuddin Rehan Ahmed, George Arthur Baker, Evi Judge, Michael Regan, Kristin Wright-Bettner, Martha Palmer, James H. Martin

Abstract: Event Coreference Resolution (ECR) as a pairwise mention classification task is expensive both for automated systems and manual annotations. The task's quadratic difficulty is exacerbated when using Large Language Models (LLMs), making prompt engineering for ECR prohibitively costly. In this work, we propose a graphical representation of events, X-AMR, anchored around individual mentions using a \… ▽ More Event Coreference Resolution (ECR) as a pairwise mention classification task is expensive both for automated systems and manual annotations. The task's quadratic difficulty is exacerbated when using Large Language Models (LLMs), making prompt engineering for ECR prohibitively costly. In this work, we propose a graphical representation of events, X-AMR, anchored around individual mentions using a \textbf{cross}-document version of \textbf{A}bstract \textbf{M}eaning \textbf{R}epresentation. We then linearize the ECR with a novel multi-hop coreference algorithm over the event graphs. The event graphs simplify ECR, making it a) LLM cost-effective, b) compositional and interpretable, and c) easily annotated. For a fair assessment, we first enrich an existing ECR benchmark dataset with these event graphs using an annotator-friendly tool we introduce. Then, we employ GPT-4, the newest LLM by OpenAI, for these annotations. Finally, using the ECR algorithm, we assess GPT-4 against humans and analyze its limitations. Through this research, we aim to advance the state-of-the-art for efficient ECR and shed light on the potential shortcomings of current LLMs at this task. Code and annotations: \url{https://github.com/ahmeshaf/gpt_coref} △ Less

Submitted 24 March, 2024; originally announced April 2024.

Comments: LREC-COLING 2024 main conference

arXiv:2404.05826 [pdf, other]

Decade-long Utilization Patterns of ICSE Technical Papers and Associated Artifacts

Authors: Sharif Ahmed, Rey Ortiz, Nasir U. Eisty

Abstract: Context: Annually, ICSE acknowledges a range of papers, a subset of which are paired with research artifacts such as source code, datasets, and supplementary materials, adhering to the Open Science Policy. However, no prior systematic inquiry dives into gauging the influence of ICSE papers using artifact attributes. Objective: We explore the mutual impact between artifacts and their associated pap… ▽ More Context: Annually, ICSE acknowledges a range of papers, a subset of which are paired with research artifacts such as source code, datasets, and supplementary materials, adhering to the Open Science Policy. However, no prior systematic inquiry dives into gauging the influence of ICSE papers using artifact attributes. Objective: We explore the mutual impact between artifacts and their associated papers presented at ICSE over ten years. Method: We collect data on usage attributes from papers and their artifacts, conduct a statistical assessment to identify differences, and analyze the top five papers in each attribute category. Results: There is a significant difference between paper citations and the usage of associated artifacts. While statistical analyses show no notable difference between paper citations and GitHub stars, variations exist in views and/or downloads of papers and artifacts. Conclusion: We provide a thorough overview of ICSE's accepted papers from the last decade, emphasizing the intricate relationship between research papers and their artifacts. To enhance the assessment of artifact influence in software research, we recommend considering key attributes that may be present in one platform but not in another. △ Less

Submitted 8 April, 2024; originally announced April 2024.

Comments: This paper has been accepted for publication and presentation at The 22nd IEEE/ACIS International Conference on Software Engineering, Management and Applications (SERA 2024) to be held in Honolulu, USA on May 30-June 1, 2024

arXiv:2403.20084 [pdf]

IPA Transcription of Bengali Texts

Authors: Kanij Fatema, Fazle Dawood Haider, Nirzona Ferdousi Turpa, Tanveer Azmal, Sourav Ahmed, Navid Hasan, Mohammad Akhlaqur Rahman, Biplab Kumar Sarkar, Afrar Jahin, Md. Rezuwan Hassan, Md Foriduzzaman Zihad, Rubayet Sabbir Faruque, Asif Sushmit, Mashrur Imtiaz, Farig Sadeque, Syed Shahrier Rahman

Abstract: The International Phonetic Alphabet (IPA) serves to systematize phonemes in language, enabling precise textual representation of pronunciation. In Bengali phonology and phonetics, ongoing scholarly deliberations persist concerning the IPA standard and core Bengali phonemes. This work examines prior research, identifies current and potential issues, and suggests a framework for a Bengali IPA standa… ▽ More The International Phonetic Alphabet (IPA) serves to systematize phonemes in language, enabling precise textual representation of pronunciation. In Bengali phonology and phonetics, ongoing scholarly deliberations persist concerning the IPA standard and core Bengali phonemes. This work examines prior research, identifies current and potential issues, and suggests a framework for a Bengali IPA standard, facilitating linguistic analysis and NLP resource creation and downstream technology development. In this work, we present a comprehensive study of Bengali IPA transcription and introduce a novel IPA transcription framework incorporating a novel dataset with DL-based benchmarks. △ Less

Submitted 29 March, 2024; originally announced March 2024.

arXiv:2403.19717 [pdf, other]

A Picture is Worth 500 Labels: A Case Study of Demographic Disparities in Local Machine Learning Models for Instagram and TikTok

Authors: Jack West, Lea Thiemt, Shimaa Ahmed, Maggie Bartig, Kassem Fawaz, Suman Banerjee

Abstract: Mobile apps have embraced user privacy by moving their data processing to the user's smartphone. Advanced machine learning (ML) models, such as vision models, can now locally analyze user images to extract insights that drive several functionalities. Capitalizing on this new processing model of locally analyzing user images, we analyze two popular social media apps, TikTok and Instagram, to reveal… ▽ More Mobile apps have embraced user privacy by moving their data processing to the user's smartphone. Advanced machine learning (ML) models, such as vision models, can now locally analyze user images to extract insights that drive several functionalities. Capitalizing on this new processing model of locally analyzing user images, we analyze two popular social media apps, TikTok and Instagram, to reveal (1) what insights vision models in both apps infer about users from their image and video data and (2) whether these models exhibit performance disparities with respect to demographics. As vision models provide signals for sensitive technologies like age verification and facial recognition, understanding potential biases in these models is crucial for ensuring that users receive equitable and accurate services. We develop a novel method for capturing and evaluating ML tasks in mobile apps, overcoming challenges like code obfuscation, native code execution, and scalability. Our method comprises ML task detection, ML pipeline reconstruction, and ML performance assessment, specifically focusing on demographic disparities. We apply our methodology to TikTok and Instagram, revealing significant insights. For TikTok, we find issues in age and gender prediction accuracy, particularly for minors and Black individuals. In Instagram, our analysis uncovers demographic disparities in the extraction of over 500 visual concepts from images, with evidence of spurious correlations between demographic features and certain concepts. △ Less

Submitted 27 March, 2024; originally announced March 2024.

Comments: 18 pages, 13 figures, to appear at IEEE Symposium on Security and Privacy 2024

ACM Class: K.4.2; C.4; D.2.2

arXiv:2403.15407 [pdf, other]

X-AMR Annotation Tool

Authors: Shafiuddin Rehan Ahmed, Jon Z. Cai, Martha Palmer, James H. Martin

Abstract: This paper presents a novel Cross-document Abstract Meaning Representation (X-AMR) annotation tool designed for annotating key corpus-level event semantics. Leveraging machine assistance through the Prodigy Annotation Tool, we enhance the user experience, ensuring ease and efficiency in the annotation process. Through empirical analyses, we demonstrate the effectiveness of our tool in augmenting a… ▽ More This paper presents a novel Cross-document Abstract Meaning Representation (X-AMR) annotation tool designed for annotating key corpus-level event semantics. Leveraging machine assistance through the Prodigy Annotation Tool, we enhance the user experience, ensuring ease and efficiency in the annotation process. Through empirical analyses, we demonstrate the effectiveness of our tool in augmenting an existing event corpus, highlighting its advantages when integrated with GPT-4. Code and annotations: https://github.com/ahmeshaf/gpt_coref △ Less

Submitted 29 February, 2024; originally announced March 2024.

Comments: EACL 2024 System Demonstration

arXiv:2403.12849 [pdf, other]

Optimizing Service Placement in Edge-to-Cloud AR/VR Systems using a Multi-Objective Genetic Algorithm

Authors: Mohammadsadeq Garshasbi Herabad, Javid Taheri, Bestoun S. Ahmed, Calin Curescu

Abstract: Augmented Reality (AR) and Virtual Reality (VR) systems involve computationally intensive image processing algorithms that can burden end-devices with limited resources, leading to poor performance in providing low latency services. Edge-to-cloud computing overcomes the limitations of end-devices by offloading their computations to nearby edge devices or remote cloud servers. Although this proves… ▽ More Augmented Reality (AR) and Virtual Reality (VR) systems involve computationally intensive image processing algorithms that can burden end-devices with limited resources, leading to poor performance in providing low latency services. Edge-to-cloud computing overcomes the limitations of end-devices by offloading their computations to nearby edge devices or remote cloud servers. Although this proves to be sufficient for many applications, optimal placement of latency sensitive AR/VR services in edge-to-cloud infrastructures (to provide desirable service response times and reliability) remain a formidable challenging. To address this challenge, this paper develops a Multi-Objective Genetic Algorithm (MOGA) to optimize the placement of AR/VR-based services in multi-tier edge-to-cloud environments. The primary objective of the proposed MOGA is to minimize the response time of all running services, while maximizing the reliability of the underlying system from both software and hardware perspectives. To evaluate its performance, we mathematically modeled all components and developed a tailor-made simulator to assess its effectiveness on various scales. MOGA was compared with several heuristics to prove that intuitive solutions, which are usually assumed sufficient, are not efficient enough for the stated problem. The experimental results indicated that MOGA can significantly reduce the response time of deployed services by an average of 67\% on different scales, compared to other heuristic methods. MOGA also ensures reliability of the 97\% infrastructure (hardware) and 95\% services (software). △ Less

Submitted 19 March, 2024; originally announced March 2024.

arXiv:2403.12044 [pdf, other]

Mobile Application for Oral Disease Detection using Federated Learning

Authors: Shankara Narayanan V, Sneha Varsha M, Syed Ashfaq Ahmed, Guruprakash J

Abstract: The mouth, often regarded as a window to the internal state of the body, plays an important role in reflecting one's overall health. Poor oral hygiene has far-reaching consequences, contributing to severe conditions like heart disease, cancer, and diabetes, while inadequate care leads to discomfort, pain, and costly treatments. Federated Learning (FL) for object detection can be utilized for this… ▽ More The mouth, often regarded as a window to the internal state of the body, plays an important role in reflecting one's overall health. Poor oral hygiene has far-reaching consequences, contributing to severe conditions like heart disease, cancer, and diabetes, while inadequate care leads to discomfort, pain, and costly treatments. Federated Learning (FL) for object detection can be utilized for this use case due to the sensitivity of the oral image data of the patients. FL ensures data privacy by storing the images used for object detection on the local device and trains the model on the edge. The updated weights are federated to a central server where all the collected weights are updated via The Federated Averaging algorithm. Finally, we have developed a mobile app named OralH which provides user-friendly solutions, allowing people to conduct self-assessments through mouth scans and providing quick oral health insights. Upon detection of the issues, the application alerts the user about potential oral health concerns or diseases and provides details about dental clinics in the user's locality. Designed as a Progressive Web Application (PWA), the platform ensures ubiquitous access, catering to users across devices for a seamless experience. The application aims to provide state-of-the-art segmentation and detection techniques, leveraging the YOLOv8 object detection model to identify oral hygiene issues and diseases. This study deals with the benefits of leveraging FL in healthcare with promising real-world results. △ Less

Submitted 27 October, 2023; originally announced March 2024.

arXiv:2403.10173 [pdf, other]

A Hybrid SNN-ANN Network for Event-based Object Detection with Spatial and Temporal Attention

Authors: Soikat Hasan Ahmed, Jan Finkbeiner, Emre Neftci

Abstract: Event cameras offer high temporal resolution and dynamic range with minimal motion blur, making them promising for object detection tasks. While Spiking Neural Networks (SNNs) are a natural match for event-based sensory data and enable ultra-energy efficient and low latency inference on neuromorphic hardware, Artificial Neural Networks (ANNs) tend to display more stable training dynamics and faste… ▽ More Event cameras offer high temporal resolution and dynamic range with minimal motion blur, making them promising for object detection tasks. While Spiking Neural Networks (SNNs) are a natural match for event-based sensory data and enable ultra-energy efficient and low latency inference on neuromorphic hardware, Artificial Neural Networks (ANNs) tend to display more stable training dynamics and faster convergence resulting in greater task performance. Hybrid SNN-ANN approaches are a promising alternative, enabling to leverage the strengths of both SNN and ANN architectures. In this work, we introduce the first Hybrid Attention-based SNN-ANN backbone for object detection using event cameras. We propose a novel Attention-based SNN-ANN bridge module to capture sparse spatial and temporal relations from the SNN layer and convert them into dense feature maps for the ANN part of the backbone. Experimental results demonstrate that our proposed method surpasses baseline hybrid and SNN-based approaches by significant margins, with results comparable to existing ANN-based methods. Extensive ablation studies confirm the effectiveness of our proposed modules and architectural choices. These results pave the way toward a hybrid SNN-ANN architecture that achieves ANN like performance at a drastically reduced parameter budget. We implemented the SNN blocks on digital neuromorphic hardware to investigate latency and power consumption and demonstrate the feasibility of our approach. △ Less

Submitted 15 March, 2024; originally announced March 2024.

arXiv:2403.07059 [pdf, other]

Better than classical? The subtle art of benchmarking quantum machine learning models

Authors: Joseph Bowles, Shahnawaz Ahmed, Maria Schuld

Abstract: Benchmarking models via classical simulations is one of the main ways to judge ideas in quantum machine learning before noise-free hardware is available. However, the huge impact of the experimental design on the results, the small scales within reach today, as well as narratives influenced by the commercialisation of quantum technologies make it difficult to gain robust insights. To facilitate be… ▽ More Benchmarking models via classical simulations is one of the main ways to judge ideas in quantum machine learning before noise-free hardware is available. However, the huge impact of the experimental design on the results, the small scales within reach today, as well as narratives influenced by the commercialisation of quantum technologies make it difficult to gain robust insights. To facilitate better decision-making we develop an open-source package based on the PennyLane software framework and use it to conduct a large-scale study that systematically tests 12 popular quantum machine learning models on 6 binary classification tasks used to create 160 individual datasets. We find that overall, out-of-the-box classical machine learning models outperform the quantum classifiers. Moreover, removing entanglement from a quantum model often results in as good or better performance, suggesting that "quantumness" may not be the crucial ingredient for the small learning tasks considered here. Our benchmarks also unlock investigations beyond simplistic leaderboard comparisons, and we identify five important questions for quantum model design that follow from our results. △ Less

Submitted 14 March, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

arXiv:2403.02913 [pdf, other]

Scientific machine learning for closure models in multiscale problems: a review

Authors: Benjamin Sanderse, Panos Stinis, Romit Maulik, Shady E. Ahmed

Abstract: Closure problems are omnipresent when simulating multiscale systems, where some quantities and processes cannot be fully prescribed despite their effects on the simulation's accuracy. Recently, scientific machine learning approaches have been proposed as a way to tackle the closure problem, combining traditional (physics-based) modeling with data-driven (machine-learned) techniques, typically thro… ▽ More Closure problems are omnipresent when simulating multiscale systems, where some quantities and processes cannot be fully prescribed despite their effects on the simulation's accuracy. Recently, scientific machine learning approaches have been proposed as a way to tackle the closure problem, combining traditional (physics-based) modeling with data-driven (machine-learned) techniques, typically through enriching differential equations with neural networks. This paper reviews the different reduced model forms, distinguished by the degree to which they include known physics, and the different objectives of a priori and a posteriori learning. The importance of adhering to physical laws (such as symmetries and conservation laws) in choosing the reduced model form and choosing the learning method is discussed. The effect of spatial and temporal discretization and recent trends toward discretization-invariant models are reviewed. In addition, we make the connections between closure problems and several other research disciplines: inverse problems, Mori-Zwanzig theory, and multi-fidelity methods. In conclusion, much progress has been made with scientific machine learning approaches for solving closure problems, but many challenges remain. In particular, the generalizability and interpretability of learned models is a major issue that needs to be addressed further. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Report number: PNNL-SA-195444 MSC Class: 65MXX; 76MXX; 68T07

arXiv:2403.00988 [pdf, other]

Optimal Robot Formations: Balancing Range-Based Observability and User-Defined Configurations

Authors: Syed Shabbir Ahmed, Mohammed Ayman Shalaby, Jerome Le Ny, James Richard Forbes

Abstract: This paper introduces a set of customizable and novel cost functions that enable the user to easily specify desirable robot formations, such as a ``high-coverage'' infrastructure-inspection formation, while maintaining high relative pose estimation accuracy. The overall cost function balances the need for the robots to be close together for good ranging-based relative localization accuracy and the… ▽ More This paper introduces a set of customizable and novel cost functions that enable the user to easily specify desirable robot formations, such as a ``high-coverage'' infrastructure-inspection formation, while maintaining high relative pose estimation accuracy. The overall cost function balances the need for the robots to be close together for good ranging-based relative localization accuracy and the need for the robots to achieve specific tasks, such as minimizing the time taken to inspect a given area. The formations found by minimizing the aggregated cost function are evaluated in a coverage path planning task in simulation and experiment, where the robots localize themselves and unknown landmarks using a simultaneous localization and mapping algorithm based on the extended Kalman filter. Compared to an optimal formation that maximizes ranging-based relative localization accuracy, these formations significantly reduce the time to cover a given area with minimal impact on relative pose estimation accuracy. △ Less

Submitted 1 March, 2024; originally announced March 2024.

Comments: 8 pages, 9 figures, submitted to IEEE International Conference on Intelligent Robots and Systems 2024

arXiv:2402.16757 [pdf, other]

Towards Environmental Preference Based Speech Enhancement For Individualised Multi-Modal Hearing Aids

Authors: Jasper Kirton-Wingate, Shafique Ahmed, Adeel Hussain, Mandar Gogate, Kia Dashtipour, Jen-Cheng Hou, Tassadaq Hussain, Yu Tsao, Amir Hussain

Abstract: Since the advent of Deep Learning (DL), Speech Enhancement (SE) models have performed well under a variety of noise conditions. However, such systems may still introduce sonic artefacts, sound unnatural, and restrict the ability for a user to hear ambient sound which may be of importance. Hearing Aid (HA) users may wish to customise their SE systems to suit their personal preferences and day-to-da… ▽ More Since the advent of Deep Learning (DL), Speech Enhancement (SE) models have performed well under a variety of noise conditions. However, such systems may still introduce sonic artefacts, sound unnatural, and restrict the ability for a user to hear ambient sound which may be of importance. Hearing Aid (HA) users may wish to customise their SE systems to suit their personal preferences and day-to-day lifestyle. In this paper, we introduce a preference learning based SE (PLSE) model for future multi-modal HAs that can contextually exploit audio information to improve listening comfort, based upon the preferences of the user. The proposed system estimates the Signal-to-noise ratio (SNR) as a basic objective speech quality measure which quantifies the relative amount of background noise present in speech, and directly correlates to the intelligibility of the signal. Additionally, to provide contextual information we predict the acoustic scene in which the user is situated. These tasks are achieved via a multi-task DL model, which surpasses the performance of inferring the acoustic scene or SNR separately, by jointly leveraging a shared encoded feature space. These environmental inferences are exploited in a preference elicitation framework, which linearly learns a set of predictive functions to determine the target SNR of an AV (Audio-Visual) SE system. By greatly reducing noise in challenging listening conditions, and by novelly scaling the output of the SE model, we are able to provide HA users with contextually individualised SE. Preliminary results suggest an improvement over the non-individualised baseline model in some participants. △ Less

Submitted 26 February, 2024; originally announced February 2024.

Comments: This has been submitted to the Trends in Hearing journal

arXiv:2402.08769 [pdf, other]

FLASH: Federated Learning Across Simultaneous Heterogeneities

Authors: Xiangyu Chang, Sk Miraj Ahmed, Srikanth V. Krishnamurthy, Basak Guler, Ananthram Swami, Samet Oymak, Amit K. Roy-Chowdhury

Abstract: The key premise of federated learning (FL) is to train ML models across a diverse set of data-owners (clients), without exchanging local data. An overarching challenge to this date is client heterogeneity, which may arise not only from variations in data distribution, but also in data quality, as well as compute/communication latency. An integrated view of these diverse and concurrent sources of h… ▽ More The key premise of federated learning (FL) is to train ML models across a diverse set of data-owners (clients), without exchanging local data. An overarching challenge to this date is client heterogeneity, which may arise not only from variations in data distribution, but also in data quality, as well as compute/communication latency. An integrated view of these diverse and concurrent sources of heterogeneity is critical; for instance, low-latency clients may have poor data quality, and vice versa. In this work, we propose FLASH(Federated Learning Across Simultaneous Heterogeneities), a lightweight and flexible client selection algorithm that outperforms state-of-the-art FL frameworks under extensive sources of heterogeneity, by trading-off the statistical information associated with the client's data quality, data distribution, and latency. FLASH is the first method, to our knowledge, for handling all these heterogeneities in a unified manner. To do so, FLASH models the learning dynamics through contextual multi-armed bandits (CMAB) and dynamically selects the most promising clients. Through extensive experiments, we demonstrate that FLASH achieves substantial and consistent improvements over state-of-the-art baselines -- as much as 10% in absolute accuracy -- thanks to its unified approach. Importantly, FLASH also outperforms federated aggregation methods that are designed to handle highly heterogeneous settings and even enjoys a performance boost when integrated with them. △ Less

Submitted 13 February, 2024; originally announced February 2024.

arXiv:2402.08566 [pdf, other]

Gaussian-Sum Filter for Range-based 3D Relative Pose Estimation in the Presence of Ambiguities

Authors: Syed S. Ahmed, Mohammed A. Shalaby, Charles C. Cossette, Jerome Le Ny, James R. Forbes

Abstract: Multi-robot systems must have the ability to accurately estimate relative states between robots in order to perform collaborative tasks, possibly with no external aiding. Three-dimensional relative pose estimation using range measurements oftentimes suffers from a finite number of non-unique solutions, or ambiguities. This paper: 1) identifies and accurately estimates all possible ambiguities in 2… ▽ More Multi-robot systems must have the ability to accurately estimate relative states between robots in order to perform collaborative tasks, possibly with no external aiding. Three-dimensional relative pose estimation using range measurements oftentimes suffers from a finite number of non-unique solutions, or ambiguities. This paper: 1) identifies and accurately estimates all possible ambiguities in 2D; 2) treats them as components of a Gaussian mixture model; and 3) presents a computationally-efficient estimator, in the form of a Gaussian-sum filter (GSF), to realize range-based relative pose estimation in an infrastructure-free, 3D, setup. This estimator is evaluated in simulation and experiment and is shown to avoid divergence to local minima induced by the ambiguous poses. Furthermore, the proposed GSF outperforms an extended Kalman filter, demonstrates similar performance to the computationally-demanding particle filter, and is shown to be consistent. △ Less

Submitted 13 February, 2024; originally announced February 2024.

Comments: 7 pages, 9 figures, submitted to IEEE Conference on Control Technology and Applications 2024

arXiv:2401.14628 [pdf, other]

Inferring Data Preconditions from Deep Learning Models for Trustworthy Prediction in Deployment

Authors: Shibbir Ahmed, Hongyang Gao, Hridesh Rajan

Abstract: Deep learning models are trained with certain assumptions about the data during the development stage and then used for prediction in the deployment stage. It is important to reason about the trustworthiness of the model's predictions with unseen data during deployment. Existing methods for specifying and verifying traditional software are insufficient for this task, as they cannot handle the comp… ▽ More Deep learning models are trained with certain assumptions about the data during the development stage and then used for prediction in the deployment stage. It is important to reason about the trustworthiness of the model's predictions with unseen data during deployment. Existing methods for specifying and verifying traditional software are insufficient for this task, as they cannot handle the complexity of DNN model architecture and expected outcomes. In this work, we propose a novel technique that uses rules derived from neural network computations to infer data preconditions for a DNN model to determine the trustworthiness of its predictions. Our approach, DeepInfer involves introducing a novel abstraction for a trained DNN model that enables weakest precondition reasoning using Dijkstra's Predicate Transformer Semantics. By deriving rules over the inductive type of neural network abstract representation, we can overcome the matrix dimensionality issues that arise from the backward non-linear computation from the output layer to the input layer. We utilize the weakest precondition computation using rules of each kind of activation function to compute layer-wise precondition from the given postcondition on the final output of a deep neural network. We extensively evaluated DeepInfer on 29 real-world DNN models using four different datasets collected from five different sources and demonstrated the utility, effectiveness, and performance improvement over closely related work. DeepInfer efficiently detects correct and incorrect predictions of high-accuracy models with high recall (0.98) and high F-1 score (0.84) and has significantly improved over prior technique, SelfChecker. The average runtime overhead of DeepInfer is low, 0.22 sec for all unseen datasets. We also compared runtime overhead using the same hardware settings and found that DeepInfer is 3.27 times faster than SelfChecker. △ Less

Submitted 25 January, 2024; originally announced January 2024.

Comments: Accepted for publication at the 46th International Conference on Software Engineering (ICSE 2024)

arXiv:2401.12959 [pdf]

doi 10.1145/3643787.3648035

Understanding Emojis :) in Useful Code Review Comments

Authors: Sharif Ahmed, Nasir U. Eisty

Abstract: Emojis and emoticons serve as non-verbal cues and are increasingly prevalent across various platforms, including Modern Code Review. These cues often carry emotive or instructive weight for developers. Our study dives into the utility of Code Review comments (CR comments) by scrutinizing the sentiments and semantics conveyed by emojis within these comments. To assess the usefulness of CR comments,… ▽ More Emojis and emoticons serve as non-verbal cues and are increasingly prevalent across various platforms, including Modern Code Review. These cues often carry emotive or instructive weight for developers. Our study dives into the utility of Code Review comments (CR comments) by scrutinizing the sentiments and semantics conveyed by emojis within these comments. To assess the usefulness of CR comments, we augment traditional 'textual' features and pre-trained embeddings with 'emoji-specific' features and pre-trained embeddings. To fortify our inquiry, we expand an existing dataset with emoji annotations, guided by existing research on GitHub emoji usage, and re-evaluate the CR comments accordingly. Our models, which incorporate textual and emoji-based sentiment features and semantic understandings of emojis, substantially outperform baseline metrics. The often-overlooked emoji elements in CR comments emerge as key indicators of usefulness, suggesting that these symbols carry significant weight. △ Less

Submitted 23 January, 2024; originally announced January 2024.

Comments: This paper has been accepted for inclusion in the Proceedings of the 3rd Intl. Workshop on NL-based Software Engineering co-located at 46th International Conference on Software Engineering (NLBSE@ICSE 2024)

arXiv:2401.12416 [pdf, other]

Enhancing Reliability of Neural Networks at the Edge: Inverted Normalization with Stochastic Affine Transformations

Authors: Soyed Tuhin Ahmed, Kamal Danouchi, Guillaume Prenat, Lorena Anghel, Mehdi B. Tahoori

Abstract: Bayesian Neural Networks (BayNNs) naturally provide uncertainty in their predictions, making them a suitable choice in safety-critical applications. Additionally, their realization using memristor-based in-memory computing (IMC) architectures enables them for resource-constrained edge applications. In addition to predictive uncertainty, however, the ability to be inherently robust to noise in comp… ▽ More Bayesian Neural Networks (BayNNs) naturally provide uncertainty in their predictions, making them a suitable choice in safety-critical applications. Additionally, their realization using memristor-based in-memory computing (IMC) architectures enables them for resource-constrained edge applications. In addition to predictive uncertainty, however, the ability to be inherently robust to noise in computation is also essential to ensure functional safety. In particular, memristor-based IMCs are susceptible to various sources of non-idealities such as manufacturing and runtime variations, drift, and failure, which can significantly reduce inference accuracy. In this paper, we propose a method to inherently enhance the robustness and inference accuracy of BayNNs deployed in IMC architectures. To achieve this, we introduce a novel normalization layer combined with stochastic affine transformations. Empirical results in various benchmark datasets show a graceful degradation in inference accuracy, with an improvement of up to $58.11\%$. △ Less

Submitted 22 January, 2024; originally announced January 2024.

arXiv:2401.11621 [pdf]

A Novel Decision Ensemble Framework: Customized Attention-BiLSTM and XGBoost for Speculative Stock Price Forecasting

Authors: Riaz Ud Din, Salman Ahmed, Saddam Hussain Khan

Abstract: Forecasting speculative stock prices is essential for effective investment risk management that drives the need for the development of innovative algorithms. However, the speculative nature, volatility, and complex sequential dependencies within financial markets present inherent challenges which necessitate advanced techniques. This paper proposes a novel framework, CAB-XDE (customized attention… ▽ More Forecasting speculative stock prices is essential for effective investment risk management that drives the need for the development of innovative algorithms. However, the speculative nature, volatility, and complex sequential dependencies within financial markets present inherent challenges which necessitate advanced techniques. This paper proposes a novel framework, CAB-XDE (customized attention BiLSTM-XGB decision ensemble), for predicting the daily closing price of speculative stock Bitcoin-USD (BTC-USD). CAB-XDE framework integrates a customized bi-directional long short-term memory (BiLSTM) with the attention mechanism and the XGBoost algorithm. The customized BiLSTM leverages its learning capabilities to capture the complex sequential dependencies and speculative market trends. Additionally, the new attention mechanism dynamically assigns weights to influential features, thereby enhancing interpretability, and optimizing effective cost measures and volatility forecasting. Moreover, XGBoost handles nonlinear relationships and contributes to the proposed CAB-XDE framework robustness. Additionally, the weight determination theory-error reciprocal method further refines predictions. This refinement is achieved by iteratively adjusting model weights. It is based on discrepancies between theoretical expectations and actual errors in individual customized attention BiLSTM and XGBoost models to enhance performance. Finally, the predictions from both XGBoost and customized attention BiLSTM models are concatenated to achieve diverse prediction space and are provided to the ensemble classifier to enhance the generalization capabilities of CAB-XDE. The proposed CAB-XDE framework is empirically validated on volatile Bitcoin market, sourced from Yahoo Finance and outperforms state-of-the-art models with a MAPE of 0.0037, MAE of 84.40, and RMSE of 106.14. △ Less

Submitted 5 January, 2024; originally announced January 2024.

Comments: 30 pages, 16 Figures, 4 Tables

arXiv:2401.11131 [pdf, other]

Towards a Non-Ideal Methodological Framework for Responsible ML

Authors: Ramaravind Kommiya Mothilal, Shion Guha, Syed Ishtiaque Ahmed

Abstract: Though ML practitioners increasingly employ various Responsible ML (RML) strategies, their methodological approach in practice is still unclear. In particular, the constraints, assumptions, and choices of practitioners with technical duties -- such as developers, engineers, and data scientists -- are often implicit, subtle, and under-scrutinized in HCI and related fields. We interviewed 22 technic… ▽ More Though ML practitioners increasingly employ various Responsible ML (RML) strategies, their methodological approach in practice is still unclear. In particular, the constraints, assumptions, and choices of practitioners with technical duties -- such as developers, engineers, and data scientists -- are often implicit, subtle, and under-scrutinized in HCI and related fields. We interviewed 22 technically oriented ML practitioners across seven domains to understand the characteristics of their methodological approaches to RML through the lens of ideal and non-ideal theorizing of fairness. We find that practitioners' methodological approaches fall along a spectrum of idealization. While they structured their approaches through ideal theorizing, such as by abstracting RML workflow from the inquiry of applicability of ML, they did not pay deliberate attention and systematically documented their non-ideal approaches, such as diagnosing imperfect conditions. We end our paper with a discussion of a new methodological approach, inspired by elements of non-ideal theory, to structure technical practitioners' RML process and facilitate collaboration with other stakeholders. △ Less

Submitted 20 January, 2024; originally announced January 2024.

Comments: 20 pages, single-column, preprint for conference

Showing 1–50 of 914 results for author: Ahmed, S