subscribe to arXiv mailings

Protecting NeRFs' Copyright via Plug-And-Play Watermarking Base Model

Authors: Qi Song, Ziyuan Luo, Ka Chun Cheung, Simon See, Renjie Wan

Abstract: Neural Radiance Fields (NeRFs) have become a key method for 3D scene representation. With the rising prominence and influence of NeRF, safeguarding its intellectual property has become increasingly important. In this paper, we propose \textbf{NeRFProtector}, which adopts a plug-and-play strategy to protect NeRF's copyright during its creation. NeRFProtector utilizes a pre-trained watermarking base… ▽ More Neural Radiance Fields (NeRFs) have become a key method for 3D scene representation. With the rising prominence and influence of NeRF, safeguarding its intellectual property has become increasingly important. In this paper, we propose \textbf{NeRFProtector}, which adopts a plug-and-play strategy to protect NeRF's copyright during its creation. NeRFProtector utilizes a pre-trained watermarking base model, enabling NeRF creators to embed binary messages directly while creating their NeRF. Our plug-and-play property ensures NeRF creators can flexibly choose NeRF variants without excessive modifications. Leveraging our newly designed progressive distillation, we demonstrate performance on par with several leading-edge neural rendering methods. Our project is available at: \url{https://qsong2001.github.io/NeRFProtector}. △ Less

Submitted 10 July, 2024; originally announced July 2024.

Comments: Accepted by ECCV2024

arXiv:2406.17245 [pdf, other]

Unlocking Continual Learning Abilities in Language Models

Authors: Wenyu Du, Shuang Cheng, Tongxu Luo, Zihan Qiu, Zeyu Huang, Ka Chun Cheung, Reynold Cheng, Jie Fu

Abstract: Language models (LMs) exhibit impressive performance and generalization capabilities. However, LMs struggle with the persistent challenge of catastrophic forgetting, which undermines their long-term sustainability in continual learning (CL). Existing approaches usually address the issue by incorporating old task data or task-wise inductive bias into LMs. However, old data and accurate task informa… ▽ More Language models (LMs) exhibit impressive performance and generalization capabilities. However, LMs struggle with the persistent challenge of catastrophic forgetting, which undermines their long-term sustainability in continual learning (CL). Existing approaches usually address the issue by incorporating old task data or task-wise inductive bias into LMs. However, old data and accurate task information are often unavailable or costly to collect, hindering the availability of current CL approaches for LMs. To address this limitation, we introduce $\textbf{MIGU}$ ($\textbf{M}$agn$\textbf{I}$tude-based $\textbf{G}$radient $\textbf{U}$pdating for continual learning), a rehearsal-free and task-label-free method that only updates the model parameters with large magnitudes of output in LMs' linear layers. MIGU is based on our observation that the L1-normalized magnitude distribution of the output in LMs' linear layers is different when the LM models deal with different task data. By imposing this simple constraint on the gradient update process, we can leverage the inherent behaviors of LMs, thereby unlocking their innate CL abilities. Our experiments demonstrate that MIGU is universally applicable to all three LM architectures (T5, RoBERTa, and Llama2), delivering state-of-the-art or on-par performance across continual finetuning and continual pre-training settings on four CL benchmarks. For example, MIGU brings a 15.2% average accuracy improvement over conventional parameter-efficient finetuning baselines in a 15-task CL benchmark. MIGU can also seamlessly integrate with all three existing CL types to further enhance performance. Code is available at \href{https://github.com/wenyudu/MIGU}{this https URL}. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: preprint, 19 pages

arXiv:2406.12018 [pdf, other]

CItruS: Chunked Instruction-aware State Eviction for Long Sequence Modeling

Authors: Yu Bai, Xiyuan Zou, Heyan Huang, Sanxing Chen, Marc-Antoine Rondeau, Yang Gao, Jackie Chi Kit Cheung

Abstract: Long sequence modeling has gained broad interest as large language models (LLMs) continue to advance. Recent research has identified that a large portion of hidden states within the key-value caches of Transformer models can be discarded (also termed evicted) without affecting the perplexity performance in generating long sequences. However, we show that these methods, despite preserving perplexit… ▽ More Long sequence modeling has gained broad interest as large language models (LLMs) continue to advance. Recent research has identified that a large portion of hidden states within the key-value caches of Transformer models can be discarded (also termed evicted) without affecting the perplexity performance in generating long sequences. However, we show that these methods, despite preserving perplexity performance, often drop information that is important for solving downstream tasks, a problem which we call information neglect. To address this issue, we introduce Chunked Instruction-aware State Eviction (CItruS), a novel modeling technique that integrates the attention preferences useful for a downstream task into the eviction process of hidden states. In addition, we design a method for chunked sequence processing to further improve efficiency. Our training-free method exhibits superior performance on long sequence comprehension and retrieval tasks over several strong baselines under the same memory budget, while preserving language modeling perplexity. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: Work in progress

arXiv:2406.08723 [pdf, other]

ECBD: Evidence-Centered Benchmark Design for NLP

Authors: Yu Lu Liu, Su Lin Blodgett, Jackie Chi Kit Cheung, Q. Vera Liao, Alexandra Olteanu, Ziang Xiao

Abstract: Benchmarking is seen as critical to assessing progress in NLP. However, creating a benchmark involves many design decisions (e.g., which datasets to include, which metrics to use) that often rely on tacit, untested assumptions about what the benchmark is intended to measure or is actually measuring. There is currently no principled way of analyzing these decisions and how they impact the validity… ▽ More Benchmarking is seen as critical to assessing progress in NLP. However, creating a benchmark involves many design decisions (e.g., which datasets to include, which metrics to use) that often rely on tacit, untested assumptions about what the benchmark is intended to measure or is actually measuring. There is currently no principled way of analyzing these decisions and how they impact the validity of the benchmark's measurements. To address this gap, we draw on evidence-centered design in educational assessments and propose Evidence-Centered Benchmark Design (ECBD), a framework which formalizes the benchmark design process into five modules. ECBD specifies the role each module plays in helping practitioners collect evidence about capabilities of interest. Specifically, each module requires benchmark designers to describe, justify, and support benchmark design choices -- e.g., clearly specifying the capabilities the benchmark aims to measure or how evidence about those capabilities is collected from model responses. To demonstrate the use of ECBD, we conduct case studies with three benchmarks: BoolQ, SuperGLUE, and HELM. Our analysis reveals common trends in benchmark design and documentation that could threaten the validity of benchmarks' measurements. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2406.04062 [pdf, other]

Online Learning in Betting Markets: Profit versus Prediction

Authors: Haiqing Zhu, Alexander Soen, Yun Kuen Cheung, Lexing Xie

Abstract: We examine two types of binary betting markets, whose primary goal is for profit (such as sports gambling) or to gain information (such as prediction markets). We articulate the interplay between belief and price-setting to analyse both types of markets, and show that the goals of maximising bookmaker profit and eliciting information are fundamentally incompatible. A key insight is that profit hin… ▽ More We examine two types of binary betting markets, whose primary goal is for profit (such as sports gambling) or to gain information (such as prediction markets). We articulate the interplay between belief and price-setting to analyse both types of markets, and show that the goals of maximising bookmaker profit and eliciting information are fundamentally incompatible. A key insight is that profit hinges on the deviation between (the distribution of) bettor and true beliefs, and that heavier tails in bettor belief distribution imply higher profit. Our algorithmic contribution is to introduce online learning methods for price-setting. Traditionally bookmakers update their prices rather infrequently, we present two algorithms that guide price updates upon seeing each bet, assuming very little of bettor belief distributions. The online pricing algorithm achieves stochastic regret of $\mathcal{O}(\sqrt{T})$ against the worst local maximum, or $ \mathcal{O}(\sqrt{T \log T}) $ with high probability against the global maximum under fair odds. More broadly, the inherent trade-off between profit and information-seeking in binary betting may inspire new understandings of large-scale multi-agent behaviour. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: ICML 2024

arXiv:2406.02724 [pdf, other]

The LiteBIRD mission to explore cosmic inflation

Authors: T. Ghigna, A. Adler, K. Aizawa, H. Akamatsu, R. Akizawa, E. Allys, A. Anand, J. Aumont, J. Austermann, S. Azzoni, C. Baccigalupi, M. Ballardini, A. J. Banday, R. B. Barreiro, N. Bartolo, S. Basak, A. Basyrov, S. Beckman, M. Bersanelli, M. Bortolami, F. Bouchet, T. Brinckmann, P. Campeti, E. Carinos, A. Carones , et al. (134 additional authors not shown)

Abstract: LiteBIRD, the next-generation cosmic microwave background (CMB) experiment, aims for a launch in Japan's fiscal year 2032, marking a major advancement in the exploration of primordial cosmology and fundamental physics. Orbiting the Sun-Earth Lagrangian point L2, this JAXA-led strategic L-class mission will conduct a comprehensive mapping of the CMB polarization across the entire sky. During its 3-… ▽ More LiteBIRD, the next-generation cosmic microwave background (CMB) experiment, aims for a launch in Japan's fiscal year 2032, marking a major advancement in the exploration of primordial cosmology and fundamental physics. Orbiting the Sun-Earth Lagrangian point L2, this JAXA-led strategic L-class mission will conduct a comprehensive mapping of the CMB polarization across the entire sky. During its 3-year mission, LiteBIRD will employ three telescopes within 15 unique frequency bands (ranging from 34 through 448 GHz), targeting a sensitivity of 2.2\,$μ$K-arcmin and a resolution of 0.5$^\circ$ at 100\,GHz. Its primary goal is to measure the tensor-to-scalar ratio $r$ with an uncertainty $δr = 0.001$, including systematic errors and margin. If $r \geq 0.01$, LiteBIRD expects to achieve a $>5σ$ detection in the $\ell=$2-10 and $\ell=$11-200 ranges separately, providing crucial insight into the early Universe. We describe LiteBIRD's scientific objectives, the application of systems engineering to mission requirements, the anticipated scientific impact, and the operations and scanning strategies vital to minimizing systematic effects. We will also highlight LiteBIRD's synergies with concurrent CMB projects. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: 23 pages, 9 figures, 1 table, SPIE Astronomical Telescopes + Instrumentation 2024

arXiv:2406.01727 [pdf, other]

Federated Learning-based Collaborative Wideband Spectrum Sensing and Scheduling for UAVs in UTM Systems

Authors: Sravan Reddy Chintareddy, Keenan Roach, Kenny Cheung, Morteza Hashemi

Abstract: In this paper, we propose a data-driven framework for collaborative wideband spectrum sensing and scheduling for networked unmanned aerial vehicles (UAVs), which act as the secondary users (SUs) to opportunistically utilize detected "spectrum holes". Our overall framework consists of three main stages. Firstly, in the model training stage, we explore dataset generation in a multi-cell environment… ▽ More In this paper, we propose a data-driven framework for collaborative wideband spectrum sensing and scheduling for networked unmanned aerial vehicles (UAVs), which act as the secondary users (SUs) to opportunistically utilize detected "spectrum holes". Our overall framework consists of three main stages. Firstly, in the model training stage, we explore dataset generation in a multi-cell environment and training a machine learning (ML) model using the federated learning (FL) architecture. Unlike the existing studies on FL for wireless that presume datasets are readily available for training, we propose a novel architecture that directly integrates wireless dataset generation, which involves capturing I/Q samples from over-the-air signals in a multi-cell environment, into the FL training process. Secondly, in the collaborative spectrum inference stage, we propose a collaborative spectrum fusion strategy that is compatible with the unmanned aircraft system traffic management (UTM) ecosystem. Finally, in the spectrum scheduling stage, we leverage reinforcement learning (RL) solutions to dynamically allocate the detected spectrum holes to the secondary users. To evaluate the proposed methods, we establish a comprehensive simulation framework that generates a near-realistic synthetic dataset using MATLAB LTE toolbox by incorporating base-station~(BS) locations in a chosen area of interest, performing ray-tracing, and emulating the primary users channel usage in terms of I/Q samples. This evaluation methodology provides a flexible framework to generate large spectrum datasets that could be used for developing ML/AI-based spectrum management solutions for aerial devices. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: This is a preprint version submitted to IEEE Transactions on Machine learning in Communications and Networking. arXiv admin note: text overlap with arXiv:2308.05036

arXiv:2406.01480 [pdf, other]

Towards Automating the Retrospective Generation of BIM Models: A Unified Framework for 3D Semantic Reconstruction of the Built Environment

Authors: Ka Lung Cheung, Chi Chung Lee

Abstract: The adoption of Building Information Modeling (BIM) is beneficial in construction projects. However, it faces challenges due to the lack of a unified and scalable framework for converting 3D model details into BIM. This paper introduces SRBIM, a unified semantic reconstruction architecture for BIM generation. Our approach's effectiveness is demonstrated through extensive qualitative and quantitati… ▽ More The adoption of Building Information Modeling (BIM) is beneficial in construction projects. However, it faces challenges due to the lack of a unified and scalable framework for converting 3D model details into BIM. This paper introduces SRBIM, a unified semantic reconstruction architecture for BIM generation. Our approach's effectiveness is demonstrated through extensive qualitative and quantitative evaluations, establishing a new paradigm for automated BIM modeling. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: CVPRW 2024, Oral

arXiv:2406.01337 [pdf, other]

ARCH2S: Dataset, Benchmark and Challenges for Learning Exterior Architectural Structures from Point Clouds

Authors: Ka Lung Cheung, Chi Chung Lee

Abstract: Precise segmentation of architectural structures provides detailed information about various building components, enhancing our understanding and interaction with our built environment. Nevertheless, existing outdoor 3D point cloud datasets have limited and detailed annotations on architectural exteriors due to privacy concerns and the expensive costs of data acquisition and annotation. To overcom… ▽ More Precise segmentation of architectural structures provides detailed information about various building components, enhancing our understanding and interaction with our built environment. Nevertheless, existing outdoor 3D point cloud datasets have limited and detailed annotations on architectural exteriors due to privacy concerns and the expensive costs of data acquisition and annotation. To overcome this shortfall, this paper introduces a semantically-enriched, photo-realistic 3D architectural models dataset and benchmark for semantic segmentation. It features 4 different building purposes of real-world buildings as well as an open architectural landscape in Hong Kong. Each point cloud is annotated into one of 14 semantic classes. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: CVPRW 2024 (Oral)

arXiv:2405.18632 [pdf]

Large Language Models as Partners in Student Essay Evaluation

Authors: Toru Ishida, Tongxi Liu, Hailong Wang, William K. Cheung

Abstract: As the importance of comprehensive evaluation in workshop courses increases, there is a growing demand for efficient and fair assessment methods that reduce the workload for faculty members. This paper presents an evaluation conducted with Large Language Models (LLMs) using actual student essays in three scenarios: 1) without providing guidance such as rubrics, 2) with pre-specified rubrics, and 3… ▽ More As the importance of comprehensive evaluation in workshop courses increases, there is a growing demand for efficient and fair assessment methods that reduce the workload for faculty members. This paper presents an evaluation conducted with Large Language Models (LLMs) using actual student essays in three scenarios: 1) without providing guidance such as rubrics, 2) with pre-specified rubrics, and 3) through pairwise comparison of essays. Quantitative analysis of the results revealed a strong correlation between LLM and faculty member assessments in the pairwise comparison scenario with pre-specified rubrics, although concerns about the quality and stability of evaluations remained. Therefore, we conducted a qualitative analysis of LLM assessment comments, showing that: 1) LLMs can match the assessment capabilities of faculty members, 2) variations in LLM assessments should be interpreted as diversity rather than confusion, and 3) assessments by humans and LLMs can differ and complement each other. In conclusion, this paper suggests that LLMs should not be seen merely as assistants to faculty members but as partners in evaluation committees and outlines directions for further research. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2405.15724 [pdf, other]

Reconfiguration Algorithms for Cubic Modular Robots with Realistic Movement Constraints

Authors: MIT--NASA Space Robots Team, Josh Brunner, Kenneth C. Cheung, Erik D. Demaine, Jenny Diomidova, Christine Gregg, Della H. Hendrickson, Irina Kostitsyna

Abstract: We introduce and analyze a model for self-reconfigurable robots made up of unit-cube modules. Compared to past models, our model aims to newly capture two important practical aspects of real-world robots. First, modules often do not occupy an exact unit cube, but rather have features like bumps extending outside the allotted space so that modules can interlock. Thus, for example, our model forbids… ▽ More We introduce and analyze a model for self-reconfigurable robots made up of unit-cube modules. Compared to past models, our model aims to newly capture two important practical aspects of real-world robots. First, modules often do not occupy an exact unit cube, but rather have features like bumps extending outside the allotted space so that modules can interlock. Thus, for example, our model forbids modules from squeezing in between two other modules that are one unit distance apart. Second, our model captures the practical scenario of many passive modules assembled by a single robot, instead of requiring all modules to be able to move on their own. We prove two universality results. First, with a supply of auxiliary modules, we show that any connected polycube structure can be constructed by a carefully aligned plane sweep. Second, without additional modules, we show how to construct any structure for which a natural notion of external feature size is at least a constant; this property largely consolidates forbidden-pattern properties used in previous works on reconfigurable modular robots. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2405.05265 [pdf, ps, other]

On the Properties of the Semigroup Generated by the RL Fractional Integral

Authors: Kyan Ka Hin Cheung, Ethan Jon Yi Soh

Abstract: For operators $A$, it is sometimes possible to define $e^{At}$ as an operator in and of itself provided it meets certain regularity conditions. Like $e^{λx}$ for ODEs, this operator is useful for solving PDEs involving the operator $A$. We call the set of $e^{At}$ a semigroup generated by $A$. In this paper, we discuss the properties of semigroups generated by the fractional integral, an operator… ▽ More For operators $A$, it is sometimes possible to define $e^{At}$ as an operator in and of itself provided it meets certain regularity conditions. Like $e^{λx}$ for ODEs, this operator is useful for solving PDEs involving the operator $A$. We call the set of $e^{At}$ a semigroup generated by $A$. In this paper, we discuss the properties of semigroups generated by the fractional integral, an operator appearing in PDEs in increasingly many fields, over Bochner-Lebesgue spaces. △ Less

Submitted 23 April, 2024; originally announced May 2024.

Comments: The earlier version of this paper will appear in the 2023 Hang Lung Mathematics Awards Winners' papers

arXiv:2404.14833 [pdf, other]

Exploring interference effects between two ALP effective operators at the LHC

Authors: Kingman Cheung, Chih-Ting Lu, C. J. Ouseph, Priyanka Sarmah

Abstract: We observe that most studies of axion-like particle (ALP) production channels at the Large Hadron Collider (LHC) focus on a single type of ALP operator for each process in the effective field theory framework. In this work, we propose an alternative approach that considers two or more types of relevant ALP effective operators together in some specific ALP production channels and study their interf… ▽ More We observe that most studies of axion-like particle (ALP) production channels at the Large Hadron Collider (LHC) focus on a single type of ALP operator for each process in the effective field theory framework. In this work, we propose an alternative approach that considers two or more types of relevant ALP effective operators together in some specific ALP production channels and study their interference effects. Using the $p p\rightarrow t j a$ process with $a\rightarrowγγ$ as an example, we show that this approach allows us to constrain the ALP interactions with both the $W$ boson and the top quark, as well as their interference in a single process. For the final state with two isolated photons and a top quark decaying semi-leptonically, we predict that the future bounds on the ALP decay constant can reach around $f_a \sim 10\;(20) $ TeV for $25$ GeV $< M_a < 100$ GeV at the LHC with 300 (3000) fb$^{1}$ luminosity. △ Less

Submitted 23 April, 2024; originally announced April 2024.

Comments: 18 Pages, 8 Figures, 3 Tables

arXiv:2404.09955 [pdf, other]

Effects of Superradiance in Active Galactic Nuclei

Authors: Priyanka Sarmah, Himanshu Verma, Kingman Cheung, Joseph Silk

Abstract: A spinning supermassive black hole (SMBH) at the core of an active galactic nucleus (AGN) provides room for the elusive ultra-light scalar particles (ULSP) to be produced through a phenomenon called \textit{superradiance}. As a result of this phenomenon, a cloud of scalar particles forms around the black hole by draining the spin angular momentum of the SMBH. In this work, we present a study of th… ▽ More A spinning supermassive black hole (SMBH) at the core of an active galactic nucleus (AGN) provides room for the elusive ultra-light scalar particles (ULSP) to be produced through a phenomenon called \textit{superradiance}. As a result of this phenomenon, a cloud of scalar particles forms around the black hole by draining the spin angular momentum of the SMBH. In this work, we present a study of the superradiant instability due to a scalar field in the vicinity of the central SMBH in an AGN. We begin by showing that the time-evolution of the gravitational coupling $α$ in a realistic ambiance created by the accretion disk around the SMBH in AGN leads to interesting consequences such as the amplified growth of the scalar cloud, enhancement of the gravitational wave emission rate, and appearance of higher modes of superradiance within the age of the Universe ($\sim 10^{10}$ years). We then explore the consequence of superradiance on the characteristics of the AGN. Using the Novikov-Thorne model for an accretion disk, we divide the full spectrum into three distinct wavelength bands- X-ray ($10^{-4}-10^{-2}~μ$m), UV (0.010-0.4~$μ$m), and Vis-IR (0.4~$μ$m-100~$μ$m) and observe sudden drops in the time-variations of the luminosities across these bands and Eddington ratio ($f_{\textrm{Edd}}$) with a characteristic timescale of superradiance. Using a uniform distribution of spin and mass of the SMBHs in AGNs, we demonstrate the appearance of depleted regions and accumulations along the boundaries of these regions in the planes of different band-luminosities and $f_{\textrm{Edd}}$. Finally, we discuss some possible signatures of superradiance that can be drawn from the observed time-variation of the AGN luminosities. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: 18 pages, 9 figures, 1 table. Comments are welcome

arXiv:2404.06126 [pdf, other]

Quark flavor violation and axion-like particles from top-quark decays at the LHC

Authors: Kingman Cheung, Fei-Tung Chung, Giovanna Cottin, Zeren Simon Wang

Abstract: We study axion-like particles (ALPs) with quark-flavor-violating couplings at the LHC. Specifically, we focus on the theoretical scenario with ALP-top-up and ALP-top-charm interactions, in addition to the more common quark-flavor-diagonal couplings. The ALPs can thus originate from decays of top quarks which are pair produced in large numbers at the LHC, and then decay to jets. If these couplings… ▽ More We study axion-like particles (ALPs) with quark-flavor-violating couplings at the LHC. Specifically, we focus on the theoretical scenario with ALP-top-up and ALP-top-charm interactions, in addition to the more common quark-flavor-diagonal couplings. The ALPs can thus originate from decays of top quarks which are pair produced in large numbers at the LHC, and then decay to jets. If these couplings to the quarks are tiny and the ALPs have $\mathcal{O}(10)$ GeV masses, they are long-lived, leading to signatures of displaced vertex plus multiple jets, which have the advantage of suppression of background events at the LHC. We recast a recent ATLAS search for the same signature and reinterpret the results in terms of bounds on the long-lived ALP in our theoretical scenario. We find that the LHC with the full Run 2 dataset can place stringent limits, while at the future high-luminosity LHC with 3 ab$^{-1}$ integrated luminosity stronger sensitivities are expected. △ Less

Submitted 9 April, 2024; originally announced April 2024.

Comments: 16 pages plus references, 2 figures, 11 tables

arXiv:2404.00727 [pdf, other]

A Controlled Reevaluation of Coreference Resolution Models

Authors: Ian Porada, Xiyuan Zou, Jackie Chi Kit Cheung

Abstract: All state-of-the-art coreference resolution (CR) models involve finetuning a pretrained language model. Whether the superior performance of one CR model over another is due to the choice of language model or other factors, such as the task-specific architecture, is difficult or impossible to determine due to lack of a standardized experimental setup. To resolve this ambiguity, we systematically ev… ▽ More All state-of-the-art coreference resolution (CR) models involve finetuning a pretrained language model. Whether the superior performance of one CR model over another is due to the choice of language model or other factors, such as the task-specific architecture, is difficult or impossible to determine due to lack of a standardized experimental setup. To resolve this ambiguity, we systematically evaluate five CR models and control for certain design decisions including the pretrained language model used by each. When controlling for language model size, encoder-based CR models outperform more recent decoder-based models in terms of both accuracy and inference speed. Surprisingly, among encoder-based CR models, more recent models are not always more accurate, and the oldest CR model that we test generalizes the best to out-of-domain textual genres. We conclude that controlling for the choice of language model reduces most, but not all, of the increase in F1 score reported in the past five years. △ Less

Submitted 22 April, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

Comments: LREC-COLING 2024

arXiv:2403.18167 [pdf, other]

Mechanistic Understanding and Mitigation of Language Model Non-Factual Hallucinations

Authors: Lei Yu, Meng Cao, Jackie Chi Kit Cheung, Yue Dong

Abstract: State-of-the-art language models (LMs) sometimes generate non-factual hallucinations that misalign with world knowledge. To explore the mechanistic causes of these hallucinations, we create diagnostic datasets with subject-relation queries and adapt interpretability methods to trace hallucinations through internal model representations. We discover two general and distinct mechanistic causes of ha… ▽ More State-of-the-art language models (LMs) sometimes generate non-factual hallucinations that misalign with world knowledge. To explore the mechanistic causes of these hallucinations, we create diagnostic datasets with subject-relation queries and adapt interpretability methods to trace hallucinations through internal model representations. We discover two general and distinct mechanistic causes of hallucinations shared across LMs (Llama-2, Pythia, GPT-J): 1) knowledge enrichment hallucinations: insufficient subject attribute knowledge in lower layer MLPs, and 2) answer extraction hallucinations: failure to select the correct object attribute in upper layer attention heads. We also found these two internal mechanistic causes of hallucinations are reflected in external manifestations. Based on insights from our mechanistic analysis, we propose a novel hallucination mitigation method through targeted restoration of the LM's internal fact recall pipeline, demonstrating superior performance compared to baselines. △ Less

Submitted 17 June, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

arXiv:2403.16763 [pdf, other]

LiteBIRD Science Goals and Forecasts: Primordial Magnetic Fields

Authors: D. Paoletti, J. Rubino-Martin, M. Shiraishi, D. Molinari, J. Chluba, F. Finelli, C. Baccigalupi, J. Errard, A. Gruppuso, A. I. Lonappan, A. Tartari, E. Allys, A. Anand, J. Aumont, M. Ballardini, A. J. Banday, R. B. Barreiro, N. Bartolo, M. Bersanelli, M. Bortolami, T. Brinckmann, E. Calabrese, P. Campeti, A. Carones, F. J. Casas , et al. (75 additional authors not shown)

Abstract: We present detailed forecasts for the constraints on primordial magnetic fields (PMFs) that will be obtained with the LiteBIRD satellite. The constraints are driven by the effects of PMFs on the CMB anisotropies: the gravitational effects of magnetically-induced perturbations; the effects on the thermal and ionization history of the Universe; the Faraday rotation imprint on the CMB polarization; a… ▽ More We present detailed forecasts for the constraints on primordial magnetic fields (PMFs) that will be obtained with the LiteBIRD satellite. The constraints are driven by the effects of PMFs on the CMB anisotropies: the gravitational effects of magnetically-induced perturbations; the effects on the thermal and ionization history of the Universe; the Faraday rotation imprint on the CMB polarization; and the non-Gaussianities induced in polarization anisotropies. LiteBIRD represents a sensitive probe for PMFs and by exploiting all the physical effects, it will be able to improve the current limit coming from Planck. In particular, thanks to its accurate $B$-mode polarization measurement, LiteBIRD will improve the constraints on infrared configurations for the gravitational effect, giving $B_{\rm 1\,Mpc}^{n_{\rm B} =-2.9} < 0.8$ nG at 95% C.L., potentially opening the possibility to detect nanogauss fields with high significance. We also observe a significant improvement in the limits when marginalized over the spectral index, $B_{1\,{\rm Mpc}}^{\rm marg}< 2.2$ nG at 95% C.L. From the thermal history effect, which relies mainly on $E$-mode polarization data, we obtain a significant improvement for all PMF configurations, with the marginalized case, $\sqrt{\langle B^2\rangle}^{\rm marg}<0.50$ nG at 95% C.L. Faraday rotation constraints will take advantage of the wide frequency coverage of LiteBIRD and the high sensitivity in $B$ modes, improving the limits by orders of magnitude with respect to current results, $B_{1\,{\rm Mpc}}^{n_{\rm B} =-2.9} < 3.2$ nG at 95% C.L. Finally, non-Gaussianities of the $B$-mode polarization can probe PMFs at the level of 1 nG, again significantly improving the current bounds from Planck. Altogether our forecasts represent a broad collection of complementary probes, providing conservative limits on PMF characteristics that will be achieved with LiteBIRD. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: 51 pages, 24 figures, abstract shortened

arXiv:2403.06197 [pdf, other]

DrFuse: Learning Disentangled Representation for Clinical Multi-Modal Fusion with Missing Modality and Modal Inconsistency

Authors: Wenfang Yao, Kejing Yin, William K. Cheung, Jia Liu, Jing Qin

Abstract: The combination of electronic health records (EHR) and medical images is crucial for clinicians in making diagnoses and forecasting prognosis. Strategically fusing these two data modalities has great potential to improve the accuracy of machine learning models in clinical prediction tasks. However, the asynchronous and complementary nature of EHR and medical images presents unique challenges. Miss… ▽ More The combination of electronic health records (EHR) and medical images is crucial for clinicians in making diagnoses and forecasting prognosis. Strategically fusing these two data modalities has great potential to improve the accuracy of machine learning models in clinical prediction tasks. However, the asynchronous and complementary nature of EHR and medical images presents unique challenges. Missing modalities due to clinical and administrative factors are inevitable in practice, and the significance of each data modality varies depending on the patient and the prediction target, resulting in inconsistent predictions and suboptimal model performance. To address these challenges, we propose DrFuse to achieve effective clinical multi-modal fusion. It tackles the missing modality issue by disentangling the features shared across modalities and those unique within each modality. Furthermore, we address the modal inconsistency issue via a disease-wise attention layer that produces the patient- and disease-wise weighting for each modality to make the final prediction. We validate the proposed method using real-world large-scale datasets, MIMIC-IV and MIMIC-CXR. Experimental results show that the proposed method significantly outperforms the state-of-the-art models. Our implementation is publicly available at https://github.com/dorothy-yao/drfuse. △ Less

Submitted 10 March, 2024; originally announced March 2024.

Comments: Accepted by AAAI-24

arXiv:2403.02330 [pdf, other]

RegionGPT: Towards Region Understanding Vision Language Model

Authors: Qiushan Guo, Shalini De Mello, Hongxu Yin, Wonmin Byeon, Ka Chun Cheung, Yizhou Yu, Ping Luo, Sifei Liu

Abstract: Vision language models (VLMs) have experienced rapid advancements through the integration of large language models (LLMs) with image-text pairs, yet they struggle with detailed regional visual understanding due to limited spatial awareness of the vision encoder, and the use of coarse-grained training data that lacks detailed, region-specific captions. To address this, we introduce RegionGPT (short… ▽ More Vision language models (VLMs) have experienced rapid advancements through the integration of large language models (LLMs) with image-text pairs, yet they struggle with detailed regional visual understanding due to limited spatial awareness of the vision encoder, and the use of coarse-grained training data that lacks detailed, region-specific captions. To address this, we introduce RegionGPT (short as RGPT), a novel framework designed for complex region-level captioning and understanding. RGPT enhances the spatial awareness of regional representation with simple yet effective modifications to existing visual encoders in VLMs. We further improve performance on tasks requiring a specific output scope by integrating task-guided instruction prompts during both training and inference phases, while maintaining the model's versatility for general-purpose tasks. Additionally, we develop an automated region caption data generation pipeline, enriching the training set with detailed region-level captions. We demonstrate that a universal RGPT model can be effectively applied and significantly enhancing performance across a range of region-level tasks, including but not limited to complex region descriptions, reasoning, object classification, and referring expressions comprehension. △ Less

Submitted 4 March, 2024; originally announced March 2024.

Comments: Accepted by CVPR 2024

arXiv:2402.19457 [pdf, other]

$\texttt{COSMIC}$: Mutual Information for Task-Agnostic Summarization Evaluation

Authors: Maxime Darrin, Philippe Formont, Jackie Chi Kit Cheung, Pablo Piantanida

Abstract: Assessing the quality of summarizers poses significant challenges. In response, we propose a novel task-oriented evaluation approach that assesses summarizers based on their capacity to produce summaries that are useful for downstream tasks, while preserving task outcomes. We theoretically establish a direct relationship between the resulting error probability of these tasks and the mutual informa… ▽ More Assessing the quality of summarizers poses significant challenges. In response, we propose a novel task-oriented evaluation approach that assesses summarizers based on their capacity to produce summaries that are useful for downstream tasks, while preserving task outcomes. We theoretically establish a direct relationship between the resulting error probability of these tasks and the mutual information between source texts and generated summaries. We introduce $\texttt{COSMIC}$ as a practical implementation of this metric, demonstrating its strong correlation with human judgment-based metrics and its effectiveness in predicting downstream task performance. Comparative analyses against established metrics like $\texttt{BERTScore}$ and $\texttt{ROUGE}$ highlight the competitive performance of $\texttt{COSMIC}$. △ Less

Submitted 1 March, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

arXiv:2402.10550 [pdf, other]

Probing the Gauge-boson Couplings of Axion-like Particle at the LHC and High-Luminosity LHC

Authors: Kingman Cheung, Wanyon Hsiao, C. J. Ouseph, Chen Wang

Abstract: In this work, we calculate the sensitivities on the gauge-boson couplings $g_{aZZ}$, $g_{aZγ}$, and $g_{aWW}$ of an axion-like particle (ALP) that one can achieve at the LHC with $\sqrt{s}=14$ TeV and integrated luminosities of 300 fb$^{-1}$ (current run) and 3000 fb$^{-1}$ (High-Luminosity LHC). We focus on the associated production processes $pp\to Za \to (l^+l^-)(γγ)$ and… ▽ More In this work, we calculate the sensitivities on the gauge-boson couplings $g_{aZZ}$, $g_{aZγ}$, and $g_{aWW}$ of an axion-like particle (ALP) that one can achieve at the LHC with $\sqrt{s}=14$ TeV and integrated luminosities of 300 fb$^{-1}$ (current run) and 3000 fb$^{-1}$ (High-Luminosity LHC). We focus on the associated production processes $pp\to Za \to (l^+l^-)(γγ)$ and $pp\to W^\pm a \to (l^\pm ν)(γγ)$. We show that better sensitivities on these gauge couplings can be achieved at the LHC for $M_a = 1-100$ GeV, down to the level of $10^{-4}\,{\rm GeV}^{-1}$. In conclusion, this study emphasizes the significance of the investigated channels in constraining the ALP couplings at the LHC, offering valuable insights for future experiments dedicated to ALP detection. △ Less

Submitted 7 May, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

Comments: 22 Pages, 8 Figures and 7 Tables, JHEP Accepted Version

arXiv:2402.05678 [pdf, other]

Interpretation of excess in $H \to Z γ$ using a light axion-like particle

Authors: Kingman Cheung, C. J. Ouseph

Abstract: We interpret the recent excess in a rare decay of the Higgs boson, $H\to Zγ$, using a light axion-like particle (ALP) in the mass range less than 0.2 GeV. The dominant decay of such a light ALP is into a pair of photons, which are very close to each other, such that it mimics a single photon in the ECAL detector. It can explain the excess with a coupling… ▽ More We interpret the recent excess in a rare decay of the Higgs boson, $H\to Zγ$, using a light axion-like particle (ALP) in the mass range less than 0.2 GeV. The dominant decay of such a light ALP is into a pair of photons, which are very close to each other, such that it mimics a single photon in the ECAL detector. It can explain the excess with a coupling $C^{\rm eff}_{aZH} / Λ\sim 4 \times 10^{-5}\;{\rm GeV}^{-1}$. A potential test would be the rare decay of the $Z$ boson $Z \to a H^* \to a (b \bar b)$ at the Tera-$Z$ option of the future FCC and CEPC. However, it has a branching ratio of only $O(10^{-12})$, and thus barely testable. The production cross section for $pp \to Z^* \to a H$ via the same coupling $C^{\rm eff}_{aZH} / Λ$ at the LHC is too small for detection. △ Less

Submitted 8 February, 2024; originally announced February 2024.

Comments: 8 pages, 2 figures

arXiv:2401.15977 [pdf, other]

Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling

Authors: Xiaoyu Shi, Zhaoyang Huang, Fu-Yun Wang, Weikang Bian, Dasong Li, Yi Zhang, Manyuan Zhang, Ka Chun Cheung, Simon See, Hongwei Qin, Jifeng Dai, Hongsheng Li

Abstract: We introduce Motion-I2V, a novel framework for consistent and controllable image-to-video generation (I2V). In contrast to previous methods that directly learn the complicated image-to-video mapping, Motion-I2V factorizes I2V into two stages with explicit motion modeling. For the first stage, we propose a diffusion-based motion field predictor, which focuses on deducing the trajectories of the ref… ▽ More We introduce Motion-I2V, a novel framework for consistent and controllable image-to-video generation (I2V). In contrast to previous methods that directly learn the complicated image-to-video mapping, Motion-I2V factorizes I2V into two stages with explicit motion modeling. For the first stage, we propose a diffusion-based motion field predictor, which focuses on deducing the trajectories of the reference image's pixels. For the second stage, we propose motion-augmented temporal attention to enhance the limited 1-D temporal attention in video latent diffusion models. This module can effectively propagate reference image's feature to synthesized frames with the guidance of predicted trajectories from the first stage. Compared with existing methods, Motion-I2V can generate more consistent videos even at the presence of large motion and viewpoint variation. By training a sparse trajectory ControlNet for the first stage, Motion-I2V can support users to precisely control motion trajectories and motion regions with sparse trajectory and region annotations. This offers more controllability of the I2V process than solely relying on textual instructions. Additionally, Motion-I2V's second stage naturally supports zero-shot video-to-video translation. Both qualitative and quantitative comparisons demonstrate the advantages of Motion-I2V over prior approaches in consistent and controllable image-to-video generation. Please see our project page at https://xiaoyushi97.github.io/Motion-I2V/. △ Less

Submitted 31 January, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

Comments: Project page: https://xiaoyushi97.github.io/Motion-I2V/

arXiv:2401.14619 [pdf, other]

Resilient Practical Test-Time Adaptation: Soft Batch Normalization Alignment and Entropy-driven Memory Bank

Authors: Xingzhi Zhou, Zhiliang Tian, Ka Chun Cheung, Simon See, Nevin L. Zhang

Abstract: Test-time domain adaptation effectively adjusts the source domain model to accommodate unseen domain shifts in a target domain during inference. However, the model performance can be significantly impaired by continuous distribution changes in the target domain and non-independent and identically distributed (non-i.i.d.) test samples often encountered in practical scenarios. While existing memory… ▽ More Test-time domain adaptation effectively adjusts the source domain model to accommodate unseen domain shifts in a target domain during inference. However, the model performance can be significantly impaired by continuous distribution changes in the target domain and non-independent and identically distributed (non-i.i.d.) test samples often encountered in practical scenarios. While existing memory bank methodologies use memory to store samples and mitigate non-i.i.d. effects, they do not inherently prevent potential model degradation. To address this issue, we propose a resilient practical test-time adaptation (ResiTTA) method focused on parameter resilience and data quality. Specifically, we develop a resilient batch normalization with estimation on normalization statistics and soft alignments to mitigate overfitting and model degradation. We use an entropy-driven memory bank that accounts for timeliness, the persistence of over-confident samples, and sample uncertainty for high-quality data in adaptation. Our framework periodically adapts the source domain model using a teacher-student model through a self-training loss on the memory samples, incorporating soft alignment losses on batch normalization. We empirically validate ResiTTA across various benchmark datasets, demonstrating state-of-the-art performance. △ Less

Submitted 25 January, 2024; originally announced January 2024.

arXiv:2401.11371 [pdf, other]

Modeling Considerations for Developing Deep Space Autonomous Spacecraft and Simulators

Authors: Christopher Agia, Guillem Casadesus Vila, Saptarshi Bandyopadhyay, David S. Bayard, Kar-Ming Cheung, Charles H. Lee, Eric Wood, Ian Aenishanslin, Steven Ardito, Lorraine Fesq, Marco Pavone, Issa A. D. Nesnas

Abstract: To extend the limited scope of autonomy used in prior missions for operation in distant and complex environments, there is a need to further develop and mature autonomy that jointly reasons over multiple subsystems, which we term system-level autonomy. System-level autonomy establishes situational awareness that resolves conflicting information across subsystems, which may necessitate the refineme… ▽ More To extend the limited scope of autonomy used in prior missions for operation in distant and complex environments, there is a need to further develop and mature autonomy that jointly reasons over multiple subsystems, which we term system-level autonomy. System-level autonomy establishes situational awareness that resolves conflicting information across subsystems, which may necessitate the refinement and interconnection of the underlying spacecraft and environment onboard models. However, with a limited understanding of the assumptions and tradeoffs of modeling to arbitrary extents, designing onboard models to support system-level capabilities presents a significant challenge. In this paper, we provide a detailed analysis of the increasing levels of model fidelity for several key spacecraft subsystems, with the goal of informing future spacecraft functional- and system-level autonomy algorithms and the physics-based simulators on which they are validated. We do not argue for the adoption of a particular fidelity class of models but, instead, highlight the potential tradeoffs and opportunities associated with the use of models for onboard autonomy and in physics-based simulators at various fidelity levels. We ground our analysis in the context of deep space exploration of small bodies, an emerging frontier for autonomous spacecraft operation in space, where the choice of models employed onboard the spacecraft may determine mission success. We conduct our experiments in the Multi-Spacecraft Concept and Autonomy Tool (MuSCAT), a software suite for developing spacecraft autonomy algorithms. △ Less

Submitted 20 January, 2024; originally announced January 2024.

Comments: Project page: https://sites.google.com/stanford.edu/spacecraft-models. 20 pages, 8 figures. Accepted to the IEEE Conference on Aerospace (AeroConf) 2024

ACM Class: I.2.8; I.2.9; I.6.1; I.6.3; I.6.4; I.6.6; J.2

arXiv:2401.11323 [pdf, other]

Identifying and Analyzing Task-Encoding Tokens in Large Language Models

Authors: Yu Bai, Heyan Huang, Cesare Spinoso-Di Piano, Marc-Antoine Rondeau, Sanxing Chen, Yang Gao, Jackie Chi Kit Cheung

Abstract: In-context learning (ICL) has become an effective solution for few-shot learning in natural language processing. However, our understanding of ICL's working mechanisms is limited, specifically regarding how models learn to perform tasks from ICL demonstrations. For example, unexpectedly large changes in performance can arise from small changes in the prompt, leaving prompt design a largely empiric… ▽ More In-context learning (ICL) has become an effective solution for few-shot learning in natural language processing. However, our understanding of ICL's working mechanisms is limited, specifically regarding how models learn to perform tasks from ICL demonstrations. For example, unexpectedly large changes in performance can arise from small changes in the prompt, leaving prompt design a largely empirical endeavour. In this paper, we investigate this problem by identifying and analyzing task-encoding tokens on whose representations the task performance depends. Using experiments that ablate the representations of different token types, we find that template and stopword tokens are the most prone to be task-encoding. In addition, we demonstrate experimentally that lexical meaning, repetition, and text formatting are the main distinguishing characteristics of these tokens. Our work sheds light on how large language models (LLMs) learn to perform a task from demonstrations, deepens our understanding of the varied roles different types of tokens play in LLMs, and provides insights for avoiding instability from improperly utilizing task-encoding tokens. △ Less

Submitted 16 February, 2024; v1 submitted 20 January, 2024; originally announced January 2024.

Comments: Work in progress

arXiv:2401.11084 [pdf, other]

Interference-Aware Queuing Analysis for Distributed Transmission Control in UAV Networks

Authors: Masoud Ghazikor, Keenan Roach, Kenny Cheung, Morteza Hashemi

Abstract: In this paper, we investigate the problem of distributed transmission control for unmanned aerial vehicles (UAVs) operating in unlicensed spectrum bands. We develop a rigorous interference-aware queuing analysis framework that jointly considers two inter-dependent factors: (i) limited-size queues with delay-constrained packet arrival, and (ii) in-band interference introduced by other ground/aerial… ▽ More In this paper, we investigate the problem of distributed transmission control for unmanned aerial vehicles (UAVs) operating in unlicensed spectrum bands. We develop a rigorous interference-aware queuing analysis framework that jointly considers two inter-dependent factors: (i) limited-size queues with delay-constrained packet arrival, and (ii) in-band interference introduced by other ground/aerial users. We aim to optimize the expected throughput by jointly analyzing these factors. In the queuing analysis, we explore two packet loss probabilities including, buffer overflow model and time threshold model. For interference analysis, we investigate the outage probability and packet losses due to low signal-to-interference-plus-noise ratio (SINR). We introduce two algorithms namely, Interference-Aware Transmission Control (IA-TC), and Interference-Aware Distributed Transmission Control (IA-DTC). These algorithms maximize the expected throughput by adjusting transmission policies to balance the trade-offs between packet drop from queues vs. transmission errors due to low SINRs. We implement the proposed algorithms and demonstrate that the optimal transmission policy under various scenarios is found. △ Less

Submitted 19 January, 2024; originally announced January 2024.

Comments: IEEE International Conference on Communications (ICC)

arXiv:2401.05914 [pdf, other]

How Teachers Can Use Large Language Models and Bloom's Taxonomy to Create Educational Quizzes

Authors: Sabina Elkins, Ekaterina Kochmar, Jackie C. K. Cheung, Iulian Serban

Abstract: Question generation (QG) is a natural language processing task with an abundance of potential benefits and use cases in the educational domain. In order for this potential to be realized, QG systems must be designed and validated with pedagogical needs in mind. However, little research has assessed or designed QG approaches with the input from real teachers or students. This paper applies a large… ▽ More Question generation (QG) is a natural language processing task with an abundance of potential benefits and use cases in the educational domain. In order for this potential to be realized, QG systems must be designed and validated with pedagogical needs in mind. However, little research has assessed or designed QG approaches with the input from real teachers or students. This paper applies a large language model-based QG approach where questions are generated with learning goals derived from Bloom's taxonomy. The automatically generated questions are used in multiple experiments designed to assess how teachers use them in practice. The results demonstrate that teachers prefer to write quizzes with automatically generated questions, and that such quizzes have no loss in quality compared to handwritten versions. Further, several metrics indicate that automatically generated questions can even improve the quality of the quizzes created, showing the promise for large scale use of QG in the classroom setting. △ Less

Submitted 11 January, 2024; originally announced January 2024.

Comments: 8 pages, 8 figures. Accepted to the main track of the EAAI-24: The 14th Symposium on Educational Advances in Artificial Intelligence

arXiv:2401.03168 [pdf, other]

Probing dark photons from a light scalar at Belle II

Authors: Kingman Cheung, Yongkyu Kim, Youngjoon Kwon, C. J. Ouseph, Abner Soffer, Zeren Simon Wang

Abstract: In the minimal $U(1)$ extension of the Standard Model (SM), a new gauge boson referred to as "dark photon" is predicted. The dark-photon mass can be generated from an additional Higgs mechanism associated with a dark scalar boson. At $B$-factories such as Belle II, large numbers of $B$-mesons are produced and can decay to a kaon plus the dark scalar via the latter's mixing with the SM Higgs boson.… ▽ More In the minimal $U(1)$ extension of the Standard Model (SM), a new gauge boson referred to as "dark photon" is predicted. The dark-photon mass can be generated from an additional Higgs mechanism associated with a dark scalar boson. At $B$-factories such as Belle II, large numbers of $B$-mesons are produced and can decay to a kaon plus the dark scalar via the latter's mixing with the SM Higgs boson. We evaluate the sensitivity of Belle II for the case in which the dark scalar decays exclusively into a pair of dark photons via the new $U(1)$ gauge coupling, and the dark photons are long lived owing to a small kinetic mixing $ε$. We study the experimental signature in which each dark photon decays into a pair of charged leptons, pions, or kaons, resulting in a pair of displaced vertices, and argue that the search is essentially background-free. We perform detailed Monte-Carlo simulations to determine the expected number of signal events at Belle II with an integrated luminosity of 50 ab$^{-1}$, taking into account the efficiencies for both final-state-particle identification and displaced tracking. We find that for experimentally allowed values of the scalar mixing angle and kinematically allowed dark-photon and dark-scalar masses, the proposed search is uniquely sensitive to the medium-$ε$ regime, which is currently mostly unexcluded by experiments. △ Less

Submitted 18 April, 2024; v1 submitted 6 January, 2024; originally announced January 2024.

Comments: v1: 14 pages plus appendix and refs, 13 figures and 2 tables; v2: 16 pages plus appendix and refs, 14 figures and 2 tables, major improvement in presentation and accepted for publication in JHEP

arXiv:2312.11509 [pdf, other]

Toward a Reinforcement-Learning-Based System for Adjusting Medication to Minimize Speech Disfluency

Authors: Pavlos Constas, Vikram Rawal, Matthew Honorio Oliveira, Andreas Constas, Aditya Khan, Kaison Cheung, Najma Sultani, Carrie Chen, Micol Altomare, Michael Akzam, Jiacheng Chen, Vhea He, Lauren Altomare, Heraa Murqi, Asad Khan, Nimit Amikumar Bhanshali, Youssef Rachad, Michael Guerzhoy

Abstract: We propose a reinforcement learning (RL)-based system that would automatically prescribe a hypothetical patient medication that may help the patient with their mental health-related speech disfluency, and adjust the medication and the dosages in response to zero-cost frequent measurement of the fluency of the patient. We demonstrate the components of the system: a module that detects and evaluates… ▽ More We propose a reinforcement learning (RL)-based system that would automatically prescribe a hypothetical patient medication that may help the patient with their mental health-related speech disfluency, and adjust the medication and the dosages in response to zero-cost frequent measurement of the fluency of the patient. We demonstrate the components of the system: a module that detects and evaluates speech disfluency on a large dataset we built, and an RL algorithm that automatically finds good combinations of medications. To support the two modules, we collect data on the effect of psychiatric medications for speech disfluency from the literature, and build a plausible patient simulation system. We demonstrate that the RL system is, under some circumstances, able to converge to a good medication regime. We collect and label a dataset of people with possible speech disfluency and demonstrate our methods using that dataset. Our work is a proof of concept: we show that there is promise in the idea of using automatic data collection to address speech disfluency. △ Less

Submitted 5 February, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

Comments: In Proc. Machine Learning for Cognitive and Mental Health Workshop (ML4CMH) at AAAI 2024

arXiv:2312.09001 [pdf, other]

Impact of beam far side-lobe knowledge in the presence of foregrounds for LiteBIRD

Authors: C. Leloup, G. Patanchon, J. Errard, C. Franceschet, J. E. Gudmundsson, S. Henrot-Versillé, H. Imada, H. Ishino, T. Matsumura, G. Puglisi, W. Wang, A. Adler, J. Aumont, R. Aurlien, C. Baccigalupi, M. Ballardini, A. J. Banday, R. B. Barreiro, N. Bartolo, A. Basyrov, M. Bersanelli, D. Blinov, M. Bortolami, T. Brinckmann, P. Campeti , et al. (86 additional authors not shown)

Abstract: We present a study of the impact of an uncertainty in the beam far side-lobe knowledge on the measurement of the Cosmic Microwave Background $B$-mode signal at large scale. It is expected to be one of the main source of systematic effects in future CMB observations. Because it is crucial for all-sky survey missions to take into account the interplays between beam systematic effects and all the dat… ▽ More We present a study of the impact of an uncertainty in the beam far side-lobe knowledge on the measurement of the Cosmic Microwave Background $B$-mode signal at large scale. It is expected to be one of the main source of systematic effects in future CMB observations. Because it is crucial for all-sky survey missions to take into account the interplays between beam systematic effects and all the data analysis steps, the primary goal of this paper is to provide the methodology to carry out the end-to-end study of their effect for a space-borne CMB polarization experiment, up to the cosmological results in the form of a bias $δr$ on the tensor-to-scalar ratio $r$. LiteBIRD is dedicated to target the measurement of CMB primordial $B$ modes by reaching a sensitivity of $σ\left( r \right) \leq 10^{-3}$ assuming $r=0$. As a demonstration of our framework, we derive the relationship between the knowledge of the beam far side-lobes and the tentatively allocated error budget under given assumptions on design, simulation and component separation method. We assume no mitigation of the far side-lobes effect at any stage of the analysis pipeline. We show that $δr$ is mostly due to the integrated fractional power difference between the estimated beams and the true beams in the far side-lobes region, with little dependence on the actual shape of the beams, for low enough $δr$. Under our set of assumptions, in particular considering the specific foreground cleaning method we used, we find that the integrated fractional power in the far side-lobes should be known at a level as tight as $\sim 10^{-4}$, to achieve the required limit on the bias $δr < 1.9 \times 10^{-5}$. The framework and tools developed for this study can be easily adapted to provide requirements under different design, data analysis frameworks and for other future space-borne experiments beyond LiteBIRD. △ Less

Submitted 14 December, 2023; originally announced December 2023.

arXiv:2312.07146 [pdf, other]

CompdVision: Combining Near-Field 3D Visual and Tactile Sensing Using a Compact Compound-Eye Imaging System

Authors: Lifan Luo, Boyang Zhang, Zhijie Peng, Yik Kin Cheung, Guanlan Zhang, Zhigang Li, Michael Yu Wang, Hongyu Yu

Abstract: As automation technologies advance, the need for compact and multi-modal sensors in robotic applications is growing. To address this demand, we introduce CompdVision, a novel sensor that employs a compound-eye imaging system to combine near-field 3D visual and tactile sensing within a compact form factor. CompdVision utilizes two types of vision units to address diverse sensing needs, eliminating… ▽ More As automation technologies advance, the need for compact and multi-modal sensors in robotic applications is growing. To address this demand, we introduce CompdVision, a novel sensor that employs a compound-eye imaging system to combine near-field 3D visual and tactile sensing within a compact form factor. CompdVision utilizes two types of vision units to address diverse sensing needs, eliminating the need for complex modality conversion. Stereo units with far-focus lenses can see through the transparent elastomer for depth estimation beyond the contact surface. Simultaneously, tactile units with near-focus lenses track the movement of markers embedded in the elastomer to obtain contact deformation. Experimental results validate the sensor's superior performance in 3D visual and tactile sensing, proving its capability for reliable external object depth estimation and precise measurement of tangential and normal contact forces. The dual modalities and compact design make the sensor a versatile tool for robotic manipulation. △ Less

Submitted 15 March, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

arXiv:2312.05194 [pdf, other]

LiteBIRD Science Goals and Forecasts: Improving Sensitivity to Inflationary Gravitational Waves with Multitracer Delensing

Authors: T. Namikawa, A. I. Lonappan, C. Baccigalupi, N. Bartolo, D. Beck, K. Benabed, A. Challinor, P. Diego-Palazuelos, J. Errard, S. Farrens, A. Gruppuso, N. Krachmalnicoff, M. Migliaccio, E. Martínez-González, V. Pettorino, G. Piccirilli, M. Ruiz-Granda, B. Sherwin, J. Starck, P. Vielva, R. Akizawa, A. Anand, J. Aumont, R. Aurlien, S. Azzoni , et al. (97 additional authors not shown)

Abstract: We estimate the efficiency of mitigating the lensing $B$-mode polarization, the so-called delensing, for the $LiteBIRD$ experiment with multiple external data sets of lensing-mass tracers. The current best bound on the tensor-to-scalar ratio, $r$, is limited by lensing rather than Galactic foregrounds. Delensing will be a critical step to improve sensitivity to $r$ as measurements of $r$ become mo… ▽ More We estimate the efficiency of mitigating the lensing $B$-mode polarization, the so-called delensing, for the $LiteBIRD$ experiment with multiple external data sets of lensing-mass tracers. The current best bound on the tensor-to-scalar ratio, $r$, is limited by lensing rather than Galactic foregrounds. Delensing will be a critical step to improve sensitivity to $r$ as measurements of $r$ become more and more limited by lensing. In this paper, we extend the analysis of the recent $LiteBIRD$ forecast paper to include multiple mass tracers, i.e., the CMB lensing maps from $LiteBIRD$ and CMB-S4-like experiment, cosmic infrared background, and galaxy number density from $Euclid$- and LSST-like survey. We find that multi-tracer delensing will further improve the constraint on $r$ by about $20\%$. In $LiteBIRD$, the residual Galactic foregrounds also significantly contribute to uncertainties of the $B$-modes, and delensing becomes more important if the residual foregrounds are further reduced by an improved component separation method. △ Less

Submitted 8 December, 2023; originally announced December 2023.

Comments: 21 pages, 7 figures

arXiv:2312.05184 [pdf, other]

doi 10.1088/1475-7516/2024/06/009

LiteBIRD Science Goals and Forecasts: A full-sky measurement of gravitational lensing of the CMB

Authors: A. I. Lonappan, T. Namikawa, G. Piccirilli, P. Diego-Palazuelos, M. Ruiz-Granda, M. Migliaccio, C. Baccigalupi, N. Bartolo, D. Beck, K. Benabed, A. Challinor, J. Errard, S. Farrens, A. Gruppuso, N. Krachmalnicoff, E. Martínez-González, V. Pettorino, B. Sherwin, J. Starck, P. Vielva, R. Akizawa, A. Anand, J. Aumont, R. Aurlien, S. Azzoni , et al. (97 additional authors not shown)

Abstract: We explore the capability of measuring lensing signals in $LiteBIRD$ full-sky polarization maps. With a $30$ arcmin beam width and an impressively low polarization noise of $2.16\,μ$K-arcmin, $LiteBIRD$ will be able to measure the full-sky polarization of the cosmic microwave background (CMB) very precisely. This unique sensitivity also enables the reconstruction of a nearly full-sky lensing map u… ▽ More We explore the capability of measuring lensing signals in $LiteBIRD$ full-sky polarization maps. With a $30$ arcmin beam width and an impressively low polarization noise of $2.16\,μ$K-arcmin, $LiteBIRD$ will be able to measure the full-sky polarization of the cosmic microwave background (CMB) very precisely. This unique sensitivity also enables the reconstruction of a nearly full-sky lensing map using only polarization data, even considering its limited capability to capture small-scale CMB anisotropies. In this paper, we investigate the ability to construct a full-sky lensing measurement in the presence of Galactic foregrounds, finding that several possible biases from Galactic foregrounds should be negligible after component separation by harmonic-space internal linear combination. We find that the signal-to-noise ratio of the lensing is approximately $40$ using only polarization data measured over $90\%$ of the sky. This achievement is comparable to $Planck$'s recent lensing measurement with both temperature and polarization and represents a four-fold improvement over $Planck$'s polarization-only lensing measurement. The $LiteBIRD$ lensing map will complement the $Planck$ lensing map and provide several opportunities for cross-correlation science, especially in the northern hemisphere. △ Less

Submitted 8 December, 2023; originally announced December 2023.

arXiv:2312.01858 [pdf, other]

Evaluating Dependencies in Fact Editing for Language Models: Specificity and Implication Awareness

Authors: Zichao Li, Ines Arous, Siva Reddy, Jackie C. K. Cheung

Abstract: The potential of using a large language model (LLM) as a knowledge base (KB) has sparked significant interest. To manage the knowledge acquired by LLMs, we need to ensure that the editing of learned facts respects internal logical constraints, which are known as dependency of knowledge. Existing work on editing LLMs has partially addressed the issue of dependency, when the editing of a fact should… ▽ More The potential of using a large language model (LLM) as a knowledge base (KB) has sparked significant interest. To manage the knowledge acquired by LLMs, we need to ensure that the editing of learned facts respects internal logical constraints, which are known as dependency of knowledge. Existing work on editing LLMs has partially addressed the issue of dependency, when the editing of a fact should apply to its lexical variations without disrupting irrelevant ones. However, they neglect the dependency between a fact and its logical implications. We propose an evaluation protocol with an accompanying question-answering dataset, DepEdit, that provides a comprehensive assessment of the editing process considering the above notions of dependency. Our protocol involves setting up a controlled environment in which we edit facts and monitor their impact on LLMs, along with their implications based on If-Then rules. Extensive experiments on DepEdit show that existing knowledge editing methods are sensitive to the surface form of knowledge, and that they have limited performance in inferring the implications of edited facts. △ Less

Submitted 4 December, 2023; originally announced December 2023.

Comments: Findings of EMNLP2023

arXiv:2311.11103 [pdf, other]

Responsible AI Considerations in Text Summarization Research: A Review of Current Practices

Authors: Yu Lu Liu, Meng Cao, Su Lin Blodgett, Jackie Chi Kit Cheung, Alexandra Olteanu, Adam Trischler

Abstract: AI and NLP publication venues have increasingly encouraged researchers to reflect on possible ethical considerations, adverse impacts, and other responsible AI issues their work might engender. However, for specific NLP tasks our understanding of how prevalent such issues are, or when and why these issues are likely to arise, remains limited. Focusing on text summarization -- a common NLP task lar… ▽ More AI and NLP publication venues have increasingly encouraged researchers to reflect on possible ethical considerations, adverse impacts, and other responsible AI issues their work might engender. However, for specific NLP tasks our understanding of how prevalent such issues are, or when and why these issues are likely to arise, remains limited. Focusing on text summarization -- a common NLP task largely overlooked by the responsible AI community -- we examine research and reporting practices in the current literature. We conduct a multi-round qualitative analysis of 333 summarization papers from the ACL Anthology published between 2020-2022. We focus on how, which, and when responsible AI issues are covered, which relevant stakeholders are considered, and mismatches between stated and realized research goals. We also discuss current evaluation practices and consider how authors discuss the limitations of both prior work and their own work. Overall, we find that relatively few papers engage with possible stakeholders or contexts of use, which limits their consideration of potential downstream adverse impacts or other responsible AI issues. Based on our findings, we make recommendations on concrete practices and research directions. △ Less

Submitted 18 November, 2023; originally announced November 2023.

arXiv:2311.10049 [pdf, other]

Inherently Interpretable Time Series Classification via Multiple Instance Learning

Authors: Joseph Early, Gavin KC Cheung, Kurt Cutajar, Hanting Xie, Jas Kandola, Niall Twomey

Abstract: Conventional Time Series Classification (TSC) methods are often black boxes that obscure inherent interpretation of their decision-making processes. In this work, we leverage Multiple Instance Learning (MIL) to overcome this issue, and propose a new framework called MILLET: Multiple Instance Learning for Locally Explainable Time series classification. We apply MILLET to existing deep learning TSC… ▽ More Conventional Time Series Classification (TSC) methods are often black boxes that obscure inherent interpretation of their decision-making processes. In this work, we leverage Multiple Instance Learning (MIL) to overcome this issue, and propose a new framework called MILLET: Multiple Instance Learning for Locally Explainable Time series classification. We apply MILLET to existing deep learning TSC models and show how they become inherently interpretable without compromising (and in some cases, even improving) predictive performance. We evaluate MILLET on 85 UCR TSC datasets and also present a novel synthetic dataset that is specially designed to facilitate interpretability evaluation. On these datasets, we show MILLET produces sparse explanations quickly that are of higher quality than other well-known interpretability methods. To the best of our knowledge, our work with MILLET, which is available on GitHub (https://github.com/JAEarly/MILTimeSeriesClassification), is the first to develop general MIL methods for TSC and apply them to an extensive variety of domains △ Less

Submitted 16 March, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

Comments: Published at ICLR 2024. 29 pages (9 main, 3 ref, 17 appendix)

arXiv:2311.04921 [pdf, other]

Successor Features for Efficient Multisubject Controlled Text Generation

Authors: Meng Cao, Mehdi Fatemi, Jackie Chi Kit Cheung, Samira Shabanian

Abstract: While large language models (LLMs) have achieved impressive performance in generating fluent and realistic text, controlling the generated text so that it exhibits properties such as safety, factuality, and non-toxicity remains challenging. % such as DExperts, GeDi, and rectification Existing decoding-based methods are static in terms of the dimension of control; if the target subject is changed,… ▽ More While large language models (LLMs) have achieved impressive performance in generating fluent and realistic text, controlling the generated text so that it exhibits properties such as safety, factuality, and non-toxicity remains challenging. % such as DExperts, GeDi, and rectification Existing decoding-based methods are static in terms of the dimension of control; if the target subject is changed, they require new training. Moreover, it can quickly become prohibitive to concurrently control multiple subjects. In this work, we introduce SF-GEN, which is grounded in two primary concepts: successor features (SFs) to decouple the LLM's dynamics from task-specific rewards, and language model rectification to proportionally adjust the probability of selecting a token based on the likelihood that the finished text becomes undesired. SF-GEN seamlessly integrates the two to enable dynamic steering of text generation with no need to alter the LLM's parameters. Thanks to the decoupling effect induced by successor features, our method proves to be memory-wise and computationally efficient for training as well as decoding, especially when dealing with multiple target subjects. To the best of our knowledge, our research represents the first application of successor features in text generation. In addition to its computational efficiency, the resultant language produced by our method is comparable to the SOTA (and outperforms baselines) in both control measures as well as language quality, which we demonstrate through a series of experiments in various controllable text generation tasks. △ Less

Submitted 2 November, 2023; originally announced November 2023.

arXiv:2310.01717 [pdf, other]

Ensemble Distillation for Unsupervised Constituency Parsing

Authors: Behzad Shayegh, Yanshuai Cao, Xiaodan Zhu, Jackie C. K. Cheung, Lili Mou

Abstract: We investigate the unsupervised constituency parsing task, which organizes words and phrases of a sentence into a hierarchical structure without using linguistically annotated data. We observe that existing unsupervised parsers capture differing aspects of parsing structures, which can be leveraged to enhance unsupervised parsing performance. To this end, we propose a notion of "tree averaging," b… ▽ More We investigate the unsupervised constituency parsing task, which organizes words and phrases of a sentence into a hierarchical structure without using linguistically annotated data. We observe that existing unsupervised parsers capture differing aspects of parsing structures, which can be leveraged to enhance unsupervised parsing performance. To this end, we propose a notion of "tree averaging," based on which we further propose a novel ensemble method for unsupervised parsing. To improve inference efficiency, we further distill the ensemble knowledge into a student model; such an ensemble-then-distill process is an effective approach to mitigate the over-smoothing problem existing in common multi-teacher distilling methods. Experiments show that our method surpasses all previous approaches, consistently demonstrating its effectiveness and robustness across various runs, with different ensemble components, and under domain-shift conditions. △ Less

Submitted 25 April, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

Comments: Accepted by International Conference on Learning Representations (ICLR) 2024

arXiv:2309.03116 [pdf]

Strong magnon-magnon coupling in an ultralow damping all-magnetic-insulator heterostructure

Authors: Jiacheng Liu, Yuzan Xiong, Jingming Liang, Xuezhao Wu, Chen Liu, Shun Kong Cheung, Zheyu Ren, Ruizi Liu, Andrew Christy, Zehan Chen, Ferris Prima Nugraha, Xi-Xiang Zhang, Chi Wah Leung, Wei Zhang, Qiming Shao

Abstract: Magnetic insulators such as yttrium iron garnets (YIGs) are of paramount importance for spin-wave or magnonic devices as their ultralow damping enables ultralow power dissipation that is free of Joule heating, exotic magnon quantum state, and coherent coupling to other wave excitations. Magnetic insulator heterostructures bestow superior structural and magnetic properties and house immense design… ▽ More Magnetic insulators such as yttrium iron garnets (YIGs) are of paramount importance for spin-wave or magnonic devices as their ultralow damping enables ultralow power dissipation that is free of Joule heating, exotic magnon quantum state, and coherent coupling to other wave excitations. Magnetic insulator heterostructures bestow superior structural and magnetic properties and house immense design space thanks to the strong and engineerable exchange interaction between individual layers. To fully unleash their potential, realizing low damping and strong exchange coupling simultaneously is critical, which often requires high quality interface. Here, we show that such a demand is realized in an all-insulator thulium iron garnet (TmIG)/YIG bilayer system. The ultralow dissipation rates in both YIG and TmIG, along with their significant spin-spin interaction at the interface, enable strong and coherent magnon-magnon coupling with a benchmarking cooperativity value larger than the conventional ferromagnetic metal-based heterostructures. The coupling strength can be tuned by varying the magnetic insulator layer thickness and magnon modes, which is consistent with analytical calculations and micromagnetic simulations. Our results demonstrate TmIG/YIG as a novel platform for investigating hybrid magnonic phenomena and open opportunities in magnon devices comprising all-insulator heterostructures. △ Less

Submitted 6 September, 2023; originally announced September 2023.

Comments: 45 pages, 18 figures, and 2 tables

arXiv:2308.15664 [pdf, other]

Comparison between $μ^{-}μ^{+}$ and $e^{-}e^{+}$ colliders for charged Higgs production in 2HDM

Authors: Brahim Ait Ouazghour, Abdesslam Arhrib, Kingman Cheung, Es-said Ghourmin, Larbi Rahili

Abstract: We study the phenomenology of the charged Higgs boson at future muon colliders. We investigate both the pair production $μ^+ μ^- \to H^+ H^-$, the single production $μ^+ μ^- \to W^\pm H^\mp$, as well as the Vector Boson Fusion (VBF) $\{e^+e^-, μ^+ μ^-\} \to ν\barν H^+ H^-$. We show that the neutral Higgs exchange diagrams in the muon collider case can lead to a significant boost in the cross secti… ▽ More We study the phenomenology of the charged Higgs boson at future muon colliders. We investigate both the pair production $μ^+ μ^- \to H^+ H^-$, the single production $μ^+ μ^- \to W^\pm H^\mp$, as well as the Vector Boson Fusion (VBF) $\{e^+e^-, μ^+ μ^-\} \to ν\barν H^+ H^-$. We show that the neutral Higgs exchange diagrams in the muon collider case can lead to a significant boost in the cross sections through their Yukawa couplings. Our results for the muon collider are systematically compared to the corresponding ones at $e^+e^-$ machines. It is demonstrated that the vector boson fusion (VBF) $e^+e^- \to ν\barν H^+ H^-$ can compete with the mentioned $2\to 2$ processes. We select benchmark points and perform signal-background analyses, taking into account detector simulations. We demonstrate the discovery region at $5σ$ and the excluded region at $2σ$ levels at a 3 TeV muon collider. △ Less

Submitted 11 May, 2024; v1 submitted 29 August, 2023; originally announced August 2023.

Comments: 43 pages, 14 figures, 10 tables, accepted to be published in PRD

arXiv:2308.12925 [pdf, other]

doi 10.1109/MLSP55844.2023.10285979

Low-count Time Series Anomaly Detection

Authors: Philipp Renz, Kurt Cutajar, Niall Twomey, Gavin K. C. Cheung, Hanting Xie

Abstract: Low-count time series describe sparse or intermittent events, which are prevalent in large-scale online platforms that capture and monitor diverse data types. Several distinct challenges surface when modelling low-count time series, particularly low signal-to-noise ratios (when anomaly signatures are provably undetectable), and non-uniform performance (when average metrics are not representative o… ▽ More Low-count time series describe sparse or intermittent events, which are prevalent in large-scale online platforms that capture and monitor diverse data types. Several distinct challenges surface when modelling low-count time series, particularly low signal-to-noise ratios (when anomaly signatures are provably undetectable), and non-uniform performance (when average metrics are not representative of local behaviour). The time series anomaly detection community currently lacks explicit tooling and processes to model and reliably detect anomalies in these settings. We address this gap by introducing a novel generative procedure for creating benchmark datasets comprising of low-count time series with anomalous segments. Via a mixture of theoretical and empirical analysis, our work explains how widely-used algorithms struggle with the distribution overlap between normal and anomalous segments. In order to mitigate this shortcoming, we then leverage our findings to demonstrate how anomaly score smoothing consistently improves performance. The practical utility of our analysis and recommendation is validated on a real-world dataset containing sales data for retail stores. △ Less

Submitted 24 August, 2023; originally announced August 2023.

Comments: 6 pages, 7 figures, to be published in IEEE 2023 Workshop on Machine Learning for Signal Processing (MLSP)

Journal ref: 2023 IEEE 33rd International Workshop on Machine Learning for Signal Processing (MLSP)

arXiv:2308.06647 [pdf, other]

Energy-Efficient Deadline-Aware Edge Computing: Bandit Learning with Partial Observations in Multi-Channel Systems

Authors: Babak Badnava, Keenan Roach, Kenny Cheung, Morteza Hashemi, Ness B Shroff

Abstract: In this paper, we consider a task offloading problem in a multi-access edge computing (MEC) network, in which edge users can either use their local processing unit to compute their tasks or offload their tasks to a nearby edge server through multiple communication channels each with different characteristics. The main objective is to maximize the energy efficiency of the edge users while meeting c… ▽ More In this paper, we consider a task offloading problem in a multi-access edge computing (MEC) network, in which edge users can either use their local processing unit to compute their tasks or offload their tasks to a nearby edge server through multiple communication channels each with different characteristics. The main objective is to maximize the energy efficiency of the edge users while meeting computing tasks deadlines. In the multi-user multi-channel offloading scenario, users are distributed with partial observations of the system states. We formulate this problem as a stochastic optimization problem and leverage \emph{contextual neural multi-armed bandit} models to develop an energy-efficient deadline-aware solution, dubbed E2DA. The proposed E2DA framework only relies on partial state information (i.e., computation task features) to make offloading decisions. Through extensive numerical analysis, we demonstrate that the E2DA algorithm can efficiently learn an offloading policy and achieve close-to-optimal performance in comparison with several baseline policies that optimize energy consumption and/or response time. Furthermore, we provide a comprehensive set of results on the MEC system performance for various applications such as augmented reality (AR) and virtual reality (VR). △ Less

Submitted 12 August, 2023; originally announced August 2023.

Comments: 2023 IEEE Global Communications Conference

arXiv:2308.05187 [pdf, other]

Exploring the Interplay of Interference and Queues in Unlicensed Spectrum Bands for UAV Networks

Authors: Masoud Ghazikor, Keenan Roach, Kenny Cheung, Morteza Hashemi

Abstract: In this paper, we present an analytical framework to explore the interplay of signal interference and transmission queue management, and their impacts on the performance of unmanned aerial vehicles (UAVs) when operating in the unlicensed spectrum bands. In particular, we develop a comprehensive framework to investigate the impact of other interference links on the UAV as it communicates with the g… ▽ More In this paper, we present an analytical framework to explore the interplay of signal interference and transmission queue management, and their impacts on the performance of unmanned aerial vehicles (UAVs) when operating in the unlicensed spectrum bands. In particular, we develop a comprehensive framework to investigate the impact of other interference links on the UAV as it communicates with the ground users. To this end, we provide closed-form expressions for packet drop probabilities in the queue due to buffer overflow or large queuing delay, which are expressed in terms of a transmission policy as a function of the channel fading threshold $β$. The overall packet loss caused either by interference signals or queuing packet drop is obtained, which, in turn, yields in obtaining the expected throughput performance. Through extensive numerical results, we investigate the impact of the channel fading threshold $β$, which plays an important role in balancing the trade-offs between packet loss due to queue drop or transmission error due to large interference levels. △ Less

Submitted 24 November, 2023; v1 submitted 9 August, 2023; originally announced August 2023.

Comments: Asilomar Conference on Signals, Systems, and Computers

arXiv:2308.05036 [pdf, other]

Collaborative Wideband Spectrum Sensing and Scheduling for Networked UAVs in UTM Systems

Authors: Sravan Reddy Chintareddy, Keenan Roach, Kenny Cheung, Morteza Hashemi

Abstract: In this paper, we propose a data-driven framework for collaborative wideband spectrum sensing and scheduling for networked unmanned aerial vehicles (UAVs), which act as the secondary users to opportunistically utilize detected spectrum holes. To this end, we propose a multi-class classification problem for wideband spectrum sensing to detect vacant spectrum spots based on collected I/Q samples. To… ▽ More In this paper, we propose a data-driven framework for collaborative wideband spectrum sensing and scheduling for networked unmanned aerial vehicles (UAVs), which act as the secondary users to opportunistically utilize detected spectrum holes. To this end, we propose a multi-class classification problem for wideband spectrum sensing to detect vacant spectrum spots based on collected I/Q samples. To enhance the accuracy of the spectrum sensing module, the outputs from the multi-class classification by each individual UAV are fused at a server in the unmanned aircraft system traffic management (UTM) ecosystem. In the spectrum scheduling phase, we leverage reinforcement learning (RL) solutions to dynamically allocate the detected spectrum holes to the secondary users (i.e., UAVs). To evaluate the proposed methods, we establish a comprehensive simulation framework that generates a near-realistic synthetic dataset using MATLAB LTE toolbox by incorporating base-station~(BS) locations in a chosen area of interest, performing ray-tracing, and emulating the primary users channel usage in terms of I/Q samples. This evaluation methodology provides a flexible framework to generate large spectrum datasets that could be used for developing ML/AI-based spectrum management solutions for aerial devices. △ Less

Submitted 9 August, 2023; originally announced August 2023.

arXiv:2308.00009 [pdf]

A 3D deep learning classifier and its explainability when assessing coronary artery disease

Authors: Wing Keung Cheung, Jeremy Kalindjian, Robert Bell, Arjun Nair, Leon J. Menezes, Riyaz Patel, Simon Wan, Kacy Chou, Jiahang Chen, Ryo Torii, Rhodri H. Davies, James C. Moon, Daniel C. Alexander, Joseph Jacob

Abstract: Early detection and diagnosis of coronary artery disease (CAD) could save lives and reduce healthcare costs. In this study, we propose a 3D Resnet-50 deep learning model to directly classify normal subjects and CAD patients on computed tomography coronary angiography images. Our proposed method outperforms a 2D Resnet-50 model by 23.65%. Explainability is also provided by using a Grad-GAM. Further… ▽ More Early detection and diagnosis of coronary artery disease (CAD) could save lives and reduce healthcare costs. In this study, we propose a 3D Resnet-50 deep learning model to directly classify normal subjects and CAD patients on computed tomography coronary angiography images. Our proposed method outperforms a 2D Resnet-50 model by 23.65%. Explainability is also provided by using a Grad-GAM. Furthermore, we link the 3D CAD classification to a 2D two-class semantic segmentation for improved explainability and accurate abnormality localisation. △ Less

Submitted 29 July, 2023; originally announced August 2023.

arXiv:2308.00008 [pdf]

A data-centric deep learning approach to airway segmentation

Authors: Wing Keung Cheung, Ashkan Pakzad, Nesrin Mogulkoc, Sarah Needleman, Bojidar Rangelov, Eyjolfur Gudmundsson, An Zhao, Mariam Abbas, Davina McLaverty, Dimitrios Asimakopoulos, Robert Chapman, Recep Savas, Sam M Janes, Yipeng Hu, Daniel C. Alexander, John R Hurst, Joseph Jacob

Abstract: The morphology and distribution of airway tree abnormalities enables diagnosis and disease characterisation across a variety of chronic respiratory conditions. In this regard, airway segmentation plays a critical role in the production of the outline of the entire airway tree to enable estimation of disease extent and severity. In this study, we propose a data-centric deep learning technique to se… ▽ More The morphology and distribution of airway tree abnormalities enables diagnosis and disease characterisation across a variety of chronic respiratory conditions. In this regard, airway segmentation plays a critical role in the production of the outline of the entire airway tree to enable estimation of disease extent and severity. In this study, we propose a data-centric deep learning technique to segment the airway tree. The proposed technique utilises interpolation and image split to improve data usefulness and quality. Then, an ensemble learning strategy is implemented to aggregate the segmented airway trees at different scales. In terms of segmentation performance (dice similarity coefficient), our method outperforms the baseline model by 2.5% on average when a combined loss is used. Further, our proposed technique has a low GPU usage and high flexibility enabling it to be deployed on any 2D deep learning model. △ Less

Submitted 29 July, 2023; originally announced August 2023.

arXiv:2307.11526 [pdf, other]

CopyRNeRF: Protecting the CopyRight of Neural Radiance Fields

Authors: Ziyuan Luo, Qing Guo, Ka Chun Cheung, Simon See, Renjie Wan

Abstract: Neural Radiance Fields (NeRF) have the potential to be a major representation of media. Since training a NeRF has never been an easy task, the protection of its model copyright should be a priority. In this paper, by analyzing the pros and cons of possible copyright protection solutions, we propose to protect the copyright of NeRF models by replacing the original color representation in NeRF with… ▽ More Neural Radiance Fields (NeRF) have the potential to be a major representation of media. Since training a NeRF has never been an easy task, the protection of its model copyright should be a priority. In this paper, by analyzing the pros and cons of possible copyright protection solutions, we propose to protect the copyright of NeRF models by replacing the original color representation in NeRF with a watermarked color representation. Then, a distortion-resistant rendering scheme is designed to guarantee robust message extraction in 2D renderings of NeRF. Our proposed method can directly protect the copyright of NeRF models while maintaining high rendering quality and bit accuracy when compared among optional solutions. △ Less

Submitted 29 July, 2023; v1 submitted 21 July, 2023; originally announced July 2023.

Comments: 11 pages, 6 figures, accepted by ICCV 2023 non-camera-ready version

arXiv:2307.08046 [pdf, other]

NANOGrav Signal and PBH from the Modified Higgs Inflation

Authors: Kingman Cheung, C. J. Ouseph, Po-Yan Tseng

Abstract: This study investigates the classical Higgs inflation model with a modified Higgs potential featuring a dip. We examine the implications of this modification on the generation of curvature perturbations, stochastic gravitational wave production, and the potential formation of primordial black holes (PBHs). Unlike the classical model, the modified potential allows for enhanced power spectra and the… ▽ More This study investigates the classical Higgs inflation model with a modified Higgs potential featuring a dip. We examine the implications of this modification on the generation of curvature perturbations, stochastic gravitational wave production, and the potential formation of primordial black holes (PBHs). Unlike the classical model, the modified potential allows for enhanced power spectra and the existence of PBHs within a wide mass range $1.5\times10^{20}$ g -- $9.72\times10^{32}$ g. We identify parameter space regions that align with inflationary constraints and have the potential to contribute significantly to the observed dark matter content. Additionally, the study explores the consistency of the obtained parameter space with cosmological constraints and discusses the implications for explaining the observed excess in gravitational wave signals, particularly in the NANOGrav experiment. Overall, this investigation highlights the relevance of the modified Higgs potential in the classical Higgs inflation model, shedding light on the formation of PBHs, the nature of dark matter, and the connection to gravitational wave observations. △ Less

Submitted 16 July, 2023; originally announced July 2023.

Comments: 26 Pages, 7 figures and 1 Table

Showing 1–50 of 548 results for author: Cheung, K