-
FedFDP: Fairness-Aware Federated Learning with Differential Privacy
Authors:
Xinpeng Ling,
Jie Fu,
Kuncan Wang,
Huifa Li,
Tong Cheng,
Zhili Chen
Abstract:
Federated learning (FL) is a new machine learning paradigm to overcome the challenge of data silos and has garnered significant attention. However, through our observations, a globally effective trained model may performance disparities in different clients. This implies that the jointly trained models by clients may lead to unfair outcomes. On the other hand, relevant studies indicate that the tr…
▽ More
Federated learning (FL) is a new machine learning paradigm to overcome the challenge of data silos and has garnered significant attention. However, through our observations, a globally effective trained model may performance disparities in different clients. This implies that the jointly trained models by clients may lead to unfair outcomes. On the other hand, relevant studies indicate that the transmission of gradients or models in federated learning can also give rise to privacy leakage issues, such as membership inference attacks.
To address the first issue mentioned above, we propose a fairness-aware federated learning algorithm, termed FedFair. Building upon FedFair, we introduce privacy protection to form the FedFDP algorithm to address the second issue mentioned above. In FedFDP, we devise a fairness-aware clipping strategy to achieve differential privacy while adjusting fairness. Additionally, for the extra uploaded loss values, we present an adaptive clipping approach to maximize utility. Furthermore, we theoretically prove that our algorithm converges and ensures differential privacy. Lastly, extensive experimental results demonstrate that FedFair and FedFDP significantly outperform state-of-the-art solutions in terms of model performance and fairness. Code and data is accessible at https://anonymous.4open.science/r/FedFDP-5607.
△ Less
Submitted 20 May, 2024; v1 submitted 25 February, 2024;
originally announced February 2024.
-
A Markovian regime-switching stochastic SEQIR epidemic model with governmental policy
Authors:
Hongjie Fan,
Kai Wang,
Yanling Zhu
Abstract:
In this paper, a stochastic SEQIR epidemic model with Markovian regime-switching is proposed and investigated. The governmental policy and implement efficiency are concerned by a generalized incidence function of the susceptible class. We have the existence and uniqueness of the globally positive solution to the stochastic model by using the Lyapunov method. In addition, we study the dynamical beh…
▽ More
In this paper, a stochastic SEQIR epidemic model with Markovian regime-switching is proposed and investigated. The governmental policy and implement efficiency are concerned by a generalized incidence function of the susceptible class. We have the existence and uniqueness of the globally positive solution to the stochastic model by using the Lyapunov method. In addition, we study the dynamical behaviors of the disease, and the sufficient conditions for the extinction and persistence in mean are obtained. Finally, numerical simulations are introduced to demonstrate the theoretical results.
△ Less
Submitted 24 February, 2024;
originally announced February 2024.
-
NaVid: Video-based VLM Plans the Next Step for Vision-and-Language Navigation
Authors:
Jiazhao Zhang,
Kunyu Wang,
Rongtao Xu,
Gengze Zhou,
Yicong Hong,
Xiaomeng Fang,
Qi Wu,
Zhizheng Zhang,
He Wang
Abstract:
Vision-and-language navigation (VLN) stands as a key research problem of Embodied AI, aiming at enabling agents to navigate in unseen environments following linguistic instructions. In this field, generalization is a long-standing challenge, either to out-of-distribution scenes or from Sim to Real. In this paper, we propose NaVid, a video-based large vision language model (VLM), to mitigate such a…
▽ More
Vision-and-language navigation (VLN) stands as a key research problem of Embodied AI, aiming at enabling agents to navigate in unseen environments following linguistic instructions. In this field, generalization is a long-standing challenge, either to out-of-distribution scenes or from Sim to Real. In this paper, we propose NaVid, a video-based large vision language model (VLM), to mitigate such a generalization gap. NaVid makes the first endeavor to showcase the capability of VLMs to achieve state-of-the-art level navigation performance without any maps, odometers, or depth inputs. Following human instruction, NaVid only requires an on-the-fly video stream from a monocular RGB camera equipped on the robot to output the next-step action. Our formulation mimics how humans navigate and naturally gets rid of the problems introduced by odometer noises, and the Sim2Real gaps from map or depth inputs. Moreover, our video-based approach can effectively encode the historical observations of robots as spatio-temporal contexts for decision making and instruction following. We train NaVid with 510k navigation samples collected from continuous environments, including action-planning and instruction-reasoning samples, along with 763k large-scale web data. Extensive experiments show that NaVid achieves state-of-the-art performance in simulation environments and the real world, demonstrating superior cross-dataset and Sim2Real transfer. We thus believe our proposed VLM approach plans the next step for not only the navigation agents but also this research field.
△ Less
Submitted 30 June, 2024; v1 submitted 24 February, 2024;
originally announced February 2024.
-
Tunable incommensurability and spontaneous symmetry breaking in the reconstructed moiré-of-moiré lattices
Authors:
Daesung Park,
Changwon Park,
Eunjung Ko,
Kunihiro Yananose,
Rebecca Engelke,
Xi Zhang,
Konstantin Davydov,
Matthew Green,
Sang Hwa Park,
Jae Heon Lee,
Kenji Watanabe,
Takashi Taniguchi,
Sang Mo Yang,
Ke Wang,
Philip Kim,
Young-Woo Son,
Hyobin Yoo
Abstract:
Imposing incommensurable periodicity on the periodic atomic lattice can lead to complex structural phases consisting of locally periodic structure bounded by topological defects. Twisted trilayer graphene (TTG) is an ideal material platform to study the interplay between different atomic periodicities, which can be tuned by twist angles between the layers, leading to moiré-of-moiré lattices. Inter…
▽ More
Imposing incommensurable periodicity on the periodic atomic lattice can lead to complex structural phases consisting of locally periodic structure bounded by topological defects. Twisted trilayer graphene (TTG) is an ideal material platform to study the interplay between different atomic periodicities, which can be tuned by twist angles between the layers, leading to moiré-of-moiré lattices. Interlayer and intralayer interactions between two interfaces in TTG transform this moiré-of-moiré lattice into an intricate network of domain structures at small twist angles, which can harbor exotic electronic behaviors. Here we report a complete structural phase diagram of TTG with atomic scale lattice reconstruction. Using transmission electron microscopy combined with a new interatomic potential simulation, we show that a cornucopia of large-scale moiré lattices, ranging from triangular, kagome, and a corner-shared hexagram-shaped domain pattern, are present. For small twist angles below 0.1°, all domains are bounded by a network of two-dimensional domain wall lattices. In particular, in the limit of small twist angles, the competition between interlayer stacking energy and the formation of discommensurate domain walls leads to unique spontaneous symmetry breaking structures with nematic orders, suggesting the pivotal role of long-range interactions across entire layers. The diverse tessellation of distinct domains, whose topological network can be tuned by the adjustment of the twist angles, establishes TTG as a platform for exploring the interplay between emerging quantum properties and controllable nontrivial lattices.
△ Less
Submitted 24 February, 2024;
originally announced February 2024.
-
Probing critical phenomena in open quantum systems using atom arrays
Authors:
Fang Fang,
Kenneth Wang,
Vincent S. Liu,
Yu Wang,
Ryan Cimmino,
Julia Wei,
Marcus Bintz,
Avery Parr,
Jack Kemp,
Kang-Kuen Ni,
Norman Y. Yao
Abstract:
At continuous phase transitions, quantum many-body systems exhibit scale-invariance and complex, emergent universal behavior. Most strikingly, at a quantum critical point, correlations decay as a power law, with exponents determined by a set of universal scaling dimensions. Experimentally probing such power-law correlations is extremely challenging, owing to the complex interplay between decoheren…
▽ More
At continuous phase transitions, quantum many-body systems exhibit scale-invariance and complex, emergent universal behavior. Most strikingly, at a quantum critical point, correlations decay as a power law, with exponents determined by a set of universal scaling dimensions. Experimentally probing such power-law correlations is extremely challenging, owing to the complex interplay between decoherence, the vanishing energy gap, and boundary effects. Here, we employ a Rydberg quantum simulator to adiabatically prepare critical ground states of both a one-dimensional ring and a two-dimensional square lattice. By accounting for and tuning the openness of our quantum system, which is well-captured by the introduction of a single phenomenological length scale, we are able to directly observe power-law correlations and extract the corresponding scaling dimensions. Moreover, in two dimensions, we observe a decoupling between phase transitions in the bulk and on the boundary, allowing us to identify two distinct boundary universality classes. Our work demonstrates that direct adiabatic preparation of critical states in quantum simulators can complement recent approaches to studying quantum criticality using the Kibble-Zurek mechanism or digital quantum circuits.
△ Less
Submitted 23 February, 2024;
originally announced February 2024.
-
Difference Learning for Air Quality Forecasting Transport Emulation
Authors:
Reed River Chen,
Christopher Ribaudo,
Jennifer Sleeman,
Chace Ashcraft,
Collin Kofroth,
Marisa Hughes,
Ivanka Stajner,
Kevin Viner,
Kai Wang
Abstract:
Human health is negatively impacted by poor air quality including increased risk for respiratory and cardiovascular disease. Due to a recent increase in extreme air quality events, both globally and locally in the United States, finer resolution air quality forecasting guidance is needed to effectively adapt to these events. The National Oceanic and Atmospheric Administration provides air quality…
▽ More
Human health is negatively impacted by poor air quality including increased risk for respiratory and cardiovascular disease. Due to a recent increase in extreme air quality events, both globally and locally in the United States, finer resolution air quality forecasting guidance is needed to effectively adapt to these events. The National Oceanic and Atmospheric Administration provides air quality forecasting guidance for the Continental United States. Their air quality forecasting model is based on a 15 km spatial resolution; however, the goal is to reach a three km spatial resolution. This is currently not feasible due in part to prohibitive computational requirements for modeling the transport of chemical species. In this work, we describe a deep learning transport emulator that is able to reduce computations while maintaining skill comparable with the existing numerical model. We show how this method maintains skill in the presence of extreme air quality events, making it a potential candidate for operational use. We also explore evaluating how well this model maintains the physical properties of the modeled transport for a given set of species.
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
Measuring Multimodal Mathematical Reasoning with MATH-Vision Dataset
Authors:
Ke Wang,
Junting Pan,
Weikang Shi,
Zimu Lu,
Mingjie Zhan,
Hongsheng Li
Abstract:
Recent advancements in Large Multimodal Models (LMMs) have shown promising results in mathematical reasoning within visual contexts, with models approaching human-level performance on existing benchmarks such as MathVista. However, we observe significant limitations in the diversity of questions and breadth of subjects covered by these benchmarks. To address this issue, we present the MATH-Vision…
▽ More
Recent advancements in Large Multimodal Models (LMMs) have shown promising results in mathematical reasoning within visual contexts, with models approaching human-level performance on existing benchmarks such as MathVista. However, we observe significant limitations in the diversity of questions and breadth of subjects covered by these benchmarks. To address this issue, we present the MATH-Vision (MATH-V) dataset, a meticulously curated collection of 3,040 high-quality mathematical problems with visual contexts sourced from real math competitions. Spanning 16 distinct mathematical disciplines and graded across 5 levels of difficulty, our dataset provides a comprehensive and diverse set of challenges for evaluating the mathematical reasoning abilities of LMMs. Through extensive experimentation, we unveil a notable performance gap between current LMMs and human performance on MATH-V, underscoring the imperative for further advancements in LMMs. Moreover, our detailed categorization allows for a thorough error analysis of LMMs, offering valuable insights to guide future research and development. The project is available at https://mathvision-cuhk.github.io
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
CaT-GNN: Enhancing Credit Card Fraud Detection via Causal Temporal Graph Neural Networks
Authors:
Yifan Duan,
Guibin Zhang,
Shilong Wang,
Xiaojiang Peng,
Wang Ziqi,
Junyuan Mao,
Hao Wu,
Xinke Jiang,
Kun Wang
Abstract:
Credit card fraud poses a significant threat to the economy. While Graph Neural Network (GNN)-based fraud detection methods perform well, they often overlook the causal effect of a node's local structure on predictions. This paper introduces a novel method for credit card fraud detection, the \textbf{\underline{Ca}}usal \textbf{\underline{T}}emporal \textbf{\underline{G}}raph \textbf{\underline{N}…
▽ More
Credit card fraud poses a significant threat to the economy. While Graph Neural Network (GNN)-based fraud detection methods perform well, they often overlook the causal effect of a node's local structure on predictions. This paper introduces a novel method for credit card fraud detection, the \textbf{\underline{Ca}}usal \textbf{\underline{T}}emporal \textbf{\underline{G}}raph \textbf{\underline{N}}eural \textbf{N}etwork (CaT-GNN), which leverages causal invariant learning to reveal inherent correlations within transaction data. By decomposing the problem into discovery and intervention phases, CaT-GNN identifies causal nodes within the transaction graph and applies a causal mixup strategy to enhance the model's robustness and interpretability. CaT-GNN consists of two key components: Causal-Inspector and Causal-Intervener. The Causal-Inspector utilizes attention weights in the temporal attention mechanism to identify causal and environment nodes without introducing additional parameters. Subsequently, the Causal-Intervener performs a causal mixup enhancement on environment nodes based on the set of nodes. Evaluated on three datasets, including a private financial dataset and two public datasets, CaT-GNN demonstrates superior performance over existing state-of-the-art methods. Our findings highlight the potential of integrating causal reasoning with graph neural networks to improve fraud detection capabilities in financial transactions.
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
Voltage tunable sign inversion of magnetoresistance in van der Waals Fe3GeTe2/MoSe2/Fe3GeTe2 tunnel junctions
Authors:
Shouguo Zhu,
Hailong Lin,
Wenkai Zhu,
Weihao Li,
Jing Zhang,
Kaiyou Wang
Abstract:
The magnetic tunnel junctions (MTJ) based on van der Waals (vdW) materials possess atomically smooth interfaces with minimal element intermixing. This characteristic ensures that spin polarization is well maintained during transport, leading to the emergence of richer magnetoresistance behaviors. Here, using all 2D vdW MTJs based on magnetic metal Fe3GeTe2 and non-magnetic semiconductor MoSe2, we…
▽ More
The magnetic tunnel junctions (MTJ) based on van der Waals (vdW) materials possess atomically smooth interfaces with minimal element intermixing. This characteristic ensures that spin polarization is well maintained during transport, leading to the emergence of richer magnetoresistance behaviors. Here, using all 2D vdW MTJs based on magnetic metal Fe3GeTe2 and non-magnetic semiconductor MoSe2, we demonstrate that the magnitude and even sign of the magnetoresistance can be tuned by the applied voltage. The sign inversion of the magnetoresistance is observed in a wide temperature range below the Curie temperature. This tunable magnetoresistance sign may be attributed to the spin polarizations of the tunneling carriers and the band structure of the two ferromagnetic electrodes. Such robust electrical tunability of magnetoresistance extends the functionalities of low-dimensional spintronics and makes it more appealing for next-generation spintronics with all-vdW MTJs.
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
MeTMaP: Metamorphic Testing for Detecting False Vector Matching Problems in LLM Augmented Generation
Authors:
Guanyu Wang,
Yuekang Li,
Yi Liu,
Gelei Deng,
Tianlin Li,
Guosheng Xu,
Yang Liu,
Haoyu Wang,
Kailong Wang
Abstract:
Augmented generation techniques such as Retrieval-Augmented Generation (RAG) and Cache-Augmented Generation (CAG) have revolutionized the field by enhancing large language model (LLM) outputs with external knowledge and cached information. However, the integration of vector databases, which serve as a backbone for these augmentations, introduces critical challenges, particularly in ensuring accura…
▽ More
Augmented generation techniques such as Retrieval-Augmented Generation (RAG) and Cache-Augmented Generation (CAG) have revolutionized the field by enhancing large language model (LLM) outputs with external knowledge and cached information. However, the integration of vector databases, which serve as a backbone for these augmentations, introduces critical challenges, particularly in ensuring accurate vector matching. False vector matching in these databases can significantly compromise the integrity and reliability of LLM outputs, leading to misinformation or erroneous responses. Despite the crucial impact of these issues, there is a notable research gap in methods to effectively detect and address false vector matches in LLM-augmented generation. This paper presents MeTMaP, a metamorphic testing framework developed to identify false vector matching in LLM-augmented generation systems. We derive eight metamorphic relations (MRs) from six NLP datasets, which form our method's core, based on the idea that semantically similar texts should match and dissimilar ones should not. MeTMaP uses these MRs to create sentence triplets for testing, simulating real-world LLM scenarios. Our evaluation of MeTMaP over 203 vector matching configurations, involving 29 embedding models and 7 distance metrics, uncovers significant inaccuracies. The results, showing a maximum accuracy of only 41.51\% on our tests compared to the original datasets, emphasize the widespread issue of false matches in vector matching methods and the critical need for effective detection and mitigation in LLM-augmented applications.
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
Existence and upper semicontinuity of pullback attractors for Kirchhoff wave equations in time-dependent spaces
Authors:
Bin Yang,
Yuming Qin,
Alain Miranville,
Ke Wang
Abstract:
In this paper, we shall investigate the existence and upper semicontinuity of pullback attractors for non-autonomous Kirchhoff wave equations with a strong damping in the time-dependent space $X_t$. After deriving the existence and uniqueness of solutions by the Faedo-Galerkin approximation method, we establish the existence of pullback attractors. Later on, we prove the upper semicontinuity of pu…
▽ More
In this paper, we shall investigate the existence and upper semicontinuity of pullback attractors for non-autonomous Kirchhoff wave equations with a strong damping in the time-dependent space $X_t$. After deriving the existence and uniqueness of solutions by the Faedo-Galerkin approximation method, we establish the existence of pullback attractors. Later on, we prove the upper semicontinuity of pullback attractors between the Kirchhoff-type wave equations with $δ\geq 0$ and the conventional wave equations with $δ=0$ by a series of complex energy estimates.
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
BIRCO: A Benchmark of Information Retrieval Tasks with Complex Objectives
Authors:
Xiaoyue Wang,
Jianyou Wang,
Weili Cao,
Kaicheng Wang,
Ramamohan Paturi,
Leon Bergen
Abstract:
We present the Benchmark of Information Retrieval (IR) tasks with Complex Objectives (BIRCO). BIRCO evaluates the ability of IR systems to retrieve documents given multi-faceted user objectives. The benchmark's complexity and compact size make it suitable for evaluating large language model (LLM)-based information retrieval systems. We present a modular framework for investigating factors that may…
▽ More
We present the Benchmark of Information Retrieval (IR) tasks with Complex Objectives (BIRCO). BIRCO evaluates the ability of IR systems to retrieve documents given multi-faceted user objectives. The benchmark's complexity and compact size make it suitable for evaluating large language model (LLM)-based information retrieval systems. We present a modular framework for investigating factors that may influence LLM performance on retrieval tasks, and identify a simple baseline model which matches or outperforms existing approaches and more complex alternatives. No approach achieves satisfactory performance on all benchmark tasks, suggesting that stronger models and new retrieval protocols are necessary to address complex user needs.
△ Less
Submitted 3 April, 2024; v1 submitted 21 February, 2024;
originally announced February 2024.
-
Affective Computing for Healthcare: Recent Trends, Applications, Challenges, and Beyond
Authors:
Yuanyuan Liu,
Ke Wang,
Lin Wei,
Jingying Chen,
Yibing Zhan,
Dapeng Tao,
Zhe Chen
Abstract:
Affective computing, which aims to recognize, interpret, and understand human emotions, provides benefits in healthcare, such as improving patient care and enhancing doctor-patient communication. However, there is a noticeable absence of a comprehensive summary of recent advancements in affective computing for healthcare, which could pose difficulties for researchers entering this field. To addres…
▽ More
Affective computing, which aims to recognize, interpret, and understand human emotions, provides benefits in healthcare, such as improving patient care and enhancing doctor-patient communication. However, there is a noticeable absence of a comprehensive summary of recent advancements in affective computing for healthcare, which could pose difficulties for researchers entering this field. To address this, our paper aims to provide an extensive literature review of related studies published in the last five years. We begin by analyzing trends, benefits, and limitations of recent datasets and affective computing methods devised for healthcare. Subsequently, we highlight several healthcare application hotspots of current technologies that could be promising for real-world deployment. Through our analysis, we identify and discuss some ongoing challenges in the field as evidenced by the literature. Concluding with a thorough review, we further offer potential future research directions and hope our findings and insights could guide related researchers to make better contributions to the evolution of affective computing in healthcare.
△ Less
Submitted 21 February, 2024;
originally announced February 2024.
-
ARL2: Aligning Retrievers for Black-box Large Language Models via Self-guided Adaptive Relevance Labeling
Authors:
Lingxi Zhang,
Yue Yu,
Kuan Wang,
Chao Zhang
Abstract:
Retrieval-augmented generation enhances large language models (LLMs) by incorporating relevant information from external knowledge sources. This enables LLMs to adapt to specific domains and mitigate hallucinations in knowledge-intensive tasks. However, existing retrievers are often misaligned with LLMs due to their separate training processes and the black-box nature of LLMs. To address this chal…
▽ More
Retrieval-augmented generation enhances large language models (LLMs) by incorporating relevant information from external knowledge sources. This enables LLMs to adapt to specific domains and mitigate hallucinations in knowledge-intensive tasks. However, existing retrievers are often misaligned with LLMs due to their separate training processes and the black-box nature of LLMs. To address this challenge, we propose ARL2, a retriever learning technique that harnesses LLMs as labelers. ARL2 leverages LLMs to annotate and score relevant evidence, enabling learning the retriever from robust LLM supervision. Furthermore, ARL2 uses an adaptive self-training strategy for curating high-quality and diverse relevance data, which can effectively reduce the annotation cost. Extensive experiments demonstrate the effectiveness of ARL2, achieving accuracy improvements of 5.4% on NQ and 4.6% on MMLU compared to the state-of-the-art methods. Additionally, ARL2 exhibits robust transfer learning capabilities and strong zero-shot generalization abilities. Our code will be published at \url{https://github.com/zhanglingxi-cs/ARL2}.
△ Less
Submitted 4 June, 2024; v1 submitted 21 February, 2024;
originally announced February 2024.
-
Inverse-designed Photonic Computing Core for Parallel Matrix-vector Multiplication
Authors:
Kaiyuan Wang,
Yunlong Li,
Tiange Wu,
Deming Liu,
Shuang Zheng,
Minming Zhang
Abstract:
On-chip optical neural networks (ONNs) have recently emerged as an attractive hardware accelerator for deep learning applications, characterized by high computing density, low latency, and compact size. As these networks rely heavily on massive matrix multiplication, photonic computing cores for matrix computation become crucial components for on-chip ONNs, which harness the degree of freedoms (DO…
▽ More
On-chip optical neural networks (ONNs) have recently emerged as an attractive hardware accelerator for deep learning applications, characterized by high computing density, low latency, and compact size. As these networks rely heavily on massive matrix multiplication, photonic computing cores for matrix computation become crucial components for on-chip ONNs, which harness the degree of freedoms (DOFs) in photonics including space, wavelength and mode dimensions. However, previous photonic computing devices have not fully utilized the orthogonality and the conversion characteristic of the waveguide modes, which as we show here, allows for the simultaneous parallel computing of several independent matrix-vector multiplications within the same device. In this work, we propose an inverse-designed photonic computing core for parallel matrix-vector multiplication. The matrices are implemented through a mode conversion process, where the input fundamental modes are simultaneously converted into several orthogonal output modes. Specifically, we target the complex-valued conversion matrices between input and output modes and inversely design the dielectric distribution within the device to achieve parallel matrix-vector multiplication. As a demonstration, the proposed photonic computing core supports simultaneous parallel computing of two independent matrix-vector multiplications, with an ultra-compact footprint and high computing precision (relative error < 8%) at 1550 nm wavelength. The inverse-designed photonic computing devices hold great potential for high-performance on-chip ONNs with low energy consumption and high computing density.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
Neural Network Parameter Diffusion
Authors:
Kai Wang,
Zhaopan Xu,
Yukun Zhou,
Zelin Zang,
Trevor Darrell,
Zhuang Liu,
Yang You
Abstract:
Diffusion models have achieved remarkable success in image and video generation. In this work, we demonstrate that diffusion models can also \textit{generate high-performing neural network parameters}. Our approach is simple, utilizing an autoencoder and a standard latent diffusion model. The autoencoder extracts latent representations of a subset of the trained network parameters. A diffusion mod…
▽ More
Diffusion models have achieved remarkable success in image and video generation. In this work, we demonstrate that diffusion models can also \textit{generate high-performing neural network parameters}. Our approach is simple, utilizing an autoencoder and a standard latent diffusion model. The autoencoder extracts latent representations of a subset of the trained network parameters. A diffusion model is then trained to synthesize these latent parameter representations from random noise. It then generates new representations that are passed through the autoencoder's decoder, whose outputs are ready to use as new subsets of network parameters. Across various architectures and datasets, our diffusion process consistently generates models of comparable or improved performance over trained networks, with minimal additional cost. Notably, we empirically find that the generated models are not memorizing the trained networks. Our results encourage more exploration on the versatile use of diffusion models.
△ Less
Submitted 28 May, 2024; v1 submitted 20 February, 2024;
originally announced February 2024.
-
Neural-Network-Based Optimal Guidance for Lunar Vertical Landing
Authors:
Kun Wang,
Zheng Chen,
Fangmin Lu,
Jun Li
Abstract:
This paper addresses an optimal guidance problem concerning the vertical landing of a lunar lander with the objective of minimizing fuel consumption. The vertical landing imposes a final attitude constraint, which is treated as a final control constraint. To handle this constraint, we propose a nonnegative small regularization term to augment the original cost functional. This ensures the satisfac…
▽ More
This paper addresses an optimal guidance problem concerning the vertical landing of a lunar lander with the objective of minimizing fuel consumption. The vertical landing imposes a final attitude constraint, which is treated as a final control constraint. To handle this constraint, we propose a nonnegative small regularization term to augment the original cost functional. This ensures the satisfaction of the final control constraint in accordance with Pontryagin's Minimum Principle. By leveraging the necessary conditions for optimality, we establish a parameterized system that facilitates the generation of numerous optimal trajectories, which contain the nonlinear mapping from the flight state to the optimal guidance command. Subsequently, a neural network is trained to approximate such mapping. Finally, numerical examples are presented to validate the proposed method.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
RealCompo: Balancing Realism and Compositionality Improves Text-to-Image Diffusion Models
Authors:
Xinchen Zhang,
Ling Yang,
Yaqi Cai,
Zhaochen Yu,
Kai-Ni Wang,
Jiake Xie,
Ye Tian,
Minkai Xu,
Yong Tang,
Yujiu Yang,
Bin Cui
Abstract:
Diffusion models have achieved remarkable advancements in text-to-image generation. However, existing models still have many difficulties when faced with multiple-object compositional generation. In this paper, we propose RealCompo, a new training-free and transferred-friendly text-to-image generation framework, which aims to leverage the respective advantages of text-to-image models and spatial-a…
▽ More
Diffusion models have achieved remarkable advancements in text-to-image generation. However, existing models still have many difficulties when faced with multiple-object compositional generation. In this paper, we propose RealCompo, a new training-free and transferred-friendly text-to-image generation framework, which aims to leverage the respective advantages of text-to-image models and spatial-aware image diffusion models (e.g., layout, keypoints and segmentation maps) to enhance both realism and compositionality of the generated images. An intuitive and novel balancer is proposed to dynamically balance the strengths of the two models in denoising process, allowing plug-and-play use of any model without extra training. Extensive experiments show that our RealCompo consistently outperforms state-of-the-art text-to-image models and spatial-aware image diffusion models in multiple-object compositional generation while keeping satisfactory realism and compositionality of the generated images. Notably, our RealCompo can be seamlessly extended with a wide range of spatial-aware image diffusion models and stylized diffusion models. Our code is available at: https://github.com/YangLing0818/RealCompo
△ Less
Submitted 24 May, 2024; v1 submitted 20 February, 2024;
originally announced February 2024.
-
Smart Mobility Digital Twin Based Automated Vehicle Navigation System: A Proof of Concept
Authors:
Kui Wang,
Zongdian Li,
Kazuma Nonomura,
Tao Yu,
Kei Sakaguchi,
Omar Hashash,
Walid Saad
Abstract:
Digital twins (DTs) have driven major advancements across various industrial domains over the past two decades. With the rapid advancements in autonomous driving and vehicle-to-everything (V2X) technologies, integrating DTs into vehicular platforms is anticipated to further revolutionize smart mobility systems. In this paper, a new smart mobility DT (SMDT) platform is proposed for the control of c…
▽ More
Digital twins (DTs) have driven major advancements across various industrial domains over the past two decades. With the rapid advancements in autonomous driving and vehicle-to-everything (V2X) technologies, integrating DTs into vehicular platforms is anticipated to further revolutionize smart mobility systems. In this paper, a new smart mobility DT (SMDT) platform is proposed for the control of connected and automated vehicles (CAVs) over next-generation wireless networks. In particular, the proposed platform enables cloud services to leverage the abilities of DTs to promote the autonomous driving experience. To enhance traffic efficiency and road safety measures, a novel navigation system that exploits available DT information is designed. The SMDT platform and navigation system are implemented with state-of-the-art products, e.g., CAVs and roadside units (RSUs), and emerging technologies, e.g., cloud and cellular V2X (C-V2X). In addition, proof-of-concept (PoC) experiments are conducted to validate system performance. The performance of SMDT is evaluated from two standpoints: (i) the rewards of the proposed navigation system on traffic efficiency and safety and, (ii) the latency and reliability of the SMDT platform. Our experimental results using SUMO-based large-scale traffic simulations show that the proposed SMDT can reduce the average travel time and the blocking probability due to unexpected traffic incidents. Furthermore, the results record a peak overall latency for DT modeling and route planning services to be 155.15 ms and 810.59 ms, respectively, which validates that our proposed design aligns with the 3GPP requirements for emerging V2X use cases and fulfills the targets of the proposed design. Our demonstration video can be found at https://youtu.be/3waQwlaHQkk.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Interference Mitigation in LEO Constellations with Limited Radio Environment Information
Authors:
Fernando Moya Caceres,
Akram Al-Hourani,
Saman Atapattu,
Michael Aygur,
Sithamparanathan Kandeepan,
Jing Fu,
Ke Wang,
Wayne S. T. Rowe,
Mark Bowyer,
Zarko Krusevac,
Edward Arbon
Abstract:
This research paper delves into interference mitigation within Low Earth Orbit (LEO) satellite constellations, particularly when operating under constraints of limited radio environment information. Leveraging cognitive capabilities facilitated by the Radio Environment Map (REM), we explore strategies to mitigate the impact of both intentional and unintentional interference using planar antenna ar…
▽ More
This research paper delves into interference mitigation within Low Earth Orbit (LEO) satellite constellations, particularly when operating under constraints of limited radio environment information. Leveraging cognitive capabilities facilitated by the Radio Environment Map (REM), we explore strategies to mitigate the impact of both intentional and unintentional interference using planar antenna array (PAA) beamforming techniques. We address the complexities encountered in the design of beamforming weights, a challenge exacerbated by the array size and the increasing number of directions of interest and avoidance. Furthermore, we conduct an extensive analysis of beamforming performance from various perspectives associated with limited REM information: static versus dynamic, partial versus full, and perfect versus imperfect. To substantiate our findings, we provide simulation results and offer conclusions based on the outcomes of our investigation.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
LLM as Prompter: Low-resource Inductive Reasoning on Arbitrary Knowledge Graphs
Authors:
Kai Wang,
Yuwei Xu,
Zhiyong Wu,
Siqiang Luo
Abstract:
Knowledge Graph (KG) inductive reasoning, which aims to infer missing facts from new KGs that are not seen during training, has been widely adopted in various applications. One critical challenge of KG inductive reasoning is handling low-resource scenarios with scarcity in both textual and structural aspects. In this paper, we attempt to address this challenge with Large Language Models (LLMs). Pa…
▽ More
Knowledge Graph (KG) inductive reasoning, which aims to infer missing facts from new KGs that are not seen during training, has been widely adopted in various applications. One critical challenge of KG inductive reasoning is handling low-resource scenarios with scarcity in both textual and structural aspects. In this paper, we attempt to address this challenge with Large Language Models (LLMs). Particularly, we utilize the state-of-the-art LLMs to generate a graph-structural prompt to enhance the pre-trained Graph Neural Networks (GNNs), which brings us new methodological insights into the KG inductive reasoning methods, as well as high generalizability in practice. On the methodological side, we introduce a novel pretraining and prompting framework ProLINK, designed for low-resource inductive reasoning across arbitrary KGs without requiring additional training. On the practical side, we experimentally evaluate our approach on 36 low-resource KG datasets and find that ProLINK outperforms previous methods in three-shot, one-shot, and zero-shot reasoning tasks, exhibiting average performance improvements by 20%, 45%, and 147%, respectively. Furthermore, ProLINK demonstrates strong robustness for various LLM promptings as well as full-shot scenarios.
△ Less
Submitted 19 June, 2024; v1 submitted 18 February, 2024;
originally announced February 2024.
-
95 GeV light Higgs in the top-pair-associated diphoton channel at the LHC in the minimal dilaton model
Authors:
Kun Wang,
Jingya Zhu
Abstract:
Motivated by experimental hints and theoretical frameworks indicating the existence of an extended Higgs sector, we explore the feasibility of detecting a 95 GeV light Higgs boson decaying into a diphoton within the minimal dilaton model at the 14 TeV LHC. Initially, we identify the correlations between the production cross section, decay branching ratios, and model parameters, e.g., the scalar mi…
▽ More
Motivated by experimental hints and theoretical frameworks indicating the existence of an extended Higgs sector, we explore the feasibility of detecting a 95 GeV light Higgs boson decaying into a diphoton within the minimal dilaton model at the 14 TeV LHC. Initially, we identify the correlations between the production cross section, decay branching ratios, and model parameters, e.g., the scalar mixing angle $\sinθ_S$. Subsequently, we utilize Monte Carlo simulations to generate the signal of the light Higgs boson via the $pp \to t\bar{t}(s\to γγ)$ process, along with the corresponding backgrounds. To effectively separate the signal from the dominant backgrounds $ttγγ$, we employ a meticulous cut-based selection process. Ultimately, we find that with an integrated luminosity of $L = 3000 {{~\rm fb}^{-1}}$, the regions of $|\sinθ_S|>0.2$ can be covered over the $3σ$ level.
△ Less
Submitted 6 June, 2024; v1 submitted 17 February, 2024;
originally announced February 2024.
-
Search for the production of deuterons and antideuterons in e^+e^- annihilation at center-of-mass energies between 4.13 and 4.70 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (593 additional authors not shown)
Abstract:
Using a data sample of $e^+e^-$ collision data corresponding to an integrated luminosity of 19 fb$^{-1}$ collected with the BESIII detector at the BEPCII collider, we search for the production of deuterons and antideuterons via $e^+e^-\to ppπ^-\bar{d}+c.c.$ for the first time at center-of-mass energies between 4.13 and 4.70 GeV. No significant signal is observed and the upper limit of the…
▽ More
Using a data sample of $e^+e^-$ collision data corresponding to an integrated luminosity of 19 fb$^{-1}$ collected with the BESIII detector at the BEPCII collider, we search for the production of deuterons and antideuterons via $e^+e^-\to ppπ^-\bar{d}+c.c.$ for the first time at center-of-mass energies between 4.13 and 4.70 GeV. No significant signal is observed and the upper limit of the $e^+e^-\to ppπ^-\bar{d}+c.c.$ cross section is determined to be from 9.0 to 145 fb depending on the center-of-mass energy at the $90\%$ confidence level.
△ Less
Submitted 17 February, 2024;
originally announced February 2024.
-
GIM: Learning Generalizable Image Matcher From Internet Videos
Authors:
Xuelun Shen,
Zhipeng Cai,
Wei Yin,
Matthias Müller,
Zijun Li,
Kaixuan Wang,
Xiaozhi Chen,
Cheng Wang
Abstract:
Image matching is a fundamental computer vision problem. While learning-based methods achieve state-of-the-art performance on existing benchmarks, they generalize poorly to in-the-wild images. Such methods typically need to train separate models for different scene types and are impractical when the scene type is unknown in advance. One of the underlying problems is the limited scalability of exis…
▽ More
Image matching is a fundamental computer vision problem. While learning-based methods achieve state-of-the-art performance on existing benchmarks, they generalize poorly to in-the-wild images. Such methods typically need to train separate models for different scene types and are impractical when the scene type is unknown in advance. One of the underlying problems is the limited scalability of existing data construction pipelines, which limits the diversity of standard image matching datasets. To address this problem, we propose GIM, a self-training framework for learning a single generalizable model based on any image matching architecture using internet videos, an abundant and diverse data source. Given an architecture, GIM first trains it on standard domain-specific datasets and then combines it with complementary matching methods to create dense labels on nearby frames of novel videos. These labels are filtered by robust fitting, and then enhanced by propagating them to distant frames. The final model is trained on propagated data with strong augmentations. We also propose ZEB, the first zero-shot evaluation benchmark for image matching. By mixing data from diverse domains, ZEB can thoroughly assess the cross-domain generalization performance of different methods. Applying GIM consistently improves the zero-shot performance of 3 state-of-the-art image matching architectures; with 50 hours of YouTube videos, the relative zero-shot performance improves by 8.4%-18.1%. GIM also enables generalization to extreme cross-domain data such as Bird Eye View (BEV) images of projected 3D point clouds (Fig. 1(c)). More importantly, our single zero-shot model consistently outperforms domain-specific baselines when evaluated on downstream tasks inherent to their respective domains. The video presentation is available at https://www.youtube.com/watch?v=FU_MJLD8LeY.
△ Less
Submitted 16 February, 2024;
originally announced February 2024.
-
Black-Hole-to-Halo Mass Relation From UNIONS Weak Lensing
Authors:
Qinxun Li,
Martin Kilbinger,
Wentao Luo,
Kai Wang,
Huiyuan Wang,
Anna Wittje,
Hendrik Hildebrandt,
Ludovic van Waerbeke,
Michael J. Hudson,
Samuel Farrens,
Tobias I. Liaudat,
Huiling Liu,
Ziwen Zhang,
Qingqing Wang,
Elisa Russier,
Axel Guinot,
Lucie Baumont,
Fabian Hervas Peters,
Thomas de Boer,
Jiaqi Wang
Abstract:
This letter presents, for the first time, direct constraints on the black-hole-to-halo-mass relation using weak gravitational lensing measurements. We construct type I and type II Active Galactic Nuclei (AGNs) samples from the Sloan Digital Sky Survey (SDSS), with a mean redshift of 0.4 0.1 for type I (type II) AGNs. This sample is cross-correlated with weak lensing shear from the Ultraviolet Near…
▽ More
This letter presents, for the first time, direct constraints on the black-hole-to-halo-mass relation using weak gravitational lensing measurements. We construct type I and type II Active Galactic Nuclei (AGNs) samples from the Sloan Digital Sky Survey (SDSS), with a mean redshift of 0.4 0.1 for type I (type II) AGNs. This sample is cross-correlated with weak lensing shear from the Ultraviolet Near Infrared Northern Survey (UNIONS). We compute the excess surface mass density of the halos associated with $36,181$ AGNs from $94,308,561$ lensed galaxies and fit the halo mass in bins of black-hole mass. We find that more massive AGNs reside in more massive halos. We see no evidence of dependence on AGN type or redshift in the black-hole-to-halo-mass relationship when systematic errors in the measured black-hole masses are included. Our results are consistent with previous measurements for non-AGN galaxies. At a fixed black-hole mass, our weak-lensing halo masses are consistent with galaxy rotation curves, but significantly lower than galaxy clustering measurements. Finally, our results are broadly consistent with state-of-the-art hydro-dynamical cosmological simulations, providing a new constraint for black-hole masses in simulations.
△ Less
Submitted 16 February, 2024;
originally announced February 2024.
-
Reward Generalization in RLHF: A Topological Perspective
Authors:
Tianyi Qiu,
Fanzhi Zeng,
Jiaming Ji,
Dong Yan,
Kaile Wang,
Jiayi Zhou,
Yang Han,
Josef Dai,
Xuehai Pan,
Yaodong Yang
Abstract:
Existing alignment methods share a common topology of information flow, where reward information is collected from humans, modeled with preference learning, and used to tune language models. However, this shared topology has not been systematically characterized, nor have its alternatives been thoroughly explored, leaving the problems of low data efficiency and unreliable generalization unaddresse…
▽ More
Existing alignment methods share a common topology of information flow, where reward information is collected from humans, modeled with preference learning, and used to tune language models. However, this shared topology has not been systematically characterized, nor have its alternatives been thoroughly explored, leaving the problems of low data efficiency and unreliable generalization unaddressed. As a solution, we introduce a theoretical framework for investigating reward generalization in reinforcement learning from human feedback (RLHF), focusing on the topology of information flow at both macro and micro levels. At the macro level, we portray the RLHF information flow as an autoencoding process over behavior distributions, formalizing the RLHF objective of distributional consistency between human preference and model behavior. At the micro level, we present induced Bayesian networks as a theory of reward generalization in RLHF, introducing fine-grained dataset topologies into generalization bounds. Combining analysis on both levels, we propose reward modeling from tree-structured preference information. It is shown to reduce reward uncertainty by up to $Θ(\log n/\log\log n)$ times compared to baselines, where $n$ is the dataset size. Validation on three NLP tasks shows that our tree-based reward model achieves an average win rate of 65% against baseline methods, thus improving reward generalization for free via topology design.
△ Less
Submitted 16 June, 2024; v1 submitted 15 February, 2024;
originally announced February 2024.
-
Dataset Clustering for Improved Offline Policy Learning
Authors:
Qiang Wang,
Yixin Deng,
Francisco Roldan Sanchez,
Keru Wang,
Kevin McGuinness,
Noel O'Connor,
Stephen J. Redmond
Abstract:
Offline policy learning aims to discover decision-making policies from previously-collected datasets without additional online interactions with the environment. As the training dataset is fixed, its quality becomes a crucial determining factor in the performance of the learned policy. This paper studies a dataset characteristic that we refer to as multi-behavior, indicating that the dataset is co…
▽ More
Offline policy learning aims to discover decision-making policies from previously-collected datasets without additional online interactions with the environment. As the training dataset is fixed, its quality becomes a crucial determining factor in the performance of the learned policy. This paper studies a dataset characteristic that we refer to as multi-behavior, indicating that the dataset is collected using multiple policies that exhibit distinct behaviors. In contrast, a uni-behavior dataset would be collected solely using one policy. We observed that policies learned from a uni-behavior dataset typically outperform those learned from multi-behavior datasets, despite the uni-behavior dataset having fewer examples and less diversity. Therefore, we propose a behavior-aware deep clustering approach that partitions multi-behavior datasets into several uni-behavior subsets, thereby benefiting downstream policy learning. Our approach is flexible and effective; it can adaptively estimate the number of clusters while demonstrating high clustering accuracy, achieving an average Adjusted Rand Index of 0.987 across various continuous control task datasets. Finally, we present improved policy learning examples using dataset clustering and discuss several potential scenarios where our approach might benefit the offline policy learning community.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Reconstructing a state-independent cost function in a mean-field game model
Authors:
Kui Ren,
Nathan Soedjak,
Kewei Wang,
Hongyu Zhai
Abstract:
In this short note, we consider an inverse problem to a mean-field games system where we are interested in reconstructing the state-independent running cost function from observed value-function data. We provide an elementary proof of a uniqueness result for the inverse problem using the standard multilinearization technique. One of the main features of our work is that we insist that the populati…
▽ More
In this short note, we consider an inverse problem to a mean-field games system where we are interested in reconstructing the state-independent running cost function from observed value-function data. We provide an elementary proof of a uniqueness result for the inverse problem using the standard multilinearization technique. One of the main features of our work is that we insist that the population distribution be a probability measure, a requirement that is not enforced in some of the existing literature on theoretical inverse mean-field games.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Coexistence of Superconductivity and Antiferromagnetism in Topological Magnet MnBi2Te4 Films
Authors:
Wei Yuan,
Zi-Jie Yan,
Hemian Yi,
Zihao Wang,
Stephen Paolini,
Yi-Fan Zhao,
Ling-Jie Zhou,
Annie G. Wang,
Ke Wang,
Thomas Prokscha,
Zaher Salman,
Andreas Suter,
Purnima P. Balakrishnan,
Alexander J. Grutter,
Laurel E. Winter,
John Singleton,
Moses H. W. Chan,
Cui-Zu Chang
Abstract:
The interface of two materials can harbor unexpected emergent phenomena. One example is interface-induced superconductivity. In this work, we employ molecular beam epitaxy to grow a series of heterostructures formed by stacking together two non-superconducting antiferromagnetic materials, an intrinsic antiferromagnetic topological insulator MnBi2Te4 and an antiferromagnetic iron chalcogenide FeTe.…
▽ More
The interface of two materials can harbor unexpected emergent phenomena. One example is interface-induced superconductivity. In this work, we employ molecular beam epitaxy to grow a series of heterostructures formed by stacking together two non-superconducting antiferromagnetic materials, an intrinsic antiferromagnetic topological insulator MnBi2Te4 and an antiferromagnetic iron chalcogenide FeTe. Our electrical transport measurements reveal interface-induced superconductivity in these heterostructures. By performing scanning tunneling microscopy and spectroscopy measurements, we observe a proximity-induced superconducting gap on the top surface of the MnBi2Te4 layer, confirming the interaction between superconductivity and antiferromagnetism in the MnBi2Te4 layer. Our findings will advance the fundamental inquiries into the topological superconducting phase in hybrid devices and provide a promising platform for the exploration of chiral Majorana physics in MnBi2Te4-based heterostructures.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Model Assessment and Selection under Temporal Distribution Shift
Authors:
Elise Han,
Chengpiao Huang,
Kaizheng Wang
Abstract:
We investigate model assessment and selection in a changing environment, by synthesizing datasets from both the current time period and historical epochs. To tackle unknown and potentially arbitrary temporal distribution shift, we develop an adaptive rolling window approach to estimate the generalization error of a given model. This strategy also facilitates the comparison between any two candidat…
▽ More
We investigate model assessment and selection in a changing environment, by synthesizing datasets from both the current time period and historical epochs. To tackle unknown and potentially arbitrary temporal distribution shift, we develop an adaptive rolling window approach to estimate the generalization error of a given model. This strategy also facilitates the comparison between any two candidate models by estimating the difference of their generalization errors. We further integrate pairwise comparisons into a single-elimination tournament, achieving near-optimal model selection from a collection of candidates. Theoretical analyses and numerical experiments demonstrate the adaptivity of our proposed methods to the non-stationarity in data.
△ Less
Submitted 3 June, 2024; v1 submitted 13 February, 2024;
originally announced February 2024.
-
Dueling Over Dessert, Mastering the Art of Repeated Cake Cutting
Authors:
Simina Brânzei,
MohammadTaghi Hajiaghayi,
Reed Phillips,
Suho Shin,
Kun Wang
Abstract:
We consider the setting of repeated fair division between two players, denoted Alice and Bob, with private valuations over a cake. In each round, a new cake arrives, which is identical to the ones in previous rounds. Alice cuts the cake at a point of her choice, while Bob chooses the left piece or the right piece, leaving the remainder for Alice. We consider two versions: sequential, where Bob obs…
▽ More
We consider the setting of repeated fair division between two players, denoted Alice and Bob, with private valuations over a cake. In each round, a new cake arrives, which is identical to the ones in previous rounds. Alice cuts the cake at a point of her choice, while Bob chooses the left piece or the right piece, leaving the remainder for Alice. We consider two versions: sequential, where Bob observes Alice's cut point before choosing left/right, and simultaneous, where he only observes her cut point after making his choice. The simultaneous version was first considered by Aumann and Maschler (1995).
We observe that if Bob is almost myopic and chooses his favorite piece too often, then he can be systematically exploited by Alice through a strategy akin to a binary search. This strategy allows Alice to approximate Bob's preferences with increasing precision, thereby securing a disproportionate share of the resource over time.
We analyze the limits of how much a player can exploit the other one and show that fair utility profiles are in fact achievable. Specifically, the players can enforce the equitable utility profile of $(1/2, 1/2)$ in the limit on every trajectory of play, by keeping the other player's utility to approximately $1/2$ on average while guaranteeing they themselves get at least approximately $1/2$ on average. We show this theorem using a connection with Blackwell approachability.
Finally, we analyze a natural dynamic known as fictitious play, where players best respond to the empirical distribution of the other player. We show that fictitious play converges to the equitable utility profile of $(1/2, 1/2)$ at a rate of $O(1/\sqrt{T})$.
△ Less
Submitted 18 February, 2024; v1 submitted 13 February, 2024;
originally announced February 2024.
-
Pandora: Jailbreak GPTs by Retrieval Augmented Generation Poisoning
Authors:
Gelei Deng,
Yi Liu,
Kailong Wang,
Yuekang Li,
Tianwei Zhang,
Yang Liu
Abstract:
Large Language Models~(LLMs) have gained immense popularity and are being increasingly applied in various domains. Consequently, ensuring the security of these models is of paramount importance. Jailbreak attacks, which manipulate LLMs to generate malicious content, are recognized as a significant vulnerability. While existing research has predominantly focused on direct jailbreak attacks on LLMs,…
▽ More
Large Language Models~(LLMs) have gained immense popularity and are being increasingly applied in various domains. Consequently, ensuring the security of these models is of paramount importance. Jailbreak attacks, which manipulate LLMs to generate malicious content, are recognized as a significant vulnerability. While existing research has predominantly focused on direct jailbreak attacks on LLMs, there has been limited exploration of indirect methods. The integration of various plugins into LLMs, notably Retrieval Augmented Generation~(RAG), which enables LLMs to incorporate external knowledge bases into their response generation such as GPTs, introduces new avenues for indirect jailbreak attacks.
To fill this gap, we investigate indirect jailbreak attacks on LLMs, particularly GPTs, introducing a novel attack vector named Retrieval Augmented Generation Poisoning. This method, Pandora, exploits the synergy between LLMs and RAG through prompt manipulation to generate unexpected responses. Pandora uses maliciously crafted content to influence the RAG process, effectively initiating jailbreak attacks. Our preliminary tests show that Pandora successfully conducts jailbreak attacks in four different scenarios, achieving higher success rates than direct attacks, with 64.3\% for GPT-3.5 and 34.8\% for GPT-4.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
Probing the interaction energy of two $^{85}$Rb atoms in an optical tweezer via spin-motion coupling
Authors:
Jun Zhuang,
Kun-Peng Wang,
Peng-Xiang Wang,
Ming-Rui Wei,
Bahtiyar Mamat,
Cheng Sheng,
Peng Xu,
Min Liu,
Jin Wang,
Xiao-Dong He,
Ming-Sheng Zhan
Abstract:
The inherent polarization gradients in tight optical tweezers can be used to couple the atomic spins to the two-body motion under the action of a microwave spin-flip transition, so that such a spin-motion coupling offers an important control knob on the motional states of optically trapped two colliding atoms. Here, after preparing two elastically scattering $^{85}$Rb atoms in the three-dimensiona…
▽ More
The inherent polarization gradients in tight optical tweezers can be used to couple the atomic spins to the two-body motion under the action of a microwave spin-flip transition, so that such a spin-motion coupling offers an important control knob on the motional states of optically trapped two colliding atoms. Here, after preparing two elastically scattering $^{85}$Rb atoms in the three-dimensional ground-state in the optical tweezer, we employed this control in order to probe the colliding energies of elastic and inelastic channels. The combination of microwave spectra and corresponding s-wave pseudopotential model allows us to infer the effect of the state-dependent trapping potentials on the elastic colliding energies, as well as to reveal how the presence of inelastic interactions affects elastic part of the relative potential. Our work shows that the spin-motion coupling in a tight optical tweezer expand the experimental toolbox for fundamental studies of ultracold collisions in the two body systems with reactive collisions, and potentially for that of more complex interactions, such as optically trapped atom-molecule and molecule-molecule interactions.
△ Less
Submitted 2 July, 2024; v1 submitted 12 February, 2024;
originally announced February 2024.
-
Phonon and defect mediated quantum anomalous Hall insulator to metal transition in magnetically doped topological insulators
Authors:
Akiyoshi Park,
Adrian Llanos,
Chun-I Lu,
Yinan Chen,
Sebastien N. Abadi,
Chien- Chang Chen,
Marcus L. Teague,
Lixuan Tai,
Peng Zhang,
Kang L. Wang,
Nai-Chang Yeh
Abstract:
Quantum Anomalous Hall (QAH) state in six quintuple layer Cr$_{0.1}$(Bi$_{0.2}$Sb$_{0.8}$)$_{1.9}$Te$_3$ thin films were studied through scanning tunneling spectroscopy (STS) and electrical transport measurements. While the surface state is gapless above the Curie temperature ($T_\mathrm{C} \approx 30$ K), scanning tunneling spectroscopy (STS) of the sample reveals a topologically non-trivial gap…
▽ More
Quantum Anomalous Hall (QAH) state in six quintuple layer Cr$_{0.1}$(Bi$_{0.2}$Sb$_{0.8}$)$_{1.9}$Te$_3$ thin films were studied through scanning tunneling spectroscopy (STS) and electrical transport measurements. While the surface state is gapless above the Curie temperature ($T_\mathrm{C} \approx 30$ K), scanning tunneling spectroscopy (STS) of the sample reveals a topologically non-trivial gap with an average value of $\approx 13.5$ meV at 4.2 K below the ferromagnetic transition. Nonetheless, areal STS scans of the magnetic topological insulator exhibit energy modulations on the order of several meV's in the surface bands which result in the valence band maximum in some regions becoming higher than the energy of the conduction band minimum of some other regions that are spatially separated by no more than 3 nm. First principle calculations demonstrate that the origin of the observed inhomogeneous energy band alignment is an outcome of many-body interactions, namely electron-defect interactions and electron-phonon interactions. Defects play the role of locally modifying the energy landscape of surface bands while electron-phonon interactions renormalize the surface bands such that the surface gap becomes reduced by more than 1 meV as temperature is raised from 0 to 4.2 K. These many-body interactions at a finite temperature result in substantial increase of electron tunneling across the spatially separated conduction band pockets even for finite temperatures well below $T_\mathrm{C}$ , thus driving the magnetic topological insulator out of its QAH insulating phase into a metallic phase at a relatively low temperature.
△ Less
Submitted 12 February, 2024;
originally announced February 2024.
-
More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning
Authors:
Kaiwen Wang,
Owen Oertell,
Alekh Agarwal,
Nathan Kallus,
Wen Sun
Abstract:
In this paper, we prove that Distributional Reinforcement Learning (DistRL), which learns the return distribution, can obtain second-order bounds in both online and offline RL in general settings with function approximation. Second-order bounds are instance-dependent bounds that scale with the variance of return, which we prove are tighter than the previously known small-loss bounds of distributio…
▽ More
In this paper, we prove that Distributional Reinforcement Learning (DistRL), which learns the return distribution, can obtain second-order bounds in both online and offline RL in general settings with function approximation. Second-order bounds are instance-dependent bounds that scale with the variance of return, which we prove are tighter than the previously known small-loss bounds of distributional RL. To the best of our knowledge, our results are the first second-order bounds for low-rank MDPs and for offline RL. When specializing to contextual bandits (one-step RL problem), we show that a distributional learning based optimism algorithm achieves a second-order worst-case regret bound, and a second-order gap dependent bound, simultaneously. We also empirically demonstrate the benefit of DistRL in contextual bandits on real-world datasets. We highlight that our analysis with DistRL is relatively simple, follows the general framework of optimism in the face of uncertainty and does not require weighted regression. Our results suggest that DistRL is a promising framework for obtaining second-order bounds in general RL settings, thus further reinforcing the benefits of DistRL.
△ Less
Submitted 11 February, 2024;
originally announced February 2024.
-
Modeling Spatio-temporal Dynamical Systems with Neural Discrete Learning and Levels-of-Experts
Authors:
Kun Wang,
Hao Wu,
Guibin Zhang,
Junfeng Fang,
Yuxuan Liang,
Yuankai Wu,
Roger Zimmermann,
Yang Wang
Abstract:
In this paper, we address the issue of modeling and estimating changes in the state of the spatio-temporal dynamical systems based on a sequence of observations like video frames. Traditional numerical simulation systems depend largely on the initial settings and correctness of the constructed partial differential equations (PDEs). Despite recent efforts yielding significant success in discovering…
▽ More
In this paper, we address the issue of modeling and estimating changes in the state of the spatio-temporal dynamical systems based on a sequence of observations like video frames. Traditional numerical simulation systems depend largely on the initial settings and correctness of the constructed partial differential equations (PDEs). Despite recent efforts yielding significant success in discovering data-driven PDEs with neural networks, the limitations posed by singular scenarios and the absence of local insights prevent them from performing effectively in a broader real-world context. To this end, this paper propose the universal expert module -- that is, optical flow estimation component, to capture the evolution laws of general physical processes in a data-driven fashion. To enhance local insight, we painstakingly design a finer-grained physical pipeline, since local characteristics may be influenced by various internal contextual information, which may contradict the macroscopic properties of the whole system. Further, we harness currently popular neural discrete learning to unveil the underlying important features in its latent space, this process better injects interpretability, which can help us obtain a powerful prior over these discrete random variables. We conduct extensive experiments and ablations to demonstrate that the proposed framework achieves large performance margins, compared with the existing SOTA baselines.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
EXGC: Bridging Efficiency and Explainability in Graph Condensation
Authors:
Junfeng Fang,
Xinglin Li,
Yongduo Sui,
Yuan Gao,
Guibin Zhang,
Kun Wang,
Xiang Wang,
Xiangnan He
Abstract:
Graph representation learning on vast datasets, like web data, has made significant strides. However, the associated computational and storage overheads raise concerns. In sight of this, Graph condensation (GCond) has been introduced to distill these large real datasets into a more concise yet information-rich synthetic graph. Despite acceleration efforts, existing GCond methods mainly grapple wit…
▽ More
Graph representation learning on vast datasets, like web data, has made significant strides. However, the associated computational and storage overheads raise concerns. In sight of this, Graph condensation (GCond) has been introduced to distill these large real datasets into a more concise yet information-rich synthetic graph. Despite acceleration efforts, existing GCond methods mainly grapple with efficiency, especially on expansive web data graphs. Hence, in this work, we pinpoint two major inefficiencies of current paradigms: (1) the concurrent updating of a vast parameter set, and (2) pronounced parameter redundancy. To counteract these two limitations correspondingly, we first (1) employ the Mean-Field variational approximation for convergence acceleration, and then (2) propose the objective of Gradient Information Bottleneck (GDIB) to prune redundancy. By incorporating the leading explanation techniques (e.g., GNNExplainer and GSAT) to instantiate the GDIB, our EXGC, the Efficient and eXplainable Graph Condensation method is proposed, which can markedly boost efficiency and inject explainability. Our extensive evaluations across eight datasets underscore EXGC's superiority and relevance. Code is available at https://github.com/MangoKiller/EXGC.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
Improving Token-Based World Models with Parallel Observation Prediction
Authors:
Lior Cohen,
Kaixin Wang,
Bingyi Kang,
Shie Mannor
Abstract:
Motivated by the success of Transformers when applied to sequences of discrete symbols, token-based world models (TBWMs) were recently proposed as sample-efficient methods. In TBWMs, the world model consumes agent experience as a language-like sequence of tokens, where each observation constitutes a sub-sequence. However, during imagination, the sequential token-by-token generation of next observa…
▽ More
Motivated by the success of Transformers when applied to sequences of discrete symbols, token-based world models (TBWMs) were recently proposed as sample-efficient methods. In TBWMs, the world model consumes agent experience as a language-like sequence of tokens, where each observation constitutes a sub-sequence. However, during imagination, the sequential token-by-token generation of next observations results in a severe bottleneck, leading to long training times, poor GPU utilization, and limited representations. To resolve this bottleneck, we devise a novel Parallel Observation Prediction (POP) mechanism. POP augments a Retentive Network (RetNet) with a novel forward mode tailored to our reinforcement learning setting. We incorporate POP in a novel TBWM agent named REM (Retentive Environment Model), showcasing a 15.4x faster imagination compared to prior TBWMs. REM attains superhuman performance on 12 out of 26 games of the Atari 100K benchmark, while training in less than 12 hours. Our code is available at \url{https://github.com/leor-c/REM}.
△ Less
Submitted 29 May, 2024; v1 submitted 8 February, 2024;
originally announced February 2024.
-
A Non-Intrusive Neural Quality Assessment Model for Surface Electromyography Signals
Authors:
Cho-Yuan Lee,
Kuan-Chen Wang,
Kai-Chun Liu,
Yu-Te Wang,
Xugang Lu,
Ping-Cheng Yeh,
Yu Tsao
Abstract:
In practical scenarios involving the measurement of surface electromyography (sEMG) in muscles, particularly those areas near the heart, one of the primary sources of contamination is the presence of electrocardiogram (ECG) signals. To assess the quality of real-world sEMG data more effectively, this study proposes QASE-net, a new non-intrusive model that predicts the SNR of sEMG signals. QASE-net…
▽ More
In practical scenarios involving the measurement of surface electromyography (sEMG) in muscles, particularly those areas near the heart, one of the primary sources of contamination is the presence of electrocardiogram (ECG) signals. To assess the quality of real-world sEMG data more effectively, this study proposes QASE-net, a new non-intrusive model that predicts the SNR of sEMG signals. QASE-net combines CNN-BLSTM with attention mechanisms and follows an end-to-end training strategy. Our experimental framework utilizes real-world sEMG and ECG data from two open-access databases, the Non-Invasive Adaptive Prosthetics Database and the MIT-BIH Normal Sinus Rhythm Database, respectively. The experimental results demonstrate the superiority of QASE-net over the previous assessment model, exhibiting significantly reduced prediction errors and notably higher linear correlations with the ground truth. These findings show the potential of QASE-net to substantially enhance the reliability and precision of sEMG quality assessment in practical applications.
△ Less
Submitted 13 June, 2024; v1 submitted 8 February, 2024;
originally announced February 2024.
-
Navigating Complexity: Toward Lossless Graph Condensation via Expanding Window Matching
Authors:
Yuchen Zhang,
Tianle Zhang,
Kai Wang,
Ziyao Guo,
Yuxuan Liang,
Xavier Bresson,
Wei Jin,
Yang You
Abstract:
Graph condensation aims to reduce the size of a large-scale graph dataset by synthesizing a compact counterpart without sacrificing the performance of Graph Neural Networks (GNNs) trained on it, which has shed light on reducing the computational cost for training GNNs. Nevertheless, existing methods often fall short of accurately replicating the original graph for certain datasets, thereby failing…
▽ More
Graph condensation aims to reduce the size of a large-scale graph dataset by synthesizing a compact counterpart without sacrificing the performance of Graph Neural Networks (GNNs) trained on it, which has shed light on reducing the computational cost for training GNNs. Nevertheless, existing methods often fall short of accurately replicating the original graph for certain datasets, thereby failing to achieve the objective of lossless condensation. To understand this phenomenon, we investigate the potential reasons and reveal that the previous state-of-the-art trajectory matching method provides biased and restricted supervision signals from the original graph when optimizing the condensed one. This significantly limits both the scale and efficacy of the condensed graph. In this paper, we make the first attempt toward \textit{lossless graph condensation} by bridging the previously neglected supervision signals. Specifically, we employ a curriculum learning strategy to train expert trajectories with more diverse supervision signals from the original graph, and then effectively transfer the information into the condensed graph with expanding window matching. Moreover, we design a loss function to further extract knowledge from the expert trajectories. Theoretical analysis justifies the design of our method and extensive experiments verify its superiority across different datasets. Code is released at https://github.com/NUS-HPC-AI-Lab/GEOM.
△ Less
Submitted 18 June, 2024; v1 submitted 7 February, 2024;
originally announced February 2024.
-
Two Trades is not Baffled: Condensing Graph via Crafting Rational Gradient Matching
Authors:
Tianle Zhang,
Yuchen Zhang,
Kun Wang,
Kai Wang,
Beining Yang,
Kaipeng Zhang,
Wenqi Shao,
Ping Liu,
Joey Tianyi Zhou,
Yang You
Abstract:
Training on large-scale graphs has achieved remarkable results in graph representation learning, but its cost and storage have raised growing concerns. As one of the most promising directions, graph condensation methods address these issues by employing gradient matching, aiming to condense the full graph into a more concise yet information-rich synthetic set. Though encouraging, these strategies…
▽ More
Training on large-scale graphs has achieved remarkable results in graph representation learning, but its cost and storage have raised growing concerns. As one of the most promising directions, graph condensation methods address these issues by employing gradient matching, aiming to condense the full graph into a more concise yet information-rich synthetic set. Though encouraging, these strategies primarily emphasize matching directions of the gradients, which leads to deviations in the training trajectories. Such deviations are further magnified by the differences between the condensation and evaluation phases, culminating in accumulated errors, which detrimentally affect the performance of the condensed graphs. In light of this, we propose a novel graph condensation method named \textbf{C}raf\textbf{T}ing \textbf{R}ationa\textbf{L} trajectory (\textbf{CTRL}), which offers an optimized starting point closer to the original dataset's feature distribution and a more refined strategy for gradient matching. Theoretically, CTRL can effectively neutralize the impact of accumulated errors on the performance of condensed graphs. We provide extensive experiments on various graph datasets and downstream tasks to support the effectiveness of CTRL. Code is released at https://github.com/NUS-HPC-AI-Lab/CTRL.
△ Less
Submitted 30 June, 2024; v1 submitted 7 February, 2024;
originally announced February 2024.
-
Spiking-PhysFormer: Camera-Based Remote Photoplethysmography with Parallel Spike-driven Transformer
Authors:
Mingxuan Liu,
Jiankai Tang,
Haoxiang Li,
Jiahao Qi,
Siwei Li,
Kegang Wang,
Yuntao Wang,
Hong Chen
Abstract:
Artificial neural networks (ANNs) can help camera-based remote photoplethysmography (rPPG) in measuring cardiac activity and physiological signals from facial videos, such as pulse wave, heart rate and respiration rate with better accuracy. However, most existing ANN-based methods require substantial computing resources, which poses challenges for effective deployment on mobile devices. Spiking ne…
▽ More
Artificial neural networks (ANNs) can help camera-based remote photoplethysmography (rPPG) in measuring cardiac activity and physiological signals from facial videos, such as pulse wave, heart rate and respiration rate with better accuracy. However, most existing ANN-based methods require substantial computing resources, which poses challenges for effective deployment on mobile devices. Spiking neural networks (SNNs), on the other hand, hold immense potential for energy-efficient deep learning owing to their binary and event-driven architecture. To the best of our knowledge, we are the first to introduce SNNs into the realm of rPPG, proposing a hybrid neural network (HNN) model, the Spiking-PhysFormer, aimed at reducing power consumption. Specifically, the proposed Spiking-PhyFormer consists of an ANN-based patch embedding block, SNN-based transformer blocks, and an ANN-based predictor head. First, to simplify the transformer block while preserving its capacity to aggregate local and global spatio-temporal features, we design a parallel spike transformer block to replace sequential sub-blocks. Additionally, we propose a simplified spiking self-attention mechanism that omits the value parameter without compromising the model's performance. Experiments conducted on four datasets-PURE, UBFC-rPPG, UBFC-Phys, and MMPD demonstrate that the proposed model achieves a 12.4\% reduction in power consumption compared to PhysFormer. Additionally, the power consumption of the transformer block is reduced by a factor of 12.2, while maintaining decent performance as PhysFormer and other ANN-based models.
△ Less
Submitted 9 February, 2024; v1 submitted 7 February, 2024;
originally announced February 2024.
-
FM-Fusion: Instance-aware Semantic Mapping Boosted by Vision-Language Foundation Models
Authors:
Chuhao Liu,
Ke Wang,
Jieqi Shi,
Zhijian Qiao,
Shaojie Shen
Abstract:
Semantic mapping based on the supervised object detectors is sensitive to image distribution. In real-world environments, the object detection and segmentation performance can lead to a major drop, preventing the use of semantic mapping in a wider domain. On the other hand, the development of vision-language foundation models demonstrates a strong zero-shot transferability across data distribution…
▽ More
Semantic mapping based on the supervised object detectors is sensitive to image distribution. In real-world environments, the object detection and segmentation performance can lead to a major drop, preventing the use of semantic mapping in a wider domain. On the other hand, the development of vision-language foundation models demonstrates a strong zero-shot transferability across data distribution. It provides an opportunity to construct generalizable instance-aware semantic maps. Hence, this work explores how to boost instance-aware semantic mapping from object detection generated from foundation models. We propose a probabilistic label fusion method to predict close-set semantic classes from open-set label measurements. An instance refinement module merges the over-segmented instances caused by inconsistent segmentation. We integrate all the modules into a unified semantic mapping system. Reading a sequence of RGB-D input, our work incrementally reconstructs an instance-aware semantic map. We evaluate the zero-shot performance of our method in ScanNet and SceneNN datasets. Our method achieves 40.3 mean average precision (mAP) on the ScanNet semantic instance segmentation task. It outperforms the traditional semantic mapping method significantly.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Electrokinetic origin of swirling flow on nanoscale interface
Authors:
Shuangshuang Meng,
Yu Han,
Wei Zhao,
Yueqiang Zhu,
Chen Zhang,
Xiaoqiang Feng,
Ce Zhang,
Duyang Zang,
Guangyin Jing,
Kaige Wang
Abstract:
The zeta ($ζ$) potential is a pivotal metric for characterizing the electric field topology within an electric double layer - an important phenomenon on phase interface. It underpins critical processes in diverse realms such as chemistry, biomedical engineering, and micro/nanofluidics. Yet, local measurement of $ζ$ potential at the interface has historically presented challenges, leading researche…
▽ More
The zeta ($ζ$) potential is a pivotal metric for characterizing the electric field topology within an electric double layer - an important phenomenon on phase interface. It underpins critical processes in diverse realms such as chemistry, biomedical engineering, and micro/nanofluidics. Yet, local measurement of $ζ$ potential at the interface has historically presented challenges, leading researchers to simplify a chemically homogenized surface with a uniform $ζ$ potential. In the current investigation, we present evidence that, within a microchannel, the spatial distribution of $ζ$ potential across a chemically homogeneous solid-liquid interface can become two-dimensional (2D) under an imposed flow regime, as disclosed by a state-of-art fluorescence photobleaching electrochemistry analyzer (FLEA) technique. The $ζ$ potential' s propensity to become increasingly negative downstream, presents an approximately symmetric, V-shaped pattern in the spanwise orientation. Intriguingly, and of notable significance to chemistry and engineering, this 2D $ζ$ potential framework was found to electrokinetically induce swirling flows in tens of nanometers, aligning with the streamwise axis, bearing a remarkable resemblance to the well-documented hairpin vortices in turbulent boundary layers. Our findings gesture towards a novel perspective on the genesis of vortex structures in nanoscale. Additionally, the FLEA technique emerges as a potent tool for discerning $ζ$ potential at a local scale with high resolution, potentially accelerating the evolution and applications of novel surface material.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
A Mean-Field Study of Quantum Oscillations in Two-Dimensional Kondo Insulators
Authors:
Kaize Wang,
Yang Ge,
Yashar Komijani
Abstract:
Magnetic oscillations in strongly correlated insulating systems have garnered interest due to oscillations seemingly originating from the bulk, despite an anticipated gapped spectrum. We use the large-$N$ mean-field theory to study the behavior of normal and topological Kondo insulators under a magnetic field. In both cases spinons acquire a charge and hybridize with electrons, producing magnetic…
▽ More
Magnetic oscillations in strongly correlated insulating systems have garnered interest due to oscillations seemingly originating from the bulk, despite an anticipated gapped spectrum. We use the large-$N$ mean-field theory to study the behavior of normal and topological Kondo insulators under a magnetic field. In both cases spinons acquire a charge and hybridize with electrons, producing magnetic oscillations that resemble two-band noninteracting systems. We show that in such band insulators magnetic oscillations are exponentially suppressed at weak magnetic fields. A self-consistent mean-field calculation for the Kondo insulators reveals that the temperature dependence of the oscillations departs from the noninteracting case due to the temperature and magnetic-field dependence of the hybridization, even though mean-field parameters remain homogeneous at low fields. Larger magnetic fields result in the Kondo breakdown, where the magnetic oscillation is solely due to the decoupled conduction electrons. These findings offer new insights into the magnetic properties of Kondo insulators, with implications for interpreting experimental results in heavy fermion materials like SmB$_6$.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Precise Measurement of Born Cross Sections for $e^+e^-\to D\bar{D}$ and Observation of One Structure between $\sqrt{s} = 3.80-4.95$ GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (604 additional authors not shown)
Abstract:
Using data samples collected with the BESIII detector at the BEPCII collider at center-of-mass energies ranging from 3.80 to 4.95 GeV, corresponding to an integrated luminosity of 20 fb$^{-1}$, a measurement of Born cross sections for the $e^+e^-\to D^{0}\bar{D}^{0}$ and $D^{+}D^{-}$ processes is presented with unprecedented precision. By performing a simultaneous fit to the dressed cross sections…
▽ More
Using data samples collected with the BESIII detector at the BEPCII collider at center-of-mass energies ranging from 3.80 to 4.95 GeV, corresponding to an integrated luminosity of 20 fb$^{-1}$, a measurement of Born cross sections for the $e^+e^-\to D^{0}\bar{D}^{0}$ and $D^{+}D^{-}$ processes is presented with unprecedented precision. By performing a simultaneous fit to the dressed cross sections for both processes, one possible new structure around 3.9 GeV/$c^2$ is observed for the first time, in addition to seven known resonances $ψ(3770)$, $ψ(4040)$, $ψ(4160)$, $Y(4230)$, $Y(4360)$, $ψ(4415)$, and $Y(4660)$. These results offer crucial experimental insights into the nature of hadron production in the open charm region.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
SDEMG: Score-based Diffusion Model for Surface Electromyographic Signal Denoising
Authors:
Yu-Tung Liu,
Kuan-Chen Wang,
Kai-Chun Liu,
Sheng-Yu Peng,
Yu Tsao
Abstract:
Surface electromyography (sEMG) recordings can be influenced by electrocardiogram (ECG) signals when the muscle being monitored is close to the heart. Several existing methods use signal-processing-based approaches, such as high-pass filter and template subtraction, while some derive mapping functions to restore clean sEMG signals from noisy sEMG (sEMG with ECG interference). Recently, the score-b…
▽ More
Surface electromyography (sEMG) recordings can be influenced by electrocardiogram (ECG) signals when the muscle being monitored is close to the heart. Several existing methods use signal-processing-based approaches, such as high-pass filter and template subtraction, while some derive mapping functions to restore clean sEMG signals from noisy sEMG (sEMG with ECG interference). Recently, the score-based diffusion model, a renowned generative model, has been introduced to generate high-quality and accurate samples with noisy input data. In this study, we proposed a novel approach, termed SDEMG, as a score-based diffusion model for sEMG signal denoising. To evaluate the proposed SDEMG approach, we conduct experiments to reduce noise in sEMG signals, employing data from an openly accessible source, the Non-Invasive Adaptive Prosthetics database, along with ECG signals from the MIT-BIH Normal Sinus Rhythm Database. The experiment result indicates that SDEMG outperformed comparative methods and produced high-quality sEMG samples. The source code of SDEMG the framework is available at: https://github.com/tonyliu0910/SDEMG
△ Less
Submitted 23 February, 2024; v1 submitted 6 February, 2024;
originally announced February 2024.
-
MolTC: Towards Molecular Relational Modeling In Language Models
Authors:
Junfeng Fang,
Shuai Zhang,
Chang Wu,
Zhengyi Yang,
Zhiyuan Liu,
Sihang Li,
Kun Wang,
Wenjie Du,
Xiang Wang
Abstract:
Molecular Relational Learning (MRL), aiming to understand interactions between molecular pairs, plays a pivotal role in advancing biochemical research. Recently, the adoption of large language models (LLMs), known for their vast knowledge repositories and advanced logical inference capabilities, has emerged as a promising way for efficient and effective MRL. Despite their potential, these methods…
▽ More
Molecular Relational Learning (MRL), aiming to understand interactions between molecular pairs, plays a pivotal role in advancing biochemical research. Recently, the adoption of large language models (LLMs), known for their vast knowledge repositories and advanced logical inference capabilities, has emerged as a promising way for efficient and effective MRL. Despite their potential, these methods predominantly rely on the textual data, thus not fully harnessing the wealth of structural information inherent in molecular graphs. Moreover, the absence of a unified framework exacerbates the issue of information underutilization, as it hinders the sharing of interaction mechanism learned across diverse datasets. To address these challenges, this work proposes a novel LLM-based multi-modal framework for Molecular inTeraction prediction following Chain-of-Thought (CoT) theory, termed MolTC, which effectively integrate graphical information of two molecules in pair. To train MolTC efficiently, we introduce a Multi-hierarchical CoT concept to refine its training paradigm, and conduct a comprehensive Molecular Interactive Instructions dataset for the development of biochemical LLMs involving MRL. Our experiments, conducted across various datasets involving over 4,000,000 molecular pairs, exhibit the superiority of our method over current GNN and LLM-based baselines. Code is available at https://github.com/MangoKiller/MolTC.
△ Less
Submitted 10 June, 2024; v1 submitted 6 February, 2024;
originally announced February 2024.
-
Joint Attention-Guided Feature Fusion Network for Saliency Detection of Surface Defects
Authors:
Xiaoheng Jiang,
Feng Yan,
Yang Lu,
Ke Wang,
Shuai Guo,
Tianzhu Zhang,
Yanwei Pang,
Jianwei Niu,
Mingliang Xu
Abstract:
Surface defect inspection plays an important role in the process of industrial manufacture and production. Though Convolutional Neural Network (CNN) based defect inspection methods have made huge leaps, they still confront a lot of challenges such as defect scale variation, complex background, low contrast, and so on. To address these issues, we propose a joint attention-guided feature fusion netw…
▽ More
Surface defect inspection plays an important role in the process of industrial manufacture and production. Though Convolutional Neural Network (CNN) based defect inspection methods have made huge leaps, they still confront a lot of challenges such as defect scale variation, complex background, low contrast, and so on. To address these issues, we propose a joint attention-guided feature fusion network (JAFFNet) for saliency detection of surface defects based on the encoder-decoder network. JAFFNet mainly incorporates a joint attention-guided feature fusion (JAFF) module into decoding stages to adaptively fuse low-level and high-level features. The JAFF module learns to emphasize defect features and suppress background noise during feature fusion, which is beneficial for detecting low-contrast defects. In addition, JAFFNet introduces a dense receptive field (DRF) module following the encoder to capture features with rich context information, which helps detect defects of different scales. The JAFF module mainly utilizes a learned joint channel-spatial attention map provided by high-level semantic features to guide feature fusion. The attention map makes the model pay more attention to defect features. The DRF module utilizes a sequence of multi-receptive-field (MRF) units with each taking as inputs all the preceding MRF feature maps and the original input. The obtained DRF features capture rich context information with a large range of receptive fields. Extensive experiments conducted on SD-saliency-900, Magnetic tile, and DAGM 2007 indicate that our method achieves promising performance in comparison with other state-of-the-art methods. Meanwhile, our method reaches a real-time defect detection speed of 66 FPS.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
AI-Generated Content Enhanced Computer-Aided Diagnosis Model for Thyroid Nodules: A ChatGPT-Style Assistant
Authors:
Jincao Yao,
Yunpeng Wang,
Zhikai Lei,
Kai Wang,
Xiaoxian Li,
Jianhua Zhou,
Xiang Hao,
Jiafei Shen,
Zhenping Wang,
Rongrong Ru,
Yaqing Chen,
Yahan Zhou,
Chen Chen,
Yanming Zhang,
Ping Liang,
Dong Xu
Abstract:
An artificial intelligence-generated content-enhanced computer-aided diagnosis (AIGC-CAD) model, designated as ThyGPT, has been developed. This model, inspired by the architecture of ChatGPT, could assist radiologists in assessing the risk of thyroid nodules through semantic-level human-machine interaction. A dataset comprising 19,165 thyroid nodule ultrasound cases from Zhejiang Cancer Hospital w…
▽ More
An artificial intelligence-generated content-enhanced computer-aided diagnosis (AIGC-CAD) model, designated as ThyGPT, has been developed. This model, inspired by the architecture of ChatGPT, could assist radiologists in assessing the risk of thyroid nodules through semantic-level human-machine interaction. A dataset comprising 19,165 thyroid nodule ultrasound cases from Zhejiang Cancer Hospital was assembled to facilitate the training and validation of the model. After training, ThyGPT could automatically evaluate thyroid nodule and engage in effective communication with physicians through human-computer interaction. The performance of ThyGPT was rigorously quantified using established metrics such as the receiver operating characteristic (ROC) curve, area under the curve (AUC), sensitivity, and specificity. The empirical findings revealed that radiologists, when supplemented with ThyGPT, markedly surpassed the diagnostic acumen of their peers utilizing traditional methods as well as the performance of the model in isolation. These findings suggest that AIGC-CAD systems, exemplified by ThyGPT, hold the promise to fundamentally transform the diagnostic workflows of radiologists in forthcoming years.
△ Less
Submitted 4 February, 2024;
originally announced February 2024.