-
MetaFood CVPR 2024 Challenge on Physically Informed 3D Food Reconstruction: Methods and Results
Authors:
Jiangpeng He,
Yuhao Chen,
Gautham Vinod,
Talha Ibn Mahmud,
Fengqing Zhu,
Edward Delp,
Alexander Wong,
Pengcheng Xi,
Ahmad AlMughrabi,
Umair Haroon,
Ricardo Marques,
Petia Radeva,
Jiadong Tang,
Dianyi Yang,
Yu Gao,
Zhaoxiang Liang,
Yawei Jueluo,
Chengyu Shi,
Pengyu Wang
Abstract:
The increasing interest in computer vision applications for nutrition and dietary monitoring has led to the development of advanced 3D reconstruction techniques for food items. However, the scarcity of high-quality data and limited collaboration between industry and academia have constrained progress in this field. Building on recent advancements in 3D reconstruction, we host the MetaFood Workshop…
▽ More
The increasing interest in computer vision applications for nutrition and dietary monitoring has led to the development of advanced 3D reconstruction techniques for food items. However, the scarcity of high-quality data and limited collaboration between industry and academia have constrained progress in this field. Building on recent advancements in 3D reconstruction, we host the MetaFood Workshop and its challenge for Physically Informed 3D Food Reconstruction. This challenge focuses on reconstructing volume-accurate 3D models of food items from 2D images, using a visible checkerboard as a size reference. Participants were tasked with reconstructing 3D models for 20 selected food items of varying difficulty levels: easy, medium, and hard. The easy level provides 200 images, the medium level provides 30 images, and the hard level provides only 1 image for reconstruction. In total, 16 teams submitted results in the final testing phase. The solutions developed in this challenge achieved promising results in 3D food reconstruction, with significant potential for improving portion estimation for dietary assessment and nutritional monitoring. More details about this workshop challenge and access to the dataset can be found at https://sites.google.com/view/cvpr-metafood-2024.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Rotating dipole and quadrupole quantum droplets in binary Bose-Einstein condensates
Authors:
Dongshuai Liu,
Yanxia Gao,
Dianyuan Fan,
Boris A. Malomed,
Lifu Zhang
Abstract:
Quantum droplets (QDs) are self-trapped modes stabilized by the Lee-Huang-Yang correction to the mean-field Hamiltonian of binary atomic Bose-Einstein condensates. The existence and stability of quiescent and rotating dipole-shaped and vortex QDs with vorticity $S=1$ (DQDs and VQDs, respectively) are numerically studied in the framework of the accordingly modified two-component system. The rotatin…
▽ More
Quantum droplets (QDs) are self-trapped modes stabilized by the Lee-Huang-Yang correction to the mean-field Hamiltonian of binary atomic Bose-Einstein condensates. The existence and stability of quiescent and rotating dipole-shaped and vortex QDs with vorticity $S=1$ (DQDs and VQDs, respectively) are numerically studied in the framework of the accordingly modified two-component system. The rotating DQDs trapped in an annular potential are built of two crescent-like components, stretching along the azimuthal direction with the increase of the rotation frequency. Rotating quadrupole QDs (QQDs) bifurcate from the VQDs with $S=2$. Above a certain rotation frequency, they transform back into VQDs with a flat-top shape. Rotating DQDs and QQDs are stable in a broad interval of values of the chemical potential. The results provide the first example of stable modes which are intermediate states between the rotating DQDs and QQDs on the one hand, and VQDs on the other.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Attribution Methods in Asset Pricing: Do They Account for Risk?
Authors:
Dangxing Chen,
Yuan Gao
Abstract:
Over the past few decades, machine learning models have been extremely successful. As a result of axiomatic attribution methods, feature contributions have been explained more clearly and rigorously. There are, however, few studies that have examined domain knowledge in conjunction with the axioms. In this study, we examine asset pricing in finance, a field closely related to risk management. Cons…
▽ More
Over the past few decades, machine learning models have been extremely successful. As a result of axiomatic attribution methods, feature contributions have been explained more clearly and rigorously. There are, however, few studies that have examined domain knowledge in conjunction with the axioms. In this study, we examine asset pricing in finance, a field closely related to risk management. Consequently, when applying machine learning models, we must ensure that the attribution methods reflect the underlying risks accurately. In this work, we present and study several axioms derived from asset pricing domain knowledge. It is shown that while Shapley value and Integrated Gradients preserve most axioms, neither can satisfy all axioms. Using extensive analytical and empirical examples, we demonstrate how attribution methods can reflect risks and when they should not be used.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Hypergraph Multi-modal Large Language Model: Exploiting EEG and Eye-tracking Modalities to Evaluate Heterogeneous Responses for Video Understanding
Authors:
Minghui Wu,
Chenxu Zhao,
Anyang Su,
Donglin Di,
Tianyu Fu,
Da An,
Min He,
Ya Gao,
Meng Ma,
Kun Yan,
Ping Wang
Abstract:
Understanding of video creativity and content often varies among individuals, with differences in focal points and cognitive levels across different ages, experiences, and genders. There is currently a lack of research in this area, and most existing benchmarks suffer from several drawbacks: 1) a limited number of modalities and answers with restrictive length; 2) the content and scenarios within…
▽ More
Understanding of video creativity and content often varies among individuals, with differences in focal points and cognitive levels across different ages, experiences, and genders. There is currently a lack of research in this area, and most existing benchmarks suffer from several drawbacks: 1) a limited number of modalities and answers with restrictive length; 2) the content and scenarios within the videos are excessively monotonous, transmitting allegories and emotions that are overly simplistic. To bridge the gap to real-world applications, we introduce a large-scale \textbf{S}ubjective \textbf{R}esponse \textbf{I}ndicators for \textbf{A}dvertisement \textbf{V}ideos dataset, namely SRI-ADV. Specifically, we collected real changes in Electroencephalographic (EEG) and eye-tracking regions from different demographics while they viewed identical video content. Utilizing this multi-modal dataset, we developed tasks and protocols to analyze and evaluate the extent of cognitive understanding of video content among different users. Along with the dataset, we designed a \textbf{H}ypergraph \textbf{M}ulti-modal \textbf{L}arge \textbf{L}anguage \textbf{M}odel (HMLLM) to explore the associations among different demographics, video elements, EEG and eye-tracking indicators. HMLLM could bridge semantic gaps across rich modalities and integrate information beyond different modalities to perform logical reasoning. Extensive experimental evaluations on SRI-ADV and other additional video-based generative performance benchmarks demonstrate the effectiveness of our method. The codes and dataset will be released at \url{https://github.com/suay1113/HMLLM}.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$
Authors:
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (645 additional authors not shown)
Abstract:
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be…
▽ More
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be $(35.9\pm 4.8\pm 3.5)\%$ and $(37.4\pm 3.1\pm 4.6)\%$, respectively. The measurements are in tension with predictions based on the assumption that the $D_{s1}(2536)$ and $D_{s2}^*(2573)$ are dominated by a bare $c\bar{s}$ component. The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ cross sections are measured, and a resonant structure at around 4.6~GeV with a width of 50~MeV is observed for the first time with a statistical significance of $15σ$ in the $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ process. It could be the $Y(4626)$ found by the Belle collaboration in the $D_s^+D_{s1}(2536)^{-}$ final state, since they have similar masses and widths. There is also evidence for a structure at around 4.75~GeV in both processes.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions
Authors:
Xuan Ju,
Yiming Gao,
Zhaoyang Zhang,
Ziyang Yuan,
Xintao Wang,
Ailing Zeng,
Yu Xiong,
Qiang Xu,
Ying Shan
Abstract:
Sora's high-motion intensity and long consistent videos have significantly impacted the field of video generation, attracting unprecedented attention. However, existing publicly available datasets are inadequate for generating Sora-like videos, as they mainly contain short videos with low motion intensity and brief captions. To address these issues, we propose MiraData, a high-quality video datase…
▽ More
Sora's high-motion intensity and long consistent videos have significantly impacted the field of video generation, attracting unprecedented attention. However, existing publicly available datasets are inadequate for generating Sora-like videos, as they mainly contain short videos with low motion intensity and brief captions. To address these issues, we propose MiraData, a high-quality video dataset that surpasses previous ones in video duration, caption detail, motion strength, and visual quality. We curate MiraData from diverse, manually selected sources and meticulously process the data to obtain semantically consistent clips. GPT-4V is employed to annotate structured captions, providing detailed descriptions from four different perspectives along with a summarized dense caption. To better assess temporal consistency and motion intensity in video generation, we introduce MiraBench, which enhances existing benchmarks by adding 3D consistency and tracking-based motion strength metrics. MiraBench includes 150 evaluation prompts and 17 metrics covering temporal consistency, motion strength, 3D consistency, visual quality, text-video alignment, and distribution similarity. To demonstrate the utility and effectiveness of MiraData, we conduct experiments using our DiT-based video generation model, MiraDiT. The experimental results on MiraBench demonstrate the superiority of MiraData, especially in motion strength.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
MBA-Net: SAM-driven Bidirectional Aggregation Network for Ovarian Tumor Segmentation
Authors:
Yifan Gao,
Wei Xia,
Wenkui Wang,
Xin Gao
Abstract:
Accurate segmentation of ovarian tumors from medical images is crucial for early diagnosis, treatment planning, and patient management. However, the diverse morphological characteristics and heterogeneous appearances of ovarian tumors pose significant challenges to automated segmentation methods. In this paper, we propose MBA-Net, a novel architecture that integrates the powerful segmentation capa…
▽ More
Accurate segmentation of ovarian tumors from medical images is crucial for early diagnosis, treatment planning, and patient management. However, the diverse morphological characteristics and heterogeneous appearances of ovarian tumors pose significant challenges to automated segmentation methods. In this paper, we propose MBA-Net, a novel architecture that integrates the powerful segmentation capabilities of the Segment Anything Model (SAM) with domain-specific knowledge for accurate and robust ovarian tumor segmentation. MBA-Net employs a hybrid encoder architecture, where the encoder consists of a prior branch, which inherits the SAM encoder to capture robust segmentation priors, and a domain branch, specifically designed to extract domain-specific features. The bidirectional flow of information between the two branches is facilitated by the robust feature injection network (RFIN) and the domain knowledge integration network (DKIN), enabling MBA-Net to leverage the complementary strengths of both branches. We extensively evaluate MBA-Net on the public multi-modality ovarian tumor ultrasound dataset and the in-house multi-site ovarian tumor MRI dataset. Our proposed method consistently outperforms state-of-the-art segmentation approaches. Moreover, MBA-Net demonstrates superior generalization capability across different imaging modalities and clinical sites.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
CA-FedRC: Codebook Adaptation via Federated Reservoir Computing in 5G NR
Authors:
Ziqiang Ye,
Sikai Liao,
Yulan Gao,
Shu Fang,
Yue Xiao,
Ming Xiao,
Saviour Zammit
Abstract:
With the burgeon deployment of the fifth-generation new radio (5G NR) networks, the codebook plays a crucial role in enabling the base station (BS) to acquire the channel state information (CSI). Different 5G NR codebooks incur varying overheads and exhibit performance disparities under diverse channel conditions, necessitating codebook adaptation based on channel conditions to reduce feedback ove…
▽ More
With the burgeon deployment of the fifth-generation new radio (5G NR) networks, the codebook plays a crucial role in enabling the base station (BS) to acquire the channel state information (CSI). Different 5G NR codebooks incur varying overheads and exhibit performance disparities under diverse channel conditions, necessitating codebook adaptation based on channel conditions to reduce feedback overhead while enhancing performance. However, existing methods of 5G NR codebooks adaptation require significant overhead for model training and feedback or fall short in performance. To address these limitations, this letter introduces a federated reservoir computing framework designed for efficient codebook adaptation in computationally and feedback resource-constrained mobile devices. This framework utilizes a novel series of indicators as input training data, striking an effective balance between performance and feedback overhead. Compared to conventional models, the proposed codebook adaptation via federated reservoir computing (CA-FedRC), achieves rapid convergence and significant loss reduction in both speed and accuracy. Extensive simulations under various channel conditions demonstrate that our algorithm not only reduces resource consumption of users but also accurately identifies channel types, thereby optimizing the trade-off between spectrum efficiency, computational complexity, and feedback overhead.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Cost-Efficient Computation Offloading in SAGIN: A Deep Reinforcement Learning and Perception-Aided Approach
Authors:
Yulan Gao,
Ziqiang Ye,
Han Yu
Abstract:
The Space-Air-Ground Integrated Network (SAGIN), crucial to the advancement of sixth-generation (6G) technology, plays a key role in ensuring universal connectivity, particularly by addressing the communication needs of remote areas lacking cellular network infrastructure. This paper delves into the role of unmanned aerial vehicles (UAVs) within SAGIN, where they act as a control layer owing to th…
▽ More
The Space-Air-Ground Integrated Network (SAGIN), crucial to the advancement of sixth-generation (6G) technology, plays a key role in ensuring universal connectivity, particularly by addressing the communication needs of remote areas lacking cellular network infrastructure. This paper delves into the role of unmanned aerial vehicles (UAVs) within SAGIN, where they act as a control layer owing to their adaptable deployment capabilities and their intermediary role. Equipped with millimeter-wave (mmWave) radar and vision sensors, these UAVs are capable of acquiring multi-source data, which helps to diminish uncertainty and enhance the accuracy of decision-making. Concurrently, UAVs collect tasks requiring computing resources from their coverage areas, originating from a variety of mobile devices moving at different speeds. These tasks are then allocated to ground base stations (BSs), low-earth-orbit (LEO) satellite, and local processing units to improve processing efficiency. Amidst this framework, our study concentrates on devising dynamic strategies for facilitating task hosting between mobile devices and UAVs, offloading computations, managing associations between UAVs and BSs, and allocating computing resources. The objective is to minimize the time-averaged network cost, considering the uncertainty of device locations, speeds, and even types. To tackle these complexities, we propose a deep reinforcement learning and perception-aided online approach (DRL-and-Perception-aided Approach) for this joint optimization in SAGIN, tailored for an environment filled with uncertainties. The effectiveness of our proposed approach is validated through extensive numerical simulations, which quantify its performance relative to various network parameters.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
DMTG: One-Shot Differentiable Multi-Task Grouping
Authors:
Yuan Gao,
Shuguo Jiang,
Moran Li,
Jin-Gang Yu,
Gui-Song Xia
Abstract:
We aim to address Multi-Task Learning (MTL) with a large number of tasks by Multi-Task Grouping (MTG). Given N tasks, we propose to simultaneously identify the best task groups from 2^N candidates and train the model weights simultaneously in one-shot, with the high-order task-affinity fully exploited. This is distinct from the pioneering methods which sequentially identify the groups and train th…
▽ More
We aim to address Multi-Task Learning (MTL) with a large number of tasks by Multi-Task Grouping (MTG). Given N tasks, we propose to simultaneously identify the best task groups from 2^N candidates and train the model weights simultaneously in one-shot, with the high-order task-affinity fully exploited. This is distinct from the pioneering methods which sequentially identify the groups and train the model weights, where the group identification often relies on heuristics. As a result, our method not only improves the training efficiency, but also mitigates the objective bias introduced by the sequential procedures that potentially lead to a suboptimal solution. Specifically, we formulate MTG as a fully differentiable pruning problem on an adaptive network architecture determined by an underlying Categorical distribution. To categorize N tasks into K groups (represented by K encoder branches), we initially set up KN task heads, where each branch connects to all N task heads to exploit the high-order task-affinity. Then, we gradually prune the KN heads down to N by learning a relaxed differentiable Categorical distribution, ensuring that each task is exclusively and uniquely categorized into only one branch. Extensive experiments on CelebA and Taskonomy datasets with detailed ablations show the promising performance and efficiency of our method. The codes are available at https://github.com/ethanygao/DMTG.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
OneRestore: A Universal Restoration Framework for Composite Degradation
Authors:
Yu Guo,
Yuan Gao,
Yuxu Lu,
Huilin Zhu,
Ryan Wen Liu,
Shengfeng He
Abstract:
In real-world scenarios, image impairments often manifest as composite degradations, presenting a complex interplay of elements such as low light, haze, rain, and snow. Despite this reality, existing restoration methods typically target isolated degradation types, thereby falling short in environments where multiple degrading factors coexist. To bridge this gap, our study proposes a versatile imag…
▽ More
In real-world scenarios, image impairments often manifest as composite degradations, presenting a complex interplay of elements such as low light, haze, rain, and snow. Despite this reality, existing restoration methods typically target isolated degradation types, thereby falling short in environments where multiple degrading factors coexist. To bridge this gap, our study proposes a versatile imaging model that consolidates four physical corruption paradigms to accurately represent complex, composite degradation scenarios. In this context, we propose OneRestore, a novel transformer-based framework designed for adaptive, controllable scene restoration. The proposed framework leverages a unique cross-attention mechanism, merging degraded scene descriptors with image features, allowing for nuanced restoration. Our model allows versatile input scene descriptors, ranging from manual text embeddings to automatic extractions based on visual attributes. Our methodology is further enhanced through a composite degradation restoration loss, using extra degraded images as negative samples to fortify model constraints. Comparative results on synthetic and real-world datasets demonstrate OneRestore as a superior solution, significantly advancing the state-of-the-art in addressing complex, composite degradations.
△ Less
Submitted 10 July, 2024; v1 submitted 5 July, 2024;
originally announced July 2024.
-
An Interactive Multi-modal Query Answering System with Retrieval-Augmented Large Language Models
Authors:
Mengzhao Wang,
Haotian Wu,
Xiangyu Ke,
Yunjun Gao,
Xiaoliang Xu,
Lu Chen
Abstract:
Retrieval-augmented Large Language Models (LLMs) have reshaped traditional query-answering systems, offering unparalleled user experiences. However, existing retrieval techniques often struggle to handle multi-modal query contexts. In this paper, we present an interactive Multi-modal Query Answering (MQA) system, empowered by our newly developed multi-modal retrieval framework and navigation graph…
▽ More
Retrieval-augmented Large Language Models (LLMs) have reshaped traditional query-answering systems, offering unparalleled user experiences. However, existing retrieval techniques often struggle to handle multi-modal query contexts. In this paper, we present an interactive Multi-modal Query Answering (MQA) system, empowered by our newly developed multi-modal retrieval framework and navigation graph index, integrated with cutting-edge LLMs. It comprises five core components: Data Preprocessing, Vector Representation, Index Construction, Query Execution, and Answer Generation, all orchestrated by a dedicated coordinator to ensure smooth data flow from input to answer generation. One notable aspect of MQA is its utilization of contrastive learning to assess the significance of different modalities, facilitating precise measurement of multi-modal information similarity. Furthermore, the system achieves efficient retrieval through our advanced navigation graph index, refined using computational pruning techniques. Another highlight of our system is its pluggable processing framework, allowing seamless integration of embedding models, graph indexes, and LLMs. This flexibility provides users diverse options for gaining insights from their multi-modal knowledge base. A preliminary video introduction of MQA is available at https://youtu.be/xvUuo2ZIqWk.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Query-Guided Self-Supervised Summarization of Nursing Notes
Authors:
Ya Gao,
Hans Moen,
Saila Koivusalo,
Miika Koskinen,
Pekka Marttinen
Abstract:
Nursing notes, an important component of Electronic Health Records (EHRs), keep track of the progression of a patient's health status during a care episode. Distilling the key information in nursing notes through text summarization techniques can improve clinicians' efficiency in understanding patients' conditions when reviewing nursing notes. However, existing abstractive summarization methods in…
▽ More
Nursing notes, an important component of Electronic Health Records (EHRs), keep track of the progression of a patient's health status during a care episode. Distilling the key information in nursing notes through text summarization techniques can improve clinicians' efficiency in understanding patients' conditions when reviewing nursing notes. However, existing abstractive summarization methods in the clinical setting have often overlooked nursing notes and require the creation of reference summaries for supervision signals, which is time-consuming. In this work, we introduce QGSumm, a query-guided self-supervised domain adaptation framework for nursing note summarization. Using patient-related clinical queries as guidance, our approach generates high-quality, patient-centered summaries without relying on reference summaries for training. Through automatic and manual evaluation by an expert clinician, we demonstrate the strengths of our approach compared to the state-of-the-art Large Language Models (LLMs) in both zero-shot and few-shot settings. Ultimately, our approach provides a new perspective on conditional text summarization, tailored to the specific interests of clinical personnel.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
EMPL: A novel Efficient Meta Prompt Learning Framework for Few-shot Unsupervised Domain Adaptation
Authors:
Wanqi Yang,
Haoran Wang,
Lei Wang,
Ge Song,
Yang Gao
Abstract:
Few-shot unsupervised domain adaptation (FS-UDA) utilizes few-shot labeled source domain data to realize effective classification in unlabeled target domain. However, current FS-UDA methods are still suffer from two issues: 1) the data from different domains can not be effectively aligned by few-shot labeled data due to the large domain gaps, 2) it is unstable and time-consuming to generalize to n…
▽ More
Few-shot unsupervised domain adaptation (FS-UDA) utilizes few-shot labeled source domain data to realize effective classification in unlabeled target domain. However, current FS-UDA methods are still suffer from two issues: 1) the data from different domains can not be effectively aligned by few-shot labeled data due to the large domain gaps, 2) it is unstable and time-consuming to generalize to new FS-UDA tasks.To address this issue, we put forward a novel Efficient Meta Prompt Learning Framework for FS-UDA. Within this framework, we use pre-trained CLIP model as the feature learning base model. First, we design domain-shared prompt learning vectors composed of virtual tokens, which mainly learns the meta knowledge from a large number of meta tasks to mitigate domain gaps. Secondly, we also design a task-shared prompt learning network to adaptively learn specific prompt vectors for each task, which aims to realize fast adaptation and task generalization. Thirdly, we learn a task-specific cross-domain alignment projection and a task-specific classifier with closed-form solutions for each meta task, which can efficiently adapt the model to new tasks in one step. The whole learning process is formulated as a bilevel optimization problem, and a good initialization of model parameters is learned through meta-learning. Extensive experimental study demonstrates the promising performance of our framework on benchmark datasets. Our method has the large improvement of at least 15.4% on 5-way 1-shot and 8.7% on 5-way 5-shot, compared with the state-of-the-art methods. Also, the performance of our method on all the test tasks is more stable than the other methods.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Authors:
Pan Zhang,
Xiaoyi Dong,
Yuhang Zang,
Yuhang Cao,
Rui Qian,
Lin Chen,
Qipeng Guo,
Haodong Duan,
Bin Wang,
Linke Ouyang,
Songyang Zhang,
Wenwei Zhang,
Yining Li,
Yang Gao,
Peng Sun,
Xinyue Zhang,
Wei Li,
Jingwen Li,
Wenhai Wang,
Hang Yan,
Conghui He,
Xingcheng Zhang,
Kai Chen,
Jifeng Dai,
Yu Qiao
, et al. (2 additional authors not shown)
Abstract:
We present InternLM-XComposer-2.5 (IXC-2.5), a versatile large-vision language model that supports long-contextual input and output. IXC-2.5 excels in various text-image comprehension and composition applications, achieving GPT-4V level capabilities with merely 7B LLM backend. Trained with 24K interleaved image-text contexts, it can seamlessly extend to 96K long contexts via RoPE extrapolation. Th…
▽ More
We present InternLM-XComposer-2.5 (IXC-2.5), a versatile large-vision language model that supports long-contextual input and output. IXC-2.5 excels in various text-image comprehension and composition applications, achieving GPT-4V level capabilities with merely 7B LLM backend. Trained with 24K interleaved image-text contexts, it can seamlessly extend to 96K long contexts via RoPE extrapolation. This long-context capability allows IXC-2.5 to excel in tasks requiring extensive input and output contexts. Compared to its previous 2.0 version, InternLM-XComposer-2.5 features three major upgrades in vision-language comprehension: (1) Ultra-High Resolution Understanding, (2) Fine-Grained Video Understanding, and (3) Multi-Turn Multi-Image Dialogue. In addition to comprehension, IXC-2.5 extends to two compelling applications using extra LoRA parameters for text-image composition: (1) Crafting Webpages and (2) Composing High-Quality Text-Image Articles. IXC-2.5 has been evaluated on 28 benchmarks, outperforming existing open-source state-of-the-art models on 16 benchmarks. It also surpasses or competes closely with GPT-4V and Gemini Pro on 16 key tasks. The InternLM-XComposer-2.5 is publicly available at https://github.com/InternLM/InternLM-XComposer.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Relating CNN-Transformer Fusion Network for Change Detection
Authors:
Yuhao Gao,
Gensheng Pei,
Mengmeng Sheng,
Zeren Sun,
Tao Chen,
Yazhou Yao
Abstract:
While deep learning, particularly convolutional neural networks (CNNs), has revolutionized remote sensing (RS) change detection (CD), existing approaches often miss crucial features due to neglecting global context and incomplete change learning. Additionally, transformer networks struggle with low-level details. RCTNet addresses these limitations by introducing \textbf{(1)} an early fusion backbo…
▽ More
While deep learning, particularly convolutional neural networks (CNNs), has revolutionized remote sensing (RS) change detection (CD), existing approaches often miss crucial features due to neglecting global context and incomplete change learning. Additionally, transformer networks struggle with low-level details. RCTNet addresses these limitations by introducing \textbf{(1)} an early fusion backbone to exploit both spatial and temporal features early on, \textbf{(2)} a Cross-Stage Aggregation (CSA) module for enhanced temporal representation, \textbf{(3)} a Multi-Scale Feature Fusion (MSF) module for enriched feature extraction in the decoder, and \textbf{(4)} an Efficient Self-deciphering Attention (ESA) module utilizing transformers to capture global information and fine-grained details for accurate change detection. Extensive experiments demonstrate RCTNet's clear superiority over traditional RS image CD methods, showing significant improvement and an optimal balance between accuracy and computational cost.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Non-Adversarial Learning: Vector-Quantized Common Latent Space for Multi-Sequence MRI
Authors:
Luyi Han,
Tao Tan,
Tianyu Zhang,
Xin Wang,
Yuan Gao,
Chunyao Lu,
Xinglong Liang,
Haoran Dou,
Yunzhi Huang,
Ritse Mann
Abstract:
Adversarial learning helps generative models translate MRI from source to target sequence when lacking paired samples. However, implementing MRI synthesis with adversarial learning in clinical settings is challenging due to training instability and mode collapse. To address this issue, we leverage intermediate sequences to estimate the common latent space among multi-sequence MRI, enabling the rec…
▽ More
Adversarial learning helps generative models translate MRI from source to target sequence when lacking paired samples. However, implementing MRI synthesis with adversarial learning in clinical settings is challenging due to training instability and mode collapse. To address this issue, we leverage intermediate sequences to estimate the common latent space among multi-sequence MRI, enabling the reconstruction of distinct sequences from the common latent space. We propose a generative model that compresses discrete representations of each sequence to estimate the Gaussian distribution of vector-quantized common (VQC) latent space between multiple sequences. Moreover, we improve the latent space consistency with contrastive learning and increase model stability by domain augmentation. Experiments using BraTS2021 dataset show that our non-adversarial model outperforms other GAN-based methods, and VQC latent space aids our model to achieve (1) anti-interference ability, which can eliminate the effects of noise, bias fields, and artifacts, and (2) solid semantic representation ability, with the potential of one-shot segmentation. Our code is publicly available.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be…
▽ More
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be $\mathcal{B}(J/ψ\to p \bar{p} η(η\to γγ)) = (1.480 \pm 0.001 \pm 0.024)\times\,10^{-3}$ and $\mathcal{B}(J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)) = (1.557 \pm 0.003 \pm 0.038)\times\,10^{-3}$, where the first uncertainties are statistical and the second systematic. Both results are compatible within their uncorrelated systematic uncertainties. The combined result is $\mathcal{B}(J/ψ\to p \bar{p} η)=(1.495 \pm 0.001 \pm 0.023)\times\,10^{-3}$ where the first uncertainty is the combined statistical uncertainty and the second one the combined systematic uncertainty of both analyses, incorporating correlations between them. In addition, the $p \bar{p}$ threshold region is investigated for a potential threshold enhancement, and no evidence for one is observed.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Domain Generalizable Knowledge Tracing via Concept Aggregation and Relation-Based Attention
Authors:
Yuquan Xie,
Wanqi Yang,
Jinyu Wei,
Ming Yang,
Yang Gao
Abstract:
Knowledge Tracing (KT) is a critical task in online education systems, aiming to monitor students' knowledge states throughout a learning period. Common KT approaches involve predicting the probability of a student correctly answering the next question based on their exercise history. However, these methods often suffer from performance degradation when faced with the scarcity of student interacti…
▽ More
Knowledge Tracing (KT) is a critical task in online education systems, aiming to monitor students' knowledge states throughout a learning period. Common KT approaches involve predicting the probability of a student correctly answering the next question based on their exercise history. However, these methods often suffer from performance degradation when faced with the scarcity of student interactions in new education systems. To address this, we leverage student interactions from existing education systems to mitigate performance degradation caused by limited training data. Nevertheless, these interactions exhibit significant differences since they are derived from different education systems. To address this issue, we propose a domain generalization approach for knowledge tracing, where existing education systems are considered source domains, and new education systems with limited data are considered target domains. Additionally, we design a domain-generalizable knowledge tracing framework (DGKT) that can be applied to any KT model. Specifically, we present a concept aggregation approach designed to reduce conceptual disparities within sequences of student interactions from diverse domains. To further mitigate domain discrepancies, we introduce a novel normalization module called Sequence Instance Normalization (SeqIN). Moreover, to fully leverage exercise information, we propose a new knowledge tracing model tailored for the domain generalization KT task, named Domain-Generalizable Relation-based Knowledge Tracing (DGRKT). Extensive experiments across five benchmark datasets demonstrate that the proposed method performs well despite limited training data.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Non-degeneracy and new type of cylindrial solutions for a critical Grushin-type problem
Authors:
Yuan Gao,
Yuxia Guo,
Ning Zhou
Abstract:
In this paper, we consider a critical Grushin-type problem, which is closely related to the prescribed Webster scalar curvature problems on the CR sphere with cylindrically symmetric curvature. We first prove a non-degeneracy result through local Pohozaev identities, then by using the Lyapunov-Schmidt reduction methods, we construct new type of multi-bubbling solutions with cylindrical symmetry.
In this paper, we consider a critical Grushin-type problem, which is closely related to the prescribed Webster scalar curvature problems on the CR sphere with cylindrically symmetric curvature. We first prove a non-degeneracy result through local Pohozaev identities, then by using the Lyapunov-Schmidt reduction methods, we construct new type of multi-bubbling solutions with cylindrical symmetry.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
A Note on Improved bounds for the Oriented Radius of Mixed Multigraphs
Authors:
Hengzhe Li,
Zhiwei Ding,
Jianbing Liu,
Yanhong Gao,
Shuli Zhao
Abstract:
For a positive integer $r$, let $f(r)$ denote the smallest number such that any 2-edge connected mixed graph with radius $r$ has an oriented radius of at most $f(r)$. Recently, Babu, Benson, and Rajendraprasad significantly improved the upper bound of $f(r)$ by establishing that $f(r) \leq 1.5r^2 + r + 1$, see [Improved bounds for the oriented radius of mixed multigraphs, J. Graph Theory, 103 (202…
▽ More
For a positive integer $r$, let $f(r)$ denote the smallest number such that any 2-edge connected mixed graph with radius $r$ has an oriented radius of at most $f(r)$. Recently, Babu, Benson, and Rajendraprasad significantly improved the upper bound of $f(r)$ by establishing that $f(r) \leq 1.5r^2 + r + 1$, see [Improved bounds for the oriented radius of mixed multigraphs, J. Graph Theory, 103 (2023), 674-689]. Additionally, they demonstrated that if each edge of a graph $G$ is contained within a cycle of length at most $η$, then the oriented radius of $G$ is at most $1.5rη$. The authors' results were derived through Observation 1, which served as the foundation for the development of Algorithm ORIENTOUT and Algorithm ORIENTIN. By integrating these algorithms, they obtained the improved bounds. However, an error has been identified in Observation 1, necessitating revisions to Algorithm ORIENTOUT and Algorithm ORIENTIN. In this note, we address the error and propose the necessary modifications to both algorithms, thereby ensuring the correctness of the conclusions.
△ Less
Submitted 27 June, 2024;
originally announced July 2024.
-
DeepiSign-G: Generic Watermark to Stamp Hidden DNN Parameters for Self-contained Tracking
Authors:
Alsharif Abuadbba,
Nicholas Rhodes,
Kristen Moore,
Bushra Sabir,
Shuo Wang,
Yansong Gao
Abstract:
Deep learning solutions in critical domains like autonomous vehicles, facial recognition, and sentiment analysis require caution due to the severe consequences of errors. Research shows these models are vulnerable to adversarial attacks, such as data poisoning and neural trojaning, which can covertly manipulate model behavior, compromising reliability and safety. Current defense strategies like wa…
▽ More
Deep learning solutions in critical domains like autonomous vehicles, facial recognition, and sentiment analysis require caution due to the severe consequences of errors. Research shows these models are vulnerable to adversarial attacks, such as data poisoning and neural trojaning, which can covertly manipulate model behavior, compromising reliability and safety. Current defense strategies like watermarking have limitations: they fail to detect all model modifications and primarily focus on attacks on CNNs in the image domain, neglecting other critical architectures like RNNs.
To address these gaps, we introduce DeepiSign-G, a versatile watermarking approach designed for comprehensive verification of leading DNN architectures, including CNNs and RNNs. DeepiSign-G enhances model security by embedding an invisible watermark within the Walsh-Hadamard transform coefficients of the model's parameters. This watermark is highly sensitive and fragile, ensuring prompt detection of any modifications. Unlike traditional hashing techniques, DeepiSign-G allows substantial metadata incorporation directly within the model, enabling detailed, self-contained tracking and verification.
We demonstrate DeepiSign-G's applicability across various architectures, including CNN models (VGG, ResNets, DenseNet) and RNNs (Text sentiment classifier). We experiment with four popular datasets: VGG Face, CIFAR10, GTSRB Traffic Sign, and Large Movie Review. We also evaluate DeepiSign-G under five potential attacks. Our comprehensive evaluation confirms that DeepiSign-G effectively detects these attacks without compromising CNN and RNN model performance, highlighting its efficacy as a robust security measure for deep learning applications. Detection of integrity breaches is nearly perfect, while hiding only a bit in approximately 1% of the Walsh-Hadamard coefficients.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Human-like object concept representations emerge naturally in multimodal large language models
Authors:
Changde Du,
Kaicheng Fu,
Bincheng Wen,
Yi Sun,
Jie Peng,
Wei Wei,
Ying Gao,
Shengpei Wang,
Chuncheng Zhang,
Jinpeng Li,
Shuang Qiu,
Le Chang,
Huiguang He
Abstract:
The conceptualization and categorization of natural objects in the human mind have long intrigued cognitive scientists and neuroscientists, offering crucial insights into human perception and cognition. Recently, the rapid development of Large Language Models (LLMs) has raised the attractive question of whether these models can also develop human-like object representations through exposure to vas…
▽ More
The conceptualization and categorization of natural objects in the human mind have long intrigued cognitive scientists and neuroscientists, offering crucial insights into human perception and cognition. Recently, the rapid development of Large Language Models (LLMs) has raised the attractive question of whether these models can also develop human-like object representations through exposure to vast amounts of linguistic and multimodal data. In this study, we combined behavioral and neuroimaging analysis methods to uncover how the object concept representations in LLMs correlate with those of humans. By collecting large-scale datasets of 4.7 million triplet judgments from LLM and Multimodal LLM (MLLM), we were able to derive low-dimensional embeddings that capture the underlying similarity structure of 1,854 natural objects. The resulting 66-dimensional embeddings were found to be highly stable and predictive, and exhibited semantic clustering akin to human mental representations. Interestingly, the interpretability of the dimensions underlying these embeddings suggests that LLM and MLLM have developed human-like conceptual representations of natural objects. Further analysis demonstrated strong alignment between the identified model embeddings and neural activity patterns in many functionally defined brain ROIs (e.g., EBA, PPA, RSC and FFA). This provides compelling evidence that the object representations in LLMs, while not identical to those in the human, share fundamental commonalities that reflect key schemas of human conceptual knowledge. This study advances our understanding of machine intelligence and informs the development of more human-like artificial cognitive systems.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Periodic domain inversion in single crystal barium titanate-on-insulator thin film
Authors:
Pragati Aashna,
Hong-Lin Lin,
Yu Cao,
Yuhui Yin,
Yuan Gao,
Sakthi Sanjeev Mohanraj,
Di Zhu,
Aaron Danner
Abstract:
We report experimentally achieving first-ever electric field periodic poling of single crystal barium titanate (BTO, or BaTiO3) thin film on insulator. Owing to the outstanding optical nonlinearities of BTO, this result is a key step towards achieving quasi-phase-matching in BTO. We first grow the BTO thin film on a dysprosium scandate substrate using pulsed laser deposition with a thin layer of s…
▽ More
We report experimentally achieving first-ever electric field periodic poling of single crystal barium titanate (BTO, or BaTiO3) thin film on insulator. Owing to the outstanding optical nonlinearities of BTO, this result is a key step towards achieving quasi-phase-matching in BTO. We first grow the BTO thin film on a dysprosium scandate substrate using pulsed laser deposition with a thin layer of strontium ruthenate later serving as the bottom electrode for poling. We present characterization of the BTO thin film using x-ray diffraction and piezo-response force microscopy to clearly demonstrate single crystal, single domain growth of the film which enables the desired periodic poling. To investigate the poling quality, we apply both non-destructive piezo force response microscopy and destructive etching-assisted scanning electron microscopy and we show that high quality, uniform and intransient poling with 50 % duty cycle and periods ranging from 2 μm to 10 μm is achieved. The successful realization of periodic poling in BTO thin film unlocks the potential for highly efficient nonlinear processes under quasi-phase-matching that seemed far-fetched with prior polycrystalline BTO thin films which predominantly relied on efficiency-limited random or non-phase matching conditions and is a key step towards integration of BTO photonic devices.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
SplitLoRA: A Split Parameter-Efficient Fine-Tuning Framework for Large Language Models
Authors:
Zheng Lin,
Xuanjie Hu,
Yuxin Zhang,
Zhe Chen,
Zihan Fang,
Xianhao Chen,
Ang Li,
Praneeth Vepakomma,
Yue Gao
Abstract:
The scalability of large language models (LLMs) in handling high-complexity models and large-scale datasets has led to tremendous successes in pivotal domains. While there is an urgent need to acquire more training data for LLMs, a concerning reality is the depletion of high-quality public datasets within a few years. In view of this, the federated learning (FL) LLM fine-tuning paradigm recently h…
▽ More
The scalability of large language models (LLMs) in handling high-complexity models and large-scale datasets has led to tremendous successes in pivotal domains. While there is an urgent need to acquire more training data for LLMs, a concerning reality is the depletion of high-quality public datasets within a few years. In view of this, the federated learning (FL) LLM fine-tuning paradigm recently has been proposed to facilitate collaborative LLM fine-tuning on distributed private data, where multiple data owners collaboratively fine-tune a shared LLM without sharing raw data. However, the staggering model size of LLMs imposes heavy computing and communication burdens on clients, posing significant barriers to the democratization of the FL LLM fine-tuning paradigm. To address this issue, split learning (SL) has emerged as a promising solution by offloading the primary training workload to a server via model partitioning while exchanging activation/activation's gradients with smaller data sizes rather than the entire LLM. Unfortunately, research on the SL LLM fine-tuning paradigm is still in its nascent stage. To fill this gap, in this paper, we propose the first SL LLM fine-tuning framework, named SplitLoRA. SplitLoRA is built on the split federated learning (SFL) framework, amalgamating the advantages of parallel training from FL and model splitting from SL and thus greatly enhancing the training efficiency. It is worth noting that SplitLoRA is the inaugural open-source benchmark for SL LLM fine-tuning, providing a foundation for research efforts dedicated to advancing SL LLM fine-tuning. Extensive simulations validate that SplitLoRA achieves target accuracy in significantly less time than state-of-the-art LLM fine-tuning frameworks, demonstrating the superior training performance of SplitLoRA. The project page is available at https://fduinc.github.io/splitlora/.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
SpectralKAN: Kolmogorov-Arnold Network for Hyperspectral Images Change Detection
Authors:
Yanheng Wang,
Xiaohan Yu,
Yongsheng Gao,
Jianjun Sha,
Jian Wang,
Lianru Gao,
Yonggang Zhang,
Xianhui Rong
Abstract:
It has been verified that deep learning methods, including convolutional neural networks (CNNs), graph neural networks (GNNs), and transformers, can accurately extract features from hyperspectral images (HSIs). These algorithms perform exceptionally well on HSIs change detection (HSIs-CD). However, the downside of these impressive results is the enormous number of parameters, FLOPs, GPU memory, tr…
▽ More
It has been verified that deep learning methods, including convolutional neural networks (CNNs), graph neural networks (GNNs), and transformers, can accurately extract features from hyperspectral images (HSIs). These algorithms perform exceptionally well on HSIs change detection (HSIs-CD). However, the downside of these impressive results is the enormous number of parameters, FLOPs, GPU memory, training and test times required. In this paper, we propose an spectral Kolmogorov-Arnold Network for HSIs-CD (SpectralKAN). SpectralKAN represent a multivariate continuous function with a composition of activation functions to extract HSIs feature and classification. These activation functions are b-spline functions with different parameters that can simulate various functions. In SpectralKAN, a KAN encoder is proposed to enhance computational efficiency for HSIs. And a spatial-spectral KAN encoder is introduced, where the spatial KAN encoder extracts spatial features and compresses the spatial dimensions from patch size to one. The spectral KAN encoder then extracts spectral features and classifies them into changed and unchanged categories. We use five HSIs-CD datasets to verify the effectiveness of SpectralKAN. Experimental verification has shown that SpectralKAN maintains high HSIs-CD accuracy while requiring fewer parameters, FLOPs, GPU memory, training and testing times, thereby increasing the efficiency of HSIs-CD. The code will be available at https://github.com/yanhengwang-heu/SpectralKAN.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Observation of the Electromagnetic Dalitz Transition $h_c \rightarrow e^+e^-η_c$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
S. Ahmed,
M. Albrecht,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
X. H. Bai,
Y. Bai,
O. Bakina,
R. Baldini Ferroli,
I. Balossino,
Y. Ban,
K. Begzsuren,
N. Berger,
M. Bertani,
D. Bettoni,
F. Bianchi,
J. Bloms,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (495 additional authors not shown)
Abstract:
Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions…
▽ More
Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions $\frac{\mathcal{B}(h_c\rightarrow e^+e^-η_c)}{\mathcal{B}(h_c\rightarrow γη_c)}$ separately for the $h_c$ samples produced via $ψ(3686)\toπ^0h_c$ and $e^+e^-\toπ^+π^-h_c$. The average ratio is determined to be $(0.59\pm0.10(\text{stat.})\pm0.04(\text{syst.}))\%$, where the uncertainty includes both statistical and systematic components.
△ Less
Submitted 2 July, 2024; v1 submitted 28 June, 2024;
originally announced July 2024.
-
MOSP: A User-interface Package for Simulating Metal Nanoparticle Structure and Reactivity under Operando Conditions
Authors:
Lei Ying,
Beien Zhu,
Yi Gao
Abstract:
Structures of metal nanoparticles (NPs) significantly influence their catalytic reactivities. Recent in situ experimental observations of dramatic structural changes in NPs have underscored the need to establish a dynamic structure-property relationship that accounts for the reconstruction of NPs in reactive environments. Here, we present the MOSP, a free and open-source graphical user interface (…
▽ More
Structures of metal nanoparticles (NPs) significantly influence their catalytic reactivities. Recent in situ experimental observations of dramatic structural changes in NPs have underscored the need to establish a dynamic structure-property relationship that accounts for the reconstruction of NPs in reactive environments. Here, we present the MOSP, a free and open-source graphical user interface (GUI) package designed to simulate the structure and reactivity of metal NPs under operando conditions. MOSP integrates two models: the multiscale structure reconstruction (MSR) model predicting equilibrium metal NP structures under specific reaction conditions and the kinetic Monte Carlo (KMC) model simulating the reaction dynamics. This combination allows for the exploration of the dynamic structure-property relationships of NPs. MOSP enhances user accessibility through its intuitive GUI, facilitating easy input, post-processing, and visualization of simulation data. This article is the release note of MOSP, focusing on its implementation and functionality.
△ Less
Submitted 1 July, 2024; v1 submitted 28 June, 2024;
originally announced June 2024.
-
Scalable and Domain-General Abstractive Proposition Segmentation
Authors:
Mohammad Javad Hosseini,
Yang Gao,
Tim Baumgärtner,
Alex Fabrikant,
Reinald Kim Amplayo
Abstract:
Segmenting text into fine-grained units of meaning is important to a wide range of NLP applications. The default approach of segmenting text into sentences is often insufficient, especially since sentences are usually complex enough to include multiple units of meaning that merit separate treatment in the downstream task. We focus on the task of abstractive proposition segmentation: transforming t…
▽ More
Segmenting text into fine-grained units of meaning is important to a wide range of NLP applications. The default approach of segmenting text into sentences is often insufficient, especially since sentences are usually complex enough to include multiple units of meaning that merit separate treatment in the downstream task. We focus on the task of abstractive proposition segmentation: transforming text into simple, self-contained, well-formed sentences. Several recent works have demonstrated the utility of proposition segmentation with few-shot prompted LLMs for downstream tasks such as retrieval-augmented grounding and fact verification. However, this approach does not scale to large amounts of text and may not always extract all the facts from the input text. In this paper, we first introduce evaluation metrics for the task to measure several dimensions of quality. We then propose a scalable, yet accurate, proposition segmentation model. We model proposition segmentation as a supervised task by training LLMs on existing annotated datasets and show that training yields significantly improved results. We further show that by using the fine-tuned LLMs as teachers for annotating large amounts of multi-domain synthetic distillation data, we can train smaller student models with results similar to the teacher LLMs. We then demonstrate that our technique leads to effective domain generalization, by annotating data in two domains outside the original training data and evaluating on them. Finally, as a key contribution of the paper, we share an easy-to-use API for NLP practitioners to use.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
Propagating Kink Waves in an Open Coronal Magnetic Flux Tube with Gravitational Stratification: Magnetohydrodynamic Simulation and Forward Modelling
Authors:
Yuhang Gao,
Tom Van Doorsselaere,
Hui Tian,
Mingzhe Guo,
Konstantinos Karampelas
Abstract:
Context. In the coronal open-field regions, such as coronal holes, there are many transverse waves propagating along magnetic flux tubes, generally interpreted as kink waves. Previous studies have highlighted their potential in coronal heating, solar wind acceleration, and seismological diagnostics of various physical parameters. Aims. This study aims to investigate propagating kink waves, conside…
▽ More
Context. In the coronal open-field regions, such as coronal holes, there are many transverse waves propagating along magnetic flux tubes, generally interpreted as kink waves. Previous studies have highlighted their potential in coronal heating, solar wind acceleration, and seismological diagnostics of various physical parameters. Aims. This study aims to investigate propagating kink waves, considering both vertical and horizontal density inhomogeneity, using three-dimensional magnetohydrodynamic (MHD) simulations. Methods. We establish a 3D MHD model of a gravitationally stratified open flux tube, incorporating a velocity driver at the lower boundary to excite propagating kink waves. Forward modelling is conducted to synthesise observational signatures of the Fe ix 17.1 nm line. Results. It is found that resonant absorption and density stratification both affect the wave amplitude. When diagnosing the relative density profile with velocity amplitude, resonant damping needs to be properly considered to avoid possible underestimation. In addition, unlike standing modes, propagating waves are believed to be Kelvin-Helmholtz stable. In the presence of vertical stratification, however, phase mixing of transverse motions around the tube boundary can still induce small scales, partially dissipating wave energy and leading to a temperature increase, especially at higher altitudes. Moreover, forward modeling is conducted to synthesise observational signatures, revealing the promising potential of future coronal imaging spectrometers such as MUSE in resolving these wave-induced signatures. Also, the synthesised intensity signals exhibit apparent periodic variations, offering a potential method to indirectly observe propagating kink waves with current EUV imagers.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
The Belle II Detector Upgrades Framework Conceptual Design Report
Authors:
H. Aihara,
A. Aloisio,
D. P. Auguste,
M. Aversano,
M. Babeluk,
S. Bahinipati,
Sw. Banerjee,
M. Barbero,
J. Baudot,
A. Beaubien,
F. Becherer,
T. Bergauer,
F. U. Bernlochner.,
V. Bertacchi,
G. Bertolone,
C. Bespin,
M. Bessner,
S. Bettarini,
A. J. Bevan,
B. Bhuyan,
M. Bona,
J. F. Bonis,
J. Borah,
F. Bosi,
R. Boudagga
, et al. (186 additional authors not shown)
Abstract:
We describe the planned near-term and potential longer-term upgrades of the Belle II detector at the SuperKEKB electron-positron collider operating at the KEK laboratory in Tsukuba, Japan. These upgrades will allow increasingly sensitive searches for possible new physics beyond the Standard Model in flavor, tau, electroweak and dark sector physics that are both complementary to and competitive wit…
▽ More
We describe the planned near-term and potential longer-term upgrades of the Belle II detector at the SuperKEKB electron-positron collider operating at the KEK laboratory in Tsukuba, Japan. These upgrades will allow increasingly sensitive searches for possible new physics beyond the Standard Model in flavor, tau, electroweak and dark sector physics that are both complementary to and competitive with the LHC and other experiments.
△ Less
Submitted 4 July, 2024; v1 submitted 26 June, 2024;
originally announced June 2024.
-
Improved measurement of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential dec…
▽ More
Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential decay rate of $D^+_s\to K^0 e^+ν_e$ to be $f^{K^0}_+(0)=0.636\pm0.049\pm0.013$. For both measurements, the first uncertainty is statistical and the second systematic. The branching fraction and form factor measurements are factors of 1.6 and 1.7 more precise than the previous world averages, respectively.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Evidential Concept Embedding Models: Towards Reliable Concept Explanations for Skin Disease Diagnosis
Authors:
Yibo Gao,
Zheyao Gao,
Xin Gao,
Yuanye Liu,
Bomin Wang,
Xiahai Zhuang
Abstract:
Due to the high stakes in medical decision-making, there is a compelling demand for interpretable deep learning methods in medical image analysis. Concept Bottleneck Models (CBM) have emerged as an active interpretable framework incorporating human-interpretable concepts into decision-making. However, their concept predictions may lack reliability when applied to clinical diagnosis, impeding conce…
▽ More
Due to the high stakes in medical decision-making, there is a compelling demand for interpretable deep learning methods in medical image analysis. Concept Bottleneck Models (CBM) have emerged as an active interpretable framework incorporating human-interpretable concepts into decision-making. However, their concept predictions may lack reliability when applied to clinical diagnosis, impeding concept explanations' quality. To address this, we propose an evidential Concept Embedding Model (evi-CEM), which employs evidential learning to model the concept uncertainty. Additionally, we offer to leverage the concept uncertainty to rectify concept misalignments that arise when training CBMs using vision-language models without complete concept supervision. With the proposed methods, we can enhance concept explanations' reliability for both supervised and label-efficient settings. Furthermore, we introduce concept uncertainty for effective test-time intervention. Our evaluation demonstrates that evi-CEM achieves superior performance in terms of concept prediction, and the proposed concept rectification effectively mitigates concept misalignments for label-efficient training. Our code is available at https://github.com/obiyoag/evi-CEM.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Exotic 4f Correlated Electronic States of Ferromagnetic Kondo Lattice Compounds ReRh$_6$Ge$_4$ (Re=Ce, Ho, Er, Tm)
Authors:
Yu Gao,
Jun Jiang,
Haiyan Lu,
Qiaoni Chen
Abstract:
CeRh$_6$Ge$_4$ stands out as the first stoichiometric metallic compound with a ferromagnetic quantum critical point, thereby garnering significant attention. Ferromagnetic Kondo lattice compounds ReRh$_6$Ge$_4$ (Re=Ce, Ho, Er, Tm) have been systematically investigated with density functional theory incorporating Coulomb interaction U and spin-orbital coupling. We determined the magnetic easy axis…
▽ More
CeRh$_6$Ge$_4$ stands out as the first stoichiometric metallic compound with a ferromagnetic quantum critical point, thereby garnering significant attention. Ferromagnetic Kondo lattice compounds ReRh$_6$Ge$_4$ (Re=Ce, Ho, Er, Tm) have been systematically investigated with density functional theory incorporating Coulomb interaction U and spin-orbital coupling. We determined the magnetic easy axis of CeRh$_6$Ge$_4$ is within the ab plane, which is in agreement with previous magnetization measurements conducted under external magnetic field and muSR experiments. We also predicted the magnetic easy axes for the other three compounds. For TmRh$_6$Ge$_4$, the magnetic easy axis aligns along the c axis, thus preserving the $C_3$ rotational symmetry of the c axis. Especially, there are triply degenerate nodal points along the $Γ-A$ direction in the band structure including spin-orbital coupling. A possible localized to itinerant crossover is revealed as $4f$ electrons increase from CeRh$_6$Ge$_4$ to TmRh$_6$Ge$_4$. Specifically, the $4f$ electrons of TmRh$_6$Ge$_4$ contribute to the formation of a large Fermi surface, indicating their participation in the conduction process. Conversely, the $4f$ electrons in HoRh$_6$Ge$_4$, ErRh$_6$Ge$_4$ and CeRh$_6$Ge$_4$ remain localized, which result in smaller Fermi surfaces for these compounds. These theoretical investigations on electronic structure and magnetic properties shed deep insight into the unique nature of $4f$ electrons, providing critical predictions for subsequent experimental studies.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Towards Human-Level 3D Relative Pose Estimation: Generalizable, Training-Free, with Single Reference
Authors:
Yuan Gao,
Yajing Luo,
Junhong Wang,
Kui Jia,
Gui-Song Xia
Abstract:
Humans can easily deduce the relative pose of an unseen object, without label/training, given only a single query-reference image pair. This is arguably achieved by incorporating (i) 3D/2.5D shape perception from a single image, (ii) render-and-compare simulation, and (iii) rich semantic cue awareness to furnish (coarse) reference-query correspondence. Existing methods implement (i) by a 3D CAD mo…
▽ More
Humans can easily deduce the relative pose of an unseen object, without label/training, given only a single query-reference image pair. This is arguably achieved by incorporating (i) 3D/2.5D shape perception from a single image, (ii) render-and-compare simulation, and (iii) rich semantic cue awareness to furnish (coarse) reference-query correspondence. Existing methods implement (i) by a 3D CAD model or well-calibrated multiple images and (ii) by training a network on specific objects, which necessitate laborious ground-truth labeling and tedious training, potentially leading to challenges in generalization. Moreover, (iii) was less exploited in the paradigm of (ii), despite that the coarse correspondence from (iii) enhances the compare process by filtering out non-overlapped parts under substantial pose differences/occlusions. Motivated by this, we propose a novel 3D generalizable relative pose estimation method by elaborating (i) with a 2.5D shape from an RGB-D reference, (ii) with an off-the-shelf differentiable renderer, and (iii) with semantic cues from a pretrained model like DINOv2. Specifically, our differentiable renderer takes the 2.5D rotatable mesh textured by the RGB and the semantic maps (obtained by DINOv2 from the RGB input), then renders new RGB and semantic maps (with back-surface culling) under a novel rotated view. The refinement loss comes from comparing the rendered RGB and semantic maps with the query ones, back-propagating the gradients through the differentiable renderer to refine the 3D relative pose. As a result, our method can be readily applied to unseen objects, given only a single RGB-D reference, without label/training. Extensive experiments on LineMOD, LM-O, and YCB-V show that our training-free method significantly outperforms the SOTA supervised methods, especially under the rigorous Acc@5/10/15° metrics and the challenging cross-dataset settings.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
A Pre-trained Deep Potential Model for Sulfide Solid Electrolytes with Broad Coverage and High Accuracy
Authors:
Ruoyu Wang,
Mingyu Guo,
Yuxiang Gao,
Xiaoxu Wang,
Yuzhi Zhang,
Bin Deng,
Xin Chen,
Mengchao Shi,
Linfeng Zhang,
Zhicheng Zhong
Abstract:
Solid electrolytes with fast ion transport are one of the key challenges for solid state lithium metal batteries. To improve ion conductivity, chemical doping has been the most effective strategy, and atomistic simulation with machine-learning potential helps find optimized doping by predicting ion conductivity for arbitrary composition. Yet most existing machine-learning models are trained on nar…
▽ More
Solid electrolytes with fast ion transport are one of the key challenges for solid state lithium metal batteries. To improve ion conductivity, chemical doping has been the most effective strategy, and atomistic simulation with machine-learning potential helps find optimized doping by predicting ion conductivity for arbitrary composition. Yet most existing machine-learning models are trained on narrow chemistry, and new model has to be trained for each system, wasting transferable knowledge and incurring significant cost. Here, we propose a pre-trained deep potential model purpose-built for sulfide electrolytes with attention mechanism, known as DPA-SSE. The training set encompasses 15 elements and consists of both equilibrium and extensive out-of-equilibrium configurations. DPA-SSE achieves a high energy resolution of less than 2 meV/atom for dynamical trajectories up to 1150 K, and reproduces experimental ion conductivity of sulfide electrolytes with remarkable accuracy. DPA-SSE exhibits good transferability, covering a range of complex electrolytes with mixes of cation and anion atoms. Highly efficient dynamical simulation with DPA-SSE can be realized by model distillation which generates a faster model for given systems. DPA-SSE also serves as a platform for continuous learning, and the model fine-tune requires only a portion of downstream data. These results demonstrate the possibility of a new pathway for AI-driven development of solid electrolytes with exceptional performance.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Measurement of the cross sections of $e^+e^-\to K^{-}\barΞ^{+}Λ/Σ^{0}$ at center-of-mass energies between 3.510 and 4.914 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of…
▽ More
Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$, evidence for $ψ(4160) \to K^{-}\barΞ^{+}Λ$ is found for the first time with a significance of 4.4$σ$, including systematic uncertainties. No evidence for other possible resonances is found. In addition, the products of electronic partial width and branching fraction for all assumed resonances decaying into $K^{-}\barΞ^{+}Λ/Σ^{0}$ are determined.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Measurements of $K_S^0$-$K_L^0$ asymmetries in the decays $Λ_c^+ \to pK_{L,S}^0$, $pK_{L,S}^0π^+π^-$ and $pK_{L,S}^0π^0$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, an…
▽ More
Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, and $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^0)=(2.02 \pm 0.13 \pm 0.05)\%$, where the first uncertainties are statistical and the second systematic. Combining with the known branching fractions of $Λ_c^+ \to pK_{S}^{0}$, $Λ_c^+ \to pK_{S}^{0}π^+π^-$, and $Λ_c^+ \to pK_{S}^{0}π^0$, we present the first measurements of the $K_{S}^{0}$-$K_{L}^{0}$ asymmetries $R(Λ_c^+, K_{S,L}^0X) = \frac{\mathcal{B}(Λ_c^+ \to K_{S}^{0} X) - \mathcal{B}(Λ_c^+ \to K_{L}^{0} X)}{\mathcal{B}(Λ_c^+ \to K_{S}^{0} X) + \mathcal{B}(Λ_c^+ \to K_{L}^{0} X)}$ in charmed baryon decays: $R(Λ_c^+, pK_{S,L}^0) = -0.025 \pm 0.031$, $R(Λ_c^+, pK_{S,L}^0π^+π^-) = -0.027 \pm 0.048$, and $R(Λ_c^+, pK_{S,L}^0π^0) =-0.015 \pm 0.046$. No significant asymmetries within the uncertainties are observed.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Exploring Energy-Based Models for Out-of-Distribution Detection in Dialect Identification
Authors:
Yaqian Hao,
Chenguang Hu,
Yingying Gao,
Shilei Zhang,
Junlan Feng
Abstract:
The diverse nature of dialects presents challenges for models trained on specific linguistic patterns, rendering them susceptible to errors when confronted with unseen or out-of-distribution (OOD) data. This study introduces a novel margin-enhanced joint energy model (MEJEM) tailored specifically for OOD detection in dialects. By integrating a generative model and the energy margin loss, our appro…
▽ More
The diverse nature of dialects presents challenges for models trained on specific linguistic patterns, rendering them susceptible to errors when confronted with unseen or out-of-distribution (OOD) data. This study introduces a novel margin-enhanced joint energy model (MEJEM) tailored specifically for OOD detection in dialects. By integrating a generative model and the energy margin loss, our approach aims to enhance the robustness of dialect identification systems. Furthermore, we explore two OOD scores for OOD dialect detection, and our findings conclusively demonstrate that the energy score outperforms the softmax score. Leveraging Sharpness-Aware Minimization to optimize the training process of the joint model, we enhance model generalization by minimizing both loss and sharpness. Experiments conducted on dialect identification tasks validate the efficacy of Energy-Based Models and provide valuable insights into their performance.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
On Calibration of Speech Classification Models: Insights from Energy-Based Model Investigations
Authors:
Yaqian Hao,
Chenguang Hu,
Yingying Gao,
Shilei Zhang,
Junlan Feng
Abstract:
For speech classification tasks, deep learning models often achieve high accuracy but exhibit shortcomings in calibration, manifesting as classifiers exhibiting overconfidence. The significance of calibration lies in its critical role in guaranteeing the reliability of decision-making within deep learning systems. This study explores the effectiveness of Energy-Based Models in calibrating confiden…
▽ More
For speech classification tasks, deep learning models often achieve high accuracy but exhibit shortcomings in calibration, manifesting as classifiers exhibiting overconfidence. The significance of calibration lies in its critical role in guaranteeing the reliability of decision-making within deep learning systems. This study explores the effectiveness of Energy-Based Models in calibrating confidence for speech classification tasks by training a joint EBM integrating a discriminative and a generative model, thereby enhancing the classifiers calibration and mitigating overconfidence. Experimental evaluations conducted on three speech classification tasks specifically: age, emotion, and language recognition. Our findings highlight the competitive performance of EBMs in calibrating the speech classification models. This research emphasizes the potential of EBMs in speech classification tasks, demonstrating their ability to enhance calibration without sacrificing accuracy.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Probing many-body Bell correlation depth with superconducting qubits
Authors:
Ke Wang,
Weikang Li,
Shibo Xu,
Mengyao Hu,
Jiachen Chen,
Yaozu Wu,
Chuanyu Zhang,
Feitong Jin,
Xuhao Zhu,
Yu Gao,
Ziqi Tan,
Aosai Zhang,
Ning Wang,
Yiren Zou,
Tingting Li,
Fanhao Shen,
Jiarun Zhong,
Zehang Bao,
Zitian Zhu,
Zixuan Song,
Jinfeng Deng,
Hang Dong,
Xu Zhang,
Pengfei Zhang,
Wenjie Jiang
, et al. (10 additional authors not shown)
Abstract:
Quantum nonlocality describes a stronger form of quantum correlation than that of entanglement. It refutes Einstein's belief of local realism and is among the most distinctive and enigmatic features of quantum mechanics. It is a crucial resource for achieving quantum advantages in a variety of practical applications, ranging from cryptography and certified random number generation via self-testing…
▽ More
Quantum nonlocality describes a stronger form of quantum correlation than that of entanglement. It refutes Einstein's belief of local realism and is among the most distinctive and enigmatic features of quantum mechanics. It is a crucial resource for achieving quantum advantages in a variety of practical applications, ranging from cryptography and certified random number generation via self-testing to machine learning. Nevertheless, the detection of nonlocality, especially in quantum many-body systems, is notoriously challenging. Here, we report an experimental certification of genuine multipartite Bell correlations, which signal nonlocality in quantum many-body systems, up to 24 qubits with a fully programmable superconducting quantum processor. In particular, we employ energy as a Bell correlation witness and variationally decrease the energy of a many-body system across a hierarchy of thresholds, below which an increasing Bell correlation depth can be certified from experimental data. As an illustrating example, we variationally prepare the low-energy state of a two-dimensional honeycomb model with 73 qubits and certify its Bell correlations by measuring an energy that surpasses the corresponding classical bound with up to 48 standard deviations. In addition, we variationally prepare a sequence of low-energy states and certify their genuine multipartite Bell correlations up to 24 qubits via energies measured efficiently by parity oscillation and multiple quantum coherence techniques. Our results establish a viable approach for preparing and certifying multipartite Bell correlations, which provide not only a finer benchmark beyond entanglement for quantum devices, but also a valuable guide towards exploiting multipartite Bell correlation in a wide spectrum of practical applications.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Study of the $f_{0}(980)$ through the decay $D_{s}^{+}\rightarrow π^{+}π^{+}π^{-}π^{0}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (649 additional authors not shown)
Abstract:
We perform the first amplitude analysis of $D^+_s \to π^+π^+π^-π^0$ decays, based on data samples of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of 7.33~fb$^{-1}$. We report the observation of $D_{s}^{+} \to f_0(980)ρ(770)^{+}$ with a statistical significance greater than 10$σ$ and…
▽ More
We perform the first amplitude analysis of $D^+_s \to π^+π^+π^-π^0$ decays, based on data samples of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of 7.33~fb$^{-1}$. We report the observation of $D_{s}^{+} \to f_0(980)ρ(770)^{+}$ with a statistical significance greater than 10$σ$ and determine the branching fractions $\mathcal{B}(D_s^+\toπ^+π^+π^-π^0|_{{\rm non}-η})=(2.04\pm0.08_{\rm stat.}\pm0.05_{\rm syst.})\%$ and $\mathcal{B}(D_s^+\toηπ^+)=(1.56\pm0.09_{\rm stat.}\pm0.04_{\rm syst.})\%$. Moreover, we measure the relative branching fraction between $φ\toπ^+π^-π^0$ and $φ\to K^+K^-$ to be $\frac{\mathcal{B}(φ(1020) \to π^+π^-π^0)}{\mathcal{B}(φ(1020) \to K^+K^-)}=0.230 \pm 0.014_{\rm stat.} \pm 0.010_{\rm syst.}$, which deviates from the world average value by more than $4σ$.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Extended alternating structure-adapted proximal gradient algorithm for nonconvex nonsmooth problems
Authors:
Ying Gao,
Chunfeng Cui,
Wenxing Zhang,
Deren Han
Abstract:
Alternating structure-adapted proximal (ASAP) gradient algorithm (M. Nikolova and P. Tan, SIAM J Optim, 29:2053-2078, 2019) has drawn much attention due to its efficiency in solving nonconvex nonsmooth optimization problems. However, the multiblock nonseparable structure confines the performance of ASAP to far-reaching practical problems, e.g., coupled tensor decomposition. In this paper, we propo…
▽ More
Alternating structure-adapted proximal (ASAP) gradient algorithm (M. Nikolova and P. Tan, SIAM J Optim, 29:2053-2078, 2019) has drawn much attention due to its efficiency in solving nonconvex nonsmooth optimization problems. However, the multiblock nonseparable structure confines the performance of ASAP to far-reaching practical problems, e.g., coupled tensor decomposition. In this paper, we propose an extended ASAP (eASAP) algorithm for nonconvex nonsmooth optimization whose objective is the sum of two nonseperable functions and a coupling one. By exploiting the blockwise restricted prox-regularity, eASAP is capable of minimizing the objective whose coupling function is multiblock nonseparable. Moreover, we analyze the global convergence of eASAP by virtue of the Aubin property on partial subdifferential mapping and the Kurdyka-Łojasiewicz property on the objective. Furthermore, the sublinear convergence rate of eASAP is built upon the proximal point algorithmic framework under some mild conditions. Numerical simulations on multimodal data fusion demonstrate the compelling performance of the proposed method.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Probing the nature of the $χ_{c1}(3872)$ state using radiative decays
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1094 additional authors not shown)
Abstract:
The radiative decays $χ_{c1}(3872)\rightarrowψ(2S)γ$ and $χ_{c1}(3872)\rightarrow J/ψγ$ are used to probe the~nature of the~$χ_{c1}(3872)$ state using proton-proton collision data collected with the LHCb detector, corresponding to an~integrated luminosity of~9fb$^{-1}$. Using the~$B^+\rightarrow χ_{c1}(3872)K^+$decay, the $χ_{c1}(3872)\rightarrow ψ(2S)γ$ process is observed for the first time and…
▽ More
The radiative decays $χ_{c1}(3872)\rightarrowψ(2S)γ$ and $χ_{c1}(3872)\rightarrow J/ψγ$ are used to probe the~nature of the~$χ_{c1}(3872)$ state using proton-proton collision data collected with the LHCb detector, corresponding to an~integrated luminosity of~9fb$^{-1}$. Using the~$B^+\rightarrow χ_{c1}(3872)K^+$decay, the $χ_{c1}(3872)\rightarrow ψ(2S)γ$ process is observed for the first time and the ratio of its partial width to that of the $χ_{c1}(3872)\rightarrow J/ψγ$ decay is measured to be $$ \frac{Γ_{χ_{c1}(3872)\rightarrow ψ(2S)γ}}
{Γ_{χ_{c1}(3872)\rightarrow J/ψγ}} = 1.67 \pm 0.21 \pm 0.12 \pm0.04 , $$ where the first uncertainty is statistical, the second systematic and the third is due to the uncertainties on the branching fractions of the $ψ(2S)$ and $J/ψ$ mesons. The measured ratio makes the interpretation of the $χ_{c1}(3872)$ state as a~pure $D^0\bar{D}^{*0}+\bar{D}^0D^{*0}$ molecule questionable and strongly indicates a sizeable compact charmonium or tetraquark component within the $χ_{c1}(3872)$ state.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Light-induced percolative topological phase transition in type-II Weyl semimetal WTe2
Authors:
Xiaoyue Zhou,
Fu Deng,
Yifan Gao,
Yi Chan,
Shulei Li,
Ning Wang,
Junwei Liu,
Jingdi Zhang
Abstract:
We report on an ultrafast terahertz free-carrier dynamic study of a photo-excited WTe2 thin film. In the photo-excited state, we observe a metastable electronic state of anomaly, featuring a negative differential terahertz photoconductivity. Detailed electrodynamics analysis and first-principal calculation attribute it to light-induced topological phase transition, reducing density of states near…
▽ More
We report on an ultrafast terahertz free-carrier dynamic study of a photo-excited WTe2 thin film. In the photo-excited state, we observe a metastable electronic state of anomaly, featuring a negative differential terahertz photoconductivity. Detailed electrodynamics analysis and first-principal calculation attribute it to light-induced topological phase transition, reducing density of states near the Fermi level. Furthermore, the emergence of an unconventional temporal isosbestic point marks a dynamic universality, strongly suggesting a percolative interaction between the two topologically distinct phases.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Compensate Quantization Errors: Make Weights Hierarchical to Compensate Each Other
Authors:
Yifei Gao,
Jie Ou,
Lei Wang,
Yuting Xiao,
Zhiyuan Xiang,
Ruiting Dai,
Jun Cheng
Abstract:
Emergent Large Language Models (LLMs) use their extraordinary performance and powerful deduction capacity to discern from traditional language models. However, the expenses of computational resources and storage for these LLMs are stunning, quantization then arises as a trending conversation. To address accuracy decay caused by quantization, two streams of works in post-training quantization metho…
▽ More
Emergent Large Language Models (LLMs) use their extraordinary performance and powerful deduction capacity to discern from traditional language models. However, the expenses of computational resources and storage for these LLMs are stunning, quantization then arises as a trending conversation. To address accuracy decay caused by quantization, two streams of works in post-training quantization methods stand out. One uses other weights to compensate existing quantization error, while the other transfers the quantization difficulty to other parts in the model. Combining both merits, we introduce Learnable Singular value Increment (LSI) as an advanced solution. LSI uses Singular Value Decomposition to extract singular values of the weights and make them learnable to help weights compensate each other conditioned on activation. Incorporating LSI with existing techniques, we achieve state-of-the-art performance in diverse quantization settings, no matter in weight-only, weight-activation or extremely low bit scenarios. By unleashing the potential of LSI, efficient finetuning on quantized model is no longer a prohibitive problem.
△ Less
Submitted 23 June, 2024;
originally announced June 2024.
-
Decoupling Many-Body Interactions in CeO2 (111) Oxygen Vacancy Structure: Insights from Machine-Learning and Cluster Expansion
Authors:
Yujing Zhang,
Zhong-Kang Han,
Beien Zhu,
Xiaojuan Hu,
Maria Troppenz,
Santiago Riga-monti,
Hui Li,
Claudia Draxl,
M. Verónica Ganduglia-Pirovano,
Yi Gao
Abstract:
Oxygen vacancies (VO's) are of paramount importance in influencing the properties and applications of ceria (CeO2). Yet, comprehending the distribution and nature of the VO's poses a significant challenge due to the vast number of electronic configurations and intricate many-body interactions among VO's and polarons (Ce3+'s). In this study, we employed a combination of LASSO regression in machine…
▽ More
Oxygen vacancies (VO's) are of paramount importance in influencing the properties and applications of ceria (CeO2). Yet, comprehending the distribution and nature of the VO's poses a significant challenge due to the vast number of electronic configurations and intricate many-body interactions among VO's and polarons (Ce3+'s). In this study, we employed a combination of LASSO regression in machine learning, in conjunction with a cluster expansion model and first-principles calculations to decouple the interactions among the Ce3+'s and VO's, thereby circumventing the limitations associated with sampling electronic configurations. By separating these interactions, we identified specific electronic configurations characterized by the most favorable VO-Ce3+ attractions and the least Ce3+-Ce3+/VO-VO repulsions, which are crucial in determining the stability of vacancy structures. Through more than 10^8 Metropolis Monte Carlo samplings of Vo's and Ce3+ in the near-surface of CeO2(111), we explored potential configurations within an 8x8 supercell. Our findings revealed that oxygen vacancies tend to aggregate and are most abundant in the third oxygen layer, primarily due to extensive geometric relaxation-an aspect previously overlooked. This behavior is notably dependent on the concentration of Vo. This work introduces a novel theoretical framework for unraveling the complex vacancy structures in metal oxides, with potential applications in redox and catalytic chemistry.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
Unsupervised Bayesian Generation of Synthetic CT from CBCT Using Patient-Specific Score-Based Prior
Authors:
Junbo Peng,
Yuan Gao,
Chih-Wei Chang,
Richard Qiu,
Tonghe Wang,
Aparna Kesarwala,
Kailin Yang,
Jacob Scott,
David Yu,
Xiaofeng Yang
Abstract:
Background: Cone-beam computed tomography (CBCT) scans, performed fractionally (e.g., daily or weekly), are widely utilized for patient alignment in the image-guided radiotherapy (IGRT) process, thereby making it a potential imaging modality for the implementation of adaptive radiotherapy (ART) protocols. Nonetheless, significant artifacts and incorrect Hounsfield unit (HU) values hinder their app…
▽ More
Background: Cone-beam computed tomography (CBCT) scans, performed fractionally (e.g., daily or weekly), are widely utilized for patient alignment in the image-guided radiotherapy (IGRT) process, thereby making it a potential imaging modality for the implementation of adaptive radiotherapy (ART) protocols. Nonetheless, significant artifacts and incorrect Hounsfield unit (HU) values hinder their application in quantitative tasks such as target and organ segmentations and dose calculation. Therefore, acquiring CT-quality images from the CBCT scans is essential to implement online ART in clinical settings.
Purpose: This work aims to develop an unsupervised learning method using the patient-specific diffusion model for CBCT-based synthetic CT (sCT) generation to improve the image quality of CBCT.
Methods: The proposed method is in an unsupervised framework that utilizes a patient-specific score-based model as the image prior alongside a customized total variation (TV) regularization to enforce coherence across different transverse slices. The score-based model is unconditionally trained using the same patient's planning CT (pCT) images to characterize the manifold of CT-quality images and capture the unique anatomical information of the specific patient. The efficacy of the proposed method was assessed on images from anatomical sites including head and neck (H&N) cancer, pancreatic cancer, and lung cancer. The performance of the proposed CBCT correction method was evaluated using quantitative metrics including mean absolute error (MAE), peak signal-to-noise ratio (PSNR), and normalized cross-correlation (NCC). Additionally, the proposed algorithm was benchmarked against two other unsupervised diffusion model-based CBCT correction algorithms.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Search for the $e^+e^- \to φχ_{c1}(3872)$ process at BESIII
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction…
▽ More
Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction $\mathcal{B}[χ_{c1}(3872)\toπ^+π^- J/ψ]$ at 4.914 and 4.946 GeV are set to be 0.85 and 0.96 pb, respectively. These measurements provide useful information for the production of the $χ_{c1}(3872)$ at $e^+e^-$ collider and deepen our understanding about the nature of this particle.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Thermal activated detection of dark particles in a weakly coupled quantum Ising ladder
Authors:
Yunjing Gao,
Jiahao Yang,
Huihang Lin,
Rong Yu,
Jianda Wu
Abstract:
The Ising$_h^2$ integrable field theory, which emerges when two quantum critical Ising chains are weakly coupled, possesses eight types of relativistic particles whose mass spectrum and scattering matrices are organized by the $\mathcal{D}_8^{(1)}$ algebra. It is predicted that all odd-parity particles are dark and cannot be directly excited from the ground state. This makes these dark particles h…
▽ More
The Ising$_h^2$ integrable field theory, which emerges when two quantum critical Ising chains are weakly coupled, possesses eight types of relativistic particles whose mass spectrum and scattering matrices are organized by the $\mathcal{D}_8^{(1)}$ algebra. It is predicted that all odd-parity particles are dark and cannot be directly excited from the ground state. This makes these dark particles hard to be detected. Here, we study the local dynamical spin structure factor of the model at low-frequencies and low-temperatures. In contrast to the invisibility of the dark particles in THz spectroscopy or inelastic neutron scattering measurement, we find that the lightest dark particle is detectable, manifested as a thermal activation gap in nuclear magnetic resonance measurements. Our results provide a practical criterion for verifying the existence of dark particles.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.