-
Navi2Gaze: Leveraging Foundation Models for Navigation and Target Gazing
Authors:
Jun Zhu,
Zihao Du,
Haotian Xu,
Fengbo Lan,
Zilong Zheng,
Bo Ma,
Shengjie Wang,
Tao Zhang
Abstract:
Task-aware navigation continues to be a challenging area of research, especially in scenarios involving open vocabulary. Previous studies primarily focus on finding suitable locations for task completion, often overlooking the importance of the robot's pose. However, the robot's orientation is crucial for successfully completing tasks because of how objects are arranged (e.g., to open a refrigerat…
▽ More
Task-aware navigation continues to be a challenging area of research, especially in scenarios involving open vocabulary. Previous studies primarily focus on finding suitable locations for task completion, often overlooking the importance of the robot's pose. However, the robot's orientation is crucial for successfully completing tasks because of how objects are arranged (e.g., to open a refrigerator door). Humans intuitively navigate to objects with the right orientation using semantics and common sense. For instance, when opening a refrigerator, we naturally stand in front of it rather than to the side. Recent advances suggest that Vision-Language Models (VLMs) can provide robots with similar common sense. Therefore, we develop a VLM-driven method called Navigation-to-Gaze (Navi2Gaze) for efficient navigation and object gazing based on task descriptions. This method uses the VLM to score and select the best pose from numerous candidates automatically. In evaluations on multiple photorealistic simulation benchmarks, Navi2Gaze significantly outperforms existing approaches and precisely determines the optimal orientation relative to target objects.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Exploring the role of criticality in the quantum Otto cycle fueled by the anisotropic quantum Rabi-Stark model
Authors:
He-Guang Xu,
Jiasen Jin,
Norton G. de Almeida,
G. D. de Moraes Neto
Abstract:
Quantum heat machines, encompassing heat engines, refrigerators, heaters, and accelerators, represent the forefront of quantum thermodynamics, offering a novel paradigm for converting heat energy into useful mechanical work. Leveraging quantum mechanical principles, these machines promise superior efficiency and performance compared to classical counterparts, with potential applications in renewab…
▽ More
Quantum heat machines, encompassing heat engines, refrigerators, heaters, and accelerators, represent the forefront of quantum thermodynamics, offering a novel paradigm for converting heat energy into useful mechanical work. Leveraging quantum mechanical principles, these machines promise superior efficiency and performance compared to classical counterparts, with potential applications in renewable energy and quantum computing. This paper investigates a quantum Otto engine operating in both ideal and finite-time scenarios, employing a two-level system interacting with a harmonic oscillator within the framework of the anisotropic quantum Rabi-Stark model (AQRSM) as the working medium. This model is notable for exhibiting both first-order and continuous quantum phase transitions. By focusing on quantum heat engines, our study reveals that these phase transitions critically modulate the efficiency and power of AQRSM-based engines, outperforming quantum engines fueled by working medium with harmonic spectrum. Additionally, we explore the impacts of quantum friction and conduct limit cycle analysis in finite-time operations, providing insights into optimizing quantum heat engines for practical implementation.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models
Authors:
Runhui Huang,
Xinpeng Ding,
Chunwei Wang,
Jianhua Han,
Yulong Liu,
Hengshuang Zhao,
Hang Xu,
Lu Hou,
Wei Zhang,
Xiaodan Liang
Abstract:
High-resolution inputs enable Large Vision-Language Models (LVLMs) to discern finer visual details, enhancing their comprehension capabilities. To reduce the training and computation costs caused by high-resolution input, one promising direction is to use sliding windows to slice the input into uniform patches, each matching the input size of the well-trained vision encoder. Although efficient, th…
▽ More
High-resolution inputs enable Large Vision-Language Models (LVLMs) to discern finer visual details, enhancing their comprehension capabilities. To reduce the training and computation costs caused by high-resolution input, one promising direction is to use sliding windows to slice the input into uniform patches, each matching the input size of the well-trained vision encoder. Although efficient, this slicing strategy leads to the fragmentation of original input, i.e., the continuity of contextual information and spatial geometry is lost across patches, adversely affecting performance in cross-patch context perception and position-specific tasks. To overcome these shortcomings, we introduce HiRes-LLaVA, a novel framework designed to efficiently process any size of high-resolution input without altering the original contextual and geometric information. HiRes-LLaVA comprises two innovative components: (i) a SliceRestore adapter that reconstructs sliced patches into their original form, efficiently extracting both global and local features via down-up-sampling and convolution layers, and (ii) a Self-Mining Sampler to compresses the vision tokens based on themselves, preserving the original context and positional information while reducing training overhead. To assess the ability of handling context fragmentation, we construct a new benchmark, EntityGrid-QA, consisting of edge-related and position-related tasks. Our comprehensive experiments demonstrate the superiority of HiRes-LLaVA on both existing public benchmarks and on EntityGrid-QA, particularly on document-oriented tasks, establishing new standards for handling high-resolution inputs.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Natural language is not enough: Benchmarking multi-modal generative AI for Verilog generation
Authors:
Kaiyan Chang,
Zhirong Chen,
Yunhao Zhou,
Wenlong Zhu,
kun wang,
Haobo Xu,
Cangyuan Li,
Mengdi Wang,
Shengwen Liang,
Huawei Li,
Yinhe Han,
Ying Wang
Abstract:
Natural language interfaces have exhibited considerable potential in the automation of Verilog generation derived from high-level specifications through the utilization of large language models, garnering significant attention. Nevertheless, this paper elucidates that visual representations contribute essential contextual information critical to design intent for hardware architectures possessing…
▽ More
Natural language interfaces have exhibited considerable potential in the automation of Verilog generation derived from high-level specifications through the utilization of large language models, garnering significant attention. Nevertheless, this paper elucidates that visual representations contribute essential contextual information critical to design intent for hardware architectures possessing spatial complexity, potentially surpassing the efficacy of natural-language-only inputs. Expanding upon this premise, our paper introduces an open-source benchmark for multi-modal generative models tailored for Verilog synthesis from visual-linguistic inputs, addressing both singular and complex modules. Additionally, we introduce an open-source visual and natural language Verilog query language framework to facilitate efficient and user-friendly multi-modal queries. To evaluate the performance of the proposed multi-modal hardware generative AI in Verilog generation tasks, we compare it with a popular method that relies solely on natural language. Our results demonstrate a significant accuracy improvement in the multi-modal generated Verilog compared to queries based solely on natural language. We hope to reveal a new approach to hardware design in the large-hardware-design-model era, thereby fostering a more diversified and productive approach to hardware design.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Beyond Instruction Following: Evaluating Rule Following of Large Language Models
Authors:
Wangtao Sun,
Chenxiang Zhang,
Xueyou Zhang,
Ziyang Huang,
Haotian Xu,
Pei Chen,
Shizhu He,
Jun Zhao,
Kang Liu
Abstract:
Although Large Language Models (LLMs) have demonstrated strong instruction-following ability to be helpful, they are further supposed to be controlled and guided by rules in real-world scenarios to be safe, and accurate in responses. This demands the possession of rule-following capability of LLMs. However, few works have made a clear evaluation of the rule-following capability of LLMs. Previous s…
▽ More
Although Large Language Models (LLMs) have demonstrated strong instruction-following ability to be helpful, they are further supposed to be controlled and guided by rules in real-world scenarios to be safe, and accurate in responses. This demands the possession of rule-following capability of LLMs. However, few works have made a clear evaluation of the rule-following capability of LLMs. Previous studies that try to evaluate the rule-following capability of LLMs fail to distinguish the rule-following scenarios from the instruction-following scenarios. Therefore, this paper first makes a clarification of the concept of rule-following, and curates a comprehensive benchmark, RuleBench, to evaluate a diversified range of rule-following abilities. Our experimental results on a variety of LLMs show that they are still limited in following rules. Our further analysis provides insights into the improvements for LLMs toward a better rule-following intelligent agent. The data and code can be found at: https://anonymous.4open.science/r/llm-rule-following-B3E3/
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Raman study of the structural transition in LiVO$_2$
Authors:
Yuri S. Ponosov,
Evgenia V. Komleva,
Elizaveta A. Pankrushina,
Haohang Xu,
Yu Sui,
Sergey V. Streltsov
Abstract:
The results of polarization-dependent Raman spectroscopy of single-crystalline LiVO$_2$ exhibiting transition to a diamagnetic state below $T_c \sim $500K are reported. Our measurements clearly detect additional peaks in the low-temperature phase, which disappear nearly completely when heated above $T_c$. Proposed $\sqrt{3}$a \times $\sqrt{3}$a lattice reconstruction explains these new Raman peaks…
▽ More
The results of polarization-dependent Raman spectroscopy of single-crystalline LiVO$_2$ exhibiting transition to a diamagnetic state below $T_c \sim $500K are reported. Our measurements clearly detect additional peaks in the low-temperature phase, which disappear nearly completely when heated above $T_c$. Proposed $\sqrt{3}$a \times $\sqrt{3}$a lattice reconstruction explains these new Raman peaks by the Brillouin zone folding. The experiment, on the one hand, confirms that the symmetry of the non-magnetic phase is not lower than trigonal, but, on another hand, our thermal cycling study suggests possible stacking faults. This agrees with results of the density functional theory calculations, which show that the energy difference between different types of stacking does not exceed 1 K per formula unit.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
fairBERTs: Erasing Sensitive Information Through Semantic and Fairness-aware Perturbations
Authors:
Jinfeng Li,
Yuefeng Chen,
Xiangyu Liu,
Longtao Huang,
Rong Zhang,
Hui Xue
Abstract:
Pre-trained language models (PLMs) have revolutionized both the natural language processing research and applications. However, stereotypical biases (e.g., gender and racial discrimination) encoded in PLMs have raised negative ethical implications for PLMs, which critically limits their broader applications. To address the aforementioned unfairness issues, we present fairBERTs, a general framework…
▽ More
Pre-trained language models (PLMs) have revolutionized both the natural language processing research and applications. However, stereotypical biases (e.g., gender and racial discrimination) encoded in PLMs have raised negative ethical implications for PLMs, which critically limits their broader applications. To address the aforementioned unfairness issues, we present fairBERTs, a general framework for learning fair fine-tuned BERT series models by erasing the protected sensitive information via semantic and fairness-aware perturbations generated by a generative adversarial network. Through extensive qualitative and quantitative experiments on two real-world tasks, we demonstrate the great superiority of fairBERTs in mitigating unfairness while maintaining the model utility. We also verify the feasibility of transferring adversarial components in fairBERTs to other conventionally trained BERT-like models for yielding fairness improvements. Our findings may shed light on further research on building fairer fine-tuned PLMs.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Multi-task Prompt Words Learning for Social Media Content Generation
Authors:
Haochen Xue,
Chong Zhang,
Chengzhi Liu,
Fangyu Wu,
Xiaobo Jin
Abstract:
The rapid development of the Internet has profoundly changed human life. Humans are increasingly expressing themselves and interacting with others on social media platforms. However, although artificial intelligence technology has been widely used in many aspects of life, its application in social media content creation is still blank. To solve this problem, we propose a new prompt word generation…
▽ More
The rapid development of the Internet has profoundly changed human life. Humans are increasingly expressing themselves and interacting with others on social media platforms. However, although artificial intelligence technology has been widely used in many aspects of life, its application in social media content creation is still blank. To solve this problem, we propose a new prompt word generation framework based on multi-modal information fusion, which combines multiple tasks including topic classification, sentiment analysis, scene recognition and keyword extraction to generate more comprehensive prompt words. Subsequently, we use a template containing a set of prompt words to guide ChatGPT to generate high-quality tweets. Furthermore, in the absence of effective and objective evaluation criteria in the field of content generation, we use the ChatGPT tool to evaluate the results generated by the algorithm, making large-scale evaluation of content generation algorithms possible. Evaluation results on extensive content generation demonstrate that our cue word generation framework generates higher quality content compared to manual methods and other cueing techniques, while topic classification, sentiment analysis, and scene recognition significantly enhance content clarity and its consistency with the image.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$
Authors:
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (645 additional authors not shown)
Abstract:
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be…
▽ More
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be $(35.9\pm 4.8\pm 3.5)\%$ and $(37.4\pm 3.1\pm 4.6)\%$, respectively. The measurements are in tension with predictions based on the assumption that the $D_{s1}(2536)$ and $D_{s2}^*(2573)$ are dominated by a bare $c\bar{s}$ component. The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ cross sections are measured, and a resonant structure at around 4.6~GeV with a width of 50~MeV is observed for the first time with a statistical significance of $15σ$ in the $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ process. It could be the $Y(4626)$ found by the Belle collaboration in the $D_s^+D_{s1}(2536)^{-}$ final state, since they have similar masses and widths. There is also evidence for a structure at around 4.75~GeV in both processes.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Bucket Pre-training is All You Need
Authors:
Hongtao Liu,
Qiyao Peng,
Qing Yang,
Kai Liu,
Hongyan Xu
Abstract:
Large language models (LLMs) have demonstrated exceptional performance across various natural language processing tasks. However, the conventional fixed-length data composition strategy for pretraining, which involves concatenating and splitting documents, can introduce noise and limit the model's ability to capture long-range dependencies. To address this, we first introduce three metrics for eva…
▽ More
Large language models (LLMs) have demonstrated exceptional performance across various natural language processing tasks. However, the conventional fixed-length data composition strategy for pretraining, which involves concatenating and splitting documents, can introduce noise and limit the model's ability to capture long-range dependencies. To address this, we first introduce three metrics for evaluating data composition quality: padding ratio, truncation ratio, and concatenation ratio. We further propose a multi-bucket data composition method that moves beyond the fixed-length paradigm, offering a more flexible and efficient approach to pretraining. Extensive experiments demonstrate that our proposed method could significantly improving both the efficiency and efficacy of LLMs pretraining. Our approach not only reduces noise and preserves context but also accelerates training, making it a promising solution for LLMs pretraining.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Review-LLM: Harnessing Large Language Models for Personalized Review Generation
Authors:
Qiyao Peng,
Hongtao Liu,
Hongyan Xu,
Qing Yang,
Minglai Shao,
Wenjun Wang
Abstract:
Product review generation is an important task in recommender systems, which could provide explanation and persuasiveness for the recommendation. Recently, Large Language Models (LLMs, e.g., ChatGPT) have shown superior text modeling and generating ability, which could be applied in review generation. However, directly applying the LLMs for generating reviews might be troubled by the ``polite'' ph…
▽ More
Product review generation is an important task in recommender systems, which could provide explanation and persuasiveness for the recommendation. Recently, Large Language Models (LLMs, e.g., ChatGPT) have shown superior text modeling and generating ability, which could be applied in review generation. However, directly applying the LLMs for generating reviews might be troubled by the ``polite'' phenomenon of the LLMs and could not generate personalized reviews (e.g., negative reviews). In this paper, we propose Review-LLM that customizes LLMs for personalized review generation. Firstly, we construct the prompt input by aggregating user historical behaviors, which include corresponding item titles and reviews. This enables the LLMs to capture user interest features and review writing style. Secondly, we incorporate ratings as indicators of satisfaction into the prompt, which could further improve the model's understanding of user preferences and the sentiment tendency control of generated reviews. Finally, we feed the prompt text into LLMs, and use Supervised Fine-Tuning (SFT) to make the model generate personalized reviews for the given user and target item. Experimental results on the real-world dataset show that our fine-tuned model could achieve better review generation performance than existing close-source LLMs.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance
Authors:
Guian Fang,
Wenbiao Yan,
Yuanfan Guo,
Jianhua Han,
Zutao Jiang,
Hang Xu,
Shengcai Liao,
Xiaodan Liang
Abstract:
Text-to-image diffusion models have significantly advanced in conditional image generation. However, these models usually struggle with accurately rendering images featuring humans, resulting in distorted limbs and other anomalies. This issue primarily stems from the insufficient recognition and evaluation of limb qualities in diffusion models. To address this issue, we introduce AbHuman, the firs…
▽ More
Text-to-image diffusion models have significantly advanced in conditional image generation. However, these models usually struggle with accurately rendering images featuring humans, resulting in distorted limbs and other anomalies. This issue primarily stems from the insufficient recognition and evaluation of limb qualities in diffusion models. To address this issue, we introduce AbHuman, the first large-scale synthesized human benchmark focusing on anatomical anomalies. This benchmark consists of 56K synthesized human images, each annotated with detailed, bounding-box level labels identifying 147K human anomalies in 18 different categories. Based on this, the recognition of human anomalies can be established, which in turn enhances image generation through traditional techniques such as negative prompting and guidance. To further boost the improvement, we propose HumanRefiner, a novel plug-and-play approach for the coarse-to-fine refinement of human anomalies in text-to-image generation. Specifically, HumanRefiner utilizes a self-diagnostic procedure to detect and correct issues related to both coarse-grained abnormal human poses and fine-grained anomaly levels, facilitating pose-reversible diffusion generation. Experimental results on the AbHuman benchmark demonstrate that HumanRefiner significantly reduces generative discrepancies, achieving a 2.9x improvement in limb quality compared to the state-of-the-art open-source generator SDXL and a 1.4x improvement over DALL-E 3 in human evaluations. Our data and code are available at https://github.com/Enderfga/HumanRefiner.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
RIS-Assisted Received Adaptive Spatial Modulation for Wireless Communication
Authors:
Chaorong Zhang,
Hui Xu,
Benjamin K. Ng,
Chan-Tong Lam
Abstract:
A novel wireless transmission scheme, as named the reconfigurable intelligent surface (RIS)-assisted received adaptive spatial modulation (RASM) scheme, is proposed in this paper. In this scheme, the adaptive spatial modulation (ASM)-based antennas selection works at the receiver by employing the characteristics of the RIS in each time slot, where the signal-to-noise ratio at specific selected ant…
▽ More
A novel wireless transmission scheme, as named the reconfigurable intelligent surface (RIS)-assisted received adaptive spatial modulation (RASM) scheme, is proposed in this paper. In this scheme, the adaptive spatial modulation (ASM)-based antennas selection works at the receiver by employing the characteristics of the RIS in each time slot, where the signal-to-noise ratio at specific selected antennas can be further enhanced with near few powers. Besides for the bits from constellation symbols, the extra bits can be mapped into the indices of receive antenna combinations and conveyed to the receiver through the ASM-based antenna-combination selection, thus providing higher spectral efficiency. To explicitly present the RASM scheme, the analytical performance of bit error rate of it is discussed in this paper. As a trade-off selection, the proposed scheme shows higher spectral efficiency and remains the satisfactory error performance. Simulation and analytical results demonstrate the better performance and exhibit more potential to apply in practical wireless communication.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
F2PAD: A General Optimization Framework for Feature-Level to Pixel-Level Anomaly Detection
Authors:
Chengyu Tao,
Hao Xu,
Juan Du
Abstract:
Image-based inspection systems have been widely deployed in manufacturing production lines. Due to the scarcity of defective samples, unsupervised anomaly detection that only leverages normal samples during training to detect various defects is popular. Existing feature-based methods, utilizing deep features from pretrained neural networks, show their impressive performance in anomaly localization…
▽ More
Image-based inspection systems have been widely deployed in manufacturing production lines. Due to the scarcity of defective samples, unsupervised anomaly detection that only leverages normal samples during training to detect various defects is popular. Existing feature-based methods, utilizing deep features from pretrained neural networks, show their impressive performance in anomaly localization and the low demand for the sample size for training. However, the detected anomalous regions of these methods always exhibit inaccurate boundaries, which impedes the downstream tasks. This deficiency is caused: (i) The decreased resolution of high-level features compared with the original image, and (ii) The mixture of adjacent normal and anomalous pixels during feature extraction. To address them, we propose a novel unified optimization framework (F2PAD) that leverages the Feature-level information to guide the optimization process for Pixel-level Anomaly Detection in the inference stage. The proposed framework is universal and plug-and-play, which can enhance various feature-based methods with limited assumptions. Case studies are provided to demonstrate the effectiveness of our strategy, particularly when applied to three popular backbone methods: PaDiM, CFLOW-AD, and PatchCore.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
FairDiff: Fair Segmentation with Point-Image Diffusion
Authors:
Wenyi Li,
Haoran Xu,
Guiyu Zhang,
Huan-ang Gao,
Mingju Gao,
Mengyu Wang,
Hao Zhao
Abstract:
Fairness is an important topic for medical image analysis, driven by the challenge of unbalanced training data among diverse target groups and the societal demand for equitable medical quality. In response to this issue, our research adopts a data-driven strategy-enhancing data balance by integrating synthetic images. However, in terms of generating synthetic images, previous works either lack pai…
▽ More
Fairness is an important topic for medical image analysis, driven by the challenge of unbalanced training data among diverse target groups and the societal demand for equitable medical quality. In response to this issue, our research adopts a data-driven strategy-enhancing data balance by integrating synthetic images. However, in terms of generating synthetic images, previous works either lack paired labels or fail to precisely control the boundaries of synthetic images to be aligned with those labels. To address this, we formulate the problem in a joint optimization manner, in which three networks are optimized towards the goal of empirical risk minimization and fairness maximization. On the implementation side, our solution features an innovative Point-Image Diffusion architecture, which leverages 3D point clouds for improved control over mask boundaries through a point-mask-image synthesis pipeline. This method outperforms significantly existing techniques in synthesizing scanning laser ophthalmoscopy (SLO) fundus images. By combining synthetic data with real data during the training phase using a proposed Equal Scale approach, our model achieves superior fairness segmentation performance compared to the state-of-the-art fairness learning models. Code is available at https://github.com/wenyi-li/FairDiff.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Joint Beamforming and Antenna Design for Near-Field Fluid Antenna System
Authors:
Yixuan Chen,
Mingzhe Chen,
Hao Xu,
Zhaohui Yang,
Kai-Kit Wong,
Zhaoyang Zhang
Abstract:
In this letter, we study the energy efficiency maximization problem for a fluid antenna system (FAS) in near field communications. Specifically, we consider a point-to-point near-field system where the base station (BS) transmitter has multiple fixed-position antennas and the user receives the signals with multiple fluid antennas. Our objective is to jointly optimize the transmit beamforming of th…
▽ More
In this letter, we study the energy efficiency maximization problem for a fluid antenna system (FAS) in near field communications. Specifically, we consider a point-to-point near-field system where the base station (BS) transmitter has multiple fixed-position antennas and the user receives the signals with multiple fluid antennas. Our objective is to jointly optimize the transmit beamforming of the BS and the fluid antenna positions at the user for maximizing the energy efficiency. Our scheme is based on an alternating optimization algorithm that iteratively solves the beamforming and antenna position subproblems. Our simulation results validate the performance improvement of the proposed algorithm and confirm the effectiveness of FAS.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
LGRNet: Local-Global Reciprocal Network for Uterine Fibroid Segmentation in Ultrasound Videos
Authors:
Huihui Xu,
Yijun Yang,
Angelica I Aviles-Rivero,
Guang Yang,
Jing Qin,
Lei Zhu
Abstract:
Regular screening and early discovery of uterine fibroid are crucial for preventing potential malignant transformations and ensuring timely, life-saving interventions. To this end, we collect and annotate the first ultrasound video dataset with 100 videos for uterine fibroid segmentation (UFUV). We also present Local-Global Reciprocal Network (LGRNet) to efficiently and effectively propagate the l…
▽ More
Regular screening and early discovery of uterine fibroid are crucial for preventing potential malignant transformations and ensuring timely, life-saving interventions. To this end, we collect and annotate the first ultrasound video dataset with 100 videos for uterine fibroid segmentation (UFUV). We also present Local-Global Reciprocal Network (LGRNet) to efficiently and effectively propagate the long-term temporal context which is crucial to help distinguish between uninformative noisy surrounding tissues and target lesion regions. Specifically, the Cyclic Neighborhood Propagation (CNP) is introduced to propagate the inter-frame local temporal context in a cyclic manner. Moreover, to aggregate global temporal context, we first condense each frame into a set of frame bottleneck queries and devise Hilbert Selective Scan (HilbertSS) to both efficiently path connect each frame and preserve the locality bias. A distribute layer is then utilized to disseminate back the global context for reciprocal refinement. Extensive experiments on UFUV and three public Video Polyp Segmentation (VPS) datasets demonstrate consistent improvements compared to state-of-the-art segmentation methods, indicating the effectiveness and versatility of LGRNet. Code, checkpoints, and dataset are available at https://github.com/bio-mlhui/LGRNet
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
SCSA: Exploring the Synergistic Effects Between Spatial and Channel Attention
Authors:
Yunzhong Si,
Huiying Xu,
Xinzhong Zhu,
Wenhao Zhang,
Yao Dong,
Yuxing Chen,
Hongbo Li
Abstract:
Channel and spatial attentions have respectively brought significant improvements in extracting feature dependencies and spatial structure relations for various downstream vision tasks. While their combination is more beneficial for leveraging their individual strengths, the synergy between channel and spatial attentions has not been fully explored, lacking in fully harness the synergistic potenti…
▽ More
Channel and spatial attentions have respectively brought significant improvements in extracting feature dependencies and spatial structure relations for various downstream vision tasks. While their combination is more beneficial for leveraging their individual strengths, the synergy between channel and spatial attentions has not been fully explored, lacking in fully harness the synergistic potential of multi-semantic information for feature guidance and mitigation of semantic disparities. Our study attempts to reveal the synergistic relationship between spatial and channel attention at multiple semantic levels, proposing a novel Spatial and Channel Synergistic Attention module (SCSA). Our SCSA consists of two parts: the Shareable Multi-Semantic Spatial Attention (SMSA) and the Progressive Channel-wise Self-Attention (PCSA). SMSA integrates multi-semantic information and utilizes a progressive compression strategy to inject discriminative spatial priors into PCSA's channel self-attention, effectively guiding channel recalibration. Additionally, the robust feature interactions based on the self-attention mechanism in PCSA further mitigate the disparities in multi-semantic information among different sub-features within SMSA. We conduct extensive experiments on seven benchmark datasets, including classification on ImageNet-1K, object detection on MSCOCO 2017, segmentation on ADE20K, and four other complex scene detection datasets. Our results demonstrate that our proposed SCSA not only surpasses the current state-of-the-art attention but also exhibits enhanced generalization capabilities across various task scenarios. The code and models are available at: https://github.com/HZAI-ZJNU/SCSA.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Distributional stability of sparse inverse covariance matrix estimators
Authors:
Renjie Chen,
Huifu Xu,
Henryk Zähle
Abstract:
Finding an approximation of the inverse of the covariance matrix, also known as precision matrix, of a random vector with empirical data is widely discussed in finance and engineering. In data-driven problems, empirical data may be ``contaminated''. This raises the question as to whether the approximate precision matrix is reliable from a statistical point of view. In this paper, we concentrate on…
▽ More
Finding an approximation of the inverse of the covariance matrix, also known as precision matrix, of a random vector with empirical data is widely discussed in finance and engineering. In data-driven problems, empirical data may be ``contaminated''. This raises the question as to whether the approximate precision matrix is reliable from a statistical point of view. In this paper, we concentrate on a much-noticed sparse estimator of the precision matrix and investigate the issue from the perspective of distributional stability. Specifically, we derive an explicit local Lipschitz bound for the distance between the distributions of the sparse estimator under two different distributions (regarded as the true data distribution and the distribution of ``contaminated'' data). The distance is measured by the Kantorovich metric on the set of all probability measures on a matrix space. We also present analogous results for the standard estimators of the covariance matrix and its eigenvalues. Furthermore, we discuss two applications and conduct some numerical experiments.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
The Solution for Language-Enhanced Image New Category Discovery
Authors:
Haonan Xu,
Dian Chao,
Xiangyu Wu,
Zhonghua Wan,
Yang Yang
Abstract:
Treating texts as images, combining prompts with textual labels for prompt tuning, and leveraging the alignment properties of CLIP have been successfully applied in zero-shot multi-label image recognition. Nonetheless, relying solely on textual labels to store visual information is insufficient for representing the diversity of visual objects. In this paper, we propose reversing the training proce…
▽ More
Treating texts as images, combining prompts with textual labels for prompt tuning, and leveraging the alignment properties of CLIP have been successfully applied in zero-shot multi-label image recognition. Nonetheless, relying solely on textual labels to store visual information is insufficient for representing the diversity of visual objects. In this paper, we propose reversing the training process of CLIP and introducing the concept of Pseudo Visual Prompts. These prompts are initialized for each object category and pre-trained on large-scale, low-cost sentence data generated by large language models. This process mines the aligned visual information in CLIP and stores it in class-specific visual prompts. We then employ contrastive learning to transfer the stored visual information to the textual labels, enhancing their visual representation capacity. Additionally, we introduce a dual-adapter module that simultaneously leverages knowledge from the original CLIP and new learning knowledge derived from downstream datasets. Benefiting from the pseudo visual prompts, our method surpasses the state-of-the-art not only on clean annotated text data but also on pseudo text data generated by large language models.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
The Solution for the AIGC Inference Performance Optimization Competition
Authors:
Sishun Pan,
Haonan Xu,
Zhonghua Wan,
Yang Yang
Abstract:
In recent years, the rapid advancement of large-scale pre-trained language models based on transformer architectures has revolutionized natural language processing tasks. Among these, ChatGPT has gained widespread popularity, demonstrating human-level conversational abilities and attracting over 100 million monthly users by late 2022. Concurrently, Baidu's commercial deployment of the Ernie Wenxin…
▽ More
In recent years, the rapid advancement of large-scale pre-trained language models based on transformer architectures has revolutionized natural language processing tasks. Among these, ChatGPT has gained widespread popularity, demonstrating human-level conversational abilities and attracting over 100 million monthly users by late 2022. Concurrently, Baidu's commercial deployment of the Ernie Wenxin model has significantly enhanced marketing effectiveness through AI-driven technologies. This paper focuses on optimizing high-performance inference for Ernie models, emphasizing GPU acceleration and leveraging the Paddle inference framework. We employ techniques such as Faster Transformer for efficient model processing, embedding layer pruning to reduce computational overhead, and FP16 half-precision inference for enhanced computational efficiency. Additionally, our approach integrates efficient data handling strategies using multi-process parallel processing to minimize latency. Experimental results demonstrate that our optimized solution achieves up to an 8.96x improvement in inference speed compared to standard methods, while maintaining competitive performance.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
LaRa: Efficient Large-Baseline Radiance Fields
Authors:
Anpei Chen,
Haofei Xu,
Stefano Esposito,
Siyu Tang,
Andreas Geiger
Abstract:
Radiance field methods have achieved photorealistic novel view synthesis and geometry reconstruction. But they are mostly applied in per-scene optimization or small-baseline settings. While several recent works investigate feed-forward reconstruction with large baselines by utilizing transformers, they all operate with a standard global attention mechanism and hence ignore the local nature of 3D r…
▽ More
Radiance field methods have achieved photorealistic novel view synthesis and geometry reconstruction. But they are mostly applied in per-scene optimization or small-baseline settings. While several recent works investigate feed-forward reconstruction with large baselines by utilizing transformers, they all operate with a standard global attention mechanism and hence ignore the local nature of 3D reconstruction. We propose a method that unifies local and global reasoning in transformer layers, resulting in improved quality and faster convergence. Our model represents scenes as Gaussian Volumes and combines this with an image encoder and Group Attention Layers for efficient feed-forward reconstruction. Experimental results demonstrate that our model, trained for two days on four GPUs, demonstrates high fidelity in reconstructing 360° radiance fields, and robustness to zero-shot and out-of-domain testing.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition
Authors:
Ye Bai,
Jingping Chen,
Jitong Chen,
Wei Chen,
Zhuo Chen,
Chuang Ding,
Linhao Dong,
Qianqian Dong,
Yujiao Du,
Kepan Gao,
Lu Gao,
Yi Guo,
Minglun Han,
Ting Han,
Wenchao Hu,
Xinying Hu,
Yuxiang Hu,
Deyu Hua,
Lu Huang,
Mingkun Huang,
Youjia Huang,
Jishuo Jin,
Fanliu Kong,
Zongwei Lan,
Tianyu Li
, et al. (30 additional authors not shown)
Abstract:
Modern automatic speech recognition (ASR) model is required to accurately transcribe diverse speech signals (from different domains, languages, accents, etc) given the specific contextual information in various application scenarios. Classic end-to-end models fused with extra language models perform well, but mainly in data matching scenarios and are gradually approaching a bottleneck. In this wor…
▽ More
Modern automatic speech recognition (ASR) model is required to accurately transcribe diverse speech signals (from different domains, languages, accents, etc) given the specific contextual information in various application scenarios. Classic end-to-end models fused with extra language models perform well, but mainly in data matching scenarios and are gradually approaching a bottleneck. In this work, we introduce Seed-ASR, a large language model (LLM) based speech recognition model. Seed-ASR is developed based on the framework of audio conditioned LLM (AcLLM), leveraging the capabilities of LLMs by inputting continuous speech representations together with contextual information into the LLM. Through stage-wise large-scale training and the elicitation of context-aware capabilities in LLM, Seed-ASR demonstrates significant improvement over end-to-end models on comprehensive evaluation sets, including multiple domains, accents/dialects and languages. Additionally, Seed-ASR can be further deployed to support specific needs in various scenarios without requiring extra language models. Compared to recently released large ASR models, Seed-ASR achieves 10%-40% reduction in word (or character, for Chinese) error rates on Chinese and English public test sets, further demonstrating its powerful performance.
△ Less
Submitted 10 July, 2024; v1 submitted 5 July, 2024;
originally announced July 2024.
-
Romanization Encoding For Multilingual ASR
Authors:
Wen Ding,
Fei Jia,
Hainan Xu,
Yu Xi,
Junjie Lai,
Boris Ginsburg
Abstract:
We introduce romanization encoding for script-heavy languages to optimize multilingual and code-switching Automatic Speech Recognition (ASR) systems. By adopting romanization encoding alongside a balanced concatenated tokenizer within a FastConformer-RNNT framework equipped with a Roman2Char module, we significantly reduce vocabulary and output dimensions, enabling larger training batches and redu…
▽ More
We introduce romanization encoding for script-heavy languages to optimize multilingual and code-switching Automatic Speech Recognition (ASR) systems. By adopting romanization encoding alongside a balanced concatenated tokenizer within a FastConformer-RNNT framework equipped with a Roman2Char module, we significantly reduce vocabulary and output dimensions, enabling larger training batches and reduced memory consumption. Our method decouples acoustic modeling and language modeling, enhancing the flexibility and adaptability of the system. In our study, applying this method to Mandarin-English ASR resulted in a remarkable 63.51% vocabulary reduction and notable performance gains of 13.72% and 15.03% on SEAME code-switching benchmarks. Ablation studies on Mandarin-Korean and Mandarin-Japanese highlight our method's strong capability to address the complexities of other script-heavy languages, paving the way for more versatile and effective multilingual ASR systems.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
MuseBarControl: Enhancing Fine-Grained Control in Symbolic Music Generation through Pre-Training and Counterfactual Loss
Authors:
Yangyang Shu,
Haiming Xu,
Ziqin Zhou,
Anton van den Hengel,
Lingqiao Liu
Abstract:
Automatically generating symbolic music-music scores tailored to specific human needs-can be highly beneficial for musicians and enthusiasts. Recent studies have shown promising results using extensive datasets and advanced transformer architectures. However, these state-of-the-art models generally offer only basic control over aspects like tempo and style for the entire composition, lacking the a…
▽ More
Automatically generating symbolic music-music scores tailored to specific human needs-can be highly beneficial for musicians and enthusiasts. Recent studies have shown promising results using extensive datasets and advanced transformer architectures. However, these state-of-the-art models generally offer only basic control over aspects like tempo and style for the entire composition, lacking the ability to manage finer details, such as control at the level of individual bars. While fine-tuning a pre-trained symbolic music generation model might seem like a straightforward method for achieving this finer control, our research indicates challenges in this approach. The model often fails to respond adequately to new, fine-grained bar-level control signals. To address this, we propose two innovative solutions. First, we introduce a pre-training task designed to link control signals directly with corresponding musical tokens, which helps in achieving a more effective initialization for subsequent fine-tuning. Second, we implement a novel counterfactual loss that promotes better alignment between the generated music and the control prompts. Together, these techniques significantly enhance our ability to control music generation at the bar level, showing a 13.06\% improvement over conventional methods. Our subjective evaluations also confirm that this enhanced control does not compromise the musical quality of the original pre-trained generative model.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
How high-status women promote repeated collaboration among women in male-dominated contexts
Authors:
Huimin Xu,
Jamie Strassman,
Ying Ding,
Steven Gray,
Maytal Saar-Tsechansky
Abstract:
Male-dominated contexts pose a dilemma: they increase the benefits of repeated collaboration among women, yet at the same time, make such collaborations less likely. This paper seeks to understand the conditions that foster repeated collaboration among women versus men in male-dominated settings by examining the critical role of status hierarchies. Using collaboration data on 8,232,769 computer sc…
▽ More
Male-dominated contexts pose a dilemma: they increase the benefits of repeated collaboration among women, yet at the same time, make such collaborations less likely. This paper seeks to understand the conditions that foster repeated collaboration among women versus men in male-dominated settings by examining the critical role of status hierarchies. Using collaboration data on 8,232,769 computer science research teams, we found that when a woman holds the top-ranking position in a steep status hierarchy, other women on that team are more likely than men to collaborate again, as compared to when the hierarchy is flat, and compared to when men occupy the top-ranking position. In steep hierarchies, top-ranking women but not top-ranking men foster conditions in which junior women are more likely to collaborate again than junior men of similar status levels. Our research suggests that whereas status hierarchies are especially detrimental to repeated collaboration among underrepresented individuals, top-ranking women in steep status hierarchies mitigate these negative impacts between women in male-dominated settings.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
A Tutorial on Fluid Antenna System for 6G Networks: Encompassing Communication Theory, Optimization Methods and Hardware Designs
Authors:
Wee Kiat New,
Kai-Kit Wong,
Hao Xu,
Chao Wang,
Farshad Rostami Ghadi,
Jichen Zhang,
Junhui Rao,
Ross Murch,
Pablo Ramírez-Espinosa,
David Morales-Jimenez,
Chan-Byoung Chae,
Kin-Fai Tong
Abstract:
The advent of the sixth-generation (6G) networks presents another round of revolution for the mobile communication landscape, promising an immersive experience, robust reliability, minimal latency, extreme connectivity, ubiquitous coverage, and capabilities beyond communication, including intelligence and sensing. To achieve these ambitious goals, it is apparent that 6G networks need to incorporat…
▽ More
The advent of the sixth-generation (6G) networks presents another round of revolution for the mobile communication landscape, promising an immersive experience, robust reliability, minimal latency, extreme connectivity, ubiquitous coverage, and capabilities beyond communication, including intelligence and sensing. To achieve these ambitious goals, it is apparent that 6G networks need to incorporate the state-of-the-art technologies. One of the technologies that has garnered rising interest is fluid antenna system (FAS) which represents any software-controllable fluidic, conductive, or dielectric structure capable of dynamically changing its shape and position to reconfigure essential radio-frequency (RF) characteristics. Compared to traditional antenna systems (TASs) with fixed-position radiating elements, the core idea of FAS revolves around the unique flexibility of reconfiguring the radiating elements within a given space. One recent driver of FAS is the recognition of its position-flexibility as a new degree of freedom (dof) to harness diversity and multiplexing gains. In this paper, we provide a comprehensive tutorial, covering channel modeling, signal processing and estimation methods, information-theoretic insights, new multiple access techniques, and hardware designs. Moreover, we delineate the challenges of FAS and explore the potential of using FAS to improve the performance of other contemporary technologies. By providing insights and guidance, this tutorial paper serves to inspire researchers to explore new horizons and fully unleash the potential of FAS.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be…
▽ More
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be $\mathcal{B}(J/ψ\to p \bar{p} η(η\to γγ)) = (1.480 \pm 0.001 \pm 0.024)\times\,10^{-3}$ and $\mathcal{B}(J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)) = (1.557 \pm 0.003 \pm 0.038)\times\,10^{-3}$, where the first uncertainties are statistical and the second systematic. Both results are compatible within their uncorrelated systematic uncertainties. The combined result is $\mathcal{B}(J/ψ\to p \bar{p} η)=(1.495 \pm 0.001 \pm 0.023)\times\,10^{-3}$ where the first uncertainty is the combined statistical uncertainty and the second one the combined systematic uncertainty of both analyses, incorporating correlations between them. In addition, the $p \bar{p}$ threshold region is investigated for a potential threshold enhancement, and no evidence for one is observed.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Explicitly Guided Information Interaction Network for Cross-modal Point Cloud Completion
Authors:
Hang Xu,
Chen Long,
Wenxiao Zhang,
Yuan Liu,
Zhen Cao,
Zhen Dong,
Bisheng Yang
Abstract:
In this paper, we explore a novel framework, EGIInet (Explicitly Guided Information Interaction Network), a model for View-guided Point cloud Completion (ViPC) task, which aims to restore a complete point cloud from a partial one with a single view image. In comparison with previous methods that relied on the global semantics of input images, EGIInet efficiently combines the information from two m…
▽ More
In this paper, we explore a novel framework, EGIInet (Explicitly Guided Information Interaction Network), a model for View-guided Point cloud Completion (ViPC) task, which aims to restore a complete point cloud from a partial one with a single view image. In comparison with previous methods that relied on the global semantics of input images, EGIInet efficiently combines the information from two modalities by leveraging the geometric nature of the completion task. Specifically, we propose an explicitly guided information interaction strategy supported by modal alignment for point cloud completion. First, in contrast to previous methods which simply use 2D and 3D backbones to encode features respectively, we unified the encoding process to promote modal alignment. Second, we propose a novel explicitly guided information interaction strategy that could help the network identify critical information within images, thus achieving better guidance for completion. Extensive experiments demonstrate the effectiveness of our framework, and we achieved a new state-of-the-art (+16% CD over XMFnet) in benchmark datasets despite using fewer parameters than the previous methods. The pre-trained model and code and are available at https://github.com/WHU-USI3DV/EGIInet.
△ Less
Submitted 4 July, 2024; v1 submitted 3 July, 2024;
originally announced July 2024.
-
OpenVNA: A Framework for Analyzing the Behavior of Multimodal Language Understanding System under Noisy Scenarios
Authors:
Ziqi Yuan,
Baozheng Zhang,
Hua Xu,
Zhiyun Liang,
Kai Gao
Abstract:
We present OpenVNA, an open-source framework designed for analyzing the behavior of multimodal language understanding systems under noisy conditions. OpenVNA serves as an intuitive toolkit tailored for researchers, facilitating convenience batch-level robustness evaluation and on-the-fly instance-level demonstration. It primarily features a benchmark Python library for assessing global model robus…
▽ More
We present OpenVNA, an open-source framework designed for analyzing the behavior of multimodal language understanding systems under noisy conditions. OpenVNA serves as an intuitive toolkit tailored for researchers, facilitating convenience batch-level robustness evaluation and on-the-fly instance-level demonstration. It primarily features a benchmark Python library for assessing global model robustness, offering high flexibility and extensibility, thereby enabling customization with user-defined noise types and models. Additionally, a GUI-based interface has been developed to intuitively analyze local model behavior. In this paper, we delineate the design principles and utilization of the created library and GUI-based web platform. Currently, OpenVNA is publicly accessible at \url{https://github.com/thuiar/OpenVNA}, with a demonstration video available at \url{https://youtu.be/0Z9cW7RGct4}.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
A New Achievable Region of the $K$-User MAC Wiretap Channel with Confidential and Open Messages Under Strong Secrecy
Authors:
Hao Xu,
Kai-Kit Wong,
Giuseppe Caire
Abstract:
This paper investigates the achievable region of a $K$-user discrete memoryless (DM) multiple access wiretap (MAC-WT) channel, where each user transmits both secret and open messages. All these messages are intended for Bob, while Eve is only interested in the secret messages. In the achievable coding strategy, the confidential information is protected by open messages and also by the introduction…
▽ More
This paper investigates the achievable region of a $K$-user discrete memoryless (DM) multiple access wiretap (MAC-WT) channel, where each user transmits both secret and open messages. All these messages are intended for Bob, while Eve is only interested in the secret messages. In the achievable coding strategy, the confidential information is protected by open messages and also by the introduction of auxiliary messages. When introducing an auxiliary message, one has to ensure that, on one hand, its rate is large enough for protecting the secret message from Eve and, on the other hand, the resulting sum rate (together with the secret and open message rate) does not exceed Bob's decoding capability. This yields an inequality structure involving the rates of all users' secret, open, and auxiliary messages. To obtain the rate region, the auxiliary message rates must be eliminated from the system of inequalities. A direct application of the Fourier-Motzkin elimination procedure is elusive since a) it requires that the number of users $K$ is explicitly given, and b) even for small $K = 3, 4, \ldots$, the number of inequalities becomes extremely large. We prove the result for general $K$ through the combined use of Fourier-Motzkin elimination procedure and mathematical induction. This paper adopts the strong secrecy metric, characterized by information leakage. To prove the achievability under this criterion, we analyze the resolvability region of a $K$-user DM-MAC channel. In addition, we show that users with zero secrecy rate can play different roles and use different strategies in encoding their messages. These strategies yield non-redundant rate inequalities. By considering all possible coding strategies, we provide a new achievable region for the considered channel, and show that it strictly improves those already known in the existing literature by considering a specific example.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Coding-Enhanced Cooperative Jamming for Secret Communication in Fluid Antenna Systems
Authors:
Hao Xu,
Kai-Kit Wong,
Wee Kiat New,
Guyue Li,
Farshad Rostami Ghadi,
Yongxu Zhu,
Shi Jin,
Chan-Byoung Chae,
Yangyang Zhang
Abstract:
This letter investigates the secret communication problem for a fluid antenna system (FAS)-assisted wiretap channel, where the legitimate transmitter transmits an information-bearing signal to the legitimate receiver, and at the same time, transmits a jamming signal to interfere with the eavesdropper (Eve). Unlike the conventional jamming scheme, which usually transmits Gaussian noise that interfe…
▽ More
This letter investigates the secret communication problem for a fluid antenna system (FAS)-assisted wiretap channel, where the legitimate transmitter transmits an information-bearing signal to the legitimate receiver, and at the same time, transmits a jamming signal to interfere with the eavesdropper (Eve). Unlike the conventional jamming scheme, which usually transmits Gaussian noise that interferes not only with Eve but also with the legitimate receiver, in this letter, we consider that encoded codewords are transmitted to jam Eve. Then, by employing appropriate coding schemes, the legitimate receiver can successfully decode the jamming signal and then cancel the interference, while Eve cannot, even if it knows the codebooks. We aim to maximize the secrecy rate through port selection and power control. Although the problem is non-convex, we show that the optimal solution can be found. Simulation results show that by using the FAS technique and the proposed jamming scheme, the secrecy rate of the system can be significantly increased.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Machine Learning-Enhanced Ant Colony Optimization for Column Generation
Authors:
Hongjie Xu,
Yunzhuang Shen,
Yuan Sun,
Xiaodong Li
Abstract:
Column generation (CG) is a powerful technique for solving optimization problems that involve a large number of variables or columns. This technique begins by solving a smaller problem with a subset of columns and gradually generates additional columns as needed. However, the generation of columns often requires solving difficult subproblems repeatedly, which can be a bottleneck for CG. To address…
▽ More
Column generation (CG) is a powerful technique for solving optimization problems that involve a large number of variables or columns. This technique begins by solving a smaller problem with a subset of columns and gradually generates additional columns as needed. However, the generation of columns often requires solving difficult subproblems repeatedly, which can be a bottleneck for CG. To address this challenge, we propose a novel method called machine learning enhanced ant colony optimization (MLACO), to efficiently generate multiple high-quality columns from a subproblem. Specifically, we train a ML model to predict the optimal solution of a subproblem, and then integrate this ML prediction into the probabilistic model of ACO to sample multiple high-quality columns. Our experimental results on the bin packing problem with conflicts show that the MLACO method significantly improves the performance of CG compared to several state-of-the-art methods. Furthermore, when our method is incorporated into a Branch-and-Price method, it leads to a significant reduction in solution time.
△ Less
Submitted 22 April, 2024;
originally announced July 2024.
-
Contractual Reinforcement Learning: Pulling Arms with Invisible Hands
Authors:
Jibang Wu,
Siyu Chen,
Mengdi Wang,
Huazheng Wang,
Haifeng Xu
Abstract:
The agency problem emerges in today's large scale machine learning tasks, where the learners are unable to direct content creation or enforce data collection. In this work, we propose a theoretical framework for aligning economic interests of different stakeholders in the online learning problems through contract design. The problem, termed \emph{contractual reinforcement learning}, naturally aris…
▽ More
The agency problem emerges in today's large scale machine learning tasks, where the learners are unable to direct content creation or enforce data collection. In this work, we propose a theoretical framework for aligning economic interests of different stakeholders in the online learning problems through contract design. The problem, termed \emph{contractual reinforcement learning}, naturally arises from the classic model of Markov decision processes, where a learning principal seeks to optimally influence the agent's action policy for their common interests through a set of payment rules contingent on the realization of next state. For the planning problem, we design an efficient dynamic programming algorithm to determine the optimal contracts against the far-sighted agent. For the learning problem, we introduce a generic design of no-regret learning algorithms to untangle the challenges from robust design of contracts to the balance of exploration and exploitation, reducing the complexity analysis to the construction of efficient search algorithms. For several natural classes of problems, we design tailored search algorithms that provably achieve $\tilde{O}(\sqrt{T})$ regret. We also present an algorithm with $\tilde{O}(T^{2/3})$ for the general problem that improves the existing analysis in online contract design with mild technical assumptions.
△ Less
Submitted 2 July, 2024; v1 submitted 1 July, 2024;
originally announced July 2024.
-
Evaluating Knowledge-based Cross-lingual Inconsistency in Large Language Models
Authors:
Xiaolin Xing,
Zhiwei He,
Haoyu Xu,
Xing Wang,
Rui Wang,
Yu Hong
Abstract:
This paper investigates the cross-lingual inconsistencies observed in Large Language Models (LLMs), such as ChatGPT, Llama, and Baichuan, which have shown exceptional performance in various Natural Language Processing (NLP) tasks. Despite their successes, these models often exhibit significant inconsistencies when processing the same concepts across different languages. This study focuses on three…
▽ More
This paper investigates the cross-lingual inconsistencies observed in Large Language Models (LLMs), such as ChatGPT, Llama, and Baichuan, which have shown exceptional performance in various Natural Language Processing (NLP) tasks. Despite their successes, these models often exhibit significant inconsistencies when processing the same concepts across different languages. This study focuses on three primary questions: the existence of cross-lingual inconsistencies in LLMs, the specific aspects in which these inconsistencies manifest, and the correlation between cross-lingual consistency and multilingual capabilities of LLMs.To address these questions, we propose an innovative evaluation method for Cross-lingual Semantic Consistency (xSC) using the LaBSE model. We further introduce metrics for Cross-lingual Accuracy Consistency (xAC) and Cross-lingual Timeliness Consistency (xTC) to comprehensively assess the models' performance regarding semantic, accuracy, and timeliness inconsistencies. By harmonizing these metrics, we provide a holistic measurement of LLMs' cross-lingual consistency. Our findings aim to enhance the understanding and improvement of multilingual capabilities and interpretability in LLMs, contributing to the development of more robust and reliable multilingual language models.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Three-Nucleon Correlations in Light Nuclei Yields Ratios from AMPT Model for QCD Critical Point Investigation
Authors:
Ning Yu,
Zuman Zhang,
Hongge Xu,
Zhong Zhu
Abstract:
This research use the AMPT model in Au+Au collisions to study the influence of the three nucleons correlation $C_{n2p}$ on the light nuclei yield ratios. It is found that neglecting $C_{n2p}$ leads to an overestimated relative neutron density fluctuation extraction. Including $C_{n2p}$ will enhances the agreement with experimental results with higher yield ratios, yet it does not change the energy…
▽ More
This research use the AMPT model in Au+Au collisions to study the influence of the three nucleons correlation $C_{n2p}$ on the light nuclei yield ratios. It is found that neglecting $C_{n2p}$ leads to an overestimated relative neutron density fluctuation extraction. Including $C_{n2p}$ will enhances the agreement with experimental results with higher yield ratios, yet it does not change the energy dependence of the yield ratio. Since there is no first-order phase transition or critical physics in the AMPT model, our work fails to reproduce the experimental energy-dependent peak around $\sqrt{s_\text{NN}} = $20-30 GeV. Our work might offer a baseline for investigating critical physics phenomena using the light nuclei production as a probe.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
TrialBench: Multi-Modal Artificial Intelligence-Ready Clinical Trial Datasets
Authors:
Jintai Chen,
Yaojun Hu,
Yue Wang,
Yingzhou Lu,
Xu Cao,
Miao Lin,
Hongxia Xu,
Jian Wu,
Cao Xiao,
Jimeng Sun,
Lucas Glass,
Kexin Huang,
Marinka Zitnik,
Tianfan Fu
Abstract:
Clinical trials are pivotal for developing new medical treatments, yet they typically pose some risks such as patient mortality, adverse events, and enrollment failure that waste immense efforts spanning over a decade. Applying artificial intelligence (AI) to forecast or simulate key events in clinical trials holds great potential for providing insights to guide trial designs. However, complex dat…
▽ More
Clinical trials are pivotal for developing new medical treatments, yet they typically pose some risks such as patient mortality, adverse events, and enrollment failure that waste immense efforts spanning over a decade. Applying artificial intelligence (AI) to forecast or simulate key events in clinical trials holds great potential for providing insights to guide trial designs. However, complex data collection and question definition requiring medical expertise and a deep understanding of trial designs have hindered the involvement of AI thus far. This paper tackles these challenges by presenting a comprehensive suite of meticulously curated AIready datasets covering multi-modal data (e.g., drug molecule, disease code, text, categorical/numerical features) and 8 crucial prediction challenges in clinical trial design, encompassing prediction of trial duration, patient dropout rate, serious adverse event, mortality rate, trial approval outcome, trial failure reason, drug dose finding, design of eligibility criteria. Furthermore, we provide basic validation methods for each task to ensure the datasets' usability and reliability. We anticipate that the availability of such open-access datasets will catalyze the development of advanced AI approaches for clinical trial design, ultimately advancing clinical trial research and accelerating medical solution development. The curated dataset, metrics, and basic models are publicly available at https://github.com/ML2Health/ML2ClinicalTrials/tree/main/AI4Trial.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Subspaces spanned by eigenforms with nonvanishing twisted central $L$-values
Authors:
June Kayath,
Connor Lane,
Ben Neifeld,
Tianyu Ni,
Hui Xue
Abstract:
In this paper, we construct explicit spanning sets for two spaces. One is the subspace generated by integral-weight Hecke eigenforms with nonvanishing quadratic twisted central $L$-values. The other is a subspace generated by half-integral weight Hecke eigenforms with certain nonvanishing Fourier coefficients. Along the way, we show that these subspaces are isomorphic via the Shimura lift.
In this paper, we construct explicit spanning sets for two spaces. One is the subspace generated by integral-weight Hecke eigenforms with nonvanishing quadratic twisted central $L$-values. The other is a subspace generated by half-integral weight Hecke eigenforms with certain nonvanishing Fourier coefficients. Along the way, we show that these subspaces are isomorphic via the Shimura lift.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Three-dimensional non-reciprocal transport in photonic topological heterostructure of arbitrary shape
Authors:
Mudi Wang,
Ruo-Yang Zhang,
Chenyu Zhang,
Haoran Xue,
Hongwei Jia,
Jing Hu,
Dongyang Wang,
Tianshu Jiang,
C. T. Chan
Abstract:
Electromagnetic wave propagation in three-dimensional space typically suffers omnidirectional scattering when encountering obstacles. In this study, we employed Chern vectors to construct a topological heterostructure, where large-volume non-reciprocal topological transport in three-dimension is achieved. The shape of the cross-section in the heterostructure can be arbitrary designed, and we exper…
▽ More
Electromagnetic wave propagation in three-dimensional space typically suffers omnidirectional scattering when encountering obstacles. In this study, we employed Chern vectors to construct a topological heterostructure, where large-volume non-reciprocal topological transport in three-dimension is achieved. The shape of the cross-section in the heterostructure can be arbitrary designed, and we experimentally observed the distinctive cross-shaped field pattern transport, non-reciprocal energy harvesting, and most importantly, the remarkable ability of electromagnetic wave to traverse obstacles and abrupt structure changes without encountering reflections in 3D space.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Teola: Towards End-to-End Optimization of LLM-based Applications
Authors:
Xin Tan,
Yimin Jiang,
Yitao Yang,
Hong Xu
Abstract:
Large language model (LLM)-based applications consist of both LLM and non-LLM components, each contributing to the end-to-end latency. Despite great efforts to optimize LLM inference, end-to-end workflow optimization has been overlooked. Existing frameworks employ coarse-grained orchestration with task modules, which confines optimizations to within each module and yields suboptimal scheduling dec…
▽ More
Large language model (LLM)-based applications consist of both LLM and non-LLM components, each contributing to the end-to-end latency. Despite great efforts to optimize LLM inference, end-to-end workflow optimization has been overlooked. Existing frameworks employ coarse-grained orchestration with task modules, which confines optimizations to within each module and yields suboptimal scheduling decisions. We propose fine-grained end-to-end orchestration, which utilizes task primitives as the basic units and represents each query's workflow as a primitive-level dataflow graph. This explicitly exposes a much larger design space, enables optimizations in parallelization and pipelining across primitives of different modules, and enhances scheduling to improve application-level performance. We build Teola, a novel orchestration framework for LLM-based applications that implements this scheme. Comprehensive experiments show that Teola can achieve up to 2.09x speedup over existing systems across various popular LLM applications.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Imaging of single barium atoms in a second matrix site in solid xenon for barium tagging in a $^{136}$Xe double beta decay experiment
Authors:
M. Yvaine,
D. Fairbank,
J. Soderstrom,
C. Taylor,
J. Stanley,
T. Walton,
C. Chambers,
A. Iverson,
W. Fairbank,
S. Al Kharusi,
A. Amy,
E. Angelico,
A. Anker,
I. J. Arnquist,
A. Atencio,
J. Bane,
V. Belov,
E. P. Bernard,
T. Bhatta,
A. Bolotnikov,
J. Breslin,
P. A. Breur,
J. P. Brodsky,
E. Brown,
T. Brunner
, et al. (112 additional authors not shown)
Abstract:
Neutrinoless double beta decay is one of the most sensitive probes for new physics beyond the Standard Model of particle physics. One of the isotopes under investigation is $^{136}$Xe, which would double beta decay into $^{136}$Ba. Detecting the single $^{136}$Ba daughter provides a sort of ultimate tool in the discrimination against backgrounds. Previous work demonstrated the ability to perform s…
▽ More
Neutrinoless double beta decay is one of the most sensitive probes for new physics beyond the Standard Model of particle physics. One of the isotopes under investigation is $^{136}$Xe, which would double beta decay into $^{136}$Ba. Detecting the single $^{136}$Ba daughter provides a sort of ultimate tool in the discrimination against backgrounds. Previous work demonstrated the ability to perform single atom imaging of Ba atoms in a single-vacancy site of a solid xenon matrix. In this paper, the effort to identify signal from individual barium atoms is extended to Ba atoms in a hexa-vacancy site in the matrix and is achieved despite increased photobleaching in this site. Abrupt fluorescence turn-off of a single Ba atom is also observed. Significant recovery of fluorescence signal lost through photobleaching is demonstrated upon annealing of Ba deposits in the Xe ice. Following annealing, it is observed that Ba atoms in the hexa-vacancy site exhibit antibleaching while Ba atoms in the tetra-vacancy site exhibit bleaching. This may be evidence for a matrix site transfer upon laser excitation. Our findings offer a path of continued research toward tagging of Ba daughters in all significant sites in solid xenon.
△ Less
Submitted 28 June, 2024;
originally announced July 2024.
-
Imaging nanomechanical vibrations and manipulating parametric mode coupling via scanning microwave microscopy
Authors:
Hao Xu,
Srisaran Venkatachalam,
Toky-Harrison Rabenimanana,
Christophe Boyaval,
Sophie Eliet,
Flavie Braud,
Eddy Collin,
Didier Theron,
Xin Zhou
Abstract:
In this study, we present a novel platform based on scanning microwave microscopy for manipulating and detecting tiny vibrations of nanoelectromechanical resonators using a single metallic tip. The tip is placed on the top of a grounded silicon nitride membrane, acting as a movable top gate of the coupled resonator. We demonstrate its ability to map mechanical modes and investigate mechanical damp…
▽ More
In this study, we present a novel platform based on scanning microwave microscopy for manipulating and detecting tiny vibrations of nanoelectromechanical resonators using a single metallic tip. The tip is placed on the top of a grounded silicon nitride membrane, acting as a movable top gate of the coupled resonator. We demonstrate its ability to map mechanical modes and investigate mechanical damping effects in a capacitive coupling scheme, based on its spatial resolution. We also manipulate the energy transfer coherently between the mode of the scanning tip and the underlying silicon nitride membrane, via parametric coupling. Typical features of optomechanics, such as anti-damping and electromechanically induced transparency, have been observed. Since the microwave optomechanical technology is fully compatible with quantum electronics and very low temperature conditions, it should provide a powerful tool for studying phonon tunnelling between two spatially separated vibrating elements, which could potentially be applied to quantum sensing.
△ Less
Submitted 28 June, 2024;
originally announced July 2024.
-
Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring
Authors:
Jiazheng Li,
Hainiu Xu,
Zhaoyue Sun,
Yuxiang Zhou,
David West,
Cesare Aloisi,
Yulan He
Abstract:
Generating rationales that justify scoring decisions has been a promising way to facilitate explainability in automated scoring systems. However, existing methods do not match the accuracy of classifier-based methods. Plus, the generated rationales often contain hallucinated information. To address these issues, we propose a novel framework capable of generating more faithful rationales and, more…
▽ More
Generating rationales that justify scoring decisions has been a promising way to facilitate explainability in automated scoring systems. However, existing methods do not match the accuracy of classifier-based methods. Plus, the generated rationales often contain hallucinated information. To address these issues, we propose a novel framework capable of generating more faithful rationales and, more importantly, matching performance with classifier-based black-box scoring systems. We first mimic the human assessment process by querying Large Language Models (LLMs) to generate a thought tree. We then summarise intermediate assessment decisions from each thought tree path for creating synthetic rationale data and rationale preference data. Finally, we utilise the generated synthetic data to calibrate LLMs through a two-step training process: supervised fine-tuning and preference optimization. Extensive experimental results demonstrate that our framework achieves a 38% assessment performance improvement in the QWK score compared to prior work while producing higher-quality rationales, as recognised by human evaluators and LLMs. Our work sheds light on the effectiveness of performing preference optimization using synthetic preference data obtained from thought tree paths.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
Self-Supervised Spatial-Temporal Normality Learning for Time Series Anomaly Detection
Authors:
Yutong Chen,
Hongzuo Xu,
Guansong Pang,
Hezhe Qiao,
Yuan Zhou,
Mingsheng Shang
Abstract:
Time Series Anomaly Detection (TSAD) finds widespread applications across various domains such as financial markets, industrial production, and healthcare. Its primary objective is to learn the normal patterns of time series data, thereby identifying deviations in test samples. Most existing TSAD methods focus on modeling data from the temporal dimension, while ignoring the semantic information in…
▽ More
Time Series Anomaly Detection (TSAD) finds widespread applications across various domains such as financial markets, industrial production, and healthcare. Its primary objective is to learn the normal patterns of time series data, thereby identifying deviations in test samples. Most existing TSAD methods focus on modeling data from the temporal dimension, while ignoring the semantic information in the spatial dimension. To address this issue, we introduce a novel approach, called Spatial-Temporal Normality learning (STEN). STEN is composed of a sequence Order prediction-based Temporal Normality learning (OTN) module that captures the temporal correlations within sequences, and a Distance prediction-based Spatial Normality learning (DSN) module that learns the relative spatial relations between sequences in a feature space. By synthesizing these two modules, STEN learns expressive spatial-temporal representations for the normal patterns hidden in the time series data. Extensive experiments on five popular TSAD benchmarks show that STEN substantially outperforms state-of-the-art competing methods. Our code is available at https://github.com/mala-lab/STEN.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
Improved measurement of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential dec…
▽ More
Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential decay rate of $D^+_s\to K^0 e^+ν_e$ to be $f^{K^0}_+(0)=0.636\pm0.049\pm0.013$. For both measurements, the first uncertainty is statistical and the second systematic. The branching fraction and form factor measurements are factors of 1.6 and 1.7 more precise than the previous world averages, respectively.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Hyper-sampling imaging
Authors:
Ze Zhang,
Hemeng Xue,
Mingtao Shang,
Hongfei Yu,
Jinchao Liang,
Meiling Guan,
Chengming Sun,
Huahua Wang,
Shufeng Wang,
Zhengyu Ye,
Feng Gao,
Lu Gao
Abstract:
In our research, we have developed a novel mechanism that allows for a significant reduction in the smallest sampling unit of digital image sensors (DIS) to as small as 1/16th of a pixel, through measuring the intra-pixel quantum efficiency for the first time and recomputing the image. Employing our method, the physical sampling resolution of DIS can be enhanced by 16 times. The method has undergo…
▽ More
In our research, we have developed a novel mechanism that allows for a significant reduction in the smallest sampling unit of digital image sensors (DIS) to as small as 1/16th of a pixel, through measuring the intra-pixel quantum efficiency for the first time and recomputing the image. Employing our method, the physical sampling resolution of DIS can be enhanced by 16 times. The method has undergone rigorous testing in real-world imaging scenarios.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Measurement of the cross sections of $e^+e^-\to K^{-}\barΞ^{+}Λ/Σ^{0}$ at center-of-mass energies between 3.510 and 4.914 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of…
▽ More
Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$, evidence for $ψ(4160) \to K^{-}\barΞ^{+}Λ$ is found for the first time with a significance of 4.4$σ$, including systematic uncertainties. No evidence for other possible resonances is found. In addition, the products of electronic partial width and branching fraction for all assumed resonances decaying into $K^{-}\barΞ^{+}Λ/Σ^{0}$ are determined.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Measurements of $K_S^0$-$K_L^0$ asymmetries in the decays $Λ_c^+ \to pK_{L,S}^0$, $pK_{L,S}^0π^+π^-$ and $pK_{L,S}^0π^0$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, an…
▽ More
Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, and $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^0)=(2.02 \pm 0.13 \pm 0.05)\%$, where the first uncertainties are statistical and the second systematic. Combining with the known branching fractions of $Λ_c^+ \to pK_{S}^{0}$, $Λ_c^+ \to pK_{S}^{0}π^+π^-$, and $Λ_c^+ \to pK_{S}^{0}π^0$, we present the first measurements of the $K_{S}^{0}$-$K_{L}^{0}$ asymmetries $R(Λ_c^+, K_{S,L}^0X) = \frac{\mathcal{B}(Λ_c^+ \to K_{S}^{0} X) - \mathcal{B}(Λ_c^+ \to K_{L}^{0} X)}{\mathcal{B}(Λ_c^+ \to K_{S}^{0} X) + \mathcal{B}(Λ_c^+ \to K_{L}^{0} X)}$ in charmed baryon decays: $R(Λ_c^+, pK_{S,L}^0) = -0.025 \pm 0.031$, $R(Λ_c^+, pK_{S,L}^0π^+π^-) = -0.027 \pm 0.048$, and $R(Λ_c^+, pK_{S,L}^0π^0) =-0.015 \pm 0.046$. No significant asymmetries within the uncertainties are observed.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Improving Entity Recognition Using Ensembles of Deep Learning and Fine-tuned Large Language Models: A Case Study on Adverse Event Extraction from Multiple Sources
Authors:
Yiming Li,
Deepthi Viswaroopan,
William He,
Jianfu Li,
Xu Zuo,
Hua Xu,
Cui Tao
Abstract:
Adverse event (AE) extraction following COVID-19 vaccines from text data is crucial for monitoring and analyzing the safety profiles of immunizations. Traditional deep learning models are adept at learning intricate feature representations and dependencies in sequential data, but often require extensive labeled data. In contrast, large language models (LLMs) excel in understanding contextual infor…
▽ More
Adverse event (AE) extraction following COVID-19 vaccines from text data is crucial for monitoring and analyzing the safety profiles of immunizations. Traditional deep learning models are adept at learning intricate feature representations and dependencies in sequential data, but often require extensive labeled data. In contrast, large language models (LLMs) excel in understanding contextual information, but exhibit unstable performance on named entity recognition tasks, possibly due to their broad but unspecific training. This study aims to evaluate the effectiveness of LLMs and traditional deep learning models in AE extraction, and to assess the impact of ensembling these models on performance. In this study, we utilized reports and posts from the VAERS (n=621), Twitter (n=9,133), and Reddit (n=131) as our corpora. Our goal was to extract three types of entities: "vaccine", "shot", and "ae". We explored and fine-tuned (except GPT-4) multiple LLMs, including GPT-2, GPT-3.5, GPT-4, and Llama-2, as well as traditional deep learning models like RNN and BioBERT. To enhance performance, we created ensembles of the three models with the best performance. For evaluation, we used strict and relaxed F1 scores to evaluate the performance for each entity type, and micro-average F1 was used to assess the overall performance. The ensemble model achieved the highest performance in "vaccine", "shot", and "ae" with strict F1-scores of 0.878, 0.930, and 0.925, respectively, along with a micro-average score of 0.903. In conclusion, this study demonstrates the effectiveness and robustness of ensembling fine-tuned traditional deep learning models and LLMs, for extracting AE-related information. This study contributes to the advancement of biomedical natural language processing, providing valuable insights into improving AE extraction from text data for pharmacovigilance and public health surveillance.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Study of the $f_{0}(980)$ through the decay $D_{s}^{+}\rightarrow π^{+}π^{+}π^{-}π^{0}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (649 additional authors not shown)
Abstract:
We perform the first amplitude analysis of $D^+_s \to π^+π^+π^-π^0$ decays, based on data samples of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of 7.33~fb$^{-1}$. We report the observation of $D_{s}^{+} \to f_0(980)ρ(770)^{+}$ with a statistical significance greater than 10$σ$ and…
▽ More
We perform the first amplitude analysis of $D^+_s \to π^+π^+π^-π^0$ decays, based on data samples of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of 7.33~fb$^{-1}$. We report the observation of $D_{s}^{+} \to f_0(980)ρ(770)^{+}$ with a statistical significance greater than 10$σ$ and determine the branching fractions $\mathcal{B}(D_s^+\toπ^+π^+π^-π^0|_{{\rm non}-η})=(2.04\pm0.08_{\rm stat.}\pm0.05_{\rm syst.})\%$ and $\mathcal{B}(D_s^+\toηπ^+)=(1.56\pm0.09_{\rm stat.}\pm0.04_{\rm syst.})\%$. Moreover, we measure the relative branching fraction between $φ\toπ^+π^-π^0$ and $φ\to K^+K^-$ to be $\frac{\mathcal{B}(φ(1020) \to π^+π^-π^0)}{\mathcal{B}(φ(1020) \to K^+K^-)}=0.230 \pm 0.014_{\rm stat.} \pm 0.010_{\rm syst.}$, which deviates from the world average value by more than $4σ$.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.