subscribe to arXiv mailings

DG-PIC: Domain Generalized Point-In-Context Learning for Point Cloud Understanding

Authors: Jincen Jiang, Qianyu Zhou, Yuhang Li, Xuequan Lu, Meili Wang, Lizhuang Ma, Jian Chang, Jian Jun Zhang

Abstract: Recent point cloud understanding research suffers from performance drops on unseen data, due to the distribution shifts across different domains. While recent studies use Domain Generalization (DG) techniques to mitigate this by learning domain-invariant features, most are designed for a single task and neglect the potential of testing data. Despite In-Context Learning (ICL) showcasing multi-task… ▽ More Recent point cloud understanding research suffers from performance drops on unseen data, due to the distribution shifts across different domains. While recent studies use Domain Generalization (DG) techniques to mitigate this by learning domain-invariant features, most are designed for a single task and neglect the potential of testing data. Despite In-Context Learning (ICL) showcasing multi-task learning capability, it usually relies on high-quality context-rich data and considers a single dataset, and has rarely been studied in point cloud understanding. In this paper, we introduce a novel, practical, multi-domain multi-task setting, handling multiple domains and multiple tasks within one unified model for domain generalized point cloud understanding. To this end, we propose Domain Generalized Point-In-Context Learning (DG-PIC) that boosts the generalizability across various tasks and domains at testing time. In particular, we develop dual-level source prototype estimation that considers both global-level shape contextual and local-level geometrical structures for representing source domains and a dual-level test-time feature shifting mechanism that leverages both macro-level domain semantic information and micro-level patch positional relationships to pull the target data closer to the source ones during the testing. Our DG-PIC does not require any model updates during the testing and can handle unseen domains and multiple tasks, \textit{i.e.,} point cloud reconstruction, denoising, and registration, within one unified model. We also introduce a benchmark for this new setting. Comprehensive experiments demonstrate that DG-PIC outperforms state-of-the-art techniques significantly. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: Accepted to ECCV 2024

arXiv:2407.08528 [pdf, other]

doi 10.1109/LSP.2024.3426918

Enhancing octree-based context models for point cloud geometry compression with attention-based child node number prediction

Authors: Chang Sun, Hui Yuan, Xiaolong Mao, Xin Lu, Raouf Hamzaoui

Abstract: In point cloud geometry compression, most octreebased context models use the cross-entropy between the onehot encoding of node occupancy and the probability distribution predicted by the context model as the loss. This approach converts the problem of predicting the number (a regression problem) and the position (a classification problem) of occupied child nodes into a 255-dimensional classificati… ▽ More In point cloud geometry compression, most octreebased context models use the cross-entropy between the onehot encoding of node occupancy and the probability distribution predicted by the context model as the loss. This approach converts the problem of predicting the number (a regression problem) and the position (a classification problem) of occupied child nodes into a 255-dimensional classification problem. As a result, it fails to accurately measure the difference between the one-hot encoding and the predicted probability distribution. We first analyze why the cross-entropy loss function fails to accurately measure the difference between the one-hot encoding and the predicted probability distribution. Then, we propose an attention-based child node number prediction (ACNP) module to enhance the context models. The proposed module can predict the number of occupied child nodes and map it into an 8- dimensional vector to assist the context model in predicting the probability distribution of the occupancy of the current node for efficient entropy coding. Experimental results demonstrate that the proposed module enhances the coding efficiency of octree-based context models. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 2 figures and 2 tables

Journal ref: IEEE Signal Processing Letters, 2024

arXiv:2407.08520 [pdf, other]

doi 10.1109/JETCAS.2024.3367729

Enhancing context models for point cloud geometry compression with context feature residuals and multi-loss

Authors: Chang Sun, Hui Yuan, Shuai Li, Xin Lu, Raouf Hamzaoui

Abstract: In point cloud geometry compression, context models usually use the one-hot encoding of node occupancy as the label, and the cross-entropy between the one-hot encoding and the probability distribution predicted by the context model as the loss function. However, this approach has two main weaknesses. First, the differences between contexts of different nodes are not significant, making it difficul… ▽ More In point cloud geometry compression, context models usually use the one-hot encoding of node occupancy as the label, and the cross-entropy between the one-hot encoding and the probability distribution predicted by the context model as the loss function. However, this approach has two main weaknesses. First, the differences between contexts of different nodes are not significant, making it difficult for the context model to accurately predict the probability distribution of node occupancy. Second, as the one-hot encoding is not the actual probability distribution of node occupancy, the cross-entropy loss function is inaccurate. To address these problems, we propose a general structure that can enhance existing context models. We introduce the context feature residuals into the context model to amplify the differences between contexts. We also add a multi-layer perception branch, that uses the mean squared error between its output and node occupancy as a loss function to provide accurate gradients in backpropagation. We validate our method by showing that it can improve the performance of an octree-based model (OctAttention) and a voxel-based model (VoxelDNN) on the object point cloud datasets MPEG 8i and MVUB, as well as the LiDAR point cloud dataset SemanticKITTI. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 11 pages, 8 figures

Journal ref: IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 14, no. 2, pp. 224-234, Jun. 2024

arXiv:2407.08183 [pdf, other]

The white-light superflares from cool stars in GWAC triggers

Authors: Guang-Wei Li, Liang Wang, Hai-Long Yuan, Li-Ping Xin, Jing Wang, Chao Wu, Hua-Li Li, Hasitieer Haerken, Wei-Hua Wang, Hong-Bo Cai, Xu-Hui Han, Yang Xu, Lei Huang, Xiao-Meng Lu, Jian-Ying Bai, Xiang-Yu Wang, Zi-Gao Dai, En-Wei Liang, Jian-Yan Wei

Abstract: M-type stars are the ones that flare most frequently, but how big their maximum flare energy can reach is still unknown. We present 163 flares from 162 individual M2 through L1-type stars that triggered the GWAC, with flare energies ranging from $10^{32.2}$ to $10^{36.4}$ erg . The flare amplitudes range from $\triangle G = 0.84$ to $\sim 10$ mag. Flare energy increases with stellar surface temper… ▽ More M-type stars are the ones that flare most frequently, but how big their maximum flare energy can reach is still unknown. We present 163 flares from 162 individual M2 through L1-type stars that triggered the GWAC, with flare energies ranging from $10^{32.2}$ to $10^{36.4}$ erg . The flare amplitudes range from $\triangle G = 0.84$ to $\sim 10$ mag. Flare energy increases with stellar surface temperature ($T_{\rm eff}$) but both $\triangle G$ and equivalent duration $\log_{10}(ED)$ seem to be independent of $T_{\rm eff}$. Combining periods detected from light curves of TESS and K2, spectra from LAMOST, SDSS and the 2.16 m Telescope, and the Gaia DR3 data, we found that these GWAC flare stars are young. For the stars that have spectra, we found that these stars are in or very near to the saturation region, and $\log_{10}(L_{\rm Hα}/L_{\rm bol})$ is lower for M7-L1 stars than for M2-M6 stars. We also studied the relation between GWAC flare bolometric energy $E_{\rm bol}$ and stellar hemispherical area $S$, and found that $\log_{10}E_{\rm bol}$ (in erg) increases with increasing $S$ (in cm$^2$), and the maximum flare energy $\log_{10}E_{\rm bol, max} \geqslant \log_{10}S + 14.25$. For M7-L1 stars, there seem to be other factors limiting their maximum flare energies in addition to stellar hemispherical area. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 18 pages, 11 figures, 4 tables

arXiv:2407.07789 [pdf, other]

Raising the Ceiling: Conflict-Free Local Feature Matching with Dynamic View Switching

Authors: Xiaoyong Lu, Songlin Du

Abstract: Current feature matching methods prioritize improving modeling capabilities to better align outputs with ground-truth matches, which are the theoretical upper bound on matching results, metaphorically depicted as the "ceiling". However, these enhancements fail to address the underlying issues that directly hinder ground-truth matches, including the scarcity of matchable points in small scale image… ▽ More Current feature matching methods prioritize improving modeling capabilities to better align outputs with ground-truth matches, which are the theoretical upper bound on matching results, metaphorically depicted as the "ceiling". However, these enhancements fail to address the underlying issues that directly hinder ground-truth matches, including the scarcity of matchable points in small scale images, matching conflicts in dense methods, and the keypoint-repeatability reliance in sparse methods. We propose a novel feature matching method named RCM, which Raises the Ceiling of Matching from three aspects. 1) RCM introduces a dynamic view switching mechanism to address the scarcity of matchable points in source images by strategically switching image pairs. 2) RCM proposes a conflict-free coarse matching module, addressing matching conflicts in the target image through a many-to-one matching strategy. 3) By integrating the semi-sparse paradigm and the coarse-to-fine architecture, RCM preserves the benefits of both high efficiency and global search, mitigating the reliance on keypoint repeatability. As a result, RCM enables more matchable points in the source image to be matched in an exhaustive and conflict-free manner in the target image, leading to a substantial 260% increase in ground-truth matches. Comprehensive experiments show that RCM exhibits remarkable performance and efficiency in comparison to state-of-the-art methods. △ Less

Submitted 10 July, 2024; originally announced July 2024.

Comments: Accepted at ECCV 2024

arXiv:2407.07651 [pdf, other]

Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$

Authors: M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (645 additional authors not shown)

Abstract: The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be… ▽ More The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be $(35.9\pm 4.8\pm 3.5)\%$ and $(37.4\pm 3.1\pm 4.6)\%$, respectively. The measurements are in tension with predictions based on the assumption that the $D_{s1}(2536)$ and $D_{s2}^*(2573)$ are dominated by a bare $c\bar{s}$ component. The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ cross sections are measured, and a resonant structure at around 4.6~GeV with a width of 50~MeV is observed for the first time with a statistical significance of $15σ$ in the $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ process. It could be the $Y(4626)$ found by the Belle collaboration in the $D_s^+D_{s1}(2536)^{-}$ final state, since they have similar masses and widths. There is also evidence for a structure at around 4.75~GeV in both processes. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.07447 [pdf]

Spin Splitting in Altermagnetic RuO$_2$ Enables Field-free Spin-Orbit Torque Switching via Dominant Out-of-Plane Spin Polarization

Authors: Zhuoyi Li, Zhe Zhang, Xianyang Lu, Yongbing Xu

Abstract: Researchers have recently identified a novel class of magnetism, termed "altermagnetism", which exhibits characteristics of both ferromagnetism and antiferromagnetism. Here, we report a groundbreaking discovery of efficient field-free spin-orbit torque (SOT) switching in a RuO$_2$ (101)/Co/Pt/Co/Pt/Ta structure. Our results demonstrate that the spin current flows along the [100] axis, induced by t… ▽ More Researchers have recently identified a novel class of magnetism, termed "altermagnetism", which exhibits characteristics of both ferromagnetism and antiferromagnetism. Here, we report a groundbreaking discovery of efficient field-free spin-orbit torque (SOT) switching in a RuO$_2$ (101)/Co/Pt/Co/Pt/Ta structure. Our results demonstrate that the spin current flows along the [100] axis, induced by the in-plane charge current, with the spin polarization direction aligned parallel to the Néel vector. These z-polarized spins generate an out-of-plane anti-damping torque, enabling deterministic switching of the Co/Pt layer without the necessity of an external magnetic field. The altermagnetic spin splitting effect (ASSE) in RuO$_2$ promotes the generation of spin currents with pronounced anisotropic behavior, maximized when the charge current flows along the [010] direction. This unique capability yields the highest field-free switching ratio, maintaining stable SOT switching within an external field range of approximately 400 Oe. Notably, ASSE dominates the spin current, especially when the current is aligned with the [010] direction (θ = 90°). Here, the spin polarization component creates a substantial field-like effective field, surpassing the damping-like field from . This highlights the crucial role of in enhancing spin-torque efficiency and elucidating spin flow modulation mechanics in this crystalline context. Our study highlights the potential of RuO$_2$ as a powerful spin current generator, paving the way for practical applications in spin-torque switching technologies and other cutting-edge spintronic devices. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.07427 [pdf, other]

Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation

Authors: Hao Fang, Peng Wu, Yawei Li, Xinxin Zhang, Xiankai Lu

Abstract: Open-Vocabulary Video Instance Segmentation (VIS) is attracting increasing attention due to its ability to segment and track arbitrary objects. However, the recent Open-Vocabulary VIS attempts obtained unsatisfactory results, especially in terms of generalization ability of novel categories. We discover that the domain gap between the VLM features (e.g., CLIP) and the instance queries and the unde… ▽ More Open-Vocabulary Video Instance Segmentation (VIS) is attracting increasing attention due to its ability to segment and track arbitrary objects. However, the recent Open-Vocabulary VIS attempts obtained unsatisfactory results, especially in terms of generalization ability of novel categories. We discover that the domain gap between the VLM features (e.g., CLIP) and the instance queries and the underutilization of temporal consistency are two central causes. To mitigate these issues, we design and train a novel Open-Vocabulary VIS baseline called OVFormer. OVFormer utilizes a lightweight module for unified embedding alignment between query embeddings and CLIP image embeddings to remedy the domain gap. Unlike previous image-based training methods, we conduct video-based model training and deploy a semi-online inference scheme to fully mine the temporal consistency in the video. Without bells and whistles, OVFormer achieves 21.9 mAP with a ResNet-50 backbone on LV-VIS, exceeding the previous state-of-the-art performance by 7.7. Extensive experiments on some Close-Vocabulary VIS datasets also demonstrate the strong zero-shot generalization ability of OVFormer (+ 7.6 mAP on YouTube-VIS 2019, + 3.9 mAP on OVIS). Code is available at https://github.com/fanghaook/OVFormer. △ Less

Submitted 11 July, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

Comments: ECCV 2024

arXiv:2407.07324 [pdf, other]

Event-Aided Time-to-Collision Estimation for Autonomous Driving

Authors: Jinghang Li, Bangyan Liao, Xiuyuan LU, Peidong Liu, Shaojie Shen, Yi Zhou

Abstract: Predicting a potential collision with leading vehicles is an essential functionality of any autonomous/assisted driving system. One bottleneck of existing vision-based solutions is that their updating rate is limited to the frame rate of standard cameras used. In this paper, we present a novel method that estimates the time to collision using a neuromorphic event-based camera, a biologically inspi… ▽ More Predicting a potential collision with leading vehicles is an essential functionality of any autonomous/assisted driving system. One bottleneck of existing vision-based solutions is that their updating rate is limited to the frame rate of standard cameras used. In this paper, we present a novel method that estimates the time to collision using a neuromorphic event-based camera, a biologically inspired visual sensor that can sense at exactly the same rate as scene dynamics. The core of the proposed algorithm consists of a two-step approach for efficient and accurate geometric model fitting on event data in a coarse-to-fine manner. The first step is a robust linear solver based on a novel geometric measurement that overcomes the partial observability of event-based normal flow. The second step further refines the resulting model via a spatio-temporal registration process formulated as a nonlinear optimization problem. Experiments on both synthetic and real data demonstrate the effectiveness of the proposed method, outperforming other alternative methods in terms of efficiency and accuracy. △ Less

Submitted 9 July, 2024; originally announced July 2024.

Comments: Accepted to European Conference on Computer Vision 2024, dataset used in this paper can be found at https://nail-hnu.github.io/EventAidedTTC

arXiv:2407.06845 [pdf, other]

Digging into the Interior of Hot Cores with ALMA (DIHCA). IV. Fragmentation in High-mass Star-Forming Clumps

Authors: Kosuke Ishihara, Patricio Sanhueza, Fumitaka Nakamura, Masao Saito, Huei-Ru V. Chen, Shanghuo Li, Fernando Olguin, Kotomi Taniguchi, Kaho Morii, Xing Lu, Qiuyi Luo, Takeshi Sakai, Qizhou Zhang

Abstract: Fragmentation contributes to the formation and evolution of stars. Observationally, high-mass stars are known to form multiple-star systems, preferentially in cluster environments. Theoretically, Jeans instability has been suggested to determine characteristic fragmentation scales, and thermal or turbulent motion in the parental gas clump mainly contributes to the instability. To search for such a… ▽ More Fragmentation contributes to the formation and evolution of stars. Observationally, high-mass stars are known to form multiple-star systems, preferentially in cluster environments. Theoretically, Jeans instability has been suggested to determine characteristic fragmentation scales, and thermal or turbulent motion in the parental gas clump mainly contributes to the instability. To search for such a characteristic fragmentation scale, we have analyzed ALMA 1.33 mm continuum observations toward 30 high-mass star-forming clumps taken by the Digging into the Interior of Hot Cores with ALMA (DIHCA) survey. We have identified 573 cores using the dendrogram algorithm and measured the separation of cores by using the Minimum Spanning Tree (MST) technique. The core separation corrected by projection effects has a distribution peaked around 5800 au. In order to remove biases produced by different distances and sensitivities, we further smooth the images to a common physical scale and perform completeness tests. Our careful analysis finds a characteristic fragmentation scale of $\sim$7000 au, comparable to the thermal Jeans length of the clumps. We conclude that thermal Jeans fragmentation plays a dominant role in determining the clump fragmentation in high-mass star-forming regions, without the need of invoking turbulent Jeans fragmentation. △ Less

Submitted 9 July, 2024; originally announced July 2024.

Comments: 30 pages, 18 figures, Accepted in ApJ

arXiv:2407.05677 [pdf, other]

PCAC-GAN:ASparse-Tensor-Based Generative Adversarial Network for 3D Point Cloud Attribute Compression

Authors: Xiaolong Mao, Hui Yuan, Xin Lu, Raouf Hamzaoui, Wei Gao

Abstract: Learning-based methods have proven successful in compressing geometric information for point clouds. For attribute compression, however, they still lag behind non-learning-based methods such as the MPEG G-PCC standard. To bridge this gap, we propose a novel deep learning-based point cloud attribute compression method that uses a generative adversarial network (GAN) with sparse convolution layers.… ▽ More Learning-based methods have proven successful in compressing geometric information for point clouds. For attribute compression, however, they still lag behind non-learning-based methods such as the MPEG G-PCC standard. To bridge this gap, we propose a novel deep learning-based point cloud attribute compression method that uses a generative adversarial network (GAN) with sparse convolution layers. Our method also includes a module that adaptively selects the resolution of the voxels used to voxelize the input point cloud. Sparse vectors are used to represent the voxelized point cloud, and sparse convolutions process the sparse tensors, ensuring computational efficiency. To the best of our knowledge, this is the first application of GANs to compress point cloud attributes. Our experimental results show that our method outperforms existing learning-based techniques and rivals the latest G-PCC test model (TMC13v23) in terms of visual quality. △ Less

Submitted 9 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

Comments: 14 pages, 5 figures

MSC Class: 94J20 ACM Class: I.4.2

arXiv:2407.04981 [pdf, other]

TRACE: TRansformer-based Attribution using Contrastive Embeddings in LLMs

Authors: Cheng Wang, Xinyang Lu, See-Kiong Ng, Bryan Kian Hsiang Low

Abstract: The rapid evolution of large language models (LLMs) represents a substantial leap forward in natural language understanding and generation. However, alongside these advancements come significant challenges related to the accountability and transparency of LLM responses. Reliable source attribution is essential to adhering to stringent legal and regulatory standards, including those set forth by th… ▽ More The rapid evolution of large language models (LLMs) represents a substantial leap forward in natural language understanding and generation. However, alongside these advancements come significant challenges related to the accountability and transparency of LLM responses. Reliable source attribution is essential to adhering to stringent legal and regulatory standards, including those set forth by the General Data Protection Regulation. Despite the well-established methods in source attribution within the computer vision domain, the application of robust attribution frameworks to natural language processing remains underexplored. To bridge this gap, we propose a novel and versatile TRansformer-based Attribution framework using Contrastive Embeddings called TRACE that, in particular, exploits contrastive learning for source attribution. We perform an extensive empirical evaluation to demonstrate the performance and efficiency of TRACE in various settings and show that TRACE significantly improves the ability to attribute sources accurately, making it a valuable tool for enhancing the reliability and trustworthiness of LLMs. △ Less

Submitted 6 July, 2024; originally announced July 2024.

arXiv:2407.04661 [pdf, other]

MIRI MRS Observations of Beta Pictoris II. The Spectroscopic Case for a Recent Giant Collision

Authors: Christine H. Chen, Cicero X. Lu, Kadin Worthen, David R. Law, B. A. Sargent, Amaya Moro-Martin, G. C. Sloan, Carey M. Lisse, Dan M. Watson, Julien H. Girard, Yiwei Chai, Dean C. Hines, Jens Kammerer, Alexis Li, Marshall Perrin, Laurent Pueyo, Isabel Rebollido, Karl R. Stapelfeldt, Christopher Stark, Michael W. Werner

Abstract: Modeling observations of the archetypal debris disk around $β$ Pic, obtained in 2023 January with the MIRI MRS on board JWST, reveals significant differences compared with that obtained with the IRS on board Spitzer. The bright 5 - 15 $μ$m continuum excess modeled using a $\sim$600 K black body has disappeared. The previously prominent 18 and 23 $μ$m crystalline forsterite emission features, arisi… ▽ More Modeling observations of the archetypal debris disk around $β$ Pic, obtained in 2023 January with the MIRI MRS on board JWST, reveals significant differences compared with that obtained with the IRS on board Spitzer. The bright 5 - 15 $μ$m continuum excess modeled using a $\sim$600 K black body has disappeared. The previously prominent 18 and 23 $μ$m crystalline forsterite emission features, arising from cold dust ($\sim$100 K) in the Rayleigh limit, have disappeared and been replaced by very weak features arising from the hotter 500 K dust population. Finally, the shape of the 10 $μ$m silicate feature has changed, consistent with a shift in the temperature of the warm dust population from $\sim$300 K to $\sim$500 K and an increase in the crystalline fraction of the warm, silicate dust. Stellar radiation pressure may have blown both the hot and the cold crystalline dust particles observed in the Spitzer spectra out of the planetary system during the intervening 20 years between the Spitzer and JWST observations. These results indicate that the $β$ Pic system has a dynamic circumstellar environment, and that periods of enhanced collisions can create large clouds of dust that sweep through the planetary system. △ Less

Submitted 5 July, 2024; originally announced July 2024.

Comments: 15 pages, 8 figures, ApJ in press

arXiv:2407.03676 [pdf]

Out-of-Plane Polarization from Spin Reflection Induces Field-Free Spin-Orbit Torque Switching in Structures with Canted NiO Interfacial Moments

Authors: Zhe Zhang, Zhuoyi Li, Yuzhe Chen, Fangyuan Zhu, Yu Yan, Yao Li, Liang He, Jun Du, Rong Zhang, Jing Wu, Xianyang Lu, Yongbing Xu

Abstract: Realizing deterministic current-induced spin-orbit torque (SOT) magnetization switching, especially in systems exhibiting perpendicular magnetic anisotropy (PMA), typically requires the application of a collinear in-plane field, posing a challenging problem. In this study, we successfully achieve field-free SOT switching in the CoFeB/MgO system. In a Ta/CoFeB/MgO/NiO/Ta structure, spin reflection… ▽ More Realizing deterministic current-induced spin-orbit torque (SOT) magnetization switching, especially in systems exhibiting perpendicular magnetic anisotropy (PMA), typically requires the application of a collinear in-plane field, posing a challenging problem. In this study, we successfully achieve field-free SOT switching in the CoFeB/MgO system. In a Ta/CoFeB/MgO/NiO/Ta structure, spin reflection at the NiO interface, characterized by noncollinear spin structures with canted magnetization, generates a spin current with an out-of-plane spin polarization σz. We confirm the contribution of σz to the field-free SOT switching through measurements of the shift effect in the out-of-plane magnetization hysteresis loops under different currents. The incorporation of NiO as an antiferromagnetic insulator, mitigates the current shunting effect and ensures excellent thermal stability of the device. The sample with 0.8 nm MgO and 2 nm NiO demonstrates an impressive optimal switching ratio approaching 100% without an in-plane field. This breakthrough in the CoFeB/MgO system promises significant applications in spintronics, advancing us closer to realizing innovative technologies. △ Less

Submitted 4 July, 2024; originally announced July 2024.

arXiv:2407.03618 [pdf, other]

BM25S: Orders of magnitude faster lexical search via eager sparse scoring

Authors: Xing Han Lù

Abstract: We introduce BM25S, an efficient Python-based implementation of BM25 that only depends on Numpy and Scipy. BM25S achieves up to a 500x speedup compared to the most popular Python-based framework by eagerly computing BM25 scores during indexing and storing them into sparse matrices. It also achieves considerable speedups compared to highly optimized Java-based implementations, which are used by pop… ▽ More We introduce BM25S, an efficient Python-based implementation of BM25 that only depends on Numpy and Scipy. BM25S achieves up to a 500x speedup compared to the most popular Python-based framework by eagerly computing BM25 scores during indexing and storing them into sparse matrices. It also achieves considerable speedups compared to highly optimized Java-based implementations, which are used by popular commercial products. Finally, BM25S reproduces the exact implementation of five BM25 variants based on Kamphuis et al. (2020) by extending eager scoring to non-sparse variants using a novel score shifting method. The code can be found at https://github.com/xhluca/bm25s △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: Technical Report

arXiv:2407.03165 [pdf, other]

Consistent Point Orientation for Manifold Surfaces via Boundary Integration

Authors: Weizhou Liu, Xingce Wang, Haichuan Zhao, Xingfei Xue, Zhongke Wu, Xuequan Lu, Ying He

Abstract: This paper introduces a new approach for generating globally consistent normals for point clouds sampled from manifold surfaces. Given that the generalized winding number (GWN) field generated by a point cloud with globally consistent normals is a solution to a PDE with jump boundary conditions and possesses harmonic properties, and the Dirichlet energy of the GWN field can be defined as an integr… ▽ More This paper introduces a new approach for generating globally consistent normals for point clouds sampled from manifold surfaces. Given that the generalized winding number (GWN) field generated by a point cloud with globally consistent normals is a solution to a PDE with jump boundary conditions and possesses harmonic properties, and the Dirichlet energy of the GWN field can be defined as an integral over the boundary surface, we formulate a boundary energy derived from the Dirichlet energy of the GWN. Taking as input a point cloud with randomly oriented normals, we optimize this energy to restore the global harmonicity of the GWN field, thereby recovering the globally consistent normals. Experiments show that our method outperforms state-of-the-art approaches, exhibiting enhanced robustness to noise, outliers, complex topologies, and thin structures. Our code can be found at \url{https://github.com/liuweizhou319/BIM}. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: accepted in siggraph2024

arXiv:2407.02899 [pdf, other]

Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

Abstract: A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be… ▽ More A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be $\mathcal{B}(J/ψ\to p \bar{p} η(η\to γγ)) = (1.480 \pm 0.001 \pm 0.024)\times\,10^{-3}$ and $\mathcal{B}(J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)) = (1.557 \pm 0.003 \pm 0.038)\times\,10^{-3}$, where the first uncertainties are statistical and the second systematic. Both results are compatible within their uncorrelated systematic uncertainties. The combined result is $\mathcal{B}(J/ψ\to p \bar{p} η)=(1.495 \pm 0.001 \pm 0.023)\times\,10^{-3}$ where the first uncertainty is the combined statistical uncertainty and the second one the combined systematic uncertainty of both analyses, incorporating correlations between them. In addition, the $p \bar{p}$ threshold region is investigated for a potential threshold enhancement, and no evidence for one is observed. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.01942 [pdf, other]

Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness

Authors: Khyathi Raghavi Chandu, Linjie Li, Anas Awadalla, Ximing Lu, Jae Sung Park, Jack Hessel, Lijuan Wang, Yejin Choi

Abstract: The ability to acknowledge the inevitable uncertainty in their knowledge and reasoning is a prerequisite for AI systems to be truly truthful and reliable. In this paper, we present a taxonomy of uncertainty specific to vision-language AI systems, distinguishing between epistemic uncertainty (arising from a lack of information) and aleatoric uncertainty (due to inherent unpredictability), and furth… ▽ More The ability to acknowledge the inevitable uncertainty in their knowledge and reasoning is a prerequisite for AI systems to be truly truthful and reliable. In this paper, we present a taxonomy of uncertainty specific to vision-language AI systems, distinguishing between epistemic uncertainty (arising from a lack of information) and aleatoric uncertainty (due to inherent unpredictability), and further explore finer categories within. Based on this taxonomy, we synthesize a benchmark dataset, CertainlyUncertain, featuring 178K visual question answering (VQA) samples as contrastive pairs. This is achieved by 1) inpainting images to make previously answerable questions into unanswerable ones; and 2) using image captions to prompt large language models for both answerable and unanswerable questions. Additionally, we introduce a new metric confidence-weighted accuracy, that is well correlated with both accuracy and calibration error, to address the shortcomings of existing metrics. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: 26 pages

arXiv:2407.01902 [pdf, other]

SoP: Unlock the Power of Social Facilitation for Automatic Jailbreak Attack

Authors: Yan Yang, Zeguan Xiao, Xin Lu, Hongru Wang, Hailiang Huang, Guanhua Chen, Yun Chen

Abstract: The widespread applications of large language models (LLMs) have brought about concerns regarding their potential misuse. Although aligned with human preference data before release, LLMs remain vulnerable to various malicious attacks. In this paper, we adopt a red-teaming strategy to enhance LLM safety and introduce SoP, a simple yet effective framework to design jailbreak prompts automatically. I… ▽ More The widespread applications of large language models (LLMs) have brought about concerns regarding their potential misuse. Although aligned with human preference data before release, LLMs remain vulnerable to various malicious attacks. In this paper, we adopt a red-teaming strategy to enhance LLM safety and introduce SoP, a simple yet effective framework to design jailbreak prompts automatically. Inspired by the social facilitation concept, SoP generates and optimizes multiple jailbreak characters to bypass the guardrails of the target LLM. Different from previous work which relies on proprietary LLMs or seed jailbreak templates crafted by human expertise, SoP can generate and optimize the jailbreak prompt in a cold-start scenario using open-sourced LLMs without any seed jailbreak templates. Experimental results show that SoP achieves attack success rates of 88% and 60% in bypassing the safety alignment of GPT-3.5-1106 and GPT-4, respectively. Furthermore, we extensively evaluate the transferability of the generated templates across different LLMs and held-out malicious requests, while also exploring defense strategies against the jailbreak attack designed by SoP. Code is available at https://github.com/Yang-Yan-Yang-Yan/SoP. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2407.01523 [pdf, other]

MMLongBench-Doc: Benchmarking Long-context Document Understanding with Visualizations

Authors: Yubo Ma, Yuhang Zang, Liangyu Chen, Meiqi Chen, Yizhu Jiao, Xinze Li, Xinyuan Lu, Ziyu Liu, Yan Ma, Xiaoyi Dong, Pan Zhang, Liangming Pan, Yu-Gang Jiang, Jiaqi Wang, Yixin Cao, Aixin Sun

Abstract: Understanding documents with rich layouts and multi-modal components is a long-standing and practical task. Recent Large Vision-Language Models (LVLMs) have made remarkable strides in various tasks, particularly in single-page document understanding (DU). However, their abilities on long-context DU remain an open problem. This work presents MMLongBench-Doc, a long-context, multi-modal benchmark co… ▽ More Understanding documents with rich layouts and multi-modal components is a long-standing and practical task. Recent Large Vision-Language Models (LVLMs) have made remarkable strides in various tasks, particularly in single-page document understanding (DU). However, their abilities on long-context DU remain an open problem. This work presents MMLongBench-Doc, a long-context, multi-modal benchmark comprising 1,062 expert-annotated questions. Distinct from previous datasets, it is constructed upon 130 lengthy PDF-formatted documents with an average of 49.4 pages and 20,971 textual tokens. Towards comprehensive evaluation, answers to these questions rely on pieces of evidence from (1) different sources (text, image, chart, table, and layout structure) and (2) various locations (i.e. page number). Moreover, 33.2% of the questions are cross-page questions requiring evidence across multiple pages. 22.8% of the questions are designed to be unanswerable for detecting potential hallucinations. Experiments on 14 LVLMs demonstrate that long-context DU greatly challenges current models. Notably, the best-performing model, GPT-4o, achieves an F1 score of only 42.7%, while the second-best, GPT-4V, scores 31.4%. Furthermore, 12 LVLMs (all except GPT-4o and GPT-4V) even present worse performance than their LLM counterparts which are fed with lossy-parsed OCR documents. These results validate the necessity of future research toward more capable long-context LVLMs. Project Page: https://mayubo2333.github.io/MMLongBench-Doc △ Less

Submitted 10 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

arXiv:2407.00369 [pdf, other]

How to Train Your Fact Verifier: Knowledge Transfer with Multimodal Open Models

Authors: Jaeyoung Lee, Ximing Lu, Jack Hessel, Faeze Brahman, Youngjae Yu, Yonatan Bisk, Yejin Choi, Saadia Gabriel

Abstract: Given the growing influx of misinformation across news and social media, there is a critical need for systems that can provide effective real-time verification of news claims. Large language or multimodal model based verification has been proposed to scale up online policing mechanisms for mitigating spread of false and harmful content. While these can potentially reduce burden on human fact-check… ▽ More Given the growing influx of misinformation across news and social media, there is a critical need for systems that can provide effective real-time verification of news claims. Large language or multimodal model based verification has been proposed to scale up online policing mechanisms for mitigating spread of false and harmful content. While these can potentially reduce burden on human fact-checkers, such efforts may be hampered by foundation model training data becoming outdated. In this work, we test the limits of improving foundation model performance without continual updating through an initial study of knowledge transfer using either existing intra- and inter- domain benchmarks or explanations generated from large language models (LLMs). We evaluate on 12 public benchmarks for fact-checking and misinformation detection as well as two other tasks relevant to content moderation -- toxicity and stance detection. Our results on two recent multi-modal fact-checking benchmarks, Mocheg and Fakeddit, indicate that knowledge transfer strategies can improve Fakeddit performance over the state-of-the-art by up to 1.7% and Mocheg performance by up to 2.9%. △ Less

Submitted 29 June, 2024; originally announced July 2024.

arXiv:2407.00136 [pdf, other]

Observation of the Electromagnetic Dalitz Transition $h_c \rightarrow e^+e^-η_c$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, S. Ahmed, M. Albrecht, R. Aliberti, A. Amoroso, M. R. An, Q. An, X. H. Bai, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (495 additional authors not shown)

Abstract: Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions… ▽ More Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions $\frac{\mathcal{B}(h_c\rightarrow e^+e^-η_c)}{\mathcal{B}(h_c\rightarrow γη_c)}$ separately for the $h_c$ samples produced via $ψ(3686)\toπ^0h_c$ and $e^+e^-\toπ^+π^-h_c$. The average ratio is determined to be $(0.59\pm0.10(\text{stat.})\pm0.04(\text{syst.}))\%$, where the uncertainty includes both statistical and systematic components. △ Less

Submitted 2 July, 2024; v1 submitted 28 June, 2024; originally announced July 2024.

arXiv:2406.19190 [pdf, ps, other]

Improved measurement of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

Abstract: Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential dec… ▽ More Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential decay rate of $D^+_s\to K^0 e^+ν_e$ to be $f^{K^0}_+(0)=0.636\pm0.049\pm0.013$. For both measurements, the first uncertainty is statistical and the second systematic. The branching fraction and form factor measurements are factors of 1.6 and 1.7 more precise than the previous world averages, respectively. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: 13 pages, 6 figures

arXiv:2406.18510 [pdf, other]

WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models

Authors: Liwei Jiang, Kavel Rao, Seungju Han, Allyson Ettinger, Faeze Brahman, Sachin Kumar, Niloofar Mireshghallah, Ximing Lu, Maarten Sap, Yejin Choi, Nouha Dziri

Abstract: We introduce WildTeaming, an automatic LLM safety red-teaming framework that mines in-the-wild user-chatbot interactions to discover 5.7K unique clusters of novel jailbreak tactics, and then composes multiple tactics for systematic exploration of novel jailbreaks. Compared to prior work that performed red-teaming via recruited human workers, gradient-based optimization, or iterative revision with… ▽ More We introduce WildTeaming, an automatic LLM safety red-teaming framework that mines in-the-wild user-chatbot interactions to discover 5.7K unique clusters of novel jailbreak tactics, and then composes multiple tactics for systematic exploration of novel jailbreaks. Compared to prior work that performed red-teaming via recruited human workers, gradient-based optimization, or iterative revision with LLMs, our work investigates jailbreaks from chatbot users who were not specifically instructed to break the system. WildTeaming reveals previously unidentified vulnerabilities of frontier LLMs, resulting in up to 4.6x more diverse and successful adversarial attacks compared to state-of-the-art jailbreak methods. While many datasets exist for jailbreak evaluation, very few open-source datasets exist for jailbreak training, as safety training data has been closed even when model weights are open. With WildTeaming we create WildJailbreak, a large-scale open-source synthetic safety dataset with 262K vanilla (direct request) and adversarial (complex jailbreak) prompt-response pairs. To mitigate exaggerated safety behaviors, WildJailbreak provides two contrastive types of queries: 1) harmful queries (vanilla & adversarial) and 2) benign queries that resemble harmful queries in form but contain no harm. As WildJailbreak considerably upgrades the quality and scale of existing safety resources, it uniquely enables us to examine the scaling effects of data and the interplay of data properties and model capabilities during safety training. Through extensive experiments, we identify the training properties that enable an ideal balance of safety behaviors: appropriate safeguarding without over-refusal, effective handling of vanilla and adversarial queries, and minimal, if any, decrease in general capabilities. All components of WildJailbeak contribute to achieving balanced safety behaviors of models. △ Less

Submitted 26 June, 2024; originally announced June 2024.

arXiv:2406.18259 [pdf, other]

Detecting Machine-Generated Texts: Not Just "AI vs Humans" and Explainability is Complicated

Authors: Jiazhou Ji, Ruizhe Li, Shujun Li, Jie Guo, Weidong Qiu, Zheng Huang, Chiyu Chen, Xiaoyu Jiang, Xinru Lu

Abstract: As LLMs rapidly advance, increasing concerns arise regarding risks about actual authorship of texts we see online and in real world. The task of distinguishing LLM-authored texts is complicated by the nuanced and overlapping behaviors of both machines and humans. In this paper, we challenge the current practice of considering LLM-generated text detection a binary classification task of differentia… ▽ More As LLMs rapidly advance, increasing concerns arise regarding risks about actual authorship of texts we see online and in real world. The task of distinguishing LLM-authored texts is complicated by the nuanced and overlapping behaviors of both machines and humans. In this paper, we challenge the current practice of considering LLM-generated text detection a binary classification task of differentiating human from AI. Instead, we introduce a novel ternary text classification scheme, adding an "undecided" category for texts that could be attributed to either source, and we show that this new category is crucial to understand how to make the detection result more explainable to lay users. This research shifts the paradigm from merely classifying to explaining machine-generated texts, emphasizing need for detectors to provide clear and understandable explanations to users. Our study involves creating four new datasets comprised of texts from various LLMs and human authors. Based on new datasets, we performed binary classification tests to ascertain the most effective SOTA detection methods and identified SOTA LLMs capable of producing harder-to-detect texts. We constructed a new dataset of texts generated by two top-performing LLMs and human authors, and asked three human annotators to produce ternary labels with explanation notes. This dataset was used to investigate how three top-performing SOTA detectors behave in new ternary classification context. Our results highlight why "undecided" category is much needed from the viewpoint of explainability. Additionally, we conducted an analysis of explainability of the three best-performing detectors and the explanation notes of the human annotators, revealing insights about the complexity of explainable detection of machine-generated texts. Finally, we propose guidelines for developing future detection systems with improved explanatory power. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: 19 pages, 2 figures

arXiv:2406.18183 [pdf, other]

Measurement of the cross sections of $e^+e^-\to K^{-}\barΞ^{+}Λ/Σ^{0}$ at center-of-mass energies between 3.510 and 4.914 GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

Abstract: Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of… ▽ More Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$, evidence for $ψ(4160) \to K^{-}\barΞ^{+}Λ$ is found for the first time with a significance of 4.4$σ$, including systematic uncertainties. No evidence for other possible resonances is found. In addition, the products of electronic partial width and branching fraction for all assumed resonances decaying into $K^{-}\barΞ^{+}Λ/Σ^{0}$ are determined. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: 26 pages,5 tables, 4 figures

arXiv:2406.18083 [pdf, other]

Measurements of $K_S^0$-$K_L^0$ asymmetries in the decays $Λ_c^+ \to pK_{L,S}^0$, $pK_{L,S}^0π^+π^-$ and $pK_{L,S}^0π^0$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

Abstract: Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, an… ▽ More Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, and $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^0)=(2.02 \pm 0.13 \pm 0.05)\%$, where the first uncertainties are statistical and the second systematic. Combining with the known branching fractions of $Λ_c^+ \to pK_{S}^{0}$, $Λ_c^+ \to pK_{S}^{0}π^+π^-$, and $Λ_c^+ \to pK_{S}^{0}π^0$, we present the first measurements of the $K_{S}^{0}$-$K_{L}^{0}$ asymmetries $R(Λ_c^+, K_{S,L}^0X) = \frac{\mathcal{B}(Λ_c^+ \to K_{S}^{0} X) - \mathcal{B}(Λ_c^+ \to K_{L}^{0} X)}{\mathcal{B}(Λ_c^+ \to K_{S}^{0} X) + \mathcal{B}(Λ_c^+ \to K_{L}^{0} X)}$ in charmed baryon decays: $R(Λ_c^+, pK_{S,L}^0) = -0.025 \pm 0.031$, $R(Λ_c^+, pK_{S,L}^0π^+π^-) = -0.027 \pm 0.048$, and $R(Λ_c^+, pK_{S,L}^0π^0) =-0.015 \pm 0.046$. No significant asymmetries within the uncertainties are observed. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: 19 pages, 2 figures

arXiv:2406.17452 [pdf, ps, other]

Study of the $f_{0}(980)$ through the decay $D_{s}^{+}\rightarrow π^{+}π^{+}π^{-}π^{0}$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (649 additional authors not shown)

Abstract: We perform the first amplitude analysis of $D^+_s \to π^+π^+π^-π^0$ decays, based on data samples of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of 7.33~fb$^{-1}$. We report the observation of $D_{s}^{+} \to f_0(980)ρ(770)^{+}$ with a statistical significance greater than 10$σ$ and… ▽ More We perform the first amplitude analysis of $D^+_s \to π^+π^+π^-π^0$ decays, based on data samples of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of 7.33~fb$^{-1}$. We report the observation of $D_{s}^{+} \to f_0(980)ρ(770)^{+}$ with a statistical significance greater than 10$σ$ and determine the branching fractions $\mathcal{B}(D_s^+\toπ^+π^+π^-π^0|_{{\rm non}-η})=(2.04\pm0.08_{\rm stat.}\pm0.05_{\rm syst.})\%$ and $\mathcal{B}(D_s^+\toηπ^+)=(1.56\pm0.09_{\rm stat.}\pm0.04_{\rm syst.})\%$. Moreover, we measure the relative branching fraction between $φ\toπ^+π^-π^0$ and $φ\to K^+K^-$ to be $\frac{\mathcal{B}(φ(1020) \to π^+π^-π^0)}{\mathcal{B}(φ(1020) \to K^+K^-)}=0.230 \pm 0.014_{\rm stat.} \pm 0.010_{\rm syst.}$, which deviates from the world average value by more than $4σ$. △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.17005 [pdf, other]

PVUW 2024 Challenge on Complex Video Understanding: Methods and Results

Authors: Henghui Ding, Chang Liu, Yunchao Wei, Nikhila Ravi, Shuting He, Song Bai, Philip Torr, Deshui Miao, Xin Li, Zhenyu He, Yaowei Wang, Ming-Hsuan Yang, Zhensong Xu, Jiangtao Yao, Chengjing Wu, Ting Liu, Luoqi Liu, Xinyu Liu, Jing Zhang, Kexin Zhang, Yuting Yang, Licheng Jiao, Shuyuan Yang, Mingqi Gao, Jingnan Luo , et al. (12 additional authors not shown)

Abstract: Pixel-level Video Understanding in the Wild Challenge (PVUW) focus on complex video understanding. In this CVPR 2024 workshop, we add two new tracks, Complex Video Object Segmentation Track based on MOSE dataset and Motion Expression guided Video Segmentation track based on MeViS dataset. In the two new tracks, we provide additional videos and annotations that feature challenging elements, such as… ▽ More Pixel-level Video Understanding in the Wild Challenge (PVUW) focus on complex video understanding. In this CVPR 2024 workshop, we add two new tracks, Complex Video Object Segmentation Track based on MOSE dataset and Motion Expression guided Video Segmentation track based on MeViS dataset. In the two new tracks, we provide additional videos and annotations that feature challenging elements, such as the disappearance and reappearance of objects, inconspicuous small objects, heavy occlusions, and crowded environments in MOSE. Moreover, we provide a new motion expression guided video segmentation dataset MeViS to study the natural language-guided video understanding in complex environments. These new videos, sentences, and annotations enable us to foster the development of a more comprehensive and robust pixel-level understanding of video scenes in complex environments and realistic scenarios. The MOSE challenge had 140 registered teams in total, 65 teams participated the validation phase and 12 teams made valid submissions in the final challenge phase. The MeViS challenge had 225 registered teams in total, 50 teams participated the validation phase and 5 teams made valid submissions in the final challenge phase. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: MOSE Challenge: https://henghuiding.github.io/MOSE/ChallengeCVPR2024, MeViS Challenge: https://henghuiding.github.io/MeViS/ChallengeCVPR2024

arXiv:2406.15720 [pdf, other]

Scaling Laws for Fact Memorization of Large Language Models

Authors: Xingyu Lu, Xiaonan Li, Qinyuan Cheng, Kai Ding, Xuanjing Huang, Xipeng Qiu

Abstract: Fact knowledge memorization is crucial for Large Language Models (LLM) to generate factual and reliable responses. However, the behaviors of LLM fact memorization remain under-explored. In this paper, we analyze the scaling laws for LLM's fact knowledge and LLMs' behaviors of memorizing different types of facts. We find that LLMs' fact knowledge capacity has a linear and negative exponential law r… ▽ More Fact knowledge memorization is crucial for Large Language Models (LLM) to generate factual and reliable responses. However, the behaviors of LLM fact memorization remain under-explored. In this paper, we analyze the scaling laws for LLM's fact knowledge and LLMs' behaviors of memorizing different types of facts. We find that LLMs' fact knowledge capacity has a linear and negative exponential law relationship with model size and training epochs, respectively. Estimated by the built scaling law, memorizing the whole Wikidata's facts requires training an LLM with 1000B non-embed parameters for 100 epochs, suggesting that using LLMs to memorize all public facts is almost implausible for a general pre-training setting. Meanwhile, we find that LLMs can generalize on unseen fact knowledge and its scaling law is similar to general pre-training. Additionally, we analyze the compatibility and preference of LLMs' fact memorization. For compatibility, we find LLMs struggle with memorizing redundant facts in a unified way. Only when correlated facts have the same direction and structure, the LLM can compatibly memorize them. This shows the inefficiency of LLM memorization for redundant facts. For preference, the LLM pays more attention to memorizing more frequent and difficult facts, and the subsequent facts can overwrite prior facts' memorization, which significantly hinders low-frequency facts memorization. Our findings reveal the capacity and characteristics of LLMs' fact knowledge learning, which provide directions for LLMs' fact knowledge augmentation. △ Less

Submitted 21 June, 2024; originally announced June 2024.

arXiv:2406.15564 [pdf, other]

Entangelment Entropy on Generalized Brillouin Zone

Authors: Zhenghao Yang, Chaoze Lu, Xiancong Lu

Abstract: We investigate the entanglement properties of non-Hermitian Su-Schrieffer-Heeger (SSH) model from the perspective of the Generalized Brillouin Zone (GBZ). The non-Bloch entanglement entropy is defined on a quasi-reciprocal lattice, obtained by performing an ordinary Fourier transformation on the non-Bloch Hamiltonian. We demonstrate that the broken bulk-boundary correspondence is recovered in term… ▽ More We investigate the entanglement properties of non-Hermitian Su-Schrieffer-Heeger (SSH) model from the perspective of the Generalized Brillouin Zone (GBZ). The non-Bloch entanglement entropy is defined on a quasi-reciprocal lattice, obtained by performing an ordinary Fourier transformation on the non-Bloch Hamiltonian. We demonstrate that the broken bulk-boundary correspondence is recovered in terms of the non-Bloch entanglement entropy. When the GBZ is circular, we show that the non-Bloch entanglement entropy is well-defined (real and positive-definite) in large parameter regions, except close to the exceptional points (EPs). In the critical region, we found that each Fermi point contributes precisely 1 to the central charge $c$ of the logarithmic scaling. At the EP, the central charge becomes negative due to the presence of the exceptional bound state. For the case of non-circular GBZ, long-range hopping emerges in the quasi-reciprocal lattice, and the von Neumann entropy on the GBZ is no longer real. However, the non-Bloch edge entanglement entropy remains real, which serves as a reliable topological indicator and respects the bulk-boundary correspondence. We compute the topological phase diagram, and reveal the critical behavior along the exceptional phase boundaries. △ Less

Submitted 21 June, 2024; originally announced June 2024.

Comments: 10 pages, 5 figures

arXiv:2406.15030 [pdf, ps, other]

Search for the $e^+e^- \to φχ_{c1}(3872)$ process at BESIII

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

Abstract: Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction… ▽ More Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction $\mathcal{B}[χ_{c1}(3872)\toπ^+π^- J/ψ]$ at 4.914 and 4.946 GeV are set to be 0.85 and 0.96 pb, respectively. These measurements provide useful information for the production of the $χ_{c1}(3872)$ at $e^+e^-$ collider and deepen our understanding about the nature of this particle. △ Less

Submitted 21 June, 2024; originally announced June 2024.

Comments: 11 pages, 3 figures

arXiv:2406.14874 [pdf, other]

TraceNet: Segment one thing efficiently

Authors: Mingyuan Wu, Zichuan Liu, Haozhen Zheng, Hongpeng Guo, Bo Chen, Xin Lu, Klara Nahrstedt

Abstract: Efficient single instance segmentation is essential for unlocking features in the mobile imaging applications, such as capture or editing. Existing on-the-fly mobile imaging applications scope the segmentation task to portraits or the salient subject due to the computational constraints. Instance segmentation, despite its recent developments towards efficient networks, is still heavy due to the co… ▽ More Efficient single instance segmentation is essential for unlocking features in the mobile imaging applications, such as capture or editing. Existing on-the-fly mobile imaging applications scope the segmentation task to portraits or the salient subject due to the computational constraints. Instance segmentation, despite its recent developments towards efficient networks, is still heavy due to the cost of computation on the entire image to identify all instances. To address this, we propose and formulate a one tap driven single instance segmentation task that segments a single instance selected by a user via a positive tap. This task, in contrast to the broader task of segmenting anything as suggested in the Segment Anything Model \cite{sam}, focuses on efficient segmentation of a single instance specified by the user. To solve this problem, we present TraceNet, which explicitly locates the selected instance by way of receptive field tracing. TraceNet identifies image regions that are related to the user tap and heavy computations are only performed on selected regions of the image. Therefore overall computation cost and memory consumption are reduced during inference. We evaluate the performance of TraceNet on instance IoU average over taps and the proportion of the region that a user tap can fall into for a high-quality single-instance mask. Experimental results on MS-COCO and LVIS demonstrate the effectiveness and efficiency of the proposed approach. TraceNet can jointly achieve the efficiency and interactivity, filling in the gap between needs for efficient mobile inference and recent research trend towards multimodal and interactive segmentation models. △ Less

Submitted 21 June, 2024; originally announced June 2024.

arXiv:2406.14811 [pdf, other]

Non-Markovian Collective Emission of Giant emitters in the Zeno Regime

Authors: Qing-Yang Qiu, Xin-You Lü

Abstract: We explore the collective Zeno dynamics of giant artificial atoms that are coupled, via multiple coupling points, to a common photonic or acoustic reservoir. In this regime, the establishment of atomic cooperativity and the revivification of exponential decay, are highly intertwined, which is utterly beyond the non-Markovian regime with only retarded backaction. We reveal that giant atoms build up… ▽ More We explore the collective Zeno dynamics of giant artificial atoms that are coupled, via multiple coupling points, to a common photonic or acoustic reservoir. In this regime, the establishment of atomic cooperativity and the revivification of exponential decay, are highly intertwined, which is utterly beyond the non-Markovian regime with only retarded backaction. We reveal that giant atoms build up their collective emission smoothly from the decay rate of zero to that predicted by Markovian approximation, and show great disparity between different waveguide QED setups. As a comparison, the step-like growth of instantaneous decay rates in the retardation-only picture has also been shown. All of these theoretical pictures predict the same collective behavior in the long time limit. From a phenomenological standpoint, we observe that the atomic superradiance exhabits significant directional property. In addition, the subradiant photons feature prolonged oscillation in the early stage of collective radiance, where the energy is exchanged remarkably between giant emitters and the field. Our results might be probed in state-of-art waveguide QED experiments, and fundamentally broaden the fields of collective emission in systems with giant atoms. △ Less

Submitted 20 June, 2024; originally announced June 2024.

Comments: 14 pages, 6 figures

arXiv:2406.14507 [pdf, other]

On Newton's Method to Unlearn Neural Networks

Authors: Nhung Bui, Xinyang Lu, See-Kiong Ng, Bryan Kian Hsian Low

Abstract: Machine unlearning facilitates personal data ownership, including the ``right to be forgotten''. The proliferation of applications of \emph{neural networks} (NNs) trained on users' personal data calls for the need to develop algorithms to unlearn an NN. Since retraining is costly, efficiency is often achieved through approximate unlearning which aims to unlearn a trained NN to be close to the retr… ▽ More Machine unlearning facilitates personal data ownership, including the ``right to be forgotten''. The proliferation of applications of \emph{neural networks} (NNs) trained on users' personal data calls for the need to develop algorithms to unlearn an NN. Since retraining is costly, efficiency is often achieved through approximate unlearning which aims to unlearn a trained NN to be close to the retrained one (in distribution). Though the Newton's method has been used by previous works to approximately unlearn linear models, adapting it for unlearning an NN often encounters degenerate Hessians that make computing the Newton's update impossible. In this paper, we will first show that when coupled with naive yet often effective solutions to mitigate the degeneracy issue for unlearning, the Newton's method surprisingly suffers from catastrophic forgetting. To overcome this difficulty, we revise the Newton's method to include a theoretically justified regularizer and propose a cubic-regularized Newton's method for unlearning an NN. The cubic regularizer comes with the benefits of not requiring manual finetuning and affording a natural interpretation. Empirical evaluation on several models and real-world datasets shows that our method is more resilient to catastrophic forgetting and performs better than the baselines, especially in sequential unlearning. △ Less

Submitted 20 June, 2024; originally announced June 2024.

arXiv:2406.13577 [pdf, other]

Genuine Multipartite Entanglement induced by a Thermal Acoustic Reservoir

Authors: Qing-Yang Qiu, Zhi-Guang Lu, Qiongyi He, Ying Wu, Xin-You Lü

Abstract: Genuine multipartite entanglement (GME) is not only fundamental interesting for the study of quantum-to-classical transition, but also is essential for realizing universal quantum computing and quantum networks. Here we investigate the multipartite entanglement (ME) dynamics in a linear chain of N LC resonators interacting optomechanically with a common thermal acoustic reservoir. By presenting th… ▽ More Genuine multipartite entanglement (GME) is not only fundamental interesting for the study of quantum-to-classical transition, but also is essential for realizing universal quantum computing and quantum networks. Here we investigate the multipartite entanglement (ME) dynamics in a linear chain of N LC resonators interacting optomechanically with a common thermal acoustic reservoir. By presenting the exact analytical solutions of system evolution, we predict the periodic generation of non-Gaussian ME, including the discrete and continuous variables entanglement. Interestingly, the GME is obtained even though the system is in a heat bath. The mechanism relies on the special acoustic environment featuring frequency comb structure. More importantly, our proposed model also allows the periodic generation of entangled multipartite cat states (MCSs), i.e., a typical GHZ state, with high fidelity. This work fundamentally broadens the fields of ME, and have wide applications in implementing thermal-noise-resistant quantum information processing and many-body quantum simulation. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 25 pages, 9 figures

arXiv:2406.12631 [pdf, other]

Nonreciprocal Bundle Emissions of Quantum Entangled Pairs

Authors: Qian Bin, Hui Jing, Ying Wu, Franco Nori, Xin-You Lü

Abstract: Realizing precise control over multiquanta emission is crucial for quantum information processing, especially when integrated with advanced techniques of manipulating quantum states. Here, by spinning the resonator to induce the Sagnac effect, we can obtain nonreciprocal photon-phonon and photon-magnon super-Rabi oscillations under conditions of optically driving resonance transitions. Opening dis… ▽ More Realizing precise control over multiquanta emission is crucial for quantum information processing, especially when integrated with advanced techniques of manipulating quantum states. Here, by spinning the resonator to induce the Sagnac effect, we can obtain nonreciprocal photon-phonon and photon-magnon super-Rabi oscillations under conditions of optically driving resonance transitions. Opening dissipative channels for such super-Rabi oscillations enables the realization of directional bundle emissions of entangled photon-phonon pairs and photon-magnon pairs by transferring pure multiquanta state to bundled multiquanta outside of the system. This nonreciprocal emission is a flexible switch that can be controlled with precision, and simultaneous emissions of different entangled pairs (such as photon-phonon or photon-magnon pairs) can even emerge but in opposite directions by driving the resonator from different directions. This ability to flexibly manipulate the system allows us to achieve directional entangled multiquanta emitters, and has also potential applications for building hybrid quantum networks and on-chip quantum communications. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 16 pages, 4 figures, accepted by Physical Review Letters

arXiv:2406.12221 [pdf, other]

On-Policy Fine-grained Knowledge Feedback for Hallucination Mitigation

Authors: Xueru Wen, Xinyu Lu, Xinyan Guan, Yaojie Lu, Hongyu Lin, Ben He, Xianpei Han, Le Sun

Abstract: Hallucination occurs when large language models (LLMs) exhibit behavior that deviates from the boundaries of their knowledge during the response generation process. Previous learning-based methods focus on detecting knowledge boundaries and finetuning models with instance-level feedback, but they suffer from inaccurate signals due to off-policy data sampling and coarse-grained feedback. In this pa… ▽ More Hallucination occurs when large language models (LLMs) exhibit behavior that deviates from the boundaries of their knowledge during the response generation process. Previous learning-based methods focus on detecting knowledge boundaries and finetuning models with instance-level feedback, but they suffer from inaccurate signals due to off-policy data sampling and coarse-grained feedback. In this paper, we introduce \textit{\b{R}einforcement \b{L}earning \b{f}or \b{H}allucination} (RLFH), a fine-grained feedback-based online reinforcement learning method for hallucination mitigation. Unlike previous learning-based methods, RLFH enables LLMs to explore the boundaries of their internal knowledge and provide on-policy, fine-grained feedback on these explorations. To construct fine-grained feedback for learning reliable generation behavior, RLFH decomposes the outcomes of large models into atomic facts, provides statement-level evaluation signals, and traces back the signals to the tokens of the original responses. Finally, RLFH adopts the online reinforcement algorithm with these token-level rewards to adjust model behavior for hallucination mitigation. For effective on-policy optimization, RLFH also introduces an LLM-based fact assessment framework to verify the truthfulness and helpfulness of atomic facts without human intervention. Experiments on HotpotQA, SQuADv2, and Biography benchmarks demonstrate that RLFH can balance their usage of internal knowledge during the generation process to eliminate the hallucination behavior of LLMs. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.12195 [pdf, other]

Quantum Compiling with Reinforcement Learning on a Superconducting Processor

Authors: Z. T. Wang, Qiuhao Chen, Yuxuan Du, Z. H. Yang, Xiaoxia Cai, Kaixuan Huang, Jingning Zhang, Kai Xu, Jun Du, Yinan Li, Yuling Jiao, Xingyao Wu, Wu Liu, Xiliang Lu, Huikai Xu, Yirong Jin, Ruixia Wang, Haifeng Yu, S. P. Zhao

Abstract: To effectively implement quantum algorithms on noisy intermediate-scale quantum (NISQ) processors is a central task in modern quantum technology. NISQ processors feature tens to a few hundreds of noisy qubits with limited coherence times and gate operations with errors, so NISQ algorithms naturally require employing circuits of short lengths via quantum compilation. Here, we develop a reinforcemen… ▽ More To effectively implement quantum algorithms on noisy intermediate-scale quantum (NISQ) processors is a central task in modern quantum technology. NISQ processors feature tens to a few hundreds of noisy qubits with limited coherence times and gate operations with errors, so NISQ algorithms naturally require employing circuits of short lengths via quantum compilation. Here, we develop a reinforcement learning (RL)-based quantum compiler for a superconducting processor and demonstrate its capability of discovering novel and hardware-amenable circuits with short lengths. We show that for the three-qubit quantum Fourier transformation, a compiled circuit using only seven CZ gates with unity circuit fidelity can be achieved. The compiler is also able to find optimal circuits under device topological constraints, with lengths considerably shorter than those by the conventional method. Our study exemplifies the codesign of the software with hardware for efficient quantum compilation, offering valuable insights for the advancement of RL-based compilers. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.11931 [pdf, other]

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Authors: DeepSeek-AI, Qihao Zhu, Daya Guo, Zhihong Shao, Dejian Yang, Peiyi Wang, Runxin Xu, Y. Wu, Yukun Li, Huazuo Gao, Shirong Ma, Wangding Zeng, Xiao Bi, Zihui Gu, Hanwei Xu, Damai Dai, Kai Dong, Liyue Zhang, Yishi Piao, Zhibin Gou, Zhenda Xie, Zhewen Hao, Bingxuan Wang, Junxiao Song, Deli Chen , et al. (15 additional authors not shown)

Abstract: We present DeepSeek-Coder-V2, an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks. Specifically, DeepSeek-Coder-V2 is further pre-trained from an intermediate checkpoint of DeepSeek-V2 with additional 6 trillion tokens. Through this continued pre-training, DeepSeek-Coder-V2 substantially enhances the coding and mathe… ▽ More We present DeepSeek-Coder-V2, an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks. Specifically, DeepSeek-Coder-V2 is further pre-trained from an intermediate checkpoint of DeepSeek-V2 with additional 6 trillion tokens. Through this continued pre-training, DeepSeek-Coder-V2 substantially enhances the coding and mathematical reasoning capabilities of DeepSeek-V2, while maintaining comparable performance in general language tasks. Compared to DeepSeek-Coder-33B, DeepSeek-Coder-V2 demonstrates significant advancements in various aspects of code-related tasks, as well as reasoning and general capabilities. Additionally, DeepSeek-Coder-V2 expands its support for programming languages from 86 to 338, while extending the context length from 16K to 128K. In standard benchmark evaluations, DeepSeek-Coder-V2 achieves superior performance compared to closed-source models such as GPT4-Turbo, Claude 3 Opus, and Gemini 1.5 Pro in coding and math benchmarks. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.11800 [pdf, other]

Magnetic field in mini starburst complex Sgr B2

Authors: Xing Pan, Qizhou Zhang, Keping Qiu, Ramprasad Rao, Lingzhen Zeng, Xing Lu, Junhao Liu

Abstract: We report the first arcsecond-resolution observations of the magnetic field in the mini starburst complex Sgr B2. SMA polarization observations revealed magnetic field morphology in three dense cores of Sgr B2 N(orth), M(ain), and S(outh). The total plane-of-sky magnetic field strengths in these cores are estimated to be 4.3-10.0 mG, 6.2-14.7 mG, and 1.9-4.5 mG derived from the angular dispersion… ▽ More We report the first arcsecond-resolution observations of the magnetic field in the mini starburst complex Sgr B2. SMA polarization observations revealed magnetic field morphology in three dense cores of Sgr B2 N(orth), M(ain), and S(outh). The total plane-of-sky magnetic field strengths in these cores are estimated to be 4.3-10.0 mG, 6.2-14.7 mG, and 1.9-4.5 mG derived from the angular dispersion function method after applying the correction factors of 0.21 and 0.5. Combining with analyses of the parsec-scale polarization data from SOFIA, we found that a magnetically supercritical condition is present from the cloud-scale ($\sim$10 pc) to core-scale ($\sim$0.2 pc) in Sgr B2, which is consistent with the burst of star formation activities in the region likely resulted from a multi-scale gravitational collapse from the cloud to dense cores. △ Less

Submitted 18 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

Comments: 17 pages, 4 figures, accepted for publication in ApJ

arXiv:2406.10956 [pdf, other]

Robust Channel Learning for Large-Scale Radio Speaker Verification

Authors: Wenhao Yang, Jianguo Wei, Wenhuan Lu, Lei Li, Xugang Lu

Abstract: Recent research in speaker verification has increasingly focused on achieving robust and reliable recognition under challenging channel conditions and noisy environments. Identifying speakers in radio communications is particularly difficult due to inherent limitations such as constrained bandwidth and pervasive noise interference. To address this issue, we present a Channel Robust Speaker Learnin… ▽ More Recent research in speaker verification has increasingly focused on achieving robust and reliable recognition under challenging channel conditions and noisy environments. Identifying speakers in radio communications is particularly difficult due to inherent limitations such as constrained bandwidth and pervasive noise interference. To address this issue, we present a Channel Robust Speaker Learning (CRSL) framework that enhances the robustness of the current speaker verification pipeline, considering data source, data augmentation, and the efficiency of model transfer processes. Our framework introduces an augmentation module that mitigates bandwidth variations in radio speech datasets by manipulating the bandwidth of training inputs. It also addresses unknown noise by introducing noise within the manifold space. Additionally, we propose an efficient fine-tuning method that reduces the need for extensive additional training time and large amounts of data. Moreover, we develop a toolkit for assembling a large-scale radio speech corpus and establish a benchmark specifically tailored for radio scenario speaker verification studies. Experimental results demonstrate that our proposed methodology effectively enhances performance and mitigates degradation caused by radio transmission in speaker verification tasks. The code will be available on Github. △ Less

Submitted 16 June, 2024; originally announced June 2024.

Comments: 12 pages, 11 figures

arXiv:2406.10744 [pdf, other]

Technique Report of CVPR 2024 PBDL Challenges

Authors: Ying Fu, Yu Li, Shaodi You, Boxin Shi, Linwei Chen, Yunhao Zou, Zichun Wang, Yichen Li, Yuze Han, Yingkai Zhang, Jianan Wang, Qinglin Liu, Wei Yu, Xiaoqian Lv, Jianing Li, Shengping Zhang, Xiangyang Ji, Yuanpei Chen, Yuhan Zhang, Weihang Peng, Liwen Zhang, Zhe Xu, Dingyong Gou, Cong Li, Senyan Xu , et al. (75 additional authors not shown)

Abstract: The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies. By leveraging the principles of physics to inform and enhance deep learning models, we can develop more robust and accurate vision systems. Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, a… ▽ More The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies. By leveraging the principles of physics to inform and enhance deep learning models, we can develop more robust and accurate vision systems. Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, and medium properties from images. In recent years, deep learning has shown promising improvements for various vision tasks, and when combined with physics-based vision, these approaches can enhance the robustness and accuracy of vision systems. This technical report summarizes the outcomes of the Physics-Based Vision Meets Deep Learning (PBDL) 2024 challenge, held in CVPR 2024 workshop. The challenge consisted of eight tracks, focusing on Low-Light Enhancement and Detection as well as High Dynamic Range (HDR) Imaging. This report details the objectives, methodologies, and results of each track, highlighting the top-performing solutions and their innovative approaches. △ Less

Submitted 12 July, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

Comments: CVPR 2024 PBDL Challenges: https://pbdl-ws.github.io/pbdl2024/challenge/index.html

arXiv:2406.10408 [pdf, other]

Final Search for Short-Baseline Neutrino Oscillations with the PROSPECT-I Detector at HFIR

Authors: M. Andriamirado, B. Balantekin, C. D. Bass, O. Benevides Rodrigues, E. P. Bernard, N. S. Bowden, C. D. Bryan, R. Carr, T. Classen, A. J. Conant, G. Deichert, M. J. Dolinski, A. Erickson, A. Galindo-Uribarri, S. Gokhale, C. Grant, S. Hans, A. B. Hansell, K. M. Heeger, B. Heffron, D. E. Jaffe, S. Jayakumar, J. R. Koblanski, P. Kunkle, C. E. Lane , et al. (22 additional authors not shown)

Abstract: The PROSPECT experiment is designed to perform precise searches for antineutrino disappearance at short distances (7 - 9~m) from compact nuclear reactor cores. This Letter reports results from a new neutrino oscillation analysis performed using the complete data sample from the PROSPECT-I detector operated at the High Flux Isotope Reactor in 2018. The analysis uses a multi-period selection of inve… ▽ More The PROSPECT experiment is designed to perform precise searches for antineutrino disappearance at short distances (7 - 9~m) from compact nuclear reactor cores. This Letter reports results from a new neutrino oscillation analysis performed using the complete data sample from the PROSPECT-I detector operated at the High Flux Isotope Reactor in 2018. The analysis uses a multi-period selection of inverse beta decay neutrino interactions with reduced backgrounds and enhanced statistical power to set limits on electron-flavor disappearance caused by mixing with sterile neutrinos with 0.2 - 20 eV$^2$ mass splittings. Inverse beta decay positron energy spectra from six different reactor-detector distance ranges are found to be statistically consistent with one another, as would be expected in the absence of sterile neutrino oscillations. The data excludes at 95% confidence level the existence of sterile neutrinos in regions above 3~eV$^2$ previously unexplored by terrestrial experiments, including all space below 10~eV$^2$ suggested by the recently strengthened Gallium Anomaly. The best-fit point of the Neutrino-4 reactor experiment's claimed observation of short-baseline oscillation is ruled out at more than five standard deviations. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: 6 pages, 4 figures

arXiv:2406.09900 [pdf, other]

GEB-1.3B: Open Lightweight Large Language Model

Authors: Jie Wu, Yufeng Zhu, Lei Shen, Xuqing Lu

Abstract: Recently developed large language models (LLMs) such as ChatGPT, Claude, and Llama have demonstrated impressive abilities, and even surpass human-level performance in several tasks. Despite their success, the resource-intensive demands of these models, requiring significant computational power for both training and inference, limit their deployment to high-performance servers. Additionally, the ex… ▽ More Recently developed large language models (LLMs) such as ChatGPT, Claude, and Llama have demonstrated impressive abilities, and even surpass human-level performance in several tasks. Despite their success, the resource-intensive demands of these models, requiring significant computational power for both training and inference, limit their deployment to high-performance servers. Additionally, the extensive calculation requirements of the models often lead to increased latency in response times. With the increasing need for LLMs to operate efficiently on CPUs, research about lightweight models that are optimized for CPU inference has emerged. In this work, we introduce GEB-1.3B, a lightweight LLM trained on 550 billion tokens in both Chinese and English languages. We employ novel training techniques, including ROPE, Group-Query-Attention, and FlashAttention-2, to accelerate training while maintaining model performance. Additionally, we fine-tune the model using 10 million samples of instruction data to enhance alignment. GEB-1.3B exhibits outstanding performance on general benchmarks such as MMLU, C-Eval, and CMMLU, outperforming comparative models such as MindLLM-1.3B and TinyLLaMA-1.1B. Notably, the FP32 version of GEB-1.3B achieves commendable inference times on CPUs, with ongoing efforts to further enhance speed through advanced quantization techniques. The release of GEB-1.3B as an open-source model marks a significant contribution to the development of lightweight LLMs, promising to foster further research and innovation in the field. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: GEB-1.3B technical report

arXiv:2406.09475 [pdf, other]

Search for $X(1870)$ via the decay $J/ψ\to ωK^+ K^-η$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (644 additional authors not shown)

Abstract: Using a sample of $(10087\pm 44)\times10^{6}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we search for the decay $X(1870)\to K^+ K^-η$ via the $J/ψ\to ωK^+ K^- η$ process for the first time. No significant $X(1870)$ signal is observed. The upper limit on the branching fraction of the decay $ J/ψ\to ωX(1870) \toωK^+ K^- η$ is determined to be $9.55\times 10^{-7}$ at the… ▽ More Using a sample of $(10087\pm 44)\times10^{6}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we search for the decay $X(1870)\to K^+ K^-η$ via the $J/ψ\to ωK^+ K^- η$ process for the first time. No significant $X(1870)$ signal is observed. The upper limit on the branching fraction of the decay $ J/ψ\to ωX(1870) \toωK^+ K^- η$ is determined to be $9.55\times 10^{-7}$ at the $90\%$ confidence level. In addition, the branching faction $B(J/ψ\toωK^+ K^- η)$ is measured to be $(3.33\pm0.02(\rm{stat.})\pm 0.12(\rm{syst.}))\times 10^{-4}$. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2406.08359 [pdf, other]

Reactor Antineutrino Directionality Measurement with the PROSPECT-I Detector

Authors: M. Andriamirado, B. Balantekin, C. D. Bass, O. Benevides Rodrigues, E. P. Bernard, N. S. Bowden, C. D. Bryan, R. Carr, T. Classen, A. J. Conant, G. Deichert, M. J. Dolinski, A. Erickson, A. Galindo-Uribarri, S. Gokhale, C. Grant, S. Hans, A. B. Hansell, K. M. Heeger, B. Heffron, D. E. Jaffe, S. Jayakumar, D. C. Jones, J. R. Koblanski, P. Kunkle , et al. (24 additional authors not shown)

Abstract: The PROSPECT-I detector has several features that enable measurement of the direction of a compact neutrino source. In this paper, a detailed report on the directional measurements made on electron antineutrinos emitted from the High Flux Isotope Reactor is presented. With an estimated true neutrino (reactor to detector) direction of $φ= 40.8\unicode{xB0} \pm 0.7\unicode{xB0}$ and… ▽ More The PROSPECT-I detector has several features that enable measurement of the direction of a compact neutrino source. In this paper, a detailed report on the directional measurements made on electron antineutrinos emitted from the High Flux Isotope Reactor is presented. With an estimated true neutrino (reactor to detector) direction of $φ= 40.8\unicode{xB0} \pm 0.7\unicode{xB0}$ and $θ= 98.6\unicode{xB0} \pm 0.4\unicode{xB0}$, the PROSPECT-I detector is able to reconstruct an average neutrino direction of $φ= 39.4\unicode{xB0} \pm 2.9\unicode{xB0}$ and $θ= 97.6\unicode{xB0} \pm 1.6\unicode{xB0}$. This measurement is made with approximately 48000 Inverse Beta Decay signal events and is the most precise directional reconstruction of reactor antineutrinos to date. △ Less

Submitted 11 July, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

arXiv:2406.08225 [pdf, ps, other]

Observation of $η_{c}$(1S, 2S) and $χ_{cJ}$ decays to 2$(π^{+}π^{-})η$ via $ψ$(3686) radiative transitions

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (636 additional authors not shown)

Abstract: Based on $2.7 \times 10^9~ψ(3686)$ decays collected with the BESIII detector, the radiative decay $ψ(3686)\to\gamma2(π^{+}π^{-})η$ is investigated to measure properties of S- and P-wave charmonium states. The branching fraction of the decay $η_{c}(1S) \to 2(π^{+}π^{-})η$, which is found to have a strong dependence on the interference pattern between $η_c(1S)$ and non-$η_c(1S)$ processes, is measur… ▽ More Based on $2.7 \times 10^9~ψ(3686)$ decays collected with the BESIII detector, the radiative decay $ψ(3686)\to\gamma2(π^{+}π^{-})η$ is investigated to measure properties of S- and P-wave charmonium states. The branching fraction of the decay $η_{c}(1S) \to 2(π^{+}π^{-})η$, which is found to have a strong dependence on the interference pattern between $η_c(1S)$ and non-$η_c(1S)$ processes, is measured in both destructive and constructive interference scenarios for the first time. The mass and width of the $η_{c}(1S)$ are measured to be $M=(2984.14 \pm 0.13 \pm 0.38)$ MeV/$c^{2}$ and $Γ=(28.82 \pm 0.11 \pm 0.82)$ MeV, respectively. Clear signals for the decays of the $χ_{cJ}(J=0,1,2)$ and the $η_{c}(2S)$ to $2(π^{+}π^{-})η$ are also observed for the first time, and the corresponding branching fractions are measured. The ratio of the branching fractions between the $η_{c}(2S)$ and $η_{c}(1S)$ decays is significantly lower than the theoretical prediction, which might suggest different dynamics in their decays. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2406.07409 [pdf, other]

Accelerating Ill-conditioned Hankel Matrix Recovery via Structured Newton-like Descent

Authors: HanQin Cai, Longxiu Huang, Xiliang Lu, Juntao You

Abstract: This paper studies the robust Hankel recovery problem, which simultaneously removes the sparse outliers and fulfills missing entries from the partial observation. We propose a novel non-convex algorithm, coined Hankel Structured Newton-Like Descent (HSNLD), to tackle the robust Hankel recovery problem. HSNLD is highly efficient with linear convergence, and its convergence rate is independent of th… ▽ More This paper studies the robust Hankel recovery problem, which simultaneously removes the sparse outliers and fulfills missing entries from the partial observation. We propose a novel non-convex algorithm, coined Hankel Structured Newton-Like Descent (HSNLD), to tackle the robust Hankel recovery problem. HSNLD is highly efficient with linear convergence, and its convergence rate is independent of the condition number of the underlying Hankel matrix. The recovery guarantee has been established under some mild conditions. Numerical experiments on both synthetic and real datasets show the superior performance of HSNLD against state-of-the-art algorithms. △ Less

Submitted 11 June, 2024; originally announced June 2024.

MSC Class: 15A29; 15A83; 47B35; 90C17; 90C26; 90C53

arXiv:2406.06925 [pdf, other]

Non-autoregressive Personalized Bundle Generation

Authors: Wenchuan Yang, Cheng Yang, Jichao Li, Yuejin Tan, Xin Lu, Chuan Shi

Abstract: The personalized bundle generation problem, which aims to create a preferred bundle for user from numerous candidate items, receives increasing attention in recommendation. However, existing works ignore the order-invariant nature of the bundle and adopt sequential modeling methods as the solution, which might introduce inductive bias and cause a large latency in prediction. To address this proble… ▽ More The personalized bundle generation problem, which aims to create a preferred bundle for user from numerous candidate items, receives increasing attention in recommendation. However, existing works ignore the order-invariant nature of the bundle and adopt sequential modeling methods as the solution, which might introduce inductive bias and cause a large latency in prediction. To address this problem, we propose to perform the bundle generation via non-autoregressive mechanism and design a novel encoder-decoder framework named BundleNAT, which can effectively output the targeted bundle in one-shot without relying on any inherent order. In detail, instead of learning sequential dependency, we propose to adopt pre-training techniques and graph neural network to fully embed user-based preference and item-based compatibility information, and use a self-attention based encoder to further extract global dependency pattern. We then design a permutation-equivariant decoding architecture that is able to directly output the desired bundle in a one-shot manner. Experiments on three real-world datasets from Youshu and Netease show the proposed BundleNAT significantly outperforms the current state-of-the-art methods in average by up to 35.92%, 10.97% and 23.67% absolute improvements in Precision, Precision+, and Recall, respectively. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: Submitted to Information Processing & Management

Showing 1–50 of 2,400 results for author: Lu, X