Skip to main content

Showing 1–50 of 1,809 results for author: Gu, Y

  1. arXiv:2407.08555  [pdf, other

    eess.IV cs.CV

    SLoRD: Structural Low-Rank Descriptors for Shape Consistency in Vertebrae Segmentation

    Authors: Xin You, Yixin Lou, Minghui Zhang, Chuyan Zhang, Jie Yang, Yun Gu

    Abstract: Automatic and precise segmentation of vertebrae from CT images is crucial for various clinical applications. However, due to a lack of explicit and strict constraints, existing methods especially for single-stage methods, still suffer from the challenge of intra-vertebrae segmentation inconsistency, which refers to multiple label predictions inside a singular vertebra. For multi-stage methods, ver… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Under review

  2. arXiv:2407.08194  [pdf, other

    cond-mat.quant-gas cond-mat.str-el quant-ph

    Uncovering Emergent Spacetime Supersymmetry with Rydberg Atom Arrays

    Authors: Chengshu Li, Shang Liu, Hanteng Wang, Wenjun Zhang, Zi-Xiang Li, Hui Zhai, Yingfei Gu

    Abstract: In the zoo of emergent symmetries in quantum many-body physics, the previously unrealized emergent spacetime supersymmetry (SUSY) is particularly intriguing. Although it was known that spacetime SUSY could emerge at the (1+1)d tricritical Ising transition, an experimental realization is still absent. In this letter, we propose to realize the tricritical Ising transition with Rydberg atom arrays, t… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 7 pages, 3 figures

  3. arXiv:2407.07651  [pdf, other

    hep-ex physics.data-an

    Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$

    Authors: M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (645 additional authors not shown)

    Abstract: The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  4. arXiv:2407.07464  [pdf, other

    cs.SD cs.CV cs.MM eess.AS

    Video-to-Audio Generation with Hidden Alignment

    Authors: Manjie Xu, Chenxing Li, Yong Ren, Rilin Chen, Yu Gu, Wei Liang, Dong Yu

    Abstract: Generating semantically and temporally aligned audio content in accordance with video input has become a focal point for researchers, particularly following the remarkable breakthrough in text-to-video generation. In this work, we aim to offer insights into the video-to-audio generation paradigm, focusing on three crucial aspects: vision encoders, auxiliary embeddings, and data augmentation techni… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: https://sites.google.com/view/vta-ldm

  5. arXiv:2407.06566  [pdf, other

    cs.CV

    Robust and Explainable Framework to Address Data Scarcity in Diagnostic Imaging

    Authors: Zehui Zhao, Laith Alzubaidi, Jinglan Zhang, Ye Duan, Usman Naseem, Yuantong Gu

    Abstract: Deep learning has significantly advanced automatic medical diagnostics and released the occupation of human resources to reduce clinical pressure, yet the persistent challenge of data scarcity in this area hampers its further improvements and applications. To address this gap, we introduce a novel ensemble framework called `Efficient Transfer and Self-supervised Learning based Ensemble Framework'… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 64 pages, 20 figures

  6. arXiv:2407.05592  [pdf, other

    cs.CV

    An Experimental Comparison of Transfer Learning against Self-supervised Learning

    Authors: Zehui Zhao, Laith Alzubaidi, Jinglan Zhang, Ye Duan, Usman Naseem, Yuantong Gu

    Abstract: Recently, transfer learning and self-supervised learning have gained significant attention within the medical field due to their ability to mitigate the challenges posed by limited data availability, improve model generalisation, and reduce computational expenses. Transfer learning and self-supervised learning hold immense potential for advancing medical research. However, it is crucial to recogni… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 27 pages, 11 figures

  7. arXiv:2407.05589  [pdf, other

    quant-ph

    Improving the trainability of VQE on NISQ computers for solving portfolio optimization using convex interpolation

    Authors: Shengbin Wang, Guihui Li, Zhaoyun Chen, Peng Wang, Menghan Dou, Haiyong Zheng, Zhimin Wang, Yongjian Gu, Yu-Chun Wu, Guo-Ping Guo

    Abstract: Solving combinatorial optimization problems using variational quantum algorithms (VQAs) represents one of the most promising applications in the NISQ era. However, the limited trainability of VQAs could hinder their scalability to large problem sizes. In this paper, we improve the trainability of variational quantum eigensolver (VQE) by utilizing convex interpolation to solve portfolio optimizatio… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  8. arXiv:2407.05505  [pdf, other

    eess.IV cs.CV

    Dynamic Position Transformation and Boundary Refinement Network for Left Atrial Segmentation

    Authors: Fangqiang Xu, Wenxuan Tu, Fan Feng, Malitha Gunawardhana, Jiayuan Yang, Yun Gu, Jichao Zhao

    Abstract: Left atrial (LA) segmentation is a crucial technique for irregular heartbeat (i.e., atrial fibrillation) diagnosis. Most current methods for LA segmentation strictly assume that the input data is acquired using object-oriented center cropping, while this assumption may not always hold in practice due to the high cost of manual object annotation. Random cropping is a straightforward data pre-proces… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: MICCAI 2024 conference

  9. arXiv:2407.05407  [pdf, other

    cs.SD cs.AI eess.AS

    CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens

    Authors: Zhihao Du, Qian Chen, Shiliang Zhang, Kai Hu, Heng Lu, Yexin Yang, Hangrui Hu, Siqi Zheng, Yue Gu, Ziyang Ma, Zhifu Gao, Zhijie Yan

    Abstract: Recent years have witnessed a trend that large language model (LLM) based text-to-speech (TTS) emerges into the mainstream due to their high naturalness and zero-shot capacity. In this paradigm, speech signals are discretized into token sequences, which are modeled by an LLM with text as prompts and reconstructed by a token-based vocoder to waveforms. Obviously, speech tokens play a critical role… ▽ More

    Submitted 9 July, 2024; v1 submitted 7 July, 2024; originally announced July 2024.

    Comments: work in progress. arXiv admin note: substantial text overlap with arXiv:2407.04051

  10. arXiv:2407.05361  [pdf, other

    eess.AS cs.CL

    Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation

    Authors: Haorui He, Zengqiang Shang, Chaoren Wang, Xuyuan Li, Yicheng Gu, Hua Hua, Liwei Liu, Chen Yang, Jiaqi Li, Peiyang Shi, Yuancheng Wang, Kai Chen, Pengyuan Zhang, Zhizheng Wu

    Abstract: Recently, speech generation models have made significant progress by using large-scale training data. However, the research community struggle to produce highly spontaneous and human-like speech due to the lack of large-scale, diverse, and spontaneous speech data. This paper presents \textit{Emilia}, the first multilingual speech generation dataset from in-the-wild speech data, and Emilia-Pipe, th… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  11. arXiv:2407.04693  [pdf, other

    cs.CL cs.AI

    ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models

    Authors: Yuzhe Gu, Ziwei Ji, Wenwei Zhang, Chengqi Lyu, Dahua Lin, Kai Chen

    Abstract: Large language models (LLMs) exhibit hallucinations in long-form question-answering tasks across various domains and wide applications. Current hallucination detection and mitigation datasets are limited in domains and sizes, which struggle to scale due to prohibitive labor costs and insufficient reliability of existing hallucination annotators. To facilitate the scalable oversight of LLM hallucin… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 9 pages

  12. arXiv:2407.04051  [pdf, other

    cs.SD cs.AI eess.AS

    FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs

    Authors: Keyu An, Qian Chen, Chong Deng, Zhihao Du, Changfeng Gao, Zhifu Gao, Yue Gu, Ting He, Hangrui Hu, Kai Hu, Shengpeng Ji, Yabin Li, Zerui Li, Heng Lu, Haoneng Luo, Xiang Lv, Bin Ma, Ziyang Ma, Chongjia Ni, Changhe Song, Jiaqi Shi, Xian Shi, Hao Wang, Wen Wang, Yuxuan Wang , et al. (8 additional authors not shown)

    Abstract: This report introduces FunAudioLLM, a model family designed to enhance natural voice interactions between humans and large language models (LLMs). At its core are two innovative models: SenseVoice, which handles multilingual speech recognition, emotion recognition, and audio event detection; and CosyVoice, which facilitates natural speech generation with control over multiple languages, timbre, sp… ▽ More

    Submitted 10 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

    Comments: Work in progress. Authors are listed in alphabetical order by family name

  13. arXiv:2407.02899  [pdf, other

    hep-ex

    Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

    Abstract: A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  14. arXiv:2407.02784  [pdf

    quant-ph

    Breeding the Cat Through Superposition of Two Schrodinger Kittens Based on Coupled Waveguides

    Authors: Nuo Wang, Xinchen Zhang, Qi Liu, Fengxiao Sun, Qiongyi He, Ying Gu

    Abstract: Optical Schrodinger's cat (SC) is highly anticipated because of the potential of realizing fault-tolerant quantum computing, but the practical merit is only shown when the amplitude is larger than 2. However, such high-amplitude cats have not been prepared due to the limitations rooted in the existing method. Here, we demonstrate a principle that a large SC-like state can be generated by the super… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  15. arXiv:2407.01494  [pdf, other

    cs.CV cs.SD eess.AS

    FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds

    Authors: Yiming Zhang, Yicheng Gu, Yanhong Zeng, Zhening Xing, Yuancheng Wang, Zhizheng Wu, Kai Chen

    Abstract: We study Neural Foley, the automatic generation of high-quality sound effects synchronizing with videos, enabling an immersive audio-visual experience. Despite its wide range of applications, existing approaches encounter limitations when it comes to simultaneously synthesizing high-quality and video-aligned (i.e.,, semantic relevant and temporal synchronized) sounds. To overcome these limitations… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Project page: https://foleycrafter.github.io/

  16. arXiv:2407.00569  [pdf, other

    cs.CV cs.AI cs.CL

    Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language Models

    Authors: Weihong Zhong, Xiaocheng Feng, Liang Zhao, Qiming Li, Lei Huang, Yuxuan Gu, Weitao Ma, Yuan Xu, Bing Qin

    Abstract: Though advanced in understanding visual information with human languages, Large Vision-Language Models (LVLMs) still suffer from multimodal hallucinations. A natural concern is that during multimodal interaction, the generated hallucinations could influence the LVLMs' subsequent generation. Thus, we raise a question: When presented with a query relevant to the previously generated hallucination, w… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: Accepted to ACL 2024 Main Conference. 21 pages, 20 figures

  17. arXiv:2407.00136  [pdf, other

    hep-ex

    Observation of the Electromagnetic Dalitz Transition $h_c \rightarrow e^+e^-η_c$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, S. Ahmed, M. Albrecht, R. Aliberti, A. Amoroso, M. R. An, Q. An, X. H. Bai, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (495 additional authors not shown)

    Abstract: Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions… ▽ More

    Submitted 2 July, 2024; v1 submitted 28 June, 2024; originally announced July 2024.

  18. arXiv:2406.19774  [pdf, other

    cs.CL

    Direct Preference Knowledge Distillation for Large Language Models

    Authors: Yixing Li, Yuxian Gu, Li Dong, Dequan Wang, Yu Cheng, Furu Wei

    Abstract: In the field of large language models (LLMs), Knowledge Distillation (KD) is a critical technique for transferring capabilities from teacher models to student models. However, existing KD methods face limitations and challenges in distillation of LLMs, including efficiency and insufficient measurement capabilities of traditional KL divergence. It is shown that LLMs can serve as an implicit reward… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  19. arXiv:2406.19310  [pdf

    cond-mat.mes-hall cond-mat.mtrl-sci

    Imaging semiconductor-to-metal transition and topological flat bands of twisted bilayer MoTe2

    Authors: Yufeng Liu, Yu Gu, Ting Bao, Ning Mao, Can Li, Shudan Jiang, Liang Liu, Dandan Guan, Yaoyi Li, Hao Zheng, Canhua Liu, Kenji Watanabe, Takashi Taniguchi, Wenhui Duan, Jinfeng Jia, Xiaoxue Liu, Yang Zhang, Tingxin Li, Shiyong Wang

    Abstract: Two-dimensional (2D) moiré materials have emerged as a highly tunable platform for investigating novel quantum states of matter arising from strong electronic correlations and nontrivial band topology. Recently, topological flat bands formed in 2D semiconducting moiré superlattices have attracted great interests. In particular, a series of topological quantum phases, including the long-sought frac… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  20. arXiv:2406.19190  [pdf, ps, other

    hep-ex

    Improved measurement of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

    Abstract: Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential dec… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 13 pages, 6 figures

  21. arXiv:2406.18463  [pdf, ps, other

    econ.TH

    Complexity Aversion

    Authors: Yuan Gu, Chao Hung Chan

    Abstract: This paper proposes a model of decision-making under uncertainty in which an agent is constrained in her cognitive ability to consider complex acts. We identify the complexity of an act according to the corresponding partition of state space. The agent ranks acts according to the expected utility net of complexity cost. A key feature of this model is that the agent is able to update her complexity… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  22. arXiv:2406.18183  [pdf, other

    hep-ex

    Measurement of the cross sections of $e^+e^-\to K^{-}\barΞ^{+}Λ/Σ^{0}$ at center-of-mass energies between 3.510 and 4.914 GeV

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

    Abstract: Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 26 pages,5 tables, 4 figures

  23. arXiv:2406.18083  [pdf, other

    hep-ex

    Measurements of $K_S^0$-$K_L^0$ asymmetries in the decays $Λ_c^+ \to pK_{L,S}^0$, $pK_{L,S}^0π^+π^-$ and $pK_{L,S}^0π^0$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

    Abstract: Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, an… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 19 pages, 2 figures

  24. arXiv:2406.17452  [pdf, ps, other

    hep-ex

    Study of the $f_{0}(980)$ through the decay $D_{s}^{+}\rightarrow π^{+}π^{+}π^{-}π^{0}$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (649 additional authors not shown)

    Abstract: We perform the first amplitude analysis of $D^+_s \to π^+π^+π^-π^0$ decays, based on data samples of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of 7.33~fb$^{-1}$. We report the observation of $D_{s}^{+} \to f_0(980)ρ(770)^{+}$ with a statistical significance greater than 10$σ$ and… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  25. arXiv:2406.17421  [pdf

    cond-mat.str-el

    Phonon Heat Transport and Anisotropic Tuning of Quantum Fluctuations in a Frustrated Honeycomb Magnet

    Authors: Haoran Fan, Yue Chen, Yuchen Gu, Yuan Li, Xi Lin

    Abstract: Honeycomb cobalt oxides containing 3$\it{d}$ Co$^{2+}$ ions might realize frustrated magnetism and novel quantum phases. Among candidate materials, Na$_{3}$Co$_{2}$SbO$_{6}$ stands out for its distorted honeycomb lattice and significant in-plane anisotropy, motivating vector-field tuning inside the honeycomb plane. Here we use thermal transport down to the mK regime to study twin-free crystals of… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 6 pages with 4 figures and 5 pages with 5 figures

  26. arXiv:2406.17244  [pdf, other

    eess.SP eess.SY

    A Near-Field Super-Resolution Network for Accelerating Antenna Characterization

    Authors: Yuchen Gu, Hai-Han Sun, Daniel W. van der Weide

    Abstract: We present a deep neural network-enabled method to accelerate near-field (NF) antenna measurement. We develop a Near-field Super-resolution Network (NFS-Net) to reconstruct significantly undersampled near-field data as high-resolution data, which considerably reduces the number of sampling points required for NF measurement and thus improves measurement efficiency. The high-resolution near-field d… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  27. arXiv:2406.16062  [pdf, other

    cs.NE

    Towards Biologically Plausible Computing: A Comprehensive Comparison

    Authors: Changze Lv, Yufei Gu, Zhengkang Guo, Zhibo Xu, Yixin Wu, Feiran Zhang, Tianyuan Shi, Zhenghua Wang, Ruicheng Yin, Yu Shang, Siqi Zhong, Xiaohua Wang, Muling Wu, Wenhao Liu, Tianlong Li, Jianhao Zhu, Cenyuan Zhang, Zixuan Ling, Xiaoqing Zheng

    Abstract: Backpropagation is a cornerstone algorithm in training neural networks for supervised learning, which uses a gradient descent method to update network weights by minimizing the discrepancy between actual and desired outputs. Despite its pivotal role in propelling deep learning advancements, the biological plausibility of backpropagation is questioned due to its requirements for weight symmetry, gl… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  28. arXiv:2406.15030  [pdf, ps, other

    hep-ex

    Search for the $e^+e^- \to φχ_{c1}(3872)$ process at BESIII

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

    Abstract: Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 11 pages, 3 figures

  29. arXiv:2406.14491  [pdf, other

    cs.CL

    Instruction Pre-Training: Language Models are Supervised Multitask Learners

    Authors: Daixuan Cheng, Yuxian Gu, Shaohan Huang, Junyu Bi, Minlie Huang, Furu Wei

    Abstract: Unsupervised multitask pre-training has been the critical method behind the recent success of language models (LMs). However, supervised multitask learning still holds significant promise, as scaling it in the post-training stage trends towards better generalization. In this paper, we explore supervised multitask pre-training by proposing Instruction Pre-Training, a framework that scalably augment… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  30. arXiv:2406.11115  [pdf, other

    cs.CL

    Text Grafting: Near-Distribution Weak Supervision for Minority Classes in Text Classification

    Authors: Letian Peng, Yi Gu, Chengyu Dong, Zihan Wang, Jingbo Shang

    Abstract: For extremely weak-supervised text classification, pioneer research generates pseudo labels by mining texts similar to the class names from the raw corpus, which may end up with very limited or even no samples for the minority classes. Recent works have started to generate the relevant texts by prompting LLMs using the class names or definitions; however, there is a high risk that LLMs cannot gene… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  31. arXiv:2406.10563  [pdf, other

    cs.LG cs.AI cs.CR

    Privacy-Preserving Heterogeneous Federated Learning for Sensitive Healthcare Data

    Authors: Yukai Xu, Jingfeng Zhang, Yujie Gu

    Abstract: In the realm of healthcare where decentralized facilities are prevalent, machine learning faces two major challenges concerning the protection of data and models. The data-level challenge concerns the data privacy leakage when centralizing data with sensitive personal information. While the model-level challenge arises from the heterogeneity of local models, which need to be collaboratively traine… ▽ More

    Submitted 4 July, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

    Comments: Accepted to the 2024 IEEE Conference on Artificial Intelligence (IEEE CAI 2024)

  32. arXiv:2406.09618  [pdf, other

    cs.CL cs.AI cs.IR cs.SD eess.AS

    Multi-Modal Retrieval For Large Language Model Based Speech Recognition

    Authors: Jari Kolehmainen, Aditya Gourav, Prashanth Gurunath Shivakumar, Yile Gu, Ankur Gandhe, Ariya Rastrow, Grant Strimel, Ivan Bulyko

    Abstract: Retrieval is a widely adopted approach for improving language models leveraging external information. As the field moves towards multi-modal large language models, it is important to extend the pure text based methods to incorporate other modalities in retrieval as well for applications across the wide spectrum of machine learning tasks and data types. In this work, we propose multi-modal retrieva… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  33. arXiv:2406.09475  [pdf, other

    hep-ex

    Search for $X(1870)$ via the decay $J/ψ\to ωK^+ K^-η$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (644 additional authors not shown)

    Abstract: Using a sample of $(10087\pm 44)\times10^{6}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we search for the decay $X(1870)\to K^+ K^-η$ via the $J/ψ\to ωK^+ K^- η$ process for the first time. No significant $X(1870)$ signal is observed. The upper limit on the branching fraction of the decay $ J/ψ\to ωX(1870) \toωK^+ K^- η$ is determined to be $9.55\times 10^{-7}$ at the… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  34. arXiv:2406.09455  [pdf, other

    cs.CV cs.AI cs.CL

    Pandora: Towards General World Model with Natural Language Actions and Video States

    Authors: Jiannan Xiang, Guangyi Liu, Yi Gu, Qiyue Gao, Yuting Ning, Yuheng Zha, Zeyu Feng, Tianhua Tao, Shibo Hao, Yemin Shi, Zhengzhong Liu, Eric P. Xing, Zhiting Hu

    Abstract: World models simulate future states of the world in response to different actions. They facilitate interactive content creation and provides a foundation for grounded, long-horizon reasoning. Current foundation models do not fully meet the capabilities of general world models: large language models (LLMs) are constrained by their reliance on language modality and their limited understanding of the… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Website: https://world-model.maitrix.org/

  35. arXiv:2406.08756  [pdf, other

    cs.DC cs.LG

    Optimizing Large Model Training through Overlapped Activation Recomputation

    Authors: Ping Chen, Wenjie Zhang, Shuibing He, Yingjie Gu, Zhuwei Peng, Kexin Huang, Xuan Zhan, Weijian Chen, Yi Zheng, Zhefeng Wang, Yanlong Yin, Gang Chen

    Abstract: Large model training has been using recomputation to alleviate the memory pressure and pipelining to exploit the parallelism of data, tensor, and devices. The existing recomputation approaches may incur up to 40% overhead when training real-world models, e.g., the GPT model with 22B parameters. This is because they are executed on demand in the critical training path. In this paper, we design a ne… ▽ More

    Submitted 27 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: 13 pages

  36. arXiv:2406.08446  [pdf, other

    cs.CL cs.AI

    OLMES: A Standard for Language Model Evaluations

    Authors: Yuling Gu, Oyvind Tafjord, Bailey Kuehl, Dany Haddad, Jesse Dodge, Hannaneh Hajishirzi

    Abstract: Progress in AI is often demonstrated by new models claiming improved performance on tasks measuring model capabilities. Evaluating language models in particular is challenging, as small changes to how a model is evaluated on a task can lead to large changes in measured performance. There is no common standard setup, so different models are evaluated on the same tasks in different ways, leading to… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  37. arXiv:2406.08225  [pdf, ps, other

    hep-ex

    Observation of $η_{c}$(1S, 2S) and $χ_{cJ}$ decays to 2$(π^{+}π^{-})η$ via $ψ$(3686) radiative transitions

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (636 additional authors not shown)

    Abstract: Based on $2.7 \times 10^9~ψ(3686)$ decays collected with the BESIII detector, the radiative decay $ψ(3686)\to\gamma2(π^{+}π^{-})η$ is investigated to measure properties of S- and P-wave charmonium states. The branching fraction of the decay $η_{c}(1S) \to 2(π^{+}π^{-})η$, which is found to have a strong dependence on the interference pattern between $η_c(1S)$ and non-$η_c(1S)$ processes, is measur… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  38. arXiv:2406.08081  [pdf

    eess.SP

    CLDTA: Contrastive Learning based on Diagonal Transformer Autoencoder for Cross-Dataset EEG Emotion Recognition

    Authors: Yuan Liao, Yuhong Zhang, Shenghuan Wang, Xiruo Zhang, Yiling Zhang, Wei Chen, Yuzhe Gu, Liya Huang

    Abstract: Recent advances in non-invasive EEG technology have broadened its application in emotion recognition, yielding a multitude of related datasets. Yet, deep learning models struggle to generalize across these datasets due to variations in acquisition equipment and emotional stimulus materials. To address the pressing need for a universal model that fluidly accommodates diverse EEG dataset formats and… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  39. arXiv:2406.07430  [pdf, other

    cs.CL cs.MM

    Learning Domain-Invariant Features for Out-of-Context News Detection

    Authors: Yimeng Gu, Mengqi Zhang, Ignacio Castro, Shu Wu, Gareth Tyson

    Abstract: Multimodal out-of-context news is a common type of misinformation on online media platforms. This involves posting a caption, alongside an invalid out-of-context news image. Reflecting its importance, researchers have developed models to detect such misinformation. However, a common limitation of these models is that they only consider the scenario where pre-labeled data is available for each doma… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  40. arXiv:2406.06959  [pdf, other

    cs.LG cs.AI

    Unleashing the Denoising Capability of Diffusion Prior for Solving Inverse Problems

    Authors: Jiawei Zhang, Jiaxin Zhuang, Cheng Jin, Gen Li, Yuantao Gu

    Abstract: The recent emergence of diffusion models has significantly advanced the precision of learnable priors, presenting innovative avenues for addressing inverse problems. Since inverse problems inherently entail maximum a posteriori estimation, previous works have endeavored to integrate diffusion priors into the optimization frameworks. However, prevailing optimization-based inverse algorithms primari… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  41. arXiv:2406.06382  [pdf, other

    cs.CV cs.CL cs.LG

    Diffusion-RPO: Aligning Diffusion Models through Relative Preference Optimization

    Authors: Yi Gu, Zhendong Wang, Yueqin Yin, Yujia Xie, Mingyuan Zhou

    Abstract: Aligning large language models with human preferences has emerged as a critical focus in language modeling research. Yet, integrating preference learning into Text-to-Image (T2I) generative models is still relatively uncharted territory. The Diffusion-DPO technique made initial strides by employing pairwise preference learning in diffusion models tailored for specific text prompts. We introduce Di… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  42. arXiv:2406.06118  [pdf, other

    hep-ex

    Strong and weak $CP$ tests in sequential decays of polarized $Σ^0$ hyperons

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (644 additional authors not shown)

    Abstract: The $J/ψ, ψ(3686) \to Σ^0 \barΣ^{0}$ processes and subsequent decays are studied using the world's largest $J/ψ$ and $ψ(3686)$ data samples collected with the BESIII detector. The strong-$CP$ symmetry is tested in the decays of the $Σ^0$ hyperons for the first time by measuring the decay parameters, $α_{Σ^0} = -0.0017 \pm 0.0021 \pm 0.0018$ and $\barα_{Σ^0} = 0.0021 \pm 0.0020 \pm 0.0022$. The wea… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  43. arXiv:2406.05827  [pdf, ps, other

    hep-ex

    Measurement of the integrated luminosity of the data collected at 3.773 GeV by BESIII from 2021 to 2024

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (634 additional authors not shown)

    Abstract: We present a measurement of the integrated luminosity of $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at a center-of-mass energy of $E_{\rm cm} = 3.773$~GeV. The integrated luminosities of the data sets taken from December 2021 to June 2022, from November 2022 to June 2023, and from October 2023 to February 2024 are determined to be $4.995 \pm 0.019$~fb$^{-1}$,… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  44. arXiv:2406.05325  [pdf, other

    eess.AS cs.SD

    LDM-SVC: Latent Diffusion Model Based Zero-Shot Any-to-Any Singing Voice Conversion with Singer Guidance

    Authors: Shihao Chen, Yu Gu, Jie Zhang, Na Li, Rilin Chen, Liping Chen, Lirong Dai

    Abstract: Any-to-any singing voice conversion (SVC) is an interesting audio editing technique, aiming to convert the singing voice of one singer into that of another, given only a few seconds of singing data. However, during the conversion process, the issue of timbre leakage is inevitable: the converted singing voice still sounds like the original singer's voice. To tackle this, we propose a latent diffusi… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  45. arXiv:2406.02931  [pdf, other

    hep-ex

    Measurements of the branching fractions of the $P$-wave charmonium spin-singlet state $h_c(^1P_1) \to h^+ h^-π^0/η$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

    Abstract: Based on $(2712.4\pm 14.3)\times10^{6}$ $ψ(3686)$ events, we investigate four hadronic decay modes of the $P$-wave charmonium spin-singlet state $h_c(^1P_1) \to h^+ h^- π^0/η$ ($h=π$ or $K$) via the process $ψ(3686) \to π^{0}h_c$ at BESIII. The $h_c \to π^+ π^- π^0$ decay is observed with a significance of 9.6$σ$ after taking into account systematic uncertainties. Evidences for… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 9 pages, 7 figures

  46. arXiv:2406.02744  [pdf, other

    cs.CR cs.LG

    DPDR: Gradient Decomposition and Reconstruction for Differentially Private Deep Learning

    Authors: Yixuan Liu, Li Xiong, Yuhan Liu, Yujie Gu, Ruixuan Liu, Hong Chen

    Abstract: Differentially Private Stochastic Gradients Descent (DP-SGD) is a prominent paradigm for preserving privacy in deep learning. It ensures privacy by perturbing gradients with random noise calibrated to their entire norm at each training step. However, this perturbation suffers from a sub-optimal performance: it repeatedly wastes privacy budget on the general converging direction shared among gradie… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 14 pages

  47. arXiv:2406.02097  [pdf, ps, other

    hep-ph

    The Benchmark Mode $Ω_c \to Ω^-π^+$ and Its Related Processes

    Authors: Shuge Zeng, Fanrong Xu, Yu Gu

    Abstract: The benchmark mode $Ω_c^0\to Ω^- π^+$, which receives purely factorization contribution, is of great importance among all the decay channels of $Ω_c^0$ decays. In this work, within the framework of non-relativistic quark model (NRQM), we calculate all the 6 baryon transition form factors involving $\frac12 ^+\to \frac32 ^+$ decays. The absolute branching fractions of non-leptonic decays… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 19 pages, 2 figures, 4 tables

  48. arXiv:2406.01549  [pdf, other

    cs.CL cs.AI

    An Information Bottleneck Perspective for Effective Noise Filtering on Retrieval-Augmented Generation

    Authors: Kun Zhu, Xiaocheng Feng, Xiyuan Du, Yuxuan Gu, Weijiang Yu, Haotian Wang, Qianglong Chen, Zheng Chu, Jingchang Chen, Bing Qin

    Abstract: Retrieval-augmented generation integrates the capabilities of large language models with relevant information retrieved from an extensive corpus, yet encounters challenges when confronted with real-world noisy data. One recent solution is to train a filter module to find relevant content but only achieve suboptimal noise compression. In this paper, we propose to introduce the information bottlenec… ▽ More

    Submitted 4 July, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024

  49. arXiv:2406.01332  [pdf, ps, other

    hep-ex

    Measurements of the branching fractions of semileptonic $D^{+}_s$ decays via $e^+e^-\to D_s^{*+}D_s^{*-}$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

    Abstract: We measure the absolute branching fractions of semileptonic $D^+_s$ decays via the $e^+e^-\to D_s^{*+}D_s^{*-}$ process using $e^+e^-$ collision data corresponding to an integrated luminosity of $10.64~\mathrm{fb}^{-1}$ collected by the BESIII detector at center-of-mass energies between 4.237 and 4.699 GeV. The branching fractions are… ▽ More

    Submitted 4 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: 14 pages, 3 figures

  50. arXiv:2405.20936  [pdf, other

    stat.ME

    Bayesian Deep Generative Models for Replicated Networks with Multiscale Overlapping Clusters

    Authors: Yuren Zhou, Yuqi Gu, David B. Dunson

    Abstract: Our interest is in replicated network data with multiple networks observed across the same set of nodes. Examples include brain connection networks, in which nodes corresponds to brain regions and replicates to different individuals, and ecological networks, in which nodes correspond to species and replicates to samples collected at different locations and/or times. Our goal is to infer a hierarch… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.