Skip to main content

Showing 1–50 of 4,102 results for author: Zhu, J

  1. arXiv:2407.09053  [pdf, other

    cs.RO

    Navi2Gaze: Leveraging Foundation Models for Navigation and Target Gazing

    Authors: Jun Zhu, Zihao Du, Haotian Xu, Fengbo Lan, Zilong Zheng, Bo Ma, Shengjie Wang, Tao Zhang

    Abstract: Task-aware navigation continues to be a challenging area of research, especially in scenarios involving open vocabulary. Previous studies primarily focus on finding suitable locations for task completion, often overlooking the importance of the robot's pose. However, the robot's orientation is crucial for successfully completing tasks because of how objects are arranged (e.g., to open a refrigerat… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  2. arXiv:2407.09024  [pdf, other

    cs.LG

    Aligning Diffusion Behaviors with Q-functions for Efficient Continuous Control

    Authors: Huayu Chen, Kaiwen Zheng, Hang Su, Jun Zhu

    Abstract: Drawing upon recent advances in language model alignment, we formulate offline Reinforcement Learning as a two-stage optimization problem: First pretraining expressive generative policies on reward-free behavior datasets, then fine-tuning these policies to align with task-specific annotations like Q-values. This strategy allows us to leverage abundant and diverse behavior data to enhance generaliz… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  3. arXiv:2407.08952  [pdf, other

    cs.CL cs.AI

    Detect, Investigate, Judge and Determine: A Novel LLM-based Framework for Few-shot Fake News Detection

    Authors: Ye Liu, Jiajun Zhu, Kai Zhang, Haoyu Tang, Yanghai Zhang, Xukai Liu, Qi Liu, Enhong Chen

    Abstract: Few-Shot Fake News Detection (FS-FND) aims to distinguish inaccurate news from real ones in extremely low-resource scenarios. This task has garnered increased attention due to the widespread dissemination and harmful impact of fake news on social media. Large Language Models (LLMs) have demonstrated competitive performance with the help of their rich prior knowledge and excellent in-context learni… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  4. arXiv:2407.08764  [pdf, other

    physics.optics

    The Radiation Gauge: When is it Valid?

    Authors: Jie Zhu, Christopher J. Ryu, Dong-Yeop Na, Weng Cho Chew

    Abstract: In this paper, we shall show that the vector-scalar potential ($\mathbf{A}$-$Φ$) formulation, for many problems, can be further simplified by ignoring the scalar potential contribution and setting it to zero.

    Submitted 10 July, 2024; originally announced July 2024.

  5. arXiv:2407.08682  [pdf, other

    hep-ph

    Jet Tagging with More-Interaction Particle Transformer

    Authors: Yifan Wu, Kun Wang, Jingya Zhu

    Abstract: In this study, we introduce the More-Interaction Particle Transformer (MIParT), a novel deep learning neural network designed for jet tagging. This framework incorporates our own design, the More-Interaction Attention (MIA) mechanism, which increases the dimensionality of particle interaction embeddings. We tested MIParT using the top tagging and quark-gluon datasets. Our results show that MIParT… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 10 pages, 6 figures, 4 tables

  6. arXiv:2407.08199  [pdf, other

    cs.CV

    SRPose: Two-view Relative Pose Estimation with Sparse Keypoints

    Authors: Rui Yin, Yulun Zhang, Zherong Pan, Jianjun Zhu, Cheng Wang, Biao Jia

    Abstract: Two-view pose estimation is essential for map-free visual relocalization and object pose tracking tasks. However, traditional matching methods suffer from time-consuming robust estimators, while deep learning-based pose regressors only cater to camera-to-world pose estimation, lacking generalizability to different image sizes and camera intrinsics. In this paper, we propose SRPose, a sparse keypoi… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 30 pages, 11 figures, to be published in ECCV 2024

  7. arXiv:2407.08174  [pdf, other

    cs.HC q-bio.NC

    An Adaptively Weighted Averaging Method for Regional Time Series Extraction of fMRI-based Brain Decoding

    Authors: Jianfei Zhu, Baichun Wei, Jiaru Tian, Feng Jiang, Chunzhi Yi

    Abstract: Brain decoding that classifies cognitive states using the functional fluctuations of the brain can provide insightful information for understanding the brain mechanisms of cognitive functions. Among the common procedures of decoding the brain cognitive states with functional magnetic resonance imaging (fMRI), extracting the time series of each brain region after brain parcellation traditionally av… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 17 pages, 4 figures

    ACM Class: J.3

  8. arXiv:2407.08033  [pdf, other

    physics.ins-det

    Studies of Cherenkov Photon Production in PbF$_2$ Crystals using Proton Beams at Fermilab

    Authors: Thomas Anderson, Alberto Belloni, Grace Cummings, Sarah Eno, Nora Fischer, Liang Guan, Yuxiang Guo, Robert Hirosky, James Hirschauer, Yihui Lai, Daniel Levin, Hui-Chi Lin, Mekhala Paranjpe, Jianming Qian, Bing Zhou, Junjie Zhu, Ren-Yuan Zhu

    Abstract: Future lepton colliders such as the FCC-ee, CEPC, ILC, or a muon collider will collect large data samples that allow precision physics studies with unprecedented accuracy, especially when the data is collected by innovative state-of-the-art detectors. An electromagnetic calorimeter based on scintillating crystals, designed to separately record Cherenkov and scintillation light, can achieve precisi… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 10 pages

  9. arXiv:2407.07651  [pdf, other

    hep-ex physics.data-an

    Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$

    Authors: M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (645 additional authors not shown)

    Abstract: The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  10. arXiv:2407.07525  [pdf, other

    cs.CV cs.RO

    Incremental Multiview Point Cloud Registration with Two-stage Candidate Retrieval

    Authors: Shiqi Li, Jihua Zhu, Yifan Xie, Mingchen Zhu

    Abstract: Multiview point cloud registration serves as a cornerstone of various computer vision tasks. Previous approaches typically adhere to a global paradigm, where a pose graph is initially constructed followed by motion synchronization to determine the absolute pose. However, this separated approach may not fully leverage the characteristics of multiview registration and might struggle with low-overlap… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  11. arXiv:2407.06939  [pdf, other

    cs.RO cs.CV

    Towards Open-World Mobile Manipulation in Homes: Lessons from the Neurips 2023 HomeRobot Open Vocabulary Mobile Manipulation Challenge

    Authors: Sriram Yenamandra, Arun Ramachandran, Mukul Khanna, Karmesh Yadav, Jay Vakil, Andrew Melnik, Michael Büttner, Leon Harz, Lyon Brown, Gora Chand Nandi, Arjun PS, Gaurav Kumar Yadav, Rahul Kala, Robert Haschke, Yang Luo, Jinxin Zhu, Yansen Han, Bingyi Lu, Xuan Gu, Qinyuan Liu, Yaping Zhao, Qiting Ye, Chenxiao Dou, Yansong Chua, Volodymyr Kuzma , et al. (20 additional authors not shown)

    Abstract: In order to develop robots that can effectively serve as versatile and capable home assistants, it is crucial for them to reliably perceive and interact with a wide variety of objects across diverse environments. To this end, we proposed Open Vocabulary Mobile Manipulation as a key benchmark task for robotics: finding any object in a novel environment and placing it on any receptacle surface withi… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  12. arXiv:2407.05965  [pdf, other

    cs.CV cs.AI cs.CL cs.CR cs.LG

    T2VSafetyBench: Evaluating the Safety of Text-to-Video Generative Models

    Authors: Yibo Miao, Yifan Zhu, Yinpeng Dong, Lijia Yu, Jun Zhu, Xiao-Shan Gao

    Abstract: The recent development of Sora leads to a new era in text-to-video (T2V) generation. Along with this comes the rising concern about its security risks. The generated videos may contain illegal or unethical content, and there is a lack of comprehensive quantitative understanding of their safety, posing a challenge to their reliability and practical deployment. Previous evaluations primarily focus o… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  13. arXiv:2407.05680  [pdf, other

    cs.CV cs.AI

    Fine-Grained Multi-View Hand Reconstruction Using Inverse Rendering

    Authors: Qijun Gan, Wentong Li, Jinwei Ren, Jianke Zhu

    Abstract: Reconstructing high-fidelity hand models with intricate textures plays a crucial role in enhancing human-object interaction and advancing real-world applications. Despite the state-of-the-art methods excelling in texture generation and image rendering, they often face challenges in accurately capturing geometric details. Learning-based approaches usually offer better robustness and faster inferenc… ▽ More

    Submitted 8 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

    Comments: Accepted by AAAI 2024

  14. arXiv:2407.05229  [pdf, other

    cs.LG

    HiDe-PET: Continual Learning via Hierarchical Decomposition of Parameter-Efficient Tuning

    Authors: Liyuan Wang, Jingyi Xie, Xingxing Zhang, Hang Su, Jun Zhu

    Abstract: The deployment of pre-trained models (PTMs) has greatly advanced the field of continual learning (CL), enabling positive knowledge transfer and resilience to catastrophic forgetting. To sustain these advantages for sequentially arriving tasks, a promising direction involves keeping the pre-trained backbone frozen while employing parameter-efficient tuning (PET) techniques to instruct representatio… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: This is a generalized version of our HiDe-Prompt (NeurIPS 2023, Spotlight)

  15. arXiv:2407.04346  [pdf

    cs.CV

    MobileFlow: A Multimodal LLM For Mobile GUI Agent

    Authors: Songqin Nong, Jiali Zhu, Rui Wu, Jiongchao Jin, Shuo Shan, Xiutian Huang, Wenhao Xu

    Abstract: Currently, the integration of mobile Graphical User Interfaces (GUIs) is ubiquitous in most people's daily lives. And the ongoing evolution of multimodal large-scale models, such as GPT-4v, Qwen-VL-Max, has significantly bolstered the capabilities of GUI comprehension and user action analysis, showcasing the potentiality of intelligent GUI assistants. However, current GUI Agents often need to acce… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  16. arXiv:2407.03162  [pdf, other

    cs.RO cs.CV cs.LG

    Bunny-VisionPro: Real-Time Bimanual Dexterous Teleoperation for Imitation Learning

    Authors: Runyu Ding, Yuzhe Qin, Jiyue Zhu, Chengzhe Jia, Shiqi Yang, Ruihan Yang, Xiaojuan Qi, Xiaolong Wang

    Abstract: Teleoperation is a crucial tool for collecting human demonstrations, but controlling robots with bimanual dexterous hands remains a challenge. Existing teleoperation systems struggle to handle the complexity of coordinating two hands for intricate manipulations. We introduce Bunny-VisionPro, a real-time bimanual dexterous teleoperation system that leverages a VR headset. Unlike previous vision-bas… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: project page: https://dingry.github.io/projects/bunny_visionpro.html

  17. arXiv:2407.03151  [pdf, other

    nlin.AO

    Inverse stochastic resonance in adaptive small-world neural networks

    Authors: Marius E. Yamakou, Jinjie Zhu, Erik A. Martens

    Abstract: Inverse stochastic resonance (ISR) is a phenomenon where noise reduces rather than increases the firing rate of a neuron, sometimes leading to complete quiescence. ISR was first experimentally verified with cerebellar Purkinje neurons. These experiments showed that ISR enables optimal information transfer between the input and output spike train of neurons. Subsequent studies demonstrated the effi… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 16 pages, 59 references, 10 figures

  18. arXiv:2407.02899  [pdf, other

    hep-ex

    Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

    Abstract: A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  19. arXiv:2407.02392  [pdf, other

    cs.CV

    TokenPacker: Efficient Visual Projector for Multimodal LLM

    Authors: Wentong Li, Yuqian Yuan, Jian Liu, Dongqi Tang, Song Wang, Jianke Zhu, Lei Zhang

    Abstract: The visual projector serves as an essential bridge between the visual encoder and the Large Language Model (LLM) in a Multimodal LLM (MLLM). Typically, MLLMs adopt a simple MLP to preserve all visual contexts via one-to-one transformation. However, the visual tokens are redundant and can be considerably increased when dealing with high-resolution images, impairing the efficiency of MLLMs significa… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 16 pages, Codes:https://github.com/CircleRadon/TokenPacker

  20. arXiv:2407.02362  [pdf, other

    cs.AR cs.AI cs.LG

    Fast, Scalable, Energy-Efficient Non-element-wise Matrix Multiplication on FPGA

    Authors: Xuqi Zhu, Huaizhi Zhang, JunKyu Lee, Jiacheng Zhu, Chandrajit Pal, Sangeet Saha, Klaus D. McDonald-Maier, Xiaojun Zhai

    Abstract: Modern Neural Network (NN) architectures heavily rely on vast numbers of multiply-accumulate arithmetic operations, constituting the predominant computational cost. Therefore, this paper proposes a high-throughput, scalable and energy efficient non-element-wise matrix multiplication unit on FPGAs as a basic component of the NNs. We firstly streamline inter-layer and intra-layer redundancies of MAD… ▽ More

    Submitted 7 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

  21. arXiv:2407.02079  [pdf, other

    cs.AR

    Theseus: Towards High-Efficiency Wafer-Scale Chip Design Space Exploration for Large Language Models

    Authors: Jingchen Zhu, Chenhao Xue, Yiqi Chen, Zhao Wang, Guangyu Sun

    Abstract: The emergence of the large language model~(LLM) poses an exponential growth of demand for computation throughput, memory capacity, and communication bandwidth. Such a demand growth has significantly surpassed the improvement of corresponding chip designs. With the advancement of fabrication and integration technologies, designers have been developing Wafer-Scale Chips(WSCs) to scale up and exploit… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  22. arXiv:2407.01842  [pdf, other

    cs.CV

    CLIP the Divergence: Language-guided Unsupervised Domain Adaptation

    Authors: Jinjing Zhu, Yucheng Chen, Lin Wang

    Abstract: Unsupervised domain adaption (UDA) has emerged as a popular solution to tackle the divergence between the labeled source and unlabeled target domains. Recently, some research efforts have been made to leverage large vision-language models, such as CLIP, and then fine-tune or learn prompts from them for addressing the challenging UDA task. In this work, we shift the gear to a new direction by direc… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  23. arXiv:2407.00945  [pdf, other

    cs.LG

    Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs

    Authors: Enshu Liu, Junyi Zhu, Zinan Lin, Xuefei Ning, Matthew B. Blaschko, Shengen Yan, Guohao Dai, Huazhong Yang, Yu Wang

    Abstract: The rapid advancement of large language models (LLMs) has led to architectures with billions to trillions of parameters, posing significant deployment challenges due to their substantial demands on memory, processing power, and energy consumption. Sparse Mixture-of-Experts (SMoE) architectures have emerged as a solution, activating only a subset of parameters per token, thereby achieving faster in… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  24. arXiv:2407.00849  [pdf, other

    cs.LG

    Towards Understanding Sensitive and Decisive Patterns in Explainable AI: A Case Study of Model Interpretation in Geometric Deep Learning

    Authors: Jiajun Zhu, Siqi Miao, Rex Ying, Pan Li

    Abstract: The interpretability of machine learning models has gained increasing attention, particularly in scientific domains where high precision and accountability are crucial. This research focuses on distinguishing between two critical data patterns -- sensitive patterns (model-related) and decisive patterns (task-related) -- which are commonly used as model interpretations but often lead to confusion.… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  25. arXiv:2407.00136  [pdf, other

    hep-ex

    Observation of the Electromagnetic Dalitz Transition $h_c \rightarrow e^+e^-η_c$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, S. Ahmed, M. Albrecht, R. Aliberti, A. Amoroso, M. R. An, Q. An, X. H. Bai, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (495 additional authors not shown)

    Abstract: Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions… ▽ More

    Submitted 2 July, 2024; v1 submitted 28 June, 2024; originally announced July 2024.

  26. arXiv:2407.00066  [pdf, other

    cs.DC cs.AI cs.CL cs.LG

    Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead

    Authors: Rickard Brüel-Gabrielsson, Jiacheng Zhu, Onkar Bhardwaj, Leshem Choshen, Kristjan Greenewald, Mikhail Yurochkin, Justin Solomon

    Abstract: Fine-tuning large language models (LLMs) with low-rank adapters (LoRAs) has become common practice, often yielding numerous copies of the same LLM differing only in their LoRA updates. This paradigm presents challenges for systems that serve real-time responses to queries that each involve a different LoRA. Prior works optimize the design of such systems but still require continuous loading and of… ▽ More

    Submitted 17 June, 2024; originally announced July 2024.

  27. arXiv:2406.19348  [pdf, other

    cond-mat.supr-con cond-mat.mes-hall cond-mat.str-el

    Theory of superconductivity in twisted transition metal dichalcogenide homobilayers

    Authors: Jihang Zhu, Yang-Zhi Chou, Ming Xie, Sankar Das Sarma

    Abstract: For the first time, robust superconductivity has been independently observed in twisted WSe$_2$ bilayers by two separate groups [Y. Xia et al., arXiv:2405.14784; Y. Guo et al., arXiv:2406.03418.]. In light of this, we explore the possibility of a universal superconducting pairing mechanism in twisted WSe$_2$ bilayers. Using a continuum band structure model and a phenomenological boson-mediated eff… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  28. arXiv:2406.19190  [pdf, ps, other

    hep-ex

    Improved measurement of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

    Abstract: Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential dec… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 13 pages, 6 figures

  29. arXiv:2406.18343  [pdf, ps, other

    math.DG

    Optimal volume bound and volume growth for Ricci-nonnegative manifolds with positive Bi-Ricci curvature

    Authors: Jie Zhou, Jintian Zhu

    Abstract: In this paper, we prove the optimal volume growth for complete Riemannian manifolds $(M^n,g)$ with nonnegative Ricci curvature everywhere and bi-Ricci curvature bounded from below by $n-2$ outside a compact set when the dimension is less than eight. This answers a question [AX24, Question 1] proposed by Antonelli-Xu in dimensions six and seven. As a by-product, we also prove an analogy of Gromov's… ▽ More

    Submitted 30 June, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

    Comments: 14 pages, no figures

  30. arXiv:2406.18284  [pdf, other

    cs.CV

    RealTalk: Real-time and Realistic Audio-driven Face Generation with 3D Facial Prior-guided Identity Alignment Network

    Authors: Xiaozhong Ji, Chuming Lin, Zhonggan Ding, Ying Tai, Jian Yang, Junwei Zhu, Xiaobin Hu, Jiangning Zhang, Donghao Luo, Chengjie Wang

    Abstract: Person-generic audio-driven face generation is a challenging task in computer vision. Previous methods have achieved remarkable progress in audio-visual synchronization, but there is still a significant gap between current results and practical applications. The challenges are two-fold: 1) Preserving unique individual traits for achieving high-precision lip synchronization. 2) Generating high-qual… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  31. arXiv:2406.18183  [pdf, other

    hep-ex

    Measurement of the cross sections of $e^+e^-\to K^{-}\barΞ^{+}Λ/Σ^{0}$ at center-of-mass energies between 3.510 and 4.914 GeV

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

    Abstract: Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 26 pages,5 tables, 4 figures

  32. arXiv:2406.18083  [pdf, other

    hep-ex

    Measurements of $K_S^0$-$K_L^0$ asymmetries in the decays $Λ_c^+ \to pK_{L,S}^0$, $pK_{L,S}^0π^+π^-$ and $pK_{L,S}^0π^0$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

    Abstract: Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, an… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 19 pages, 2 figures

  33. arXiv:2406.18007  [pdf, other

    cs.MM

    Deep Mamba Multi-modal Learning

    Authors: Jian Zhu, Xin Zou, Yu Cui, Zhangmin Huang, Chenshu Hu, Bo Lyu

    Abstract: Inspired by the excellent performance of Mamba networks, we propose a novel Deep Mamba Multi-modal Learning (DMML). It can be used to achieve the fusion of multi-modal features. We apply DMML to the field of multimedia retrieval and propose an innovative Deep Mamba Multi-modal Hashing (DMMH) method. It combines the advantages of algorithm accuracy and inference speed. We validated the effectivenes… ▽ More

    Submitted 9 April, 2024; originally announced June 2024.

    Comments: Deep Mamba Multi-modal Learning; Deep Mamba Multi-modal Hashing

  34. arXiv:2406.17507  [pdf, other

    cs.IR

    ACE: A Generative Cross-Modal Retrieval Framework with Coarse-To-Fine Semantic Modeling

    Authors: Minghui Fang, Shengpeng Ji, Jialong Zuo, Hai Huang, Yan Xia, Jieming Zhu, Xize Cheng, Xiaoda Yang, Wenrui Liu, Gang Wang, Zhenhua Dong, Zhou Zhao

    Abstract: Generative retrieval, which has demonstrated effectiveness in text-to-text retrieval, utilizes a sequence-to-sequence model to directly generate candidate identifiers based on natural language queries. Without explicitly computing the similarity between queries and candidates, generative retrieval surpasses dual-tower models in both speed and accuracy on large-scale corpora, providing new insights… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  35. arXiv:2406.17452  [pdf, ps, other

    hep-ex

    Study of the $f_{0}(980)$ through the decay $D_{s}^{+}\rightarrow π^{+}π^{+}π^{-}π^{0}$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (649 additional authors not shown)

    Abstract: We perform the first amplitude analysis of $D^+_s \to π^+π^+π^-π^0$ decays, based on data samples of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of 7.33~fb$^{-1}$. We report the observation of $D_{s}^{+} \to f_0(980)ρ(770)^{+}$ with a statistical significance greater than 10$σ$ and… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  36. arXiv:2406.16815  [pdf, other

    cs.CV

    ClotheDreamer: Text-Guided Garment Generation with 3D Gaussians

    Authors: Yufei Liu, Junshu Tang, Chu Zheng, Shijie Zhang, Jinkun Hao, Junwei Zhu, Dongjin Huang

    Abstract: High-fidelity 3D garment synthesis from text is desirable yet challenging for digital avatar creation. Recent diffusion-based approaches via Score Distillation Sampling (SDS) have enabled new possibilities but either intricately couple with human body or struggle to reuse. We introduce ClotheDreamer, a 3D Gaussian-based method for generating wearable, production-ready 3D garment assets from text p… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Project Page: https://ggxxii.github.io/clothedreamer

  37. arXiv:2406.16360  [pdf, other

    cs.CV cs.GR

    MIRReS: Multi-bounce Inverse Rendering using Reservoir Sampling

    Authors: Yuxin Dai, Qi Wang, Jingsen Zhu, Dianbing Xi, Yuchi Huo, Chen Qian, Ying He

    Abstract: We present MIRReS, a novel two-stage inverse rendering framework that jointly reconstructs and optimizes the explicit geometry, material, and lighting from multi-view images. Unlike previous methods that rely on implicit irradiance fields or simplified path tracing algorithms, our method extracts an explicit geometry (triangular mesh) in stage one, and introduces a more realistic physically-based… ▽ More

    Submitted 24 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

    Comments: 16 pages, 14 figures

  38. arXiv:2406.16321  [pdf, other

    cs.LG cs.AI

    Multimodal Graph Benchmark

    Authors: Jing Zhu, Yuhang Zhou, Shengyi Qian, Zhongmou He, Tong Zhao, Neil Shah, Danai Koutra

    Abstract: Associating unstructured data with structured information is crucial for real-world tasks that require relevance search. However, existing graph learning benchmarks often overlook the rich semantic information associate with each node. To bridge such gap, we introduce the Multimodal Graph Benchmark (MM-GRAPH), the first comprehensive multi-modal graph benchmark that incorporates both textual and v… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: https://mm-graph-benchmark.github.io/

  39. arXiv:2406.16069  [pdf, other

    cs.CL cs.AI

    FastMem: Fast Memorization of Prompt Improves Context Awareness of Large Language Models

    Authors: Junyi Zhu, Shuochen Liu, Yu Yu, Bo Tang, Yibo Yan, Zhiyu Li, Feiyu Xiong, Tong Xu, Matthew B. Blaschko

    Abstract: Large language models (LLMs) excel in generating coherent text, but they often struggle with context awareness, leading to inaccuracies in tasks requiring faithful adherence to provided information. We introduce FastMem, a novel method designed to enhance instruction fine-tuned LLMs' context awareness through fast memorization of the prompt. FastMem maximizes the likelihood of the prompt before in… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  40. arXiv:2406.16062  [pdf, other

    cs.NE

    Towards Biologically Plausible Computing: A Comprehensive Comparison

    Authors: Changze Lv, Yufei Gu, Zhengkang Guo, Zhibo Xu, Yixin Wu, Feiran Zhang, Tianyuan Shi, Zhenghua Wang, Ruicheng Yin, Yu Shang, Siqi Zhong, Xiaohua Wang, Muling Wu, Wenhao Liu, Tianlong Li, Jianhao Zhu, Cenyuan Zhang, Zixuan Ling, Xiaoqing Zheng

    Abstract: Backpropagation is a cornerstone algorithm in training neural networks for supervised learning, which uses a gradient descent method to update network weights by minimizing the discrepancy between actual and desired outputs. Despite its pivotal role in propelling deep learning advancements, the biological plausibility of backpropagation is questioned due to its requirements for weight symmetry, gl… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  41. arXiv:2406.15939  [pdf, other

    hep-ph

    Investigating higgsino dark matter in the semi-constrained NMSSM

    Authors: Kun Wang, Jingya Zhu

    Abstract: In this study, we explore the characteristics of higgsino-dominated dark matter (DM) within the semi-constrained Next-to-Minimal Supersymmetric Standard Model (scNMSSM), covering a mass range from hundreds of GeV to several TeV. We carefully analyzed the parameter space under existing theoretical and experimental constraints to confirm the viability of higgsino-dominated lightest supersymmetric pa… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: 13 pages, 5 figures, 1 table

  42. arXiv:2406.15846  [pdf, other

    cs.CL eess.AS

    Revisiting Interpolation Augmentation for Speech-to-Text Generation

    Authors: Chen Xu, Jie Wang, Xiaoqian Liu, Qianqian Dong, Chunliang Zhang, Tong Xiao, Jingbo Zhu, Dapeng Man, Wu Yang

    Abstract: Speech-to-text (S2T) generation systems frequently face challenges in low-resource scenarios, primarily due to the lack of extensive labeled datasets. One emerging solution is constructing virtual training samples by interpolating inputs and labels, which has notably enhanced system generalization in other domains. Despite its potential, this technique's application in S2T tasks has remained under… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: ACL 2024 Findings

  43. arXiv:2406.15735  [pdf, other

    cs.CV cs.AI

    Identifying and Solving Conditional Image Leakage in Image-to-Video Diffusion Model

    Authors: Min Zhao, Hongzhou Zhu, Chendong Xiang, Kaiwen Zheng, Chongxuan Li, Jun Zhu

    Abstract: Diffusion models have obtained substantial progress in image-to-video (I2V) generation. However, such models are not fully understood. In this paper, we report a significant but previously overlooked issue in I2V diffusion models (I2V-DMs), namely, conditional image leakage. I2V-DMs tend to over-rely on the conditional image at large time steps, neglecting the crucial task of predicting the clean… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: Project page: https://cond-image-leak.github.io/

  44. arXiv:2406.15407  [pdf

    physics.ins-det

    Preliminary Design of a General Electronics Platform for Accelerator Facilities

    Authors: Jinfu Zhu, Hongli Ding, Haokui Li, Qiaoye Ran, Xiwen Dai, Wei Li, Jiawei Han, Yue Li, Zhiyuan Zhang, Weixin Qiu, Weiqing Zhang

    Abstract: Many accelerators require considerable electronic systems for tests, verification, and operation. In Shenzhen Superconducting Soft X-ray Free Electron Laser (S3FEL), to meet the early tests and verification of various systems, save development expenses, and improve the reusability of hardware, firmware, and software systems, we have considered the needs of each system and preliminarily designed a… ▽ More

    Submitted 11 May, 2024; originally announced June 2024.

    Comments: 3 pages, 4 figures, 2024 IEEE Real-Time Conference

  45. arXiv:2406.15178  [pdf, other

    cs.CL

    Hybrid Alignment Training for Large Language Models

    Authors: Chenglong Wang, Hang Zhou, Kaiyan Chang, Bei Li, Yongyu Mu, Tong Xiao, Tongran Liu, Jingbo Zhu

    Abstract: Alignment training is crucial for enabling large language models (LLMs) to cater to human intentions and preferences. It is typically performed based on two stages with different objectives: instruction-following alignment and human-preference alignment. However, aligning LLMs with these objectives in sequence suffers from an inherent problem: the objectives may conflict, and the LLMs cannot guara… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: accepted by ACL (Findings) 2024

  46. arXiv:2406.15030  [pdf, ps, other

    hep-ex

    Search for the $e^+e^- \to φχ_{c1}(3872)$ process at BESIII

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

    Abstract: Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 11 pages, 3 figures

  47. arXiv:2406.14964  [pdf, other

    cs.CV

    VividDreamer: Towards High-Fidelity and Efficient Text-to-3D Generation

    Authors: Zixuan Chen, Ruijie Su, Jiahao Zhu, Lingxiao Yang, Jian-Huang Lai, Xiaohua Xie

    Abstract: Text-to-3D generation aims to create 3D assets from text-to-image diffusion models. However, existing methods face an inherent bottleneck in generation quality because the widely-used objectives such as Score Distillation Sampling (SDS) inappropriately omit U-Net jacobians for swift generation, leading to significant bias compared to the "true" gradient obtained by full denoising sampling. This bi… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  48. arXiv:2406.14017  [pdf, other

    cs.IR

    EAGER: Two-Stream Generative Recommender with Behavior-Semantic Collaboration

    Authors: Ye Wang, Jiahao Xun, Minjie Hong, Jieming Zhu, Tao Jin, Wang Lin, Haoyuan Li, Linjun Li, Yan Xia, Zhou Zhao, Zhenhua Dong

    Abstract: Generative retrieval has recently emerged as a promising approach to sequential recommendation, framing candidate item retrieval as an autoregressive sequence generation problem. However, existing generative methods typically focus solely on either behavioral or semantic aspects of item information, neglecting their complementary nature and thus resulting in limited effectiveness. To address this… ▽ More

    Submitted 3 July, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

    Comments: Accepted by KDD 2024. Code available at https://reczoo.github.io/EAGER

  49. arXiv:2406.13977  [pdf, other

    eess.IV cs.CV

    Similarity-aware Syncretic Latent Diffusion Model for Medical Image Translation with Representation Learning

    Authors: Tingyi Lin, Pengju Lyu, Jie Zhang, Yuqing Wang, Cheng Wang, Jianjun Zhu

    Abstract: Non-contrast CT (NCCT) imaging may reduce image contrast and anatomical visibility, potentially increasing diagnostic uncertainty. In contrast, contrast-enhanced CT (CECT) facilitates the observation of regions of interest (ROI). Leading generative models, especially the conditional diffusion model, demonstrate remarkable capabilities in medical image modality transformation. Typical conditional d… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  50. arXiv:2406.13798  [pdf, ps, other

    math.OC

    Aubin Property and Strong Regularity Are Equivalent for Nonlinear Second-Order Cone Programming

    Authors: Liang Chen, Ruoning Chen, Defeng Sun, Junyuan Zhu

    Abstract: This paper solves a fundamental open problem in variational analysis on the equivalence between the Aubin property and the strong regularity for nonlinear second-order cone programming (SOCP) at a locally optimal solution. We achieve this by introducing a reduction approach to the Aubin property characterized by the Mordukhovich criterion and a lemma of alternative choices on cones to replace the… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    MSC Class: 90C; 90C31; 90C46