Skip to main content

Showing 1–50 of 2,430 results for author: Hu, X

  1. arXiv:2407.09072  [pdf, other

    cs.CL

    New Desiderata for Direct Preference Optimization

    Authors: Xiangkun Hu, Tong He, David Wipf

    Abstract: Large language models in the past have typically relied on some form of reinforcement learning with human feedback (RLHF) to better align model responses with human preferences. However, because of oft-observed instabilities when implementing these RLHF pipelines, various reparameterization techniques have recently been introduced to sidestep the need for separately learning an RL reward model. In… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  2. arXiv:2407.08903  [pdf, other

    cs.CR cs.AI cs.AR

    TensorTEE: Unifying Heterogeneous TEE Granularity for Efficient Secure Collaborative Tensor Computing

    Authors: Husheng Han, Xinyao Zheng, Yuanbo Wen, Yifan Hao, Erhu Feng, Ling Liang, Jianan Mu, Xiaqing Li, Tianyun Ma, Pengwei Jin, Xinkai Song, Zidong Du, Qi Guo, Xing Hu

    Abstract: Heterogeneous collaborative computing with NPU and CPU has received widespread attention due to its substantial performance benefits. To ensure data confidentiality and integrity during computing, Trusted Execution Environments (TEE) is considered a promising solution because of its comparatively lower overhead. However, existing heterogeneous TEE designs are inefficient for collaborative computin… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Accepted by ASPLOS 2024

  3. arXiv:2407.08032  [pdf, other

    astro-ph.EP

    Rossby Wave Instability and Substructure Formation in 3D Non-Ideal MHD Wind-Launching Disks

    Authors: Chun-Yen Hsu, Zhi-Yun Li, Yisheng Tu, Xiao Hu, Min-Kai Lin

    Abstract: Rings and gaps are routinely observed in the dust continuum emission of protoplanetary discs (PPDs). How they form and evolve remains debated. Previous studies have demonstrated the possibility of spontaneous gas rings and gaps formation in wind-launching disks. Here, we show that such gas substructures are unstable to the Rossby Wave Instability (RWI) through numerical simulations. Specifically,… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  4. arXiv:2407.07433  [pdf, other

    cs.CV cs.AI

    Controllable Navigation Instruction Generation with Chain of Thought Prompting

    Authors: Xianghao Kong, Jinyu Chen, Wenguan Wang, Hang Su, Xiaolin Hu, Yi Yang, Si Liu

    Abstract: Instruction generation is a vital and multidisciplinary research area with broad applications. Existing instruction generation models are limited to generating instructions in a single style from a particular dataset, and the style and content of generated instructions cannot be controlled. Moreover, most existing instruction generation methods also disregard the spatial modeling of the navigation… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: ECCV 2024

  5. arXiv:2407.05908  [pdf, other

    physics.optics cond-mat.mes-hall

    Realization of Z$_2$ topological photonic insulators made from multilayer transition metal dichalcogenides

    Authors: Tommi Isoniemi, Paul Bouteyre, Xuerong Hu, Fedor Benimetskiy, Yue Wang, Maurice S. Skolnick, Dmitry N. Krizhanovskii, Alexander I. Tartakovskii

    Abstract: Monolayers of semiconducting transition metal dichalcogenides (TMDs) have long attracted interest for their intriguing optical and electronic properties. Recently TMDs in their quasi-bulk form have started to show considerable promise for nanophotonics thanks to their high refractive indices, large optical anisotropy, wide transparency windows reaching to the visible, and robust room temperature e… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 34 pages, 18 figures

  6. arXiv:2407.05700  [pdf, other

    cs.CL cs.AI cs.SE

    InverseCoder: Unleashing the Power of Instruction-Tuned Code LLMs with Inverse-Instruct

    Authors: Yutong Wu, Di Huang, Wenxuan Shi, Wei Wang, Lingzhe Gao, Shihao Liu, Ziyuan Nan, Kaizhao Yuan, Rui Zhang, Xishan Zhang, Zidong Du, Qi Guo, Yewen Pu, Dawei Yin, Xing Hu, Yunji Chen

    Abstract: Recent advancements in open-source code large language models (LLMs) have demonstrated remarkable coding abilities by fine-tuning on the data generated from powerful closed-source LLMs such as GPT-3.5 and GPT-4 for instruction tuning. This paper explores how to further improve an instruction-tuned code LLM by generating data from itself rather than querying closed-source LLMs. Our key observation… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  7. arXiv:2407.05293  [pdf, other

    cs.CE

    Wideband Beamforming with RIS: A Unified Framework via Space-Frequency Transformation

    Authors: Xiaowei Qian, Xiaoling Hu, Chenxi Liu, Mugen Peng

    Abstract: The spectrum shift from the sub-6G band to the high-frequency band has posed an ever-increasing demand on the paradigm shift from narrowband beamforming to wideband beamforming. Despite recent research efforts, the problem of wideband beamforming design is particularly challenging in reconfigurable intelligent surface (RIS)-assisted systems, due to that RIS is not capable of performing frequency-d… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: 13 pages, 16 figures

  8. arXiv:2407.05272  [pdf, other

    gr-qc

    Quasinormal modes and greybody factor of Schwarzschild Black Hole in the Cold Dark Matter Halo

    Authors: Shi-Jie Ma, Rui-Bo Wang, Tian-Chi Ma, He-Xu Zhang, Jian-Bo Deng, Xian-Ru Hu

    Abstract: In this article, we firstly studied wave function in static spherically symmetric spacetime and obtained effective potential of perturbed fields with spin. Then we applied $6^{\rm{th}}$ order WKB approximation to analyze quasinormal modes of Schwarzschild black hole in the Cold Dark Matter halo in perturbed fields with different spins and derived quasinormal frequencies. Further, to study the rela… ▽ More

    Submitted 10 July, 2024; v1 submitted 7 July, 2024; originally announced July 2024.

    Comments: 22 pages, 5 figures, 4 tables

  9. arXiv:2407.04675  [pdf, other

    eess.AS cs.SD

    Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition

    Authors: Ye Bai, Jingping Chen, Jitong Chen, Wei Chen, Zhuo Chen, Chuang Ding, Linhao Dong, Qianqian Dong, Yujiao Du, Kepan Gao, Lu Gao, Yi Guo, Minglun Han, Ting Han, Wenchao Hu, Xinying Hu, Yuxiang Hu, Deyu Hua, Lu Huang, Mingkun Huang, Youjia Huang, Jishuo Jin, Fanliu Kong, Zongwei Lan, Tianyu Li , et al. (30 additional authors not shown)

    Abstract: Modern automatic speech recognition (ASR) model is required to accurately transcribe diverse speech signals (from different domains, languages, accents, etc) given the specific contextual information in various application scenarios. Classic end-to-end models fused with extra language models perform well, but mainly in data matching scenarios and are gradually approaching a bottleneck. In this wor… ▽ More

    Submitted 10 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

  10. arXiv:2407.03926  [pdf, ps, other

    cs.IT eess.SP

    Rethinking the fundamental performance limits of integrated sensing and communication systems

    Authors: Zhouyuan Yu, Xiaoling Hu, Chenxi Liu, Mugen Peng

    Abstract: Integrated sensing and communication (ISAC) has been recognized as a key enabler and feature of future wireless networks. In the existing works analyzing the performances of ISAC, discrete-time systems were commonly assumed, which, however, overlooked the impacts of temporal, spectral, and spatial properties. To address this issue, we establish a unified information model for the band-limited cont… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  11. arXiv:2407.03902  [pdf, ps, other

    eess.SP

    Detection and Multi-Parameter Estimation for NLOS Targets: An IRS-assisted Framework

    Authors: Zhouyuan Yu, Xiaoling Hu, Chenxi Liu, Qin Tao, Mugen Peng

    Abstract: Intelligent reflecting surface (IRS) has the potential to enhance sensing performance, due to its capability of reshaping the echo signals. Different from the existing literature, which has commonly focused on IRS beamforming optimization, in this paper, we pay special attention to designing effective signal processing approaches to extract sensing information from IRS-reshaped echo signals. To th… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  12. arXiv:2407.03804  [pdf, other

    cs.LG cs.NI

    Multi-Time Scale Service Caching and Pricing in MEC Systems with Dynamic Program Popularity

    Authors: Yiming Chen, Xingyuan Hu, Bo Gu, Shimin Gong, Zhou Su

    Abstract: In mobile edge computing systems, base stations (BSs) equipped with edge servers can provide computing services to users to reduce their task execution time. However, there is always a conflict of interest between the BS and users. The BS prices the service programs based on user demand to maximize its own profit, while the users determine their offloading strategies based on the prices to minimiz… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  13. arXiv:2407.01527  [pdf, other

    cs.CL

    KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches

    Authors: Jiayi Yuan, Hongyi Liu, Shaochen, Zhong, Yu-Neng Chuang, Songchen Li, Guanchu Wang, Duy Le, Hongye Jin, Vipin Chaudhary, Zhaozhuo Xu, Zirui Liu, Xia Hu

    Abstract: Long context capability is a crucial competency for large language models (LLMs) as it mitigates the human struggle to digest long-form texts. This capability enables complex task-solving scenarios such as book summarization, code assistance, and many more tasks that are traditionally manpower-intensive. However, transformer-based LLMs face significant challenges with long context input due to the… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  14. arXiv:2407.00952  [pdf, other

    cs.LG cs.CL cs.DC

    SplitLoRA: A Split Parameter-Efficient Fine-Tuning Framework for Large Language Models

    Authors: Zheng Lin, Xuanjie Hu, Yuxin Zhang, Zhe Chen, Zihan Fang, Xianhao Chen, Ang Li, Praneeth Vepakomma, Yue Gao

    Abstract: The scalability of large language models (LLMs) in handling high-complexity models and large-scale datasets has led to tremendous successes in pivotal domains. While there is an urgent need to acquire more training data for LLMs, a concerning reality is the depletion of high-quality public datasets within a few years. In view of this, the federated learning (FL) LLM fine-tuning paradigm recently h… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 9 pages, 3 figures

  15. arXiv:2407.00466  [pdf, other

    cs.CL cs.AI

    BioKGBench: A Knowledge Graph Checking Benchmark of AI Agent for Biomedical Science

    Authors: Xinna Lin, Siqi Ma, Junjie Shan, Xiaojing Zhang, Shell Xu Hu, Tiannan Guo, Stan Z. Li, Kaicheng Yu

    Abstract: Pursuing artificial intelligence for biomedical science, a.k.a. AI Scientist, draws increasing attention, where one common approach is to build a copilot agent driven by Large Language Models (LLMs). However, to evaluate such systems, people either rely on direct Question-Answering (QA) to the LLM itself, or in a biomedical experimental manner. How to precisely benchmark biomedical agents from an… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  16. arXiv:2407.00082  [pdf, other

    cs.IR cs.AI cs.LG

    Adapting Job Recommendations to User Preference Drift with Behavioral-Semantic Fusion Learning

    Authors: Xiao Han, Chen Zhu, Xiao Hu, Chuan Qin, Xiangyu Zhao, Hengshu Zhu

    Abstract: Job recommender systems are crucial for aligning job opportunities with job-seekers in online job-seeking. However, users tend to adjust their job preferences to secure employment opportunities continually, which limits the performance of job recommendations. The inherent frequency of preference drift poses a challenge to promptly and precisely capture user preferences. To address this issue, we p… ▽ More

    Submitted 24 June, 2024; originally announced July 2024.

    Comments: Accepted by KDD 24 Research Track

  17. arXiv:2406.19783  [pdf, other

    cs.SE cs.CL

    NLPerturbator: Studying the Robustness of Code LLMs to Natural Language Variations

    Authors: Junkai Chen, Zhenhao Li, Xing Hu, Xin Xia

    Abstract: Large language models (LLMs) achieve promising results in code generation based on a given natural language description. They have been integrated into open-source projects and commercial products to facilitate daily coding activities. The natural language description in the prompt is crucial for LLMs to comprehend users' requirements. Prior studies uncover that LLMs are sensitive to the changes i… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  18. arXiv:2406.19651  [pdf, other

    cs.DB cs.AI

    CANDY: A Benchmark for Continuous Approximate Nearest Neighbor Search with Dynamic Data Ingestion

    Authors: Xianzhi Zeng, Zhuoyan Wu, Xinjing Hu, Xuanhua Shi, Shixuan Sun, Shuhao Zhang

    Abstract: Approximate K Nearest Neighbor (AKNN) algorithms play a pivotal role in various AI applications, including information retrieval, computer vision, and natural language processing. Although numerous AKNN algorithms and benchmarks have been developed recently to evaluate their effectiveness, the dynamic nature of real-world data presents significant challenges that existing benchmarks fail to addres… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  19. arXiv:2406.19544  [pdf, other

    cs.SE

    Where Are Large Language Models for Code Generation on GitHub?

    Authors: Xiao Yu, Lei Liu, Xing Hu, Jacky Wai Keung, Jin Liu, Xin Xia

    Abstract: The increasing use of Large Language Models (LLMs) in software development has garnered significant attention from researchers assessing the quality of the code they generate. However, much of the research focuses on controlled datasets such as HumanEval, which fail to adequately represent how developers actually utilize LLMs' code generation capabilities or clarify the characteristics of LLM-gene… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  20. arXiv:2406.18365  [pdf, other

    cs.CL

    Themis: Towards Flexible and Interpretable NLG Evaluation

    Authors: Xinyu Hu, Li Lin, Mingqi Gao, Xunjian Yin, Xiaojun Wan

    Abstract: The evaluation of natural language generation (NLG) tasks is a significant and longstanding research issue. With the recent emergence of powerful large language models (LLMs), some studies have turned to LLM-based automatic evaluation methods, which demonstrate great potential to become a new evaluation paradigm following traditional string-based and model-based metrics. However, despite the impro… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  21. arXiv:2406.18287  [pdf, other

    math.OC

    Learning-rate-free Momentum SGD with Reshuffling Converges in Nonsmooth Nonconvex Optimization

    Authors: Xiaoyin Hu, Nachuan Xiao, Xin Liu, Kim-Chuan Toh

    Abstract: In this paper, we propose a generalized framework for developing learning-rate-free momentum stochastic gradient descent (SGD) methods in the minimization of nonsmooth nonconvex functions, especially in training nonsmooth neural networks. Our framework adaptively generates learning rates based on the historical data of stochastic subgradients and iterates. Under mild conditions, we prove that our… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 26 pages

  22. arXiv:2406.18284  [pdf, other

    cs.CV

    RealTalk: Real-time and Realistic Audio-driven Face Generation with 3D Facial Prior-guided Identity Alignment Network

    Authors: Xiaozhong Ji, Chuming Lin, Zhonggan Ding, Ying Tai, Jian Yang, Junwei Zhu, Xiaobin Hu, Jiangning Zhang, Donghao Luo, Chengjie Wang

    Abstract: Person-generic audio-driven face generation is a challenging task in computer vision. Previous methods have achieved remarkable progress in audio-visual synchronization, but there is still a significant gap between current results and practical applications. The challenges are two-fold: 1) Preserving unique individual traits for achieving high-precision lip synchronization. 2) Generating high-qual… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  23. arXiv:2406.18021  [pdf, other

    cs.SD cs.LG eess.AS

    SC-MoE: Switch Conformer Mixture of Experts for Unified Streaming and Non-streaming Code-Switching ASR

    Authors: Shuaishuai Ye, Shunfei Chen, Xinhui Hu, Xinkang Xu

    Abstract: In this work, we propose a Switch-Conformer-based MoE system named SC-MoE for unified streaming and non-streaming code-switching (CS) automatic speech recognition (ASR), where we design a streaming MoE layer consisting of three language experts, which correspond to Mandarin, English, and blank, respectively, and equipped with a language identification (LID) network with a Connectionist Temporal Cl… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Accepted by InterSpeech 2024; 5 pages, 2 figures

  24. arXiv:2406.17006  [pdf, other

    hep-ex

    Probing the nature of the $χ_{c1}(3872)$ state using radiative decays

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1094 additional authors not shown)

    Abstract: The radiative decays $χ_{c1}(3872)\rightarrowψ(2S)γ$ and $χ_{c1}(3872)\rightarrow J/ψγ$ are used to probe the~nature of the~$χ_{c1}(3872)$ state using proton-proton collision data collected with the LHCb detector, corresponding to an~integrated luminosity of~9fb$^{-1}$. Using the~$B^+\rightarrow χ_{c1}(3872)K^+$decay, the $χ_{c1}(3872)\rightarrow ψ(2S)γ$ process is observed for the first time and… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 31 pages, 2 figures. All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-015.html (LHCb public pages)

    Report number: LHCb-PAPER-2024-015, CERN-EP-2025-157

  25. arXiv:2406.16786  [pdf, other

    cs.CE

    Generalized and high-efficiency arbitrary-positioned buffer for smoothed particle hydrodynamics

    Authors: Shuoguo Zhang, Yu Fan, Yaru Ren, Bin Qian, Xiangyu Hu

    Abstract: This paper develops an arbitrary-positioned buffer for the smoothed particle hydrodynamics (SPH) method, whose generality and high efficiency are achieved through two techniques. First, with the local coordinate system established at each arbitrary-positioned in-/outlet, particle positions in the global coordinate system are transformed into those in it via coordinate transformation. Since one loc… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 34 pages and 17 figures

  26. arXiv:2406.16743  [pdf, other

    cs.CL

    Adversarial Contrastive Decoding: Boosting Safety Alignment of Large Language Models via Opposite Prompt Optimization

    Authors: Zhengyue Zhao, Xiaoyun Zhang, Kaidi Xu, Xing Hu, Rui Zhang, Zidong Du, Qi Guo, Yunji Chen

    Abstract: With the widespread application of Large Language Models (LLMs), it has become a significant concern to ensure their safety and prevent harmful responses. While current safe-alignment methods based on instruction fine-tuning and Reinforcement Learning from Human Feedback (RLHF) can effectively reduce harmful responses from LLMs, they often require high-quality datasets and heavy computational over… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  27. arXiv:2406.16694  [pdf, other

    cs.CL

    Task Oriented In-Domain Data Augmentation

    Authors: Xiao Liang, Xinyu Hu, Simiao Zuo, Yeyun Gong, Qiang Lou, Yi Liu, Shao-Lun Huang, Jian Jiao

    Abstract: Large Language Models (LLMs) have shown superior performance in various applications and fields. To achieve better performance on specialized domains such as law and advertisement, LLMs are often continue pre-trained on in-domain data. However, existing approaches suffer from two major issues. First, in-domain data are scarce compared with general domain-agnostic data. Second, data used for contin… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  28. arXiv:2406.16583  [pdf, other

    cs.LG cs.CV

    Personalized federated learning based on feature fusion

    Authors: Wolong Xing, Zhenkui Shi, Hongyan Peng, Xiantao Hu, Xianxian Li

    Abstract: Federated learning enables distributed clients to collaborate on training while storing their data locally to protect client privacy. However, due to the heterogeneity of data, models, and devices, the final global model may need to perform better for tasks on each client. Communication bottlenecks, data heterogeneity, and model heterogeneity have been common challenges in federated learning. In t… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  29. arXiv:2406.16002  [pdf, ps, other

    cond-mat.quant-gas quant-ph

    Photon-assisted tunneling resonantly controlling spin current of a spin-orbit-coupled atom in a toroidal trap

    Authors: Zhiqiang Li, Xiaoxiao Hu, Zhao-Yun Zeng, Ai-Xi Chen, Xiaobing Luo

    Abstract: The periodic flashing potential has proven to be a powerful tool for investigating directed atomic currents. By applying the flashing ring-shaped potential to spin-orbit (SO) coupled, noninteracting Bose-Einstein condensate (BEC) systems, through photon-assisted tunneling (resonance) techniques, we demonstrate the generation of tunable alternating (AC) spin and atomic mass currents that can be pre… ▽ More

    Submitted 27 June, 2024; v1 submitted 22 June, 2024; originally announced June 2024.

    Comments: 10 pages, 6figures

  30. arXiv:2406.15956  [pdf

    cond-mat.mtrl-sci

    Decoupling Many-Body Interactions in CeO2 (111) Oxygen Vacancy Structure: Insights from Machine-Learning and Cluster Expansion

    Authors: Yujing Zhang, Zhong-Kang Han, Beien Zhu, Xiaojuan Hu, Maria Troppenz, Santiago Riga-monti, Hui Li, Claudia Draxl, M. Verónica Ganduglia-Pirovano, Yi Gao

    Abstract: Oxygen vacancies (VO's) are of paramount importance in influencing the properties and applications of ceria (CeO2). Yet, comprehending the distribution and nature of the VO's poses a significant challenge due to the vast number of electronic configurations and intricate many-body interactions among VO's and polarons (Ce3+'s). In this study, we employed a combination of LASSO regression in machine… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: 22 pages, 1 scheme, 5 figures

  31. arXiv:2406.15485  [pdf, other

    cs.CL cs.CV

    SegHist: A General Segmentation-based Framework for Chinese Historical Document Text Line Detection

    Authors: Xingjian Hu, Baole Wei, Liangcai Gao, Jun Wang

    Abstract: Text line detection is a key task in historical document analysis facing many challenges of arbitrary-shaped text lines, dense texts, and text lines with high aspect ratios, etc. In this paper, we propose a general framework for historical document text detection (SegHist), enabling existing segmentation-based text detection methods to effectively address the challenges, especially text lines with… ▽ More

    Submitted 8 July, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted by ICDAR2024 (poster)

  32. arXiv:2406.15477  [pdf

    cs.CL cs.AI

    CrisisSense-LLM: Instruction Fine-Tuned Large Language Model for Multi-label Social Media Text Classification in Disaster Informatics

    Authors: Kai Yin, Chengkai Liu, Ali Mostafavi, Xia Hu

    Abstract: In the field of crisis/disaster informatics, social media is increasingly being used for improving situational awareness to inform response and relief efforts. Efficient and accurate text classification tools have been a focal area of investigation in crisis informatics. However, current methods mostly rely on single-label text classification models, which fails to capture different insights embed… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  33. arXiv:2406.15245  [pdf, other

    cs.CL cs.LG

    Unsupervised Morphological Tree Tokenizer

    Authors: Qingyang Zhu, Xiang Hu, Pengyu Ji, Wei Wu, Kewei Tu

    Abstract: As a cornerstone in language modeling, tokenization involves segmenting text inputs into pre-defined atomic units. Conventional statistical tokenizers often disrupt constituent boundaries within words, thereby corrupting semantic information. To address this drawback, we introduce morphological structure guidance to tokenization and propose a deep model to induce character-level structures of word… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  34. arXiv:2406.14558  [pdf, other

    cs.RO cs.AI

    CooHOI: Learning Cooperative Human-Object Interaction with Manipulated Object Dynamics

    Authors: Jiawei Gao, Ziqin Wang, Zeqi Xiao, Jingbo Wang, Tai Wang, Jinkun Cao, Xiaolin Hu, Si Liu, Jifeng Dai, Jiangmiao Pang

    Abstract: Recent years have seen significant advancements in humanoid control, largely due to the availability of large-scale motion capture data and the application of reinforcement learning methodologies. However, many real-world tasks, such as moving large and heavy furniture, require multi-character collaboration. Given the scarcity of data on multi-character collaboration and the efficiency challenges… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  35. arXiv:2406.14045  [pdf, other

    cs.LG cs.AI

    Understanding Different Design Choices in Training Large Time Series Models

    Authors: Yu-Neng Chuang, Songchen Li, Jiayi Yuan, Guanchu Wang, Kwei-Herng Lai, Leisheng Yu, Sirui Ding, Chia-Yuan Chang, Qiaoyu Tan, Daochen Zha, Xia Hu

    Abstract: Inspired by Large Language Models (LLMs), Time Series Forecasting (TSF), a long-standing task in time series analysis, is undergoing a transition towards Large Time Series Models (LTSMs), aiming to train universal transformer-based models for TSF. However, training LTSMs on heterogeneous time series data poses unique challenges, including diverse frequencies, dimensions, and patterns across datase… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  36. arXiv:2406.13919  [pdf, other

    cs.AI

    SPL: A Socratic Playground for Learning Powered by Large Language Model

    Authors: Liang Zhang, Jionghao Lin, Ziyi Kuang, Sheng Xu, Mohammed Yeasin, Xiangen Hu

    Abstract: Dialogue-based Intelligent Tutoring Systems (ITSs) have significantly advanced adaptive and personalized learning by automating sophisticated human tutoring strategies within interactive dialogues. However, replicating the nuanced patterns of expert human communication remains a challenge in Natural Language Processing (NLP). Recent advancements in NLP, particularly Large Language Models (LLMs) su… ▽ More

    Submitted 20 June, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

  37. arXiv:2406.13219  [pdf, other

    cs.CV cs.CL

    MC-MKE: A Fine-Grained Multimodal Knowledge Editing Benchmark Emphasizing Modality Consistency

    Authors: Junzhe Zhang, Huixuan Zhang, Xunjian Yin, Baizhou Huang, Xu Zhang, Xinyu Hu, Xiaojun Wan

    Abstract: Multimodal large language models (MLLMs) are prone to non-factual or outdated knowledge issues, which can manifest as misreading and misrecognition errors due to the complexity of multimodal knowledge. Previous benchmarks have not systematically analyzed the performance of editing methods in correcting these two error types. To better represent and correct these errors, we decompose multimodal kno… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  38. arXiv:2406.13185  [pdf, other

    cs.CL

    Learnable In-Context Vector for Visual Question Answering

    Authors: Yingzhe Peng, Chenduo Hao, Xu Yang, Jiawei Peng, Xinting Hu, Xin Geng

    Abstract: As language models continue to scale, Large Language Models (LLMs) have exhibited emerging capabilities in In-Context Learning (ICL), enabling them to solve language tasks by prefixing a few in-context demonstrations (ICDs) as context. Inspired by these advancements, researchers have extended these techniques to develop Large Multimodal Models (LMMs) with ICL capabilities. However, applying ICL us… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  39. arXiv:2406.12757  [pdf, other

    cs.CV

    MAC: A Benchmark for Multiple Attributes Compositional Zero-Shot Learning

    Authors: Shuo Xu, Sai Wang, Xinyue Hu, Yutian Lin, Bo Du, Yu Wu

    Abstract: Compositional Zero-Shot Learning (CZSL) aims to learn semantic primitives (attributes and objects) from seen compositions and recognize unseen attribute-object compositions. Existing CZSL datasets focus on single attributes, neglecting the fact that objects naturally exhibit multiple interrelated attributes. Real-world objects often possess multiple interrelated attributes, and current datasets' n… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 13pages,5figures

  40. arXiv:2406.12111  [pdf, other

    hep-ex

    Precision measurement of the $Ξ^-_b$ baryon lifetime

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1064 additional authors not shown)

    Abstract: A sample of $pp$ collision data, corresponding to an integrated luminosity of 5.5 fb$^{-1}$ and collected by the LHCb experiment during Run 2, is used to measure the ratio of the lifetime of the $Ξ^-_b$ baryon to that of the $Λ^0_b$ baryon, $r_τ\equivτ_{Ξ^-_b}/τ_{Λ^0_b}$. The value ${r_τ^{\rm Run\,2}=1.076\pm0.013\pm0.006}$ is obtained, where the first uncertainty is statistical and the second sys… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 12 pages, 5 figures. All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2014-010.html (LHCb public pages)

    Report number: LHCb-PAPER-2024-010, CERN-EP-2024-139

  41. arXiv:2406.11643  [pdf, other

    cs.CV

    AnyMaker: Zero-shot General Object Customization via Decoupled Dual-Level ID Injection

    Authors: Lingjie Kong, Kai Wu, Xiaobin Hu, Wenhui Han, Jinlong Peng, Chengming Xu, Donghao Luo, Jiangning Zhang, Chengjie Wang, Yanwei Fu

    Abstract: Text-to-image based object customization, aiming to generate images with the same identity (ID) as objects of interest in accordance with text prompts and reference images, has made significant progress. However, recent customizing research is dominated by specialized tasks, such as human customization or virtual try-on, leaving a gap in general object customization. To this end, we introduce AnyM… ▽ More

    Submitted 5 July, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  42. arXiv:2406.11357  [pdf, other

    cs.CL cs.AI

    Refiner: Restructure Retrieval Content Efficiently to Advance Question-Answering Capabilities

    Authors: Zhonghao Li, Xuming Hu, Aiwei Liu, Kening Zheng, Sirui Huang, Hui Xiong

    Abstract: Large Language Models (LLMs) are limited by their parametric knowledge, leading to hallucinations in knowledge-extensive tasks. To address this, Retrieval-Augmented Generation (RAG) incorporates external document chunks to expand LLM knowledge. Furthermore, compressing information from document chunks through extraction or summarization can improve LLM performance. Nonetheless, LLMs still struggle… ▽ More

    Submitted 17 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: 8 pages

  43. arXiv:2406.11345  [pdf, other

    cs.CL cs.AI

    Full-ECE: A Metric For Token-level Calibration on Large Language Models

    Authors: Han Liu, Yupeng Zhang, Bingning Wang, Weipeng Chen, Xiaolin Hu

    Abstract: Deep Neural Networks (DNNs) excel in various domains but face challenges in providing accurate uncertainty estimates, which are crucial for high-stakes applications. Large Language Models (LLMs) have recently emerged as powerful tools, demonstrating exceptional performance in language tasks. However, traditional calibration metrics such as Expected Calibration Error (ECE) and classwise-ECE (cw-ECE… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  44. arXiv:2406.11309  [pdf, other

    cs.CV

    BaFTA: Backprop-Free Test-Time Adaptation For Zero-Shot Vision-Language Models

    Authors: Xuefeng Hu, Ke Zhang, Min Sun, Albert Chen, Cheng-Hao Kuo, Ram Nevatia

    Abstract: Large-scale pretrained vision-language models like CLIP have demonstrated remarkable zero-shot image classification capabilities across diverse domains. To enhance CLIP's performance while preserving the zero-shot paradigm, various test-time prompt tuning methods have been introduced to refine class embeddings through unsupervised learning objectives during inference. However, these methods often… ▽ More

    Submitted 18 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Preprint updated from our earlier manuscript submitted to ICLR 2024 (https://openreview.net/forum?id=KNtcoAM5Gy)

  45. arXiv:2406.11213  [pdf, other

    cs.SE

    A Survey of AIOps for Failure Management in the Era of Large Language Models

    Authors: Lingzhe Zhang, Tong Jia, Mengxi Jia, Yifan Wu, Aiwei Liu, Yong Yang, Zhonghai Wu, Xuming Hu, Philip S. Yu, Ying Li

    Abstract: As software systems grow increasingly intricate, Artificial Intelligence for IT Operations (AIOps) methods have been widely used in software system failure management to ensure the high availability and reliability of large-scale distributed software systems. However, these methods still face several challenges, such as lack of cross-platform generality and cross-task flexibility. Fortunately, rec… ▽ More

    Submitted 23 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: 35 pages

  46. arXiv:2406.11193  [pdf, other

    cs.CL

    MMNeuron: Discovering Neuron-Level Domain-Specific Interpretation in Multimodal Large Language Model

    Authors: Jiahao Huo, Yibo Yan, Boren Hu, Yutao Yue, Xuming Hu

    Abstract: Projecting visual features into word embedding space has become a significant fusion strategy adopted by Multimodal Large Language Models (MLLMs). However, its internal mechanisms have yet to be explored. Inspired by multilingual research, we identify domain-specific neurons in multimodal large language models. Specifically, we investigate the distribution of domain-specific neurons and the mechan… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  47. arXiv:2406.10340  [pdf, other

    cond-mat.str-el

    Spin waves in Dirac semimetal Ca$_{0.6}$Sr$_{0.4}$MnSb$_2$ investigated with neutrons by the diffraction method

    Authors: Xiao Hu, Yan Wu, Matthias D. Frontzek, Zhixiang Hu, Cedomir Petrovic, John M. Tranquada, Igor A. Zaliznyak

    Abstract: We report neutron diffraction measurements of Ca$_{0.6}$Sr$_{0.4}$MnSb$_2$, a low-carrier-density Dirac semimetal in which the antiferromagnetic Mn layers are interleaved with Sb layers that host Dirac fermions. We have discovered that we can detect a good quality inelastic spin wave signal from a small (m ~ 0.28 g) single crystal sample by the diffraction method, without energy analysis, using a… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 6 pages including 4 figures and bibliography plus 13-page supplementary with figures S1-S11

  48. arXiv:2406.09781  [pdf, other

    cs.CV

    GPT-4o: Visual perception performance of multimodal large language models in piglet activity understanding

    Authors: Yiqi Wu, Xiaodan Hu, Ziming Fu, Siling Zhou, Jiangong Li

    Abstract: Animal ethology is an crucial aspect of animal research, and animal behavior labeling is the foundation for studying animal behavior. This process typically involves labeling video clips with behavioral semantic tags, a task that is complex, subjective, and multimodal. With the rapid development of multimodal large language models(LLMs), new application have emerged for animal behavior understandi… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  49. arXiv:2406.09723  [pdf, other

    cs.LG cs.AI

    When Will Gradient Regularization Be Harmful?

    Authors: Yang Zhao, Hao Zhang, Xiuyuan Hu

    Abstract: Gradient regularization (GR), which aims to penalize the gradient norm atop the loss function, has shown promising results in training modern over-parameterized deep neural networks. However, can we trust this powerful technique? This paper reveals that GR can cause performance degeneration in adaptive optimization scenarios, particularly with learning rate warmup. Our empirical and theoretical an… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: ICML 2024 paper

    MSC Class: 55N31 ACM Class: I.4.0

  50. arXiv:2406.09701  [pdf, other

    cs.SE

    Towards Effectively Detecting and Explaining Vulnerabilities Using Large Language Models

    Authors: Qiheng Mao, Zhenhao Li, Xing Hu, Kui Liu, Xin Xia, Jianling Sun

    Abstract: Software vulnerabilities pose significant risks to the security and integrity of software systems. Prior studies have proposed a series of approaches to vulnerability detection using deep learning or pre-trained models. However, there is still a lack of vulnerability's detailed explanation for understanding apart from detecting its occurrence. Recently, large language models (LLMs) have shown a re… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.