Skip to main content

Showing 1–50 of 1,315 results for author: Liang, X

  1. arXiv:2407.08706  [pdf, other

    cs.CV

    HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models

    Authors: Runhui Huang, Xinpeng Ding, Chunwei Wang, Jianhua Han, Yulong Liu, Hengshuang Zhao, Hang Xu, Lu Hou, Wei Zhang, Xiaodan Liang

    Abstract: High-resolution inputs enable Large Vision-Language Models (LVLMs) to discern finer visual details, enhancing their comprehension capabilities. To reduce the training and computation costs caused by high-resolution input, one promising direction is to use sliding windows to slice the input into uniform patches, each matching the input size of the well-trained vision encoder. Although efficient, th… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  2. arXiv:2407.08554  [pdf, other

    cs.AI cs.HC

    Establishing Rigorous and Cost-effective Clinical Trials for Artificial Intelligence Models

    Authors: Wanling Gao, Yunyou Huang, Dandan Cui, Zhuoming Yu, Wenjing Liu, Xiaoshuang Liang, Jiahui Zhao, Jiyue Xie, Hao Li, Li Ma, Ning Ye, Yumiao Kang, Dingfeng Luo, Peng Pan, Wei Huang, Zhongmou Liu, Jizhong Hu, Gangyuan Zhao, Chongrong Jiang, Fan Huang, Tianyi Wei, Suqin Tang, Bingjie Xia, Zhifei Zhang, Jianfeng Zhan

    Abstract: A profound gap persists between artificial intelligence (AI) and clinical practice in medicine, primarily due to the lack of rigorous and cost-effective evaluation methodologies. State-of-the-art and state-of-the-practice AI model evaluations are limited to laboratory studies on medical datasets or direct clinical trials with no or solely patient-centered controls. Moreover, the crucial role of cl… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 23 pages

  3. arXiv:2407.07844  [pdf, other

    cs.CV

    OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion

    Authors: Hao Wang, Pengzhen Ren, Zequn Jie, Xiao Dong, Chengjian Feng, Yinlong Qian, Lin Ma, Dongmei Jiang, Yaowei Wang, Xiangyuan Lan, Xiaodan Liang

    Abstract: Open-vocabulary detection is a challenging task due to the requirement of detecting objects based on class names, including those not encountered during training. Existing methods have shown strong zero-shot detection capabilities through pre-training on diverse large-scale datasets. However, these approaches still face two primary challenges: (i) how to universally integrate diverse data sources… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Technical Report

  4. arXiv:2407.06937  [pdf, other

    cs.CV

    HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance

    Authors: Guian Fang, Wenbiao Yan, Yuanfan Guo, Jianhua Han, Zutao Jiang, Hang Xu, Shengcai Liao, Xiaodan Liang

    Abstract: Text-to-image diffusion models have significantly advanced in conditional image generation. However, these models usually struggle with accurately rendering images featuring humans, resulting in distorted limbs and other anomalies. This issue primarily stems from the insufficient recognition and evaluation of limb qualities in diffusion models. To address this issue, we introduce AbHuman, the firs… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024

  5. arXiv:2407.05890  [pdf, other

    cs.RO cs.CL

    Affordances-Oriented Planning using Foundation Models for Continuous Vision-Language Navigation

    Authors: Jiaqi Chen, Bingqian Lin, Xinmin Liu, Xiaodan Liang, Kwan-Yee K. Wong

    Abstract: LLM-based agents have demonstrated impressive zero-shot performance in the vision-language navigation (VLN) task. However, these zero-shot methods focus only on solving high-level task planning by selecting nodes in predefined navigation graphs for movements, overlooking low-level control in realistic navigation scenarios. To bridge this gap, we propose AO-Planner, a novel affordances-oriented pla… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  6. arXiv:2407.05578  [pdf, other

    cs.CV

    FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance

    Authors: Jiedong Zhuang, Jiaqi Hu, Lianrui Mu, Rui Hu, Xiaoyu Liang, Jiangnan Ye, Haoji Hu

    Abstract: CLIP has achieved impressive zero-shot performance after pre-training on a large-scale dataset consisting of paired image-text data. Previous works have utilized CLIP by incorporating manually designed visual prompts like colored circles and blur masks into the images to guide the model's attention, showing enhanced zero-shot performance in downstream tasks. Although these methods have achieved pr… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: accepted by ECCV2024

  7. arXiv:2407.04213  [pdf

    cs.CR cs.NI

    Pathfinder: Exploring Path Diversity for Assessing Internet Censorship Inconsistency

    Authors: Xiaoqin Liang, Guannan Liu, Lin Jin, Shuai Hao, Haining Wang

    Abstract: Internet censorship is typically enforced by authorities to achieve information control for a certain group of Internet users. So far existing censorship studies have primarily focused on country-level characterization because (1) in many cases, censorship is enabled by governments with nationwide policies and (2) it is usually hard to control how the probing packets are routed to trigger censorsh… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  8. arXiv:2407.03361  [pdf, ps, other

    cs.SD cs.AI eess.AS

    PianoBART: Symbolic Piano Music Generation and Understanding with Large-Scale Pre-Training

    Authors: Xiao Liang, Zijian Zhao, Weichao Zeng, Yutong He, Fupeng He, Yiyi Wang, Chengying Gao

    Abstract: Learning musical structures and composition patterns is necessary for both music generation and understanding, but current methods do not make uniform use of learned features to generate and comprehend music simultaneously. In this paper, we propose PianoBART, a pre-trained model that uses BART for both symbolic piano music generation and understanding. We devise a multi-level object selection str… ▽ More

    Submitted 25 June, 2024; originally announced July 2024.

  9. arXiv:2407.03244  [pdf

    cond-mat.mtrl-sci physics.chem-ph

    Picosecond lifetimes of hydrogen bonds in the halide perovskite CH$_3$NH$_3$PbBr$_3$

    Authors: Alejandro Garrote-Márquez, Lucas Lodeiro, Norge Cruz Hernández, Xia Liang, Aron Walsh, Eduardo Menéndez-Proupin

    Abstract: The structures and properties of organic-inorganic perovskites are influenced by the hydrogen bonding between the organic cations and the inorganic octahedral networks. This study explores the dynamics of hydrogen bonds in CH$_3$NH$_3$PbBr$_3$ across a temperature range from 70 K to 350 K, using molecular dynamics simulations with machine-learning force fields. The results indicate that the lifeti… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 26 pages, 14 figures

  10. arXiv:2407.02911  [pdf, other

    eess.IV cs.CV

    Non-Adversarial Learning: Vector-Quantized Common Latent Space for Multi-Sequence MRI

    Authors: Luyi Han, Tao Tan, Tianyu Zhang, Xin Wang, Yuan Gao, Chunyao Lu, Xinglong Liang, Haoran Dou, Yunzhi Huang, Ritse Mann

    Abstract: Adversarial learning helps generative models translate MRI from source to target sequence when lacking paired samples. However, implementing MRI synthesis with adversarial learning in clinical settings is challenging due to training instability and mode collapse. To address this issue, we leverage intermediate sequences to estimate the common latent space among multi-sequence MRI, enabling the rec… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  11. arXiv:2407.02902  [pdf, other

    stat.ME

    Instrumental Variable methods to target Hypothetical Estimands with longitudinal repeated measures data: Application to the STEP 1 trial

    Authors: Jack Bowden, Jesper Madsen, Bryan Goldman, Aske Thorn Iversen, Xiaoran Liang, Stijn Vansteelandt

    Abstract: The STEP 1 randomized trial evaluated the effect of taking semaglutide vs placebo on body weight over a 68 week duration. As with any study evaluating an intervention delivered over a sustained period, non-adherence was observed. This was addressed in the original trial analysis within the Estimand Framework by viewing non-adherence as an intercurrent event. The primary analysis applied a treatmen… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  12. arXiv:2407.01920  [pdf, other

    cs.CL cs.AI cs.CV cs.LG cs.MM

    To Forget or Not? Towards Practical Knowledge Unlearning for Large Language Models

    Authors: Bozhong Tian, Xiaozhuan Liang, Siyuan Cheng, Qingbin Liu, Mengru Wang, Dianbo Sui, Xi Chen, Huajun Chen, Ningyu Zhang

    Abstract: Large Language Models (LLMs) trained on extensive corpora inevitably retain sensitive data, such as personal privacy information and copyrighted material. Recent advancements in knowledge unlearning involve updating LLM parameters to erase specific knowledge. However, current unlearning paradigms are mired in vague forgetting boundaries, often erasing knowledge indiscriminately. In this work, we i… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Work in progress

  13. arXiv:2407.00687  [pdf, ps, other

    cs.LO math.LO

    Field Knowledge as a Dual to Distributed Knowledge: A Characterization by Weighted Modal Logic

    Authors: Xiaolong Liang, Yì N. Wáng

    Abstract: The study of group knowledge concepts such as mutual, common, and distributed knowledge is well established within the discipline of epistemic logic. In this work, we incorporate epistemic abilities of agents to refine the formal definition of distributed knowledge and introduce a formal characterization of field knowledge. We propose that field knowledge serves as a dual to distributed knowledge.… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Journal ref: Liao et al. (eds.) Fourth International Workshop on Logics for New-Generation Artificial Intelligence (LNGAI 2024), pp. 9--31, College Publications, 24 June 2024

  14. arXiv:2406.20098  [pdf, other

    cs.CV cs.AI cs.CL

    Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs

    Authors: Sukmin Yun, Haokun Lin, Rusiru Thushara, Mohammad Qazim Bhat, Yongxin Wang, Zutao Jiang, Mingkai Deng, Jinhong Wang, Tianhua Tao, Junbo Li, Haonan Li, Preslav Nakov, Timothy Baldwin, Zhengzhong Liu, Eric P. Xing, Xiaodan Liang, Zhiqiang Shen

    Abstract: Multimodal large language models (MLLMs) have shown impressive success across modalities such as image, video, and audio in a variety of understanding and generation tasks. However, current MLLMs are surprisingly poor at understanding webpage screenshots and generating their corresponding HTML code. To address this problem, we propose Web2Code, a benchmark consisting of a new large-scale webpage-t… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: Website at https://mbzuai-llm.github.io/webpage2code/

  15. arXiv:2406.19735  [pdf, other

    hep-ph hep-ex

    Pseudoscalar heavy quarkonium production in heavy ion ultraperipheral collision

    Authors: Jun Jiang, Shi-Yuan Li, Xiao Liang, Yan-Rui Liu, Cong-Feng Qiao, Zong-Guo Si, Hao Yang

    Abstract: The inclusive production of pseudoscalar heavy quarkoniua ($η_c,~η_b$ and $B_c$) via photon-photon fusion in heavy ion ultraperipheral collision (UPC) are calculated to QCD next-to-leading order in the framework of non-relativistic QCD (NRQCD). The total cross section of $η_c$ produced in Pb-Pb UPC is 194 $\mathrm{nb}^{-1}$ and 1052 $\mathrm{nb}^{-1}$ at nucleon-nucleon c.m. energies… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: 16 pages, 1 figures, 4 tables

  16. arXiv:2406.18231  [pdf, ps, other

    math.DS

    Return time sets and product recurrence

    Authors: Jian Li, Xianjuan Liang, Yini Yang

    Abstract: Let $G$ be a countable infinite discrete group. We show that a subset $F$ of $G$ contains a return time set of some piecewise syndetic recurrent point $x$ in a compact Hausdorff space $X$ with a $G$-action if and only if $F$ is a quasi-central set. As an application, we show that if a nonempty closed subsemigroup $S$ of the Stone-Čech compactification $βG$ contains the smallest ideal $K(βG)$ of… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 22 pages

  17. arXiv:2406.17946  [pdf, other

    astro-ph.CO astro-ph.HE

    The Glow of Axion Quark Nugget Dark Matter: (II) Galaxy Clusters

    Authors: Julian S. Sommer, Klaus Dolag, Ludwig M. Böss, Ildar Khabibullin, Xunyu Liang, Ludovic Van Waerbeke, Ariel Zhitnitsky, Fereshteh Majidi, Jenny G. Sorce, Benjamin Seidel, Elena Hernández-Martínez

    Abstract: (abridged) We analyze the emission of axion quark nuggets in a large sample of 161 simulated galaxy clusters using the SLOW simulation. These clusters are divided into a sub-sample of 150 galaxy clusters, ordered in five mass bins ranging from $0.8$ to $31.7 \times 10^{14} \,M_\odot$, along with 11 cross-identified galaxy clusters from observations. We investigate dark matter-baryonic matter inter… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 22 pages, 14 figures

  18. arXiv:2406.17807  [pdf, other

    cs.CL cs.AI

    Enhancing Commentary Strategies for Imperfect Information Card Games: A Study of Large Language Models in Guandan Commentary

    Authors: Meiling Tao, Xuechen Liang, Yiling Tao, Tianyu Shi

    Abstract: Recent advancements in large language models (LLMs) have unlocked the potential for generating high-quality game commentary. However, producing insightful and engaging commentary for complex games with incomplete information remains a significant challenge. In this paper, we introduce a novel commentary method that combine Reinforcement Learning (RL) and LLMs, tailored specifically for the Chinese… ▽ More

    Submitted 4 July, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

  19. arXiv:2406.17006  [pdf, other

    hep-ex

    Probing the nature of the $χ_{c1}(3872)$ state using radiative decays

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1094 additional authors not shown)

    Abstract: The radiative decays $χ_{c1}(3872)\rightarrowψ(2S)γ$ and $χ_{c1}(3872)\rightarrow J/ψγ$ are used to probe the~nature of the~$χ_{c1}(3872)$ state using proton-proton collision data collected with the LHCb detector, corresponding to an~integrated luminosity of~9fb$^{-1}$. Using the~$B^+\rightarrow χ_{c1}(3872)K^+$decay, the $χ_{c1}(3872)\rightarrow ψ(2S)γ$ process is observed for the first time and… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 31 pages, 2 figures. All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-015.html (LHCb public pages)

    Report number: LHCb-PAPER-2024-015, CERN-EP-2025-157

  20. arXiv:2406.16694  [pdf, other

    cs.CL

    Task Oriented In-Domain Data Augmentation

    Authors: Xiao Liang, Xinyu Hu, Simiao Zuo, Yeyun Gong, Qiang Lou, Yi Liu, Shao-Lun Huang, Jian Jiao

    Abstract: Large Language Models (LLMs) have shown superior performance in various applications and fields. To achieve better performance on specialized domains such as law and advertisement, LLMs are often continue pre-trained on in-domain data. However, existing approaches suffer from two major issues. First, in-domain data are scarce compared with general domain-agnostic data. Second, data used for contin… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  21. A Review of Spatial Network Insights and Methods in the Context of Planning: Applications, Challenges, and Opportunities

    Authors: Xiaofan Liang, Yuhao Kang

    Abstract: With the rise of geospatial big data, new narratives of cities based on spatial networks and flows have replaced the traditional focus on locations. While plenty of research have empirically analyzed network structures, there lacks a state-of-the-art synthesis of applicable insights and methods of spatial networks in the planning context. In this chapter, we reviewed the theories, concepts, method… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: peer-reviewed book chapter

    Journal ref: Urban Informatics and Future Cities, 71-91 (2021)

  22. Intercity Connectivity and Innovation

    Authors: Xiaofan Liang, César A. Hidalgo, Pierre-Alexandre Balland, Siqi Zheng, Jianghao Wang

    Abstract: Urban outputs, from economy to innovation, are known to grow as a power of a city's population. But, since large cities tend to be central in transportation and communication networks, the effects attributed to city size may be confounded with those of intercity connectivity. Here, we map intercity networks for the world's two largest economies (the United States and China) to explore whether a ci… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: peer-reviewed journal article; An interactive visualization and data are available at: https://github.com/xiaofanliang/intercity_connectivity

    Journal ref: Computers, Environment and Urban Systems, 109, 102092 (2024)

  23. arXiv:2406.14408  [pdf, other

    cs.AI cs.CL cs.LG

    FVEL: Interactive Formal Verification Environment with Large Language Models via Theorem Proving

    Authors: Xiaohan Lin, Qingxing Cao, Yinya Huang, Haiming Wang, Jianqiao Lu, Zhengying Liu, Linqi Song, Xiaodan Liang

    Abstract: Formal verification (FV) has witnessed growing significance with current emerging program synthesis by the evolving large language models (LLMs). However, current formal verification mainly resorts to symbolic verifiers or hand-craft rules, resulting in limitations for extensive and flexible verification. On the other hand, formal languages for automated theorem proving, such as Isabelle, as anoth… ▽ More

    Submitted 20 June, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

  24. arXiv:2406.14180  [pdf, other

    cs.NE

    RTFormer: Re-parameter TSBN Spiking Transformer

    Authors: Hongzhi Wang, Xiubo Liang, Mengjian Li, Tao Zhang

    Abstract: The Spiking Neural Networks (SNNs), renowned for their bio-inspired operational mechanism and energy efficiency, mirror the human brain's neural activity. Yet, SNNs face challenges in balancing energy efficiency with the computational demands of advanced tasks. Our research introduces the RTFormer, a novel architecture that embeds Re-parameterized Temporal Sliding Batch Normalization (TSBN) within… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  25. arXiv:2406.13173  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Biomedical Visual Instruction Tuning with Clinician Preference Alignment

    Authors: Hejie Cui, Lingjun Mao, Xin Liang, Jieyu Zhang, Hui Ren, Quanzheng Li, Xiang Li, Carl Yang

    Abstract: Recent advancements in multimodal foundation models have showcased impressive capabilities in understanding and reasoning with visual and textual information. Adapting these foundation models trained for general usage to specialized domains like biomedicine requires large-scale domain-specific instruction datasets. While existing works have explored curating such datasets automatically, the result… ▽ More

    Submitted 29 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

    MSC Class: 68T50; 68T45; 68T37; 68T05; 68T07; 68T09; ACM Class: I.2.7; I.2.6; I.2.10

  26. arXiv:2406.12122  [pdf, other

    astro-ph.CO

    The Glow of Axion Quark Nugget Dark Matter: (I) Large Scale Structures

    Authors: Fereshteh Majidi, Xunyu Liang, Ludovic Van Waerbeke, Ariel Zhitnitsky, Michael Sekatchev, Julian S. Sommer, Klaus Dolag, Tiago Castro

    Abstract: Axion quark nuggets (AQNs) are hypothetical objects with a mass greater than a few grams and sub-micrometer size, formed during the quark-hadron transition. Originating from the axion field, they offer a possible resolution of the similarity between visible and dark components of the Universe. These composite objects behave as cold dark matter, interacting with ordinary matter and resulting in per… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 38 pages, 12 figures, submitted to JCAP

  27. arXiv:2406.12111  [pdf, other

    hep-ex

    Precision measurement of the $Ξ^-_b$ baryon lifetime

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1064 additional authors not shown)

    Abstract: A sample of $pp$ collision data, corresponding to an integrated luminosity of 5.5 fb$^{-1}$ and collected by the LHCb experiment during Run 2, is used to measure the ratio of the lifetime of the $Ξ^-_b$ baryon to that of the $Λ^0_b$ baryon, $r_τ\equivτ_{Ξ^-_b}/τ_{Λ^0_b}$. The value ${r_τ^{\rm Run\,2}=1.076\pm0.013\pm0.006}$ is obtained, where the first uncertainty is statistical and the second sys… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 12 pages, 5 figures. All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2014-010.html (LHCb public pages)

    Report number: LHCb-PAPER-2024-010, CERN-EP-2024-139

  28. arXiv:2406.10536  [pdf, other

    physics.comp-ph cond-mat.mtrl-sci

    Universal materials model of deep-learning density functional theory Hamiltonian

    Authors: Yuxiang Wang, Yang Li, Zechen Tang, He Li, Zilong Yuan, Honggeng Tao, Nianlong Zou, Ting Bao, Xinghao Liang, Zezhou Chen, Shanghua Xu, Ce Bian, Zhiming Xu, Chong Wang, Chen Si, Wenhui Duan, Yong Xu

    Abstract: Realizing large materials models has emerged as a critical endeavor for materials research in the new era of artificial intelligence, but how to achieve this fantastic and challenging objective remains elusive. Here, we propose a feasible pathway to address this paramount pursuit by developing universal materials models of deep-learning density functional theory Hamiltonian (DeepH), enabling compu… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  29. arXiv:2406.09423  [pdf, other

    cs.DC cs.GR

    MSz: An Efficient Parallel Algorithm for Correcting Morse-Smale Segmentations in Error-Bounded Lossy Compressors

    Authors: Yuxiao Li, Xin Liang, Bei Wang, Yongfeng Qiu, Lin Yan, Hanqi Guo

    Abstract: This research explores a novel paradigm for preserving topological segmentations in existing error-bounded lossy compressors. Today's lossy compressors rarely consider preserving topologies such as Morse-Smale complexes, and the discrepancies in topology between original and decompressed datasets could potentially result in erroneous interpretations or even incorrect scientific conclusions. In thi… ▽ More

    Submitted 5 July, 2024; v1 submitted 5 April, 2024; originally announced June 2024.

  30. arXiv:2406.08283  [pdf, other

    cs.RO eess.SY

    A Hybrid Task-Constrained Motion Planning for Collaborative Robots in Intelligent Remanufacturing

    Authors: Wansong Liu, Chang Liu, Xiao Liang, Minghui Zheng

    Abstract: Industrial manipulators have extensively collaborated with human operators to execute tasks, e.g., disassembly of end-of-use products, in intelligent remanufacturing. A safety task execution requires real-time path planning for the manipulator's end-effector to autonomously avoid human operators. This is even more challenging when the end-effector needs to follow a planned path while avoiding the… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  31. arXiv:2406.07362  [pdf, other

    cs.HC

    AI.vs.Clinician: Unveiling Intricate Interactions Between AI and Clinicians through an Open-Access Database

    Authors: Wanling Gao, Yuan Liu, Zhuoming Yu, Dandan Cui, Wenjing Liu, Xiaoshuang Liang, Jiahui Zhao, Jiyue Xie, Hao Li, Li Ma, Ning Ye, Yumiao Kang, Dingfeng Luo, Peng Pan, Wei Huang, Zhongmou Liu, Jizhong Hu, Fan Huang, Gangyuan Zhao, Chongrong Jiang, Tianyi Wei, Zhifei Zhang, Yunyou Huang, Jianfeng Zhan

    Abstract: Artificial Intelligence (AI) plays a crucial role in medical field and has the potential to revolutionize healthcare practices. However, the success of AI models and their impacts hinge on the synergy between AI and medical specialists, with clinicians assuming a dominant role. Unfortunately, the intricate dynamics and interactions between AI and clinicians remain undiscovered and thus hinder AI f… ▽ More

    Submitted 15 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

    Comments: 12 pages

  32. arXiv:2406.03387  [pdf, other

    hep-ex

    Measurement of the branching fraction ratios $R(D^{+})$ and $R(D^{*+})$ using muonic $τ$ decays

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1063 additional authors not shown)

    Abstract: The branching fraction ratios of $\overline{B}^0\to D^+τ^-\overlineν_τ$ and $\overline{B}^0\to D^{*+}τ^-\overlineν_τ$ decays are measured with respect to their muonic counterparts, using a data sample corresponding to an integrated luminosity of 2.0 fb$^{-1}$ collected by the LHCb experiment in proton-proton collisions at $\sqrt{s} = 13$ TeV. The reconstructed final states are formed by combining… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://lhcbproject.web.cern.ch/Publications/LHCbProjectPublic/LHCb-PAPER-2024-007.html (LHCb public pages)

    Report number: LHCb-PAPER-2024-007, CERN-EP-2024-125

  33. arXiv:2406.03156  [pdf, other

    hep-ex

    Observation of new charmonium(-like) states in $B^+ \to D^{*\pm} D^{\mp} K^+$ decays

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1062 additional authors not shown)

    Abstract: A study of resonant structures in $B^{+}\rightarrow{D^{\ast+}D^{-}K^{+}}$ and $B^{+}\rightarrow{D^{\ast-}D^{+}K^{+}}$ decays is performed, using proton-proton collision data at centre-of-mass energies of $\sqrt{s}=7, 8$, and $13$ TeV recorded by the LHCb experiment, corresponding to an integrated luminosity of 9 fb$^{-1}$. A simultaneous amplitude fit is performed to the two channels with contribu… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2023-047.html (LHCb public pages)

    Report number: LHCb-PAPER-2023-047, CERN-EP-2024-096

  34. arXiv:2406.02990  [pdf, other

    cs.CV

    Predicting Genetic Mutation from Whole Slide Images via Biomedical-Linguistic Knowledge Enhanced Multi-label Classification

    Authors: Gexin Huang, Chenfei Wu, Mingjie Li, Xiaojun Chang, Ling Chen, Ying Sun, Shen Zhao, Xiaodan Liang, Liang Lin

    Abstract: Predicting genetic mutations from whole slide images is indispensable for cancer diagnosis. However, existing work training multiple binary classification models faces two challenges: (a) Training multiple binary classifiers is inefficient and would inevitably lead to a class imbalance problem. (b) The biological relationships among genes are overlooked, which limits the prediction performance. To… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 16 pages, 8 figures, and 3 tables

  35. arXiv:2406.02147  [pdf, other

    cs.CV

    UA-Track: Uncertainty-Aware End-to-End 3D Multi-Object Tracking

    Authors: Lijun Zhou, Tao Tang, Pengkun Hao, Zihang He, Kalok Ho, Shuo Gu, Wenbo Hou, Zhihui Hao, Haiyang Sun, Kun Zhan, Peng Jia, Xianpeng Lang, Xiaodan Liang

    Abstract: 3D multiple object tracking (MOT) plays a crucial role in autonomous driving perception. Recent end-to-end query-based trackers simultaneously detect and track objects, which have shown promising potential for the 3D MOT task. However, existing methods overlook the uncertainty issue, which refers to the lack of precise confidence about the state and location of tracked objects. Uncertainty arises… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  36. arXiv:2406.01873  [pdf, other

    cs.CL cs.CR cs.LG

    CR-UTP: Certified Robustness against Universal Text Perturbations on Large Language Models

    Authors: Qian Lou, Xin Liang, Jiaqi Xue, Yancheng Zhang, Rui Xie, Mengxin Zheng

    Abstract: It is imperative to ensure the stability of every prediction made by a language model; that is, a language's prediction should remain consistent despite minor input variations, like word substitutions. In this paper, we investigate the problem of certifying a language model's robustness against Universal Text Perturbations (UTPs), which have been widely used in universal adversarial attacks and ba… ▽ More

    Submitted 5 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: Accepted by ACL Findings 2024

  37. arXiv:2406.01388  [pdf, other

    cs.CV

    AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation

    Authors: Junhao Cheng, Xi Lu, Hanhui Li, Khun Loun Zai, Baiqiao Yin, Yuhao Cheng, Yiqiang Yan, Xiaodan Liang

    Abstract: As cutting-edge Text-to-Image (T2I) generation models already excel at producing remarkable single images, an even more challenging task, i.e., multi-turn interactive image generation begins to attract the attention of related research communities. This task requires models to interact with users over multiple turns to generate a coherent sequence of images. However, since users may switch subject… ▽ More

    Submitted 10 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: Multi-turn interactive image generation

  38. arXiv:2406.00235  [pdf, other

    hep-ex

    Amplitude analysis of the radiative decay $B^0_s\to K^+K^-γ$

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1061 additional authors not shown)

    Abstract: A search for radiative decay of $B^0_s$ mesons to orbitally excited $K^+K^-$ states is performed using proton proton collisions recorded by the \mbox{LHCb}\xspace experiment, corresponding to an integrated luminosity of 9~fb$^{-1}$. The dikaon spectrum in the mass range $m_{KK}<2400$~{\ensuremath{\,\text{Me\kern -0.1em V\!/}c^2}\xspace} is dominated by the $φ(1020)$ resonance that accounts for alm… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-002.html (LHCb public pages)

    Report number: LHCb-PAPER-2024-002, CERN-EP-2024-115

  39. arXiv:2406.00046  [pdf, other

    cs.CL cs.LG

    Hate Speech Detection with Generalizable Target-aware Fairness

    Authors: Tong Chen, Danny Wang, Xurong Liang, Marten Risius, Gianluca Demartini, Hongzhi Yin

    Abstract: To counter the side effect brought by the proliferation of social media platforms, hate speech detection (HSD) plays a vital role in halting the dissemination of toxic online posts at an early stage. However, given the ubiquitous topical communities on social media, a trained HSD classifier easily becomes biased towards specific targeted groups (e.g., female and black people), where a high rate of… ▽ More

    Submitted 11 June, 2024; v1 submitted 28 May, 2024; originally announced June 2024.

    Comments: To appear in KDD 2024

  40. arXiv:2405.19465  [pdf, other

    cs.CV

    RAP: Efficient Text-Video Retrieval with Sparse-and-Correlated Adapter

    Authors: Meng Cao, Haoran Tang, Jinfa Huang, Peng Jin, Can Zhang, Ruyang Liu, Long Chen, Xiaodan Liang, Li Yuan, Ge Li

    Abstract: Text-Video Retrieval (TVR) aims to align relevant video content with natural language queries. To date, most state-of-the-art TVR methods learn image-to-video transfer learning based on large-scale pre-trained visionlanguage models (e.g., CLIP). However, fully fine-tuning these pre-trained models for TVR incurs prohibitively expensive computation costs. To this end, we propose to conduct efficient… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: Accepted by ACL 2024 Findings

  41. Correctable Landmark Discovery via Large Models for Vision-Language Navigation

    Authors: Bingqian Lin, Yunshuang Nie, Ziming Wei, Yi Zhu, Hang Xu, Shikui Ma, Jianzhuang Liu, Xiaodan Liang

    Abstract: Vision-Language Navigation (VLN) requires the agent to follow language instructions to reach a target position. A key factor for successful navigation is to align the landmarks implied in the instruction with diverse visual observations. However, previous VLN agents fail to perform accurate modality alignment especially in unexplored scenes, since they learn from limited navigation data and lack s… ▽ More

    Submitted 5 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: Accepted by TPAMI 2024

  42. arXiv:2405.18326  [pdf, other

    cs.CV

    VITON-DiT: Learning In-the-Wild Video Try-On from Human Dance Videos via Diffusion Transformers

    Authors: Jun Zheng, Fuwei Zhao, Youjiang Xu, Xin Dong, Xiaodan Liang

    Abstract: Video try-on stands as a promising area for its tremendous real-world potential. Prior works are limited to transferring product clothing images onto person videos with simple poses and backgrounds, while underperforming on casually captured videos. Recently, Sora revealed the scalability of Diffusion Transformer (DiT) in generating lifelike videos featuring real-world scenarios. Inspired by this,… ▽ More

    Submitted 7 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: Project Page: https://zhengjun-ai.github.io/viton-dit-page/

  43. arXiv:2405.17347  [pdf, other

    hep-ex

    Comprehensive analysis of local and nonlocal amplitudes in the $B^0\rightarrow K^{*0}μ^+μ^-$ decay

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1070 additional authors not shown)

    Abstract: A comprehensive study of the local and nonlocal amplitudes contributing to the decay $B^0\rightarrow K^{*0}(\to K^+π^-) μ^+μ^-$ is performed by analysing the phase-space distribution of the decay products. The analysis is based on \proton\proton collision data corresponding to an integrated luminosity of 8.4fb$^{-1}$ collected by the LHCb experiment. This measurement employs for the first time a m… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-011.html (LHCb public pages)

    Report number: LHCb-PAPER-2024-011, CERN-EP-2024-122

  44. arXiv:2405.17188  [pdf, other

    cs.CV

    The SkatingVerse Workshop & Challenge: Methods and Results

    Authors: Jian Zhao, Lei Jin, Jianshu Li, Zheng Zhu, Yinglei Teng, Jiaojiao Zhao, Sadaf Gulshad, Zheng Wang, Bo Zhao, Xiangbo Shu, Yunchao Wei, Xuecheng Nie, Xiaojie Jin, Xiaodan Liang, Shin'ichi Satoh, Yandong Guo, Cewu Lu, Junliang Xing, Jane Shen Shengmei

    Abstract: The SkatingVerse Workshop & Challenge aims to encourage research in developing novel and accurate methods for human action understanding. The SkatingVerse dataset used for the SkatingVerse Challenge has been publicly released. There are two subsets in the dataset, i.e., the training subset and testing subset. The training subsets consists of 19,993 RGB video sequences, and the testing subsets cons… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  45. arXiv:2405.16933  [pdf, other

    cs.CL cs.IR

    Empowering Large Language Models to Set up a Knowledge Retrieval Indexer via Self-Learning

    Authors: Xun Liang, Simin Niu, Zhiyu li, Sensen Zhang, Shichao Song, Hanyu Wang, Jiawei Yang, Feiyu Xiong, Bo Tang, Chenyang Xi

    Abstract: Retrieval-Augmented Generation (RAG) offers a cost-effective approach to injecting real-time knowledge into large language models (LLMs). Nevertheless, constructing and validating high-quality knowledge repositories require considerable effort. We propose a pre-retrieval framework named Pseudo-Graph Retrieval-Augmented Generation (PG-RAG), which conceptualizes LLMs as students by providing them wi… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  46. arXiv:2405.14414  [pdf, other

    cs.AI

    Proving Theorems Recursively

    Authors: Haiming Wang, Huajian Xin, Zhengying Liu, Wenda Li, Yinya Huang, Jianqiao Lu, Zhicheng Yang, Jing Tang, Jian Yin, Zhenguo Li, Xiaodan Liang

    Abstract: Recent advances in automated theorem proving leverages language models to explore expanded search spaces by step-by-step proof generation. However, such approaches are usually based on short-sighted heuristics (e.g., log probability or value function scores) that potentially lead to suboptimal or even distracting subgoals, preventing us from finding longer proofs. To address this challenge, we pro… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 21 pages, 5 figures, 3 tables

  47. arXiv:2405.14333  [pdf, other

    cs.AI

    DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data

    Authors: Huajian Xin, Daya Guo, Zhihong Shao, Zhizhou Ren, Qihao Zhu, Bo Liu, Chong Ruan, Wenda Li, Xiaodan Liang

    Abstract: Proof assistants like Lean have revolutionized mathematical proof verification, ensuring high accuracy and reliability. Although large language models (LLMs) show promise in mathematical reasoning, their advancement in formal theorem proving is hindered by a lack of training data. To address this issue, we introduce an approach to generate extensive Lean 4 proof data derived from high-school and u… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  48. arXiv:2405.14029  [pdf, ps, other

    cs.IT eess.SP

    Analog Beamforming Enabled Multicasting: Finite-Alphabet Inputs and Statistical CSI

    Authors: Yanjun Wu, Zhong Xie, Zhuochen Xie, Chongjun Ouyang, Xuwen Liang

    Abstract: The average multicast rate (AMR) is analyzed in a multicast channel utilizing analog beamforming with finite-alphabet inputs, considering statistical channel state information (CSI). New expressions for the AMR are derived for non-cooperative and cooperative multicasting scenarios. Asymptotic analyses are conducted in the high signal-to-noise ratio regime to derive the array gain and diversity ord… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 5 pages

  49. arXiv:2405.13103  [pdf, other

    hep-ex

    Search for the lepton-flavor violating decay $B^0_s\toφμ^\pmτ^\mp$

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1062 additional authors not shown)

    Abstract: A search for the lepton-flavor violating decays $B^0_s\toφμ^\pmτ^\mp$ is presented, using a sample of proton-proton collisions at center-of-mass energies of 7, 8, and 13 TeV, collected with the LHCb detector and corresponding to a total integrated luminosity of $9\,\text{fb}^{-1}$. The $τ$ leptons are selected using decays with three charged pions. No significant excess is observed, and an upper l… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-006.html (LHCb public pages)

    Report number: LHCb-PAPER-2024-006, CERN-EP-2024-114

  50. arXiv:2405.12688  [pdf, other

    hep-ex

    Study of $b$-hadron decays to $Λ_c^+ h^- h^{\prime -}$ final states

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1072 additional authors not shown)

    Abstract: Decays of $Ξ_b^-$ and $Ω_b^-$ baryons to $Λ_c^+ h^- h^{\prime -}$ final states, with $h^- h^{\prime -}$ being $π^-π^-$, $K^-π^-$ and $K^-K^-$ meson pairs, are searched for using data collected with the LHCb detector. The data sample studied corresponds to an integrated luminosity of $8.7\,\mathrm{fb}^{-1}$ of $pp$ collisions collected at centre-of-mass energies $\sqrt{s} = 7$, $8$ and… ▽ More

    Submitted 22 May, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

    Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-013.html

    Report number: CERN-EP-2024-116, LHCb-PAPER-2024-013