Skip to main content

Showing 1–50 of 3,952 results for author: Zhang, T

  1. arXiv:2407.09295  [pdf, other

    cs.CR

    Security Matrix for Multimodal Agents on Mobile Devices: A Systematic and Proof of Concept Study

    Authors: Yulong Yang, Xinshan Yang, Shuaidong Li, Chenhao Lin, Zhengyu Zhao, Chao Shen, Tianwei Zhang

    Abstract: The rapid progress in the reasoning capability of the Multi-modal Large Language Models (MLLMs) has triggered the development of autonomous agent systems on mobile devices. MLLM-based mobile agent systems consist of perception, reasoning, memory, and multi-agent collaboration modules, enabling automatic analysis of user instructions and the design of task pipelines with only natural language and d… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: Preprint. Work in progress

  2. arXiv:2407.09164  [pdf, other

    cs.CR cs.AI

    TAPI: Towards Target-Specific and Adversarial Prompt Injection against Code LLMs

    Authors: Yuchen Yang, Hongwei Yao, Bingrun Yang, Yiling He, Yiming Li, Tianwei Zhang, Zhan Qin, Kui Ren

    Abstract: Recently, code-oriented large language models (Code LLMs) have been widely and successfully used to simplify and facilitate code programming. With these tools, developers can easily generate desired complete functional codes based on incomplete code and natural language prompts. However, a few pioneering works revealed that these Code LLMs are also vulnerable, e.g., against backdoor and adversaria… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  3. arXiv:2407.09053  [pdf, other

    cs.RO

    Navi2Gaze: Leveraging Foundation Models for Navigation and Target Gazing

    Authors: Jun Zhu, Zihao Du, Haotian Xu, Fengbo Lan, Zilong Zheng, Bo Ma, Shengjie Wang, Tao Zhang

    Abstract: Task-aware navigation continues to be a challenging area of research, especially in scenarios involving open vocabulary. Previous studies primarily focus on finding suitable locations for task completion, often overlooking the importance of the robot's pose. However, the robot's orientation is crucial for successfully completing tasks because of how objects are arranged (e.g., to open a refrigerat… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  4. arXiv:2407.08982  [pdf, other

    cond-mat.mtrl-sci physics.comp-ph

    Understanding chiral charge-density wave by frozen chiral phonon

    Authors: Shuai Zhang, Kaifa Luo, Tiantian Zhang

    Abstract: Charge density wave (CDW) is discovered within a wide interval in solids, however, its microscopic nature is still not transparent in most realistic materials, and the recently studied chiral ones with chiral structural distortion remain unclear. In this paper, we try to understand the driving forces of chiral CDW transition by chiral phonons from the electron-phonon coupling scenario. We use the… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  5. arXiv:2407.08970  [pdf, other

    cs.CR cs.AI cs.LG

    Soft Prompts Go Hard: Steering Visual Language Models with Hidden Meta-Instructions

    Authors: Tingwei Zhang, Collin Zhang, John X. Morris, Eugene Bagdasaryan, Vitaly Shmatikov

    Abstract: We introduce a new type of indirect injection vulnerabilities in language models that operate on images: hidden "meta-instructions" that influence how the model interprets the image and steer the model's outputs to express an adversary-chosen style, sentiment, or point of view. We explain how to create meta-instructions by generating images that act as soft prompts. Unlike jailbreaking attacks a… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  6. arXiv:2407.08659  [pdf, other

    cs.LG cs.CV

    Controlling the Fidelity and Diversity of Deep Generative Models via Pseudo Density

    Authors: Shuangqi Li, Chen Liu, Tong Zhang, Hieu Le, Sabine Süsstrunk, Mathieu Salzmann

    Abstract: We introduce an approach to bias deep generative models, such as GANs and diffusion models, towards generating data with either enhanced fidelity or increased diversity. Our approach involves manipulating the distribution of training and generated data through a novel metric for individual samples, named pseudo density, which is based on the nearest-neighbor information from real samples. Our appr… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  7. arXiv:2407.08240  [pdf, other

    cs.HC cs.AI

    Leveraging LLMs to Predict Affective States via Smartphone Sensor Features

    Authors: Tianyi Zhang, Songyan Teng, Hong Jia, Simon D'Alfonso

    Abstract: As mental health issues for young adults present a pressing public health concern, daily digital mood monitoring for early detection has become an important prospect. An active research area, digital phenotyping, involves collecting and analysing data from personal digital devices such as smartphones (usage and sensors) and wearables to infer behaviours and mental health. Whilst this data is stand… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  8. arXiv:2407.08187  [pdf, other

    cs.CV

    ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation

    Authors: Ruijie Zhu, Chuxin Wang, Ziyang Song, Li Liu, Tianzhu Zhang, Yongdong Zhang

    Abstract: Estimating depth from a single image is a challenging visual task. Compared to relative depth estimation, metric depth estimation attracts more attention due to its practical physical significance and critical applications in real-life scenarios. However, existing metric depth estimation methods are typically trained on specific datasets with similar scenes, facing challenges in generalizing acros… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 14 pages, 11 figure, 13 tables

  9. arXiv:2407.08047  [pdf, other

    cs.LG cs.AI

    Spatial-Temporal Attention Model for Traffic State Estimation with Sparse Internet of Vehicles

    Authors: Jianzhe Xue, Dongcheng Yuan, Yu Sun, Tianqi Zhang, Wenchao Xu, Haibo Zhou, Xuemin, Shen

    Abstract: The growing number of connected vehicles offers an opportunity to leverage internet of vehicles (IoV) data for traffic state estimation (TSE) which plays a crucial role in intelligent transportation systems (ITS). By utilizing only a portion of IoV data instead of the entire dataset, the significant overheads associated with collecting and processing large amounts of data can be avoided. In this p… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  10. arXiv:2407.08019  [pdf, other

    cs.CV

    Coherent and Multi-modality Image Inpainting via Latent Space Optimization

    Authors: Lingzhi Pan, Tong Zhang, Bingyuan Chen, Qi Zhou, Wei Ke, Sabine Süsstrunk, Mathieu Salzmann

    Abstract: With the advancements in denoising diffusion probabilistic models (DDPMs), image inpainting has significantly evolved from merely filling information based on nearby regions to generating content conditioned on various prompts such as text, exemplar images, and sketches. However, existing methods, such as model fine-tuning and simple concatenation of latent vectors, often result in generation fail… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  11. arXiv:2407.07775  [pdf, other

    cs.RO cs.AI

    Mobility VLA: Multimodal Instruction Navigation with Long-Context VLMs and Topological Graphs

    Authors: Hao-Tien Lewis Chiang, Zhuo Xu, Zipeng Fu, Mithun George Jacob, Tingnan Zhang, Tsang-Wei Edward Lee, Wenhao Yu, Connor Schenck, David Rendleman, Dhruv Shah, Fei Xia, Jasmine Hsu, Jonathan Hoech, Pete Florence, Sean Kirmani, Sumeet Singh, Vikas Sindhwani, Carolina Parada, Chelsea Finn, Peng Xu, Sergey Levine, Jie Tan

    Abstract: An elusive goal in navigation research is to build an intelligent agent that can understand multimodal instructions including natural language and image, and perform useful navigation. To achieve this, we study a widely useful category of navigation tasks we call Multimodal Instruction Navigation with demonstration Tours (MINT), in which the environment prior is provided through a previously recor… ▽ More

    Submitted 12 July, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

  12. arXiv:2407.07651  [pdf, other

    hep-ex physics.data-an

    Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$

    Authors: M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (645 additional authors not shown)

    Abstract: The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  13. arXiv:2407.07631  [pdf, other

    cs.LG math.OC math.ST stat.ML

    Pessimism Meets Risk: Risk-Sensitive Offline Reinforcement Learning

    Authors: Dake Zhang, Boxiang Lyu, Shuang Qiu, Mladen Kolar, Tong Zhang

    Abstract: We study risk-sensitive reinforcement learning (RL), a crucial field due to its ability to enhance decision-making in scenarios where it is essential to manage uncertainty and minimize potential adverse outcomes. Particularly, our work focuses on applying the entropic risk measure to RL problems. While existing literature primarily investigates the online setting, there remains a large gap in unde… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: ICML 2024

  14. arXiv:2407.07589  [pdf

    cs.RO

    MSC-LIO: An MSCKF-Based LiDAR-Inertial Odometry with Same-Plane-Point Tracking

    Authors: Tisheng Zhang, Man Yuan, Linfu Wei, Hailiang Tang, Xiaoji Niu

    Abstract: The multi-state constraint Kalman filter (MSCKF) has been proven to be more efficient than graph optimization for visual-based odometry while with similar accuracy. However, it has not yet been properly considered and studied for LiDAR-based odometry. In this paper, we propose a novel tightly coupled LiDAR-inertial odometry based on the MSCKF framework, named MSC-LIO. An efficient LiDAR same-plane… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 9 pages

  15. arXiv:2407.06964  [pdf, other

    cs.CV

    Parameter-Efficient and Memory-Efficient Tuning for Vision Transformer: A Disentangled Approach

    Authors: Taolin Zhang, Jiawang Bai, Zhihe Lu, Dongze Lian, Genping Wang, Xinchao Wang, Shu-Tao Xia

    Abstract: Recent works on parameter-efficient transfer learning (PETL) show the potential to adapt a pre-trained Vision Transformer to downstream recognition tasks with only a few learnable parameters. However, since they usually insert new structures into the pre-trained model, entire intermediate features of that model are changed and thus need to be stored to be involved in back-propagation, resulting in… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: ECCV2024

  16. arXiv:2407.06938  [pdf, other

    cs.CV

    RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models

    Authors: Bowen Zhang, Yiji Cheng, Chunyu Wang, Ting Zhang, Jiaolong Yang, Yansong Tang, Feng Zhao, Dong Chen, Baining Guo

    Abstract: We present RodinHD, which can generate high-fidelity 3D avatars from a portrait image. Existing methods fail to capture intricate details such as hairstyles which we tackle in this paper. We first identify an overlooked problem of catastrophic forgetting that arises when fitting triplanes sequentially on many avatars, caused by the MLP decoder sharing scheme. To overcome this issue, we raise a nov… ▽ More

    Submitted 10 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: ECCV 2024; project page: https://rodinhd.github.io/

  17. arXiv:2407.06494  [pdf, other

    cs.LG cs.AI

    A Generative Approach to Control Complex Physical Systems

    Authors: Long Wei, Peiyan Hu, Ruiqi Feng, Haodong Feng, Yixuan Du, Tao Zhang, Rui Wang, Yue Wang, Zhi-Ming Ma, Tailin Wu

    Abstract: Controlling the evolution of complex physical systems is a fundamental task across science and engineering. Classical techniques suffer from limited applicability or huge computational costs. On the other hand, recent deep learning and reinforcement learning-based approaches often struggle to optimize long-term control sequences under the constraints of system dynamics. In this work, we introduce… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  18. arXiv:2407.06469  [pdf, other

    cs.CV cs.GR

    Sketch-Guided Scene Image Generation

    Authors: Tianyu Zhang, Xiaoxuan Xie, Xusheng Du, Haoran Xie

    Abstract: Text-to-image models are showcasing the impressive ability to create high-quality and diverse generative images. Nevertheless, the transition from freehand sketches to complex scene images remains challenging using diffusion models. In this study, we propose a novel sketch-guided scene image generation framework, decomposing the task of scene image scene generation from sketch inputs into object-l… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 12 pages, 8 figures

  19. arXiv:2407.06431  [pdf, other

    cs.DS

    A Lossless Deamortization for Dynamic Greedy Set Cover

    Authors: Shay Solomon, Amitai Uzrad, Tianyi Zhang

    Abstract: The dynamic set cover problem has been subject to growing research attention in recent years. In this problem, we are given as input a dynamic universe of at most $n$ elements and a fixed collection of $m$ sets, where each element appears in a most $f$ sets and the cost of each set is in $[1/C, 1]$, and the goal is to efficiently maintain an approximate minimum set cover under element updates. T… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: Accepted to FOCS 2024

  20. arXiv:2407.06172  [pdf, other

    cs.AI cs.CL

    On Speeding Up Language Model Evaluation

    Authors: Jin Peng Zhou, Christian K. Belardi, Ruihan Wu, Travis Zhang, Carla P. Gomes, Wen Sun, Kilian Q. Weinberger

    Abstract: Large language models (LLMs) currently dominate the field of natural language processing (NLP), representing the state-of-the-art across a diverse array of tasks. Developing a model of this nature, from training to inference, requires making numerous decisions which define a combinatorial search problem. For example, selecting the optimal pre-trained LLM, prompt, or hyperparameters to attain the b… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  21. arXiv:2407.06027  [pdf, other

    cs.CL

    PAS: Data-Efficient Plug-and-Play Prompt Augmentation System

    Authors: Miao Zheng, Hao Liang, Fan Yang, Haoze Sun, Tianpeng Li, Lingchu Xiong, Yan Zhang, Youzhen Wu, Kun Li, Yanjun Shen, Mingan Lin, Tao Zhang, Guosheng Dong, Yujing Qiao, Kun Fang, Weipeng Chen, Bin Cui, Wentao Zhang, Zenan Zhou

    Abstract: In recent years, the rise of Large Language Models (LLMs) has spurred a growing demand for plug-and-play AI systems. Among the various AI techniques, prompt engineering stands out as particularly significant. However, users often face challenges in writing prompts due to the steep learning curve and significant time investment, and existing automatic prompt engineering (APE) models can be difficul… ▽ More

    Submitted 12 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

  22. arXiv:2407.05781  [pdf, other

    cs.LG eess.SY

    Regret Analysis of Multi-task Representation Learning for Linear-Quadratic Adaptive Control

    Authors: Bruce D. Lee, Leonardo F. Toso, Thomas T. Zhang, James Anderson, Nikolai Matni

    Abstract: Representation learning is a powerful tool that enables learning over large multitudes of agents or domains by enforcing that all agents operate on a shared set of learned features. However, many robotics or controls applications that would benefit from collaboration operate in settings with changing environments and goals, whereas most guarantees for representation learning are stated for static… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  23. arXiv:2407.05758  [pdf, other

    eess.IV cs.AI cs.CV

    Potential of Multimodal Large Language Models for Data Mining of Medical Images and Free-text Reports

    Authors: Yutong Zhang, Yi Pan, Tianyang Zhong, Peixin Dong, Kangni Xie, Yuxiao Liu, Hanqi Jiang, Zhengliang Liu, Shijie Zhao, Tuo Zhang, Xi Jiang, Dinggang Shen, Tianming Liu, Xin Zhang

    Abstract: Medical images and radiology reports are crucial for diagnosing medical conditions, highlighting the importance of quantitative analysis for clinical decision-making. However, the diversity and cross-source heterogeneity of these data challenge the generalizability of current data-mining methods. Multimodal large language models (MLLMs) have recently transformed many domains, significantly affecti… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  24. arXiv:2407.05413  [pdf, other

    cs.AI cs.CL cs.LG

    SBoRA: Low-Rank Adaptation with Regional Weight Updates

    Authors: Lai-Man Po, Yuyang Liu, Haoxuan Wu, Tianqi Zhang, Wing-Yin Yu, Zeyu Jiang, Kun Li

    Abstract: This paper introduces Standard Basis LoRA (SBoRA), a novel parameter-efficient fine-tuning approach for Large Language Models that builds upon the pioneering works of Low-Rank Adaptation (LoRA) and Orthogonal Adaptation. SBoRA further reduces the computational and memory requirements of LoRA while enhancing learning performance. By leveraging orthogonal standard basis vectors to initialize one of… ▽ More

    Submitted 10 July, 2024; v1 submitted 7 July, 2024; originally announced July 2024.

    Comments: 15 pages, 2 figures

  25. arXiv:2407.04476  [pdf, other

    cs.CV cs.LG

    Rethinking Data Input for Point Cloud Upsampling

    Authors: Tongxu Zhang

    Abstract: In recent years, point cloud upsampling has been widely applied in fields such as 3D reconstruction and surface generation. However, existing point cloud upsampling inputs are all patch based, and there is no research discussing the differences and principles between point cloud model full input and patch based input. In order to compare with patch based point cloud input, this article proposes a… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 16 pages, 6 figures

  26. arXiv:2407.03888  [pdf, other

    math.OC cs.LG

    Continuous-time q-Learning for Jump-Diffusion Models under Tsallis Entropy

    Authors: Lijun Bo, Yijie Huang, Xiang Yu, Tingting Zhang

    Abstract: This paper studies continuous-time reinforcement learning for controlled jump-diffusion models by featuring the q-function (the continuous-time counterpart of Q-function) and the q-learning algorithms under the Tsallis entropy regularization. Contrary to the conventional Shannon entropy, the general form of Tsallis entropy renders the optimal policy not necessary a Gibbs measure, where some Lagran… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  27. arXiv:2407.03772  [pdf, other

    eess.IV cs.CV q-bio.QM

    CS3: Cascade SAM for Sperm Segmentation

    Authors: Yi Shi, Xu-Peng Tian, Yun-Kai Wang, Tie-Yi Zhang, Bin Yao, Hui Wang, Yong Shao, Cen-Cen Wang, Rong Zeng, De-Chuan Zhan

    Abstract: Automated sperm morphology analysis plays a crucial role in the assessment of male fertility, yet its efficacy is often compromised by the challenges in accurately segmenting sperm images. Existing segmentation techniques, including the Segment Anything Model(SAM), are notably inadequate in addressing the complex issue of sperm overlap-a frequent occurrence in clinical samples. Our exploratory stu… ▽ More

    Submitted 9 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

    Comments: Early accepted by MICCAI2024

  28. arXiv:2407.03558  [pdf, ps, other

    stat.ME

    Aggregated Sure Independence Screening for Variable Selection with Interaction Structures

    Authors: Tonglin Zhang

    Abstract: A new method called the aggregated sure independence screening is proposed for the computational challenges in variable selection of interactions when the number of explanatory variables is much higher than the number of observations (i.e., $p\gg n$). In this problem, the two main challenges are the strong hierarchical restriction and the number of candidates for the main effects and interactions.… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Preprint

    MSC Class: 62J07; 62J05

  29. arXiv:2407.03203  [pdf, other

    cs.FL cs.AI

    TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts

    Authors: Ruida Wang, Jipeng Zhang, Yizhen Jia, Rui Pan, Shizhe Diao, Renjie Pi, Tong Zhang

    Abstract: Proving mathematical theorems using computer-verifiable formal languages like Lean significantly impacts mathematical reasoning. One approach to formal theorem proving involves generating complete proofs using Large Language Models (LLMs) based on Natural Language (NL) proofs. Similar methods have shown promising results in code generation. However, most modern LLMs exhibit suboptimal performance… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  30. arXiv:2407.03188  [pdf, other

    cs.SD cs.AI cs.MM eess.AS

    MuDiT & MuSiT: Alignment with Colloquial Expression in Description-to-Song Generation

    Authors: Zihao Wang, Haoxuan Liu, Jiaxing Yu, Tao Zhang, Yan Liu, Kejun Zhang

    Abstract: Amid the rising intersection of generative AI and human artistic processes, this study probes the critical yet less-explored terrain of alignment in human-centric automatic song composition. We propose a novel task of Colloquial Description-to-Song Generation, which focuses on aligning the generated content with colloquial human expressions. This task is aimed at bridging the gap between colloquia… ▽ More

    Submitted 10 July, 2024; v1 submitted 3 July, 2024; originally announced July 2024.

    Comments: 19 pages, 5 figures

    MSC Class: 68Txx(Primary)14F05; 91Fxx(Secondary) ACM Class: I.2.7; J.5

  31. arXiv:2407.02911  [pdf, other

    eess.IV cs.CV

    Non-Adversarial Learning: Vector-Quantized Common Latent Space for Multi-Sequence MRI

    Authors: Luyi Han, Tao Tan, Tianyu Zhang, Xin Wang, Yuan Gao, Chunyao Lu, Xinglong Liang, Haoran Dou, Yunzhi Huang, Ritse Mann

    Abstract: Adversarial learning helps generative models translate MRI from source to target sequence when lacking paired samples. However, implementing MRI synthesis with adversarial learning in clinical settings is challenging due to training instability and mode collapse. To address this issue, we leverage intermediate sequences to estimate the common latent space among multi-sequence MRI, enabling the rec… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  32. arXiv:2407.02899  [pdf, other

    hep-ex

    Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

    Abstract: A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  33. arXiv:2407.01908  [pdf, other

    eess.IV cs.CV

    Efficient Stochastic Differential Equation for DEM Super Resolution with Void Filling

    Authors: Tongtong Zhang, Zongcheng Zuo, Yuanxiang Li

    Abstract: Digital Elevation Model (DEM) plays a fundamental role in remote sensing and photogrammetry. Enhancing the quality of DEM is crucial for various applications. Although multiple types of defects may appear simultaneously in the same DEM, they are commonly addressed separately. Most existing approaches only aim to fill the DEM voids, or apply super-resolution to the intact DEM. This paper introduces… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  34. arXiv:2407.00141  [pdf, other

    cs.LG cs.AI

    Towards Secure and Efficient Data Scheduling for Vehicular Social Networks

    Authors: Youhua Xia, Tiehua Zhang, Jiong Jin, Ying He, Fei Yu

    Abstract: Efficient data transmission scheduling within vehicular environments poses a significant challenge due to the high mobility of such networks. Contemporary research predominantly centers on crafting cooperative scheduling algorithms tailored for vehicular networks. Notwithstanding, the intricacies of orchestrating scheduling in vehicular social networks both effectively and efficiently remain formi… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

  35. arXiv:2407.00136  [pdf, other

    hep-ex

    Observation of the Electromagnetic Dalitz Transition $h_c \rightarrow e^+e^-η_c$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, S. Ahmed, M. Albrecht, R. Aliberti, A. Amoroso, M. R. An, Q. An, X. H. Bai, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (495 additional authors not shown)

    Abstract: Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions… ▽ More

    Submitted 2 July, 2024; v1 submitted 28 June, 2024; originally announced July 2024.

  36. arXiv:2406.19976  [pdf, other

    cs.LG math.OC

    ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting

    Authors: Rui Pan, Jipeng Zhang, Xingyuan Pan, Renjie Pi, Xiaoyu Wang, Tong Zhang

    Abstract: Bilevel optimization has shown its utility across various machine learning settings, yet most algorithms in practice require second-order information, making it challenging to scale them up. Only recently, a paradigm of first-order algorithms emerged, capable of effectively addressing bilevel optimization problems. Nevertheless, the practical efficiency of this paradigm remains unverified, particu… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  37. arXiv:2406.19791  [pdf, other

    cs.RO

    Mobile Robot Oriented Large-Scale Indoor Dataset for Dynamic Scene Understanding

    Authors: Yifan Tang, Cong Tai, Fangxing Chen, Wanting Zhang, Tao Zhang, Xueping Liu, Yongjin Liu, Long Zeng

    Abstract: Most existing robotic datasets capture static scene data and thus are limited in evaluating robots' dynamic performance. To address this, we present a mobile robot oriented large-scale indoor dataset, denoted as THUD (Tsinghua University Dynamic) robotic dataset, for training and evaluating their dynamic scene understanding algorithms. Specifically, the THUD dataset construction is first detailed,… ▽ More

    Submitted 30 June, 2024; v1 submitted 28 June, 2024; originally announced June 2024.

    Comments: This version has been accepted by ICRA2024 and the dataset has been published, where the link can be found in the paper

    Journal ref: IEEE International Conference on Robotics & Automation,2024

  38. arXiv:2406.19711  [pdf, other

    cs.LG

    CHASE: A Causal Heterogeneous Graph based Framework for Root Cause Analysis in Multimodal Microservice Systems

    Authors: Ziming Zhao, Tiehua Zhang, Zhishu Shen, Hai Dong, Xingjun Ma, Xianhui Liu, Yun Yang

    Abstract: In recent years, the widespread adoption of distributed microservice architectures within the industry has significantly increased the demand for enhanced system availability and robustness. Due to the complex service invocation paths and dependencies at enterprise-level microservice systems, it is challenging to locate the anomalies promptly during service invocations, thus causing intractable is… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  39. arXiv:2406.19708  [pdf, other

    cs.NE cs.AI cs.CE q-bio.NC

    A Differentiable Approach to Multi-scale Brain Modeling

    Authors: Chaoming Wang, Muyang Lyu, Tianqiu Zhang, Sichao He, Si Wu

    Abstract: We present a multi-scale differentiable brain modeling workflow utilizing BrainPy, a unique differentiable brain simulator that combines accurate brain simulation with powerful gradient-based optimization. We leverage this capability of BrainPy across different brain scales. At the single-neuron level, we implement differentiable neuron models and employ gradient methods to optimize their fit to e… ▽ More

    Submitted 1 July, 2024; v1 submitted 28 June, 2024; originally announced June 2024.

    Comments: 2nd Differentiable Almost Everything Workshop at ICML 2024

  40. arXiv:2406.19389  [pdf, other

    cs.CV

    OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding

    Authors: Tao Zhang, Xiangtai Li, Hao Fei, Haobo Yuan, Shengqiong Wu, Shunping Ji, Chen Change Loy, Shuicheng Yan

    Abstract: Current universal segmentation methods demonstrate strong capabilities in pixel-level image and video understanding. However, they lack reasoning abilities and cannot be controlled via text instructions. In contrast, large vision-language multimodal models exhibit powerful vision-based conversation and reasoning capabilities but lack pixel-level understanding and have difficulty accepting visual p… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  41. arXiv:2406.19369  [pdf, other

    cs.CV

    Mamba or RWKV: Exploring High-Quality and High-Efficiency Segment Anything Model

    Authors: Haobo Yuan, Xiangtai Li, Lu Qi, Tao Zhang, Ming-Hsuan Yang, Shuicheng Yan, Chen Change Loy

    Abstract: Transformer-based segmentation methods face the challenge of efficient inference when dealing with high-resolution images. Recently, several linear attention architectures, such as Mamba and RWKV, have attracted much attention as they can process long sequences efficiently. In this work, we focus on designing an efficient segment-anything model by exploring these different architectures. Specifica… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 16 pages; 8 figures

  42. arXiv:2406.19190  [pdf, ps, other

    hep-ex

    Improved measurement of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

    Abstract: Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential dec… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 13 pages, 6 figures

  43. arXiv:2406.18485  [pdf, other

    cs.DC

    LoongTrain: Efficient Training of Long-Sequence LLMs with Head-Context Parallelism

    Authors: Diandian Gu, Peng Sun, Qinghao Hu, Ting Huang, Xun Chen, Yingtong Xiong, Guoteng Wang, Qiaoling Chen, Shangchun Zhao, Jiarui Fang, Yonggang Wen, Tianwei Zhang, Xin Jin, Xuanzhe Liu

    Abstract: Efficiently training LLMs with long sequences is important yet challenged by the massive computation and memory requirements. Sequence parallelism has been proposed to tackle these problems, but existing methods suffer from scalability or efficiency issues. We propose LoongTrain, a novel system to efficiently train LLMs with long sequences at scale. The core of LoongTrain is the 2D-Attention mecha… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  44. arXiv:2406.18183  [pdf, other

    hep-ex

    Measurement of the cross sections of $e^+e^-\to K^{-}\barΞ^{+}Λ/Σ^{0}$ at center-of-mass energies between 3.510 and 4.914 GeV

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

    Abstract: Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 26 pages,5 tables, 4 figures

  45. arXiv:2406.18083  [pdf, other

    hep-ex

    Measurements of $K_S^0$-$K_L^0$ asymmetries in the decays $Λ_c^+ \to pK_{L,S}^0$, $pK_{L,S}^0π^+π^-$ and $pK_{L,S}^0π^0$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

    Abstract: Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, an… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 19 pages, 2 figures

  46. arXiv:2406.17452  [pdf, ps, other

    hep-ex

    Study of the $f_{0}(980)$ through the decay $D_{s}^{+}\rightarrow π^{+}π^{+}π^{-}π^{0}$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (649 additional authors not shown)

    Abstract: We perform the first amplitude analysis of $D^+_s \to π^+π^+π^-π^0$ decays, based on data samples of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of 7.33~fb$^{-1}$. We report the observation of $D_{s}^{+} \to f_0(980)ρ(770)^{+}$ with a statistical significance greater than 10$σ$ and… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  47. arXiv:2406.17442  [pdf, other

    cs.CV

    Mamba24/8D: Enhancing Global Interaction in Point Clouds via State Space Model

    Authors: Zhuoyuan Li, Yubo Ai, Jiahao Lu, ChuXin Wang, Jiacheng Deng, Hanzhi Chang, Yanzhe Liang, Wenfei Yang, Shifeng Zhang, Tianzhu Zhang

    Abstract: Transformers have demonstrated impressive results for 3D point cloud semantic segmentation. However, the quadratic complexity of transformer makes computation cost high, limiting the number of points that can be processed simultaneously and impeding the modeling of long-range dependencies. Drawing inspiration from the great potential of recent state space models (SSM) for long sequence modeling, w… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  48. A unifying separability criterion based on extended correlation tensor

    Authors: Xiaofen Huang, Tinggui Zhang, Naihuan Jing

    Abstract: Entanglement is fundamental inasmuch because it rephrases the quest for the classical-quantum demarcation line, and it also has potentially enormous practical applications in modern information technology. In this work, employing the approach of matrix decomposition, we introduce and formulate a practicable criterion for separability based on the correlation tensor. It is interesting that this cri… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Journal ref: Quant. Inf. Process.(2024) 23:233 (15pp)

  49. arXiv:2406.16850  [pdf, other

    cs.CV cs.RO

    From Perfect to Noisy World Simulation: Customizable Embodied Multi-modal Perturbations for SLAM Robustness Benchmarking

    Authors: Xiaohao Xu, Tianyi Zhang, Sibo Wang, Xiang Li, Yongqi Chen, Ye Li, Bhiksha Raj, Matthew Johnson-Roberson, Xiaonan Huang

    Abstract: Embodied agents require robust navigation systems to operate in unstructured environments, making the robustness of Simultaneous Localization and Mapping (SLAM) models critical to embodied agent autonomy. While real-world datasets are invaluable, simulation-based benchmarks offer a scalable approach for robustness evaluations. However, the creation of a challenging and controllable noisy world wit… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 50 pages. arXiv admin note: substantial text overlap with arXiv:2402.08125

  50. arXiv:2406.16500  [pdf, other

    cs.NE

    A Dual-Channel Particle Swarm Optimization Algorithm Based on Adaptive Balance Search

    Authors: Zhenxing Zhang, Tianxian Zhang, Xiangliang Xu, Lingjiang Kong, Yi Han, Zicheng Wang

    Abstract: The balance between exploration (Er) and exploitation (Ei) determines the generalization performance of the particle swarm optimization (PSO) algorithm on different problems. Although the insufficient balance caused by global best being located near a local minimum has been widely researched, few scholars have systematically paid attention to two behaviors about personal best position (P) and glob… ▽ More

    Submitted 25 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.