Skip to main content

Showing 1–50 of 3,998 results for author: Zhang, M

  1. arXiv:2407.09036  [pdf, ps, other

    math.AG

    On the structure of the complement of skeleton

    Authors: Morgan Brown, Jiachang Xu, Muyuan Zhang

    Abstract: We study the higher dimensional geometry of Berkovich spaces using open fiber disks, which are given by open disks in a relative dimension $1$ fibration. Inspired by birational geometry, we conjecture that the Berkovich skeleton is the complement of the union of all open fiber disks, and prove this conjecture for $\mathcal{X}$ admitting a strictly semistable model with semiample canonical class.

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: Comments are welcome!

    MSC Class: 14G22; 14E30

  2. arXiv:2407.08555  [pdf, other

    eess.IV cs.CV

    SLoRD: Structural Low-Rank Descriptors for Shape Consistency in Vertebrae Segmentation

    Authors: Xin You, Yixin Lou, Minghui Zhang, Chuyan Zhang, Jie Yang, Yun Gu

    Abstract: Automatic and precise segmentation of vertebrae from CT images is crucial for various clinical applications. However, due to a lack of explicit and strict constraints, existing methods especially for single-stage methods, still suffer from the challenge of intra-vertebrae segmentation inconsistency, which refers to multiple label predictions inside a singular vertebra. For multi-stage methods, ver… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Under review

  3. arXiv:2407.08454  [pdf, other

    cs.CL

    Model Tells You Where to Merge: Adaptive KV Cache Merging for LLMs on Long-Context Tasks

    Authors: Zheng Wang, Boxiao Jin, Zhongzhi Yu, Minjia Zhang

    Abstract: How to efficiently serve Large Language Models (LLMs) has become a pressing issue because of their huge computational cost in their autoregressive generation process. To mitigate computational costs, LLMs often employ the KV Cache technique to improve the generation speed. While improving the computational efficiency, the storage requirements of the KV cache are substantial, particularly in long-c… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  4. Sulphur dioxide in the mid-infrared transmission spectrum of WASP-39b

    Authors: Diana Powell, Adina D. Feinstein, Elspeth K. H. Lee, Michael Zhang, Shang-Min Tsai, Jake Taylor, James Kirk, Taylor Bell, Joanna K. Barstow, Peter Gao, Jacob L. Bean, Jasmina Blecic, Katy L. Chubb, Ian J. M. Crossfield, Sean Jordan, Daniel Kitzmann, Sarah E. Moran, Giuseppe Morello, Julianne I. Moses, Luis Welbanks, Jeehyun Yang, Xi Zhang, Eva-Maria Ahrer, Aaron Bello-Arufe, Jonathan Brande , et al. (48 additional authors not shown)

    Abstract: The recent inference of sulphur dioxide (SO$_2$) in the atmosphere of the hot ($\sim$1100 K), Saturn-mass exoplanet WASP-39b from near-infrared JWST observations suggests that photochemistry is a key process in high temperature exoplanet atmospheres. This is due to the low ($<$1 ppb) abundance of SO$_2$ under thermochemical equilibrium, compared to that produced from the photochemistry of H$_2$O a… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Published in Nature

    Journal ref: Nature 626, 979-983 (2024)

  5. arXiv:2407.07651  [pdf, other

    hep-ex physics.data-an

    Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$

    Authors: M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (645 additional authors not shown)

    Abstract: The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  6. arXiv:2407.07520  [pdf, other

    cs.CV

    IRSAM: Advancing Segment Anything Model for Infrared Small Target Detection

    Authors: Mingjin Zhang, Yuchun Wang, Jie Guo, Yunsong Li, Xinbo Gao, Jing Zhang

    Abstract: The recent Segment Anything Model (SAM) is a significant advancement in natural image segmentation, exhibiting potent zero-shot performance suitable for various downstream image segmentation tasks. However, directly utilizing the pretrained SAM for Infrared Small Target Detection (IRSTD) task falls short in achieving satisfying performance due to a notable domain gap between natural and infrared i… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 18 pages, 8 figures, to be published in ECCV2024

  7. arXiv:2407.07501  [pdf

    cond-mat.supr-con

    Electronic Correlation and Pseudogap-like Behavior of High-Temperature Superconductor La3Ni2O7

    Authors: Yidian Li, Xian Du, Yantao Cao, Cuiying Pei, Mingxin Zhang, Wenxuan Zhao, Kaiyi Zhai, Runzhe Xu, Zhongkai Liu, Zhiwei Li, Jinkui Zhao, Gang Li, Yanpeng Qi, Hanjie Guo, Yulin Chen, Lexian Yang

    Abstract: High-temperature superconductivity (HTSC) remains one of the most challenging and fascinating mysteries in condensed matter physics. Recently, superconductivity with transition temperature exceeding liquid-nitrogen temperature is discovered in La3Ni2O7 at high pressure, which provides a new platform to explore the unconventional HTSC. In this work, using high-resolution angle-resolved photoemissio… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  8. arXiv:2407.07357  [pdf, ps, other

    cs.LG q-bio.MN

    A deep graph model for the signed interaction prediction in biological network

    Authors: Shuyi Jin, Mengji Zhang, Meijie Wang, Lun Yu

    Abstract: In pharmaceutical research, the strategy of drug repurposing accelerates the development of new therapies while reducing R&D costs. Network pharmacology lays the theoretical groundwork for identifying new drug indications, and deep graph models have become essential for their precision in mapping complex biological networks. Our study introduces an advanced graph model that utilizes graph convolut… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  9. arXiv:2407.07327  [pdf, other

    cs.AI

    Fuse, Reason and Verify: Geometry Problem Solving with Parsed Clauses from Diagram

    Authors: Ming-Liang Zhang, Zhong-Zhi Li, Fei Yin, Liang Lin, Cheng-Lin Liu

    Abstract: Geometry problem solving (GPS) requires capacities of multi-modal understanding, multi-hop reasoning and theorem knowledge application. In this paper, we propose a neural-symbolic model for plane geometry problem solving (PGPS), named PGPSNet-v2, with three key steps: modal fusion, reasoning process and knowledge verification. In modal fusion, we leverage textual clauses to express fine-grained st… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: under review by journal

  10. arXiv:2407.07306  [pdf

    physics.med-ph eess.SY

    Electrical Impedance Tomography Based Closed-loop Tumor Treating Fields in Dynamic Lung Tumors

    Authors: Minmin Wang, Xu Xie, Yuxi Guo, Liying Zhu, Yue Lan, Haitang Yang, Yun Pan, Guangdi Chen, Shaomin Zhang, Maomao Zhang

    Abstract: Tumor Treating Fields (TTFields) is a non-invasive anticancer modality that utilizes alternating electric fields to disrupt cancer cell division and growth. While generally well-tolerated with minimal side effects, traditional TTFields therapy for lung tumors faces challenges due to the influence of respiratory motion. We design a novel closed-loop TTFields strategy for lung tumors by incorporatin… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 7 pages, 5 figures

  11. arXiv:2407.07017  [pdf, other

    gr-qc hep-th

    Shadows, greybody factors, emission rate, topological charge, and phase transitions for a charged black hole with a Kalb-Ramond field background

    Authors: F. Hosseinifar, A. A. Araújo Filho, M. Y. Zhang, H. Chen, H. Hassanabadi

    Abstract: In this work, we investigate a spherically symmetric charged black hole in the presence of a Kalb--Ramond field background. We calculate the photon sphere and shadow radii and, corroborating our results, we constrain them from observational data from the Event Horizon Telescope (EHT), particularly focusing on the shadow images of Sagittarius $A^{*}$. Additionally, we analyze the greybody factors,… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 7 pages in two column, 10 figures and 2 tables

  12. arXiv:2407.06628  [pdf, other

    cs.CV

    Masked Video and Body-worn IMU Autoencoder for Egocentric Action Recognition

    Authors: Mingfang Zhang, Yifei Huang, Ruicong Liu, Yoichi Sato

    Abstract: Compared with visual signals, Inertial Measurement Units (IMUs) placed on human limbs can capture accurate motion signals while being robust to lighting variation and occlusion. While these characteristics are intuitively valuable to help egocentric action recognition, the potential of IMUs remains under-explored. In this work, we present a novel method for action recognition that integrates motio… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: ECCV 2024

  13. arXiv:2407.06191  [pdf, other

    cs.CV

    Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side Images

    Authors: Zhangyang Qi, Yunhan Yang, Mengchen Zhang, Long Xing, Xiaoyang Wu, Tong Wu, Dahua Lin, Xihui Liu, Jiaqi Wang, Hengshuang Zhao

    Abstract: Recent advances in 3D AIGC have shown promise in directly creating 3D objects from text and images, offering significant cost savings in animation and product design. However, detailed edit and customization of 3D assets remains a long-standing challenge. Specifically, 3D Generation methods lack the ability to follow finely detailed instructions as precisely as their 2D image creation counterparts… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: Project Page: https://tailor3d-2024.github.io/

  14. arXiv:2407.06188  [pdf, other

    cs.CV

    CrowdMoGen: Zero-Shot Text-Driven Collective Motion Generation

    Authors: Xinying Guo, Mingyuan Zhang, Haozhe Xie, Chenyang Gu, Ziwei Liu

    Abstract: Crowd Motion Generation is essential in entertainment industries such as animation and games as well as in strategic fields like urban simulation and planning. This new task requires an intricate integration of control and generation to realistically synthesize crowd dynamics under specific spatial and semantic constraints, whose challenges are yet to be fully explored. On the one hand, existing h… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: Project page: https://gxyes.github.io/projects/CrowdMoGen.html

  15. arXiv:2407.06163  [pdf, other

    astro-ph.EP

    Hydrogen sulfide and metal-enriched atmosphere for a Jupiter-mass exoplanet

    Authors: Guangwei Fu, Luis Welbanks, Drake Deming, Julie Inglis, Michael Zhang, Joshua Lothringer, Jegug Ih, Julianne I. Moses, Everett Schlawin, Heather A. Knutson, Gregory Henry, Thomas Greene, David K. Sing, Arjun B. Savel, Eliza M. -R. Kempton, Dana R. Louie, Michael Line, Matt Nixon

    Abstract: As the closest transiting hot Jupiter to Earth, HD 189733b has been the benchmark planet for atmospheric characterization. It has also been the anchor point for much of our theoretical understanding of exoplanet atmospheres from composition, chemistry, aerosols to atmospheric dynamics, escape, and modeling techniques. Prior studies of HD 189733b have detected carbon and oxygen-bearing molecules H2… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: Published online in Nature on July 8th, 2024

  16. arXiv:2407.06153  [pdf, other

    cs.SE cs.CL

    What's Wrong with Your Code Generated by Large Language Models? An Extensive Study

    Authors: Shihan Dou, Haoxiang Jia, Shenxi Wu, Huiyuan Zheng, Weikang Zhou, Muling Wu, Mingxu Chai, Jessica Fan, Caishuang Huang, Yunbo Tao, Yan Liu, Enyu Zhou, Ming Zhang, Yuhao Zhou, Yueming Wu, Rui Zheng, Ming Wen, Rongxiang Weng, Jingang Wang, Xunliang Cai, Tao Gui, Xipeng Qiu, Qi Zhang, Xuanjing Huang

    Abstract: The increasing development of large language models (LLMs) in code generation has drawn significant attention among researchers. To enhance LLM-based code generation ability, current efforts are predominantly directed towards collecting high-quality datasets and leveraging diverse training technologies. However, there is a notable lack of comprehensive studies examining the limitations and boundar… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 17 pages, 7 figures

  17. arXiv:2407.06136  [pdf, other

    cs.CV

    Mamba-FSCIL: Dynamic Adaptation with Selective State Space Model for Few-Shot Class-Incremental Learning

    Authors: Xiaojie Li, Yibo Yang, Jianlong Wu, Bernard Ghanem, Liqiang Nie, Min Zhang

    Abstract: Few-shot class-incremental learning (FSCIL) confronts the challenge of integrating new classes into a model with minimal training samples while preserving the knowledge of previously learned classes. Traditional methods widely adopt static adaptation relying on a fixed parameter space to learn from data that arrive sequentially, prone to overfitting to the current session. Existing dynamic strateg… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: Code: https://github.com/xiaojieli0903/Mamba-FSCIL

  18. arXiv:2407.06048  [pdf, other

    cs.CL cs.CV

    Vision-Braille: An End-to-End Tool for Chinese Braille Image-to-Text Translation

    Authors: Alan Wu, Ye Yuan, Ming Zhang

    Abstract: Visually impaired people are a large group who can only use braille for reading and writing. However, the lack of special educational resources is the bottleneck for educating them. Educational equity is a reflection of the level of social civilization, cultural equality, and individual dignity. Facilitating and improving lifelong learning channels for the visually impaired is of great significanc… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: This paper is submitted to NeurIPS 2024 High School Project Track

  19. arXiv:2407.05510  [pdf, other

    cs.AR cs.ET cs.LG

    SCATTER: Algorithm-Circuit Co-Sparse Photonic Accelerator with Thermal-Tolerant, Power-Efficient In-situ Light Redistribution

    Authors: Ziang Yin, Nicholas Gangi, Meng Zhang, Jeff Zhang, Rena Huang, Jiaqi Gu

    Abstract: Photonic computing has emerged as a promising solution for accelerating computation-intensive artificial intelligence (AI) workloads. However, limited reconfigurability, high electrical-optical conversion cost, and thermal sensitivity limit the deployment of current optical analog computing engines to support power-restricted, performance-sensitive AI workloads at scale. Sparsity provides a great… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  20. arXiv:2407.05310  [pdf, other

    eess.SP cs.NE cs.SD eess.AS

    Ternary Spike-based Neuromorphic Signal Processing System

    Authors: Shuai Wang, Dehao Zhang, Ammar Belatreche, Yichen Xiao, Hongyu Qing, Wenjie We, Malu Zhang, Yang Yang

    Abstract: Deep Neural Networks (DNNs) have been successfully implemented across various signal processing fields, resulting in significant enhancements in performance. However, DNNs generally require substantial computational resources, leading to significant economic costs and posing challenges for their deployment on resource-constrained edge devices. In this study, we take advantage of spiking neural net… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  21. arXiv:2407.05282  [pdf, other

    cs.CV

    UltraEdit: Instruction-based Fine-Grained Image Editing at Scale

    Authors: Haozhe Zhao, Xiaojian Ma, Liang Chen, Shuzheng Si, Rujie Wu, Kaikai An, Peiyu Yu, Minjia Zhang, Qing Li, Baobao Chang

    Abstract: This paper presents UltraEdit, a large-scale (approximately 4 million editing samples), automatically generated dataset for instruction-based image editing. Our key idea is to address the drawbacks in existing image editing datasets like InstructPix2Pix and MagicBrush, and provide a systematic approach to producing massive and high-quality image editing samples. UltraEdit offers several distinct a… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: 32 pages, 14 figures

  22. arXiv:2407.05047  [pdf, other

    cs.AI

    MFE-ETP: A Comprehensive Evaluation Benchmark for Multi-modal Foundation Models on Embodied Task Planning

    Authors: Min Zhang, Jianye Hao, Xian Fu, Peilong Han, Hao Zhang, Lei Shi, Hongyao Tang, Yan Zheng

    Abstract: In recent years, Multi-modal Foundation Models (MFMs) and Embodied Artificial Intelligence (EAI) have been advancing side by side at an unprecedented pace. The integration of the two has garnered significant attention from the AI research community. In this work, we attempt to provide an in-depth and comprehensive evaluation of the performance of MFM s on embodied task planning, aiming to shed lig… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  23. arXiv:2407.05021  [pdf, other

    cs.CV

    Incremental Multiview Point Cloud Registration

    Authors: Xiaoya Cheng, Yu Liu, Maojun Zhang, Shen Yan

    Abstract: In this paper, we present a novel approach for multiview point cloud registration. Different from previous researches that typically employ a global scheme for multiview registration, we propose to adopt an incremental pipeline to progressively align scans into a canonical coordinate system. Specifically, drawing inspiration from image-based 3D reconstruction, our approach first builds a sparse sc… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  24. arXiv:2407.04248  [pdf, other

    stat.ML cs.LG

    Machine Learning for Complex Systems with Abnormal Pattern by Exception Maximization Outlier Detection Method

    Authors: Zhikun Zhang, Yiting Duan, Xiangjun Wang, Mingyuan Zhang

    Abstract: This paper proposes a novel fast online methodology for outlier detection called the exception maximization outlier detection method(EMODM), which employs probabilistic models and statistical algorithms to detect abnormal patterns from the outputs of complex systems. The EMODM is based on a two-state Gaussian mixture model and demonstrates strong performance in probability anomaly detection workin… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  25. arXiv:2407.03884  [pdf, other

    cs.CL cs.AI

    Planning with Large Language Models for Conversational Agents

    Authors: Zhigen Li, Jianxiang Peng, Yanmeng Wang, Tianhao Shen, Minghui Zhang, Linxi Su, Shang Wu, Yihang Wu, Yuqian Wang, Ye Wang, Wei Hu, Jianfeng Li, Shaojun Wang, Jing Xiao, Deyi Xiong

    Abstract: Controllability and proactivity are crucial properties of autonomous conversational agents (CAs). Controllability requires the CAs to follow the standard operating procedures (SOPs), such as verifying identity before activating credit cards. Proactivity requires the CAs to guide the conversation towards the goal during user uncooperation, such as persuasive dialogue. Existing research cannot be un… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  26. arXiv:2407.03515  [pdf, other

    stat.ML cs.LG

    Feature-Specific Coefficients of Determination in Tree Ensembles

    Authors: Zhongli Jiang, Dabao Zhang, Min Zhang

    Abstract: Tree ensemble methods provide promising predictions with models difficult to interpret. Recent introduction of Shapley values for individualized feature contributions, accompanied with several fast computing algorithms for predicted values, shows intriguing results. However, individualizing coefficients of determination, aka $R^2$, for each feature is challenged by the underlying quadratic losses,… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  27. arXiv:2407.03125  [pdf, other

    cs.LG cs.AI

    Foundations and Frontiers of Graph Learning Theory

    Authors: Yu Huang, Min Zhou, Menglin Yang, Zhen Wang, Muhan Zhang, Jie Wang, Hong Xie, Hao Wang, Defu Lian, Enhong Chen

    Abstract: Recent advancements in graph learning have revolutionized the way to understand and analyze data with complex structures. Notably, Graph Neural Networks (GNNs), i.e. neural network architectures designed for learning graph representations, have become a popular paradigm. With these models being usually characterized by intuition-driven design or highly intricate components, placing them within the… ▽ More

    Submitted 7 July, 2024; v1 submitted 3 July, 2024; originally announced July 2024.

    Comments: 35pages,273references. Github link: https://github.com/minehly/awesome-paper-for-graph-learning-theory

  28. arXiv:2407.02899  [pdf, other

    hep-ex

    Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

    Abstract: A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  29. arXiv:2407.02894  [pdf, other

    cs.CL cs.AI

    Translatotron-V(ison): An End-to-End Model for In-Image Machine Translation

    Authors: Zhibin Lan, Liqiang Niu, Fandong Meng, Jie Zhou, Min Zhang, Jinsong Su

    Abstract: In-image machine translation (IIMT) aims to translate an image containing texts in source language into an image containing translations in target language. In this regard, conventional cascaded methods suffer from issues such as error propagation, massive parameters, and difficulties in deployment and retaining visual characteristics of the input image. Thus, constructing end-to-end models has be… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Accepted to ACL 2024 Findings

  30. arXiv:2407.02614  [pdf, other

    cs.HC

    AcuVR: Enhancing Acupuncture Training Workflow with Virtual Reality

    Authors: Menghe Zhang, Chen Chen, Matin Yarmand, Anish Rajeshkumar, Nadir Weibel

    Abstract: Acupuncture is a widely adopted medical practice that involves inserting thin needles into specific points on the body to alleviate pain and treat various health conditions. Current learning practices heavily rely on 2D atlases and practice on peers, which are notably less intuitive and pose risks, particularly in sensitive areas such as the eyes. To address these challenges, we introduce AcuVR, a… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 10 pages

    ACM Class: J.3; J.4; H.5

  31. arXiv:2407.02229  [pdf, other

    cs.CV

    LaMoD: Latent Motion Diffusion Model For Myocardial Strain Generation

    Authors: Jiarui Xing, Nivetha Jayakumar, Nian Wu, Yu Wang, Frederick H. Epstein, Miaomiao Zhang

    Abstract: Motion and deformation analysis of cardiac magnetic resonance (CMR) imaging videos is crucial for assessing myocardial strain of patients with abnormal heart functions. Recent advances in deep learning-based image registration algorithms have shown promising results in predicting motion fields from routinely acquired CMR sequences. However, their accuracy often diminishes in regions with subtle ap… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  32. arXiv:2407.02043  [pdf, other

    cs.CL

    Concise and Precise Context Compression for Tool-Using Language Models

    Authors: Yang Xu, Yunlong Feng, Honglin Mu, Yutai Hou, Yitong Li, Xinghao Wang, Wanjun Zhong, Zhongyang Li, Dandan Tu, Qingfu Zhu, Min Zhang, Wanxiang Che

    Abstract: Through reading the documentation in the context, tool-using language models can dynamically extend their capability using external tools. The cost is that we have to input lengthy documentation every time the model needs to use the tool, occupying the input window as well as slowing down the decoding process. Given the progress in general-purpose compression, soft context compression is a suita… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  33. arXiv:2407.01284  [pdf, other

    cs.AI cs.CL cs.CV cs.LG cs.SC

    We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?

    Authors: Runqi Qiao, Qiuna Tan, Guanting Dong, Minhui Wu, Chong Sun, Xiaoshuai Song, Zhuoma GongQue, Shanglin Lei, Zhe Wei, Miaoxuan Zhang, Runfeng Qiao, Yifan Zhang, Xiao Zong, Yida Xu, Muxi Diao, Zhimin Bao, Chen Li, Honggang Zhang

    Abstract: Visual mathematical reasoning, as a fundamental visual reasoning ability, has received widespread attention from the Large Multimodal Models (LMMs) community. Existing benchmarks, such as MathVista and MathVerse, focus more on the result-oriented performance but neglect the underlying principles in knowledge acquisition and generalization. Inspired by human-like mathematical reasoning, we introduc… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Work in progress

  34. arXiv:2407.01274  [pdf

    cs.CY cs.CL

    Leveraging Large Language Models for Actionable Course Evaluation Student Feedback to Lecturers

    Authors: Mike Zhang, Euan D Lindsay, Frederik Bode Thorbensen, Danny Bøgsted Poulsen, Johannes Bjerva

    Abstract: End of semester student evaluations of teaching are the dominant mechanism for providing feedback to academics on their teaching practice. For large classes, however, the volume of feedback makes these tools impractical for this purpose. This paper explores the use of open-source generative AI to synthesise factual, actionable and appropriate summaries of student feedback from these survey respons… ▽ More

    Submitted 2 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

    Comments: Accepted to SEFI 2024

  35. arXiv:2407.01213  [pdf, other

    cs.SI

    EMIF: Evidence-aware Multi-source Information Fusion Network for Explainable Fake News Detection

    Authors: Qingxing Dong, Mengyi Zhang, Shiyuan Wu, Xiaozhen Wu

    Abstract: Extensive research on automatic fake news detection has been conducted due to the significant detrimental effects of fake news proliferation. Most existing approaches rely on a single source of evidence, such as comments or relevant news, to derive explanatory evidence for decision-making, demonstrating exceptional performance. However, their single evidence source suffers from two critical drawba… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  36. arXiv:2407.01035  [pdf

    physics.plasm-ph

    Off-site production of plasma-activated water for efficient sterilization: the crucial role of high-valence NOx and new chemical pathways

    Authors: Zifeng Wang, Xiangyu Wang, Shenghang Xu, Renwu Zhou, Mingyan Zhang, Wanchun Li, Zizhu Zhang, Luge Wang, Jinkun Chen, Jishen Zhang, Li Guo, Dandan Pei, Dingxin Liu, Mingzhe Rong

    Abstract: Efficient sterilization of pathogens with cleaner methods is a critical concern for environmental disinfection and clinical anti-infective treatment. Plasma-activated water (PAW) is a promising alternative to chemical disinfectants and antibiotics for its strong sterilization ability and not inducing any acute toxicity, and only water and air are consumed during production. For more efficient wate… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  37. arXiv:2407.00914  [pdf, ps, other

    math.DS

    Multifractal analysis of the convergence exponents for the digits in $d$-decaying Gauss like dynamical systems

    Authors: Kunkun Song, Mengjie Zhang

    Abstract: Let $\{a_n(x)\}_{n\geq1}$ be the sequence of digits of $x\in(0,1)$ in infinite iterated function systems with polynomial decay of the derivative. We first study the multifractal spectrum of the convergence exponent defined by the sequence of the digits $\{a_n(x)\}_{n\geq1}$ and the weighted products of distinct digits with finite numbers respectively, and then calculate the Hausdorff dimensions of… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: 17 pages

    MSC Class: 11K55; 28A80

  38. arXiv:2407.00882  [pdf, other

    stat.ME

    Subgroup Identification with Latent Factor Structure

    Authors: Yong He, Dong Liu, Fuxin Wang, Mingjuan Zhang, Wen-Xin Zhou

    Abstract: Subgroup analysis has attracted growing attention due to its ability to identify meaningful subgroups from a heterogeneous population and thereby improving predictive power. However, in many scenarios such as social science and biology, the covariates are possibly highly correlated due to the existence of common factors, which brings great challenges for group identification and is neglected in th… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  39. arXiv:2407.00468  [pdf, other

    cs.CV cs.AI cs.CL

    MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation

    Authors: Jinsheng Huang, Liang Chen, Taian Guo, Fu Zeng, Yusheng Zhao, Bohan Wu, Ye Yuan, Haozhe Zhao, Zhihui Guo, Yichi Zhang, Jingyang Yuan, Wei Ju, Luchen Liu, Tianyu Liu, Baobao Chang, Ming Zhang

    Abstract: Large Multimodal Models (LMMs) exhibit impressive cross-modal understanding and reasoning abilities, often assessed through multiple-choice questions (MCQs) that include an image, a question, and several options. However, many benchmarks used for such evaluations suffer from systematic biases. Remarkably, Large Language Models (LLMs) without any visual perception capabilities achieve non-trivial p… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: 21 pages, code released at https://github.com/chenllliang/MMEvalPro, Homepage at https://mmevalpro.github.io/

  40. arXiv:2407.00382  [pdf, other

    math.NA cs.AI cs.CE cs.LG

    Towards Universal Mesh Movement Networks

    Authors: Mingrui Zhang, Chunyang Wang, Stephan Kramer, Joseph G. Wallwork, Siyi Li, Jiancheng Liu, Xiang Chen, Matthew D. Piggott

    Abstract: Solving complex Partial Differential Equations (PDEs) accurately and efficiently is an essential and challenging problem in all scientific and engineering disciplines. Mesh movement methods provide the capability to improve the accuracy of the numerical solution without increasing the overall mesh degree of freedom count. Conventional sophisticated mesh movement methods are extremely expensive and… ▽ More

    Submitted 1 July, 2024; v1 submitted 29 June, 2024; originally announced July 2024.

  41. arXiv:2407.00136  [pdf, other

    hep-ex

    Observation of the Electromagnetic Dalitz Transition $h_c \rightarrow e^+e^-η_c$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, S. Ahmed, M. Albrecht, R. Aliberti, A. Amoroso, M. R. An, Q. An, X. H. Bai, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (495 additional authors not shown)

    Abstract: Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions… ▽ More

    Submitted 2 July, 2024; v1 submitted 28 June, 2024; originally announced July 2024.

  42. arXiv:2407.00079  [pdf, other

    cs.DC cs.AI cs.AR

    Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving

    Authors: Ruoyu Qin, Zheming Li, Weiran He, Mingxing Zhang, Yongwei Wu, Weimin Zheng, Xinran Xu

    Abstract: Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI. It features a KVCache-centric disaggregated architecture that separates the prefill and decoding clusters. It also leverages the underutilized CPU, DRAM, and SSD resources of the GPU cluster to implement a disaggregated cache of KVCache. The core of Mooncake is its KVCache-centric scheduler, which balances ma… ▽ More

    Submitted 9 July, 2024; v1 submitted 23 June, 2024; originally announced July 2024.

    Comments: 23 pages, 13 figures

  43. Enhancing Video-Language Representations with Structural Spatio-Temporal Alignment

    Authors: Hao Fei, Shengqiong Wu, Meishan Zhang, Min Zhang, Tat-Seng Chua, Shuicheng Yan

    Abstract: While pre-training large-scale video-language models (VLMs) has shown remarkable potential for various downstream video-language tasks, existing VLMs can still suffer from certain commonly seen limitations, e.g., coarse-grained cross-modal aligning , under-modeling of temporal dynamics, detached video-language view. In this work, we target enhancing VLMs with a fine-grained structural spatio-tempo… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Accepted by IEEE TPAMI 2024

    Journal ref: [J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024

  44. arXiv:2406.19190  [pdf, ps, other

    hep-ex

    Improved measurement of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

    Abstract: Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential dec… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 13 pages, 6 figures

  45. arXiv:2406.18846  [pdf, other

    cs.CE

    AFBench: A Large-scale Benchmark for Airfoil Design

    Authors: Jian Liu, Jianyu Wu, Hairun Xie, Guoqing Zhang, Jing Wang, Wei Liu, Wanli Ouyang, Junjun Jiang, Xianming Liu, Shixiang Tang, Miao Zhang

    Abstract: Data-driven generative models have emerged as promising approaches towards achieving efficient mechanical inverse design. However, due to prohibitively high cost in time and money, there is still lack of open-source and large-scale benchmarks in this field. It is mainly the case for airfoil inverse design, which requires to generate and edit diverse geometric-qualified and aerodynamic-qualified ai… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Submitted to NeurIPS 2024 Dataset & Benchmark Track

  46. arXiv:2406.18820  [pdf, other

    cs.DC cs.LG

    Universal Checkpointing: Efficient and Flexible Checkpointing for Large Scale Distributed Training

    Authors: Xinyu Lian, Sam Ade Jacobs, Lev Kurilenko, Masahiro Tanaka, Stas Bekman, Olatunji Ruwase, Minjia Zhang

    Abstract: Existing checkpointing approaches seem ill-suited for distributed training even though hardware limitations make model parallelism, i.e., sharding model state across multiple accelerators, a requirement for model scaling. Consolidating distributed model state into a single checkpoint unacceptably slows down training, and is impractical at extreme scales. Distributed checkpoints, in contrast, are t… ▽ More

    Submitted 27 June, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

  47. arXiv:2406.18414  [pdf, other

    cs.CV cs.AI

    BiTrack: Bidirectional Offline 3D Multi-Object Tracking Using Camera-LiDAR Data

    Authors: Kemiao Huang, Meiying Zhang, Qi Hao

    Abstract: Compared with real-time multi-object tracking (MOT), offline multi-object tracking (OMOT) has the advantages to perform 2D-3D detection fusion, erroneous link correction, and full track optimization but has to deal with the challenges from bounding box misalignment and track evaluation, editing, and refinement. This paper proposes "BiTrack", a 3D OMOT framework that includes modules of 2D-3D detec… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  48. arXiv:2406.18183  [pdf, other

    hep-ex

    Measurement of the cross sections of $e^+e^-\to K^{-}\barΞ^{+}Λ/Σ^{0}$ at center-of-mass energies between 3.510 and 4.914 GeV

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

    Abstract: Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 26 pages,5 tables, 4 figures

  49. arXiv:2406.18129  [pdf, other

    cs.CV cs.LG

    CTS: Sim-to-Real Unsupervised Domain Adaptation on 3D Detection

    Authors: Meiying Zhang, Weiyuan Peng, Guangyao Ding, Chenyang Lei, Chunlin Ji, Qi Hao

    Abstract: Simulation data can be accurately labeled and have been expected to improve the performance of data-driven algorithms, including object detection. However, due to the various domain inconsistencies from simulation to reality (sim-to-real), cross-domain object detection algorithms usually suffer from dramatic performance drops. While numerous unsupervised domain adaptation (UDA) methods have been d… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  50. arXiv:2406.18088  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    LLM-Driven Multimodal Opinion Expression Identification

    Authors: Bonian Jia, Huiyao Chen, Yueheng Sun, Meishan Zhang, Min Zhang

    Abstract: Opinion Expression Identification (OEI) is essential in NLP for applications ranging from voice assistants to depression diagnosis. This study extends OEI to encompass multimodal inputs, underlining the significance of auditory cues in delivering emotional subtleties beyond the capabilities of text. We introduce a novel multimodal OEI (MOEI) task, integrating text and speech to mirror real-world s… ▽ More

    Submitted 29 June, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

    Comments: 5 pages, 3 Figures, Accept by Interspeech 2024