Skip to main content

Showing 1–50 of 8,396 results for author: Zhang, L

  1. arXiv:2407.09163  [pdf, ps, other

    math-ph

    Mean eigenvector self-overlap in deformed complex Ginibre ensemble

    Authors: Lu Zhang

    Abstract: Consider a random matrix of size $N$ as an additive deformation of the complex Ginibre ensemble under a deterministic matrix $X_0$ with a finite rank, independent of $N$. We prove that microscopic statistics for the mean diagonal overlap, near the edge point, are characterized by the iterative erfc integrals, which only depend on the geometric multiplicity of certain eigenvalue of $X_0$. Further w… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  2. arXiv:2407.09129  [pdf, other

    cond-mat.quant-gas nlin.PS

    Rotating dipole and quadrupole quantum droplets in binary Bose-Einstein condensates

    Authors: Dongshuai Liu, Yanxia Gao, Dianyuan Fan, Boris A. Malomed, Lifu Zhang

    Abstract: Quantum droplets (QDs) are self-trapped modes stabilized by the Lee-Huang-Yang correction to the mean-field Hamiltonian of binary atomic Bose-Einstein condensates. The existence and stability of quiescent and rotating dipole-shaped and vortex QDs with vorticity $S=1$ (DQDs and VQDs, respectively) are numerically studied in the framework of the accordingly modified two-component system. The rotatin… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: 7 pages,8 figures;to be published in Physical Review Research

  3. arXiv:2407.08966  [pdf, other

    cs.CV cs.AI cs.LG

    LAPT: Label-driven Automated Prompt Tuning for OOD Detection with Vision-Language Models

    Authors: Yabin Zhang, Wenjie Zhu, Chenhang He, Lei Zhang

    Abstract: Out-of-distribution (OOD) detection is crucial for model reliability, as it identifies samples from unknown classes and reduces errors due to unexpected inputs. Vision-Language Models (VLMs) such as CLIP are emerging as powerful tools for OOD detection by integrating multi-modal information. However, the practical application of such systems is challenged by manual prompt engineering, which demand… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: ECCV2024; Codes and Supp. are available at: https://github.com/YBZh/LAPT

  4. arXiv:2407.08353  [pdf

    cond-mat.mtrl-sci

    One-dimensional flat bands in phosphorene nanoribbons with pentagonal nature

    Authors: Shuo Sun, Jing-Yang You, Zhihao Cai, Jie Su, Tong Yang, Xinnan Peng, Yihe Wang, Daiyu Geng, Jian Gou, Yuli Huang, Sisheng Duan, Lan Chen, Kehui Wu, Andrew T. S. Wee, Yuan Ping Feng, Jia Lin Zhang, Jiong Lu, Baojie Feng, Wei Chen

    Abstract: Materials with topological flat bands can serve as a promising platform to investigate strongly interacting phenomena. However, experimental realization of ideal flat bands is mostly limited to artificial lattices or moiré systems. Here we report a general way to construct one-dimensional (1D) flat bands in phosphorene nanoribbons (PNRs) with pentagonal nature: penta-hexa-PNRs and penta-dodeca-PNR… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 13 pages, 4 figures

  5. arXiv:2407.08208  [pdf, other

    gr-qc

    Scalarization of Taub-NUT Black Holes in Extended scalar-tensor-Gauss-Bonnet Theory

    Authors: Hai-Shan Liu, Lei Zhang

    Abstract: Recently, scalarization of Schwarzschild black hole are extensively studied. In this work, we explore the scalarization of Taub-NUT black hole. The theory we consider is the extended scalar-tensor-Gauss-Bonnet theory, which admits Ricci-flat Taub-NUT black hole as a solution. An analysis of probe scalar field is carried out to identify the mass parameter and NUT parameter (m,n) where the hairy bla… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 19 pages, 10 figures

  6. arXiv:2407.08195  [pdf

    cs.AI cs.CL cs.MA

    A Text-to-Game Engine for UGC-Based Role-Playing Games

    Authors: Lei Zhang, Xuezheng Peng, Shuyi Yang, Feiyang Wang

    Abstract: The shift from professionally generated content (PGC) to user-generated content (UGC) has revolutionized various media formats, from text to video. With the rapid advancements in generative AI, a similar shift is set to transform the game industry, particularly in the realm of role-playing games (RPGs). This paper introduces a new framework for a text-to-game engine that utilizes foundation models… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 13 pages,11 figures

  7. arXiv:2407.08109  [pdf, other

    cs.CV cs.AI cs.LG

    Urban Waterlogging Detection: A Challenging Benchmark and Large-Small Model Co-Adapter

    Authors: Suqi Song, Chenxu Zhang, Peng Zhang, Pengkun Li, Fenglong Song, Lei Zhang

    Abstract: Urban waterlogging poses a major risk to public safety and infrastructure. Conventional methods using water-level sensors need high-maintenance to hardly achieve full coverage. Recent advances employ surveillance camera imagery and deep learning for detection, yet these struggle amidst scarce data and adverse environmental conditions. In this paper, we establish a challenging Urban Waterlogging Be… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: ECCV 2024

  8. arXiv:2407.07651  [pdf, other

    hep-ex physics.data-an

    Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$

    Authors: M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (645 additional authors not shown)

    Abstract: The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  9. arXiv:2407.07626  [pdf, other

    quant-ph

    Fusion of atomic W-like states in cavity QED systems

    Authors: Cheng-Yun Ding, Wan-Fang Liu, Li-Hua Zhang

    Abstract: It is well-known that maximally entangled GHZ states can achieve perfect teleportation and superdense coding, whereas maximally entangled W states cannot. However, it has been demonstrated that there exists a special class of non-maximally entangled W states, called as \textit{W-like} states, which can overcome this limitation. Therefore, it is of great significance to prepare such W-like states f… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 8 pages, 4 figures, 1 table

  10. arXiv:2407.07614  [pdf, other

    cs.CV

    MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis

    Authors: Wanggui He, Siming Fu, Mushui Liu, Xierui Wang, Wenyi Xiao, Fangxun Shu, Yi Wang, Lei Zhang, Zhelun Yu, Haoyuan Li, Ziwei Huang, LeiLei Gan, Hao Jiang

    Abstract: Auto-regressive models have made significant progress in the realm of language generation, yet they do not perform on par with diffusion models in the domain of image synthesis. In this work, we introduce MARS, a novel framework for T2I generation that incorporates a specially designed Semantic Vision-Language Integration Expert (SemVIE). This innovative component integrates pre-trained LLMs by in… ▽ More

    Submitted 11 July, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

    Comments: 14 pages, 9 figures

  11. arXiv:2407.07132  [pdf, other

    astro-ph.HE astro-ph.SR nucl-th

    The neutron star mass, distance, and inclination from precision timing of the brilliant millisecond pulsar J0437$-$4715

    Authors: Daniel J. Reardon, Matthew Bailes, Ryan M. Shannon, Chris Flynn, Jacob Askew, N. D. Ramesh Bhat, Zu-Cheng Chen, Małgorzata Curyło, Yi Feng, George B. Hobbs, Agastya Kapur, Matthew Kerr, Xiaojin Liu, Richard N. Manchester, Rami Mandow, Saurav Mishra, Christopher J. Russell, Mohsen Shamohammadi, Lei Zhang, Andrew Zic

    Abstract: The observation of neutron stars enables the otherwise impossible study of fundamental physical processes. Timing of binary radio pulsars is particularly powerful, as it enables precise characterization of their (three-dimensional) positions and orbits. PSR J0437$-$4715 is an important millisecond pulsar for timing array experiments and is also a primary target for the Neutron Star Interior Compos… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 13 pages, 3 figures, accepted for publication in Astrophysical Journal Letters

  12. arXiv:2407.06975  [pdf

    cond-mat.mtrl-sci

    Optimization of noncollinear magnetic ordering temperature in Y-type hexaferrite by machine learning

    Authors: Yonghong Li, Jing Zhang, Linfeng Jiang, Long Zhang, Yugang Zhang, Xueliang Wu, Yisheng Chai, Xiaoyuan Zhou, Zizhen Zhou

    Abstract: Searching the optimal doping compositions of the Y-type hexaferrite Ba2Mg2Fe12O22 remains a long-standing challenge for enhanced non-collinear magnetic transition temperature (TNC). Instead of the conventional trial-and-error approach, the composition-property descriptor is established via a data driven machine learning method named SISSO (sure independence screening and sparsifying operator). Bas… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: accepted by Applied Physics Letters in 2024

  13. arXiv:2407.06807  [pdf, other

    cs.AI

    A Hybrid Training-time and Run-time Defense Against Adversarial Attacks in Modulation Classification

    Authors: Lu Zhang, Sangarapillai Lambotharan, Gan Zheng, Guisheng Liao, Ambra Demontis, Fabio Roli

    Abstract: Motivated by the superior performance of deep learning in many applications including computer vision and natural language processing, several recent studies have focused on applying deep neural network for devising future generations of wireless networks. However, several recent works have pointed out that imperceptible and carefully designed adversarial examples (attacks) can significantly deter… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: Published in IEEE Wireless Communications Letters, vol. 11, no. 6, pp. 1161-1165, June 2022

  14. arXiv:2407.06796  [pdf, other

    cs.AI

    Countermeasures Against Adversarial Examples in Radio Signal Classification

    Authors: Lu Zhang, Sangarapillai Lambotharan, Gan Zheng, Basil AsSadhan, Fabio Roli

    Abstract: Deep learning algorithms have been shown to be powerful in many communication network design problems, including that in automatic modulation classification. However, they are vulnerable to carefully crafted attacks called adversarial examples. Hence, the reliance of wireless networks on deep learning algorithms poses a serious threat to the security and operation of wireless networks. In this let… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: Published in IEEE Wireless Communications Letters, vol. 10, no. 8, pp. 1830-1834, Aug. 2021

  15. arXiv:2407.06778  [pdf, other

    cs.CR cs.AI

    A BERT-based Empirical Study of Privacy Policies' Compliance with GDPR

    Authors: Lu Zhang, Nabil Moukafih, Hamad Alamri, Gregory Epiphaniou, Carsten Maple

    Abstract: Since its implementation in May 2018, the General Data Protection Regulation (GDPR) has prompted businesses to revisit and revise their data handling practices to ensure compliance. The privacy policy, which serves as the primary means of informing users about their privacy rights and the data practices of companies, has been significantly updated by numerous businesses post-GDPR implementation. H… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: Published in IEEE Conference on Communications and Network Security (CNS), 2023

  16. arXiv:2407.06612  [pdf

    eess.IV cs.CV cs.LG

    AI-based Automatic Segmentation of Prostate on Multi-modality Images: A Review

    Authors: Rui Jin, Derun Li, Dehui Xiang, Lei Zhang, Hailing Zhou, Fei Shi, Weifang Zhu, Jing Cai, Tao Peng, Xinjian Chen

    Abstract: Prostate cancer represents a major threat to health. Early detection is vital in reducing the mortality rate among prostate cancer patients. One approach involves using multi-modality (CT, MRI, US, etc.) computer-aided diagnosis (CAD) systems for the prostate region. However, prostate segmentation is challenging due to imperfections in the images and the prostate's complex tissue structure. The ad… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  17. arXiv:2407.06507  [pdf

    cs.LG cs.AI

    Economic span selection of bridge based on deep reinforcement learning

    Authors: Leye Zhang, Xiangxiang Tian, Chengli Zhang, Hongjun Zhang

    Abstract: Deep Q-network algorithm is used to select economic span of bridge. Selection of bridge span has a significant impact on the total cost of bridge, and a reasonable selection of span can reduce engineering cost. Economic span of bridge is theoretically analyzed, and the theoretical solution formula of economic span is deduced. Construction process of bridge simulation environment is described in de… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 7 pages, 6 figures

  18. arXiv:2407.06152  [pdf, other

    physics.chem-ph cs.AI

    Uni-ELF: A Multi-Level Representation Learning Framework for Electrolyte Formulation Design

    Authors: Boshen Zeng, Sian Chen, Xinxin Liu, Changhong Chen, Bin Deng, Xiaoxu Wang, Zhifeng Gao, Yuzhi Zhang, Weinan E, Linfeng Zhang

    Abstract: Advancements in lithium battery technology heavily rely on the design and engineering of electrolytes. However, current schemes for molecular design and recipe optimization of electrolytes lack an effective computational-experimental closed loop and often fall short in accurately predicting diverse electrolyte formulation properties. In this work, we introduce Uni-ELF, a novel multi-level represen… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  19. arXiv:2407.06053  [pdf, other

    cond-mat.mtrl-sci cs.LG quant-ph

    Learning local equivariant representations for quantum operators

    Authors: Zhanghao Zhouyin, Zixi Gan, Shishir Kumar Pandey, Linfeng Zhang, Qiangqiang Gu

    Abstract: Predicting quantum operator matrices such as Hamiltonian, overlap, and density matrices in the density functional theory (DFT) framework is crucial for understanding material properties. Current methods often focus on individual operators and struggle with efficiency and scalability for large systems. Here we introduce a novel deep learning model, SLEM (strictly localized equivariant message-passi… ▽ More

    Submitted 10 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

    Comments: 11 pages, 5 figures and 4 tables

  20. arXiv:2407.05875  [pdf, other

    cs.CV

    Minutes to Seconds: Speeded-up DDPM-based Image Inpainting with Coarse-to-Fine Sampling

    Authors: Lintao Zhang, Xiangcheng Du, LeoWu TomyEnrique, Yiqun Wang, Yingbin Zheng, Cheng Jin

    Abstract: For image inpainting, the existing Denoising Diffusion Probabilistic Model (DDPM) based method i.e. RePaint can produce high-quality images for any inpainting form. It utilizes a pre-trained DDPM as a prior and generates inpainting results by conditioning on the reverse diffusion process, namely denoising process. However, this process is significantly time-consuming. In this paper, we propose an… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: The code is avaliable at: https://github.com/linghuyuhangyuan/M2S

  21. arXiv:2407.05418  [pdf, other

    cs.CV cs.AI

    EMBANet: A Flexible Efffcient Multi-branch Attention Network

    Authors: Keke Zu, Hu Zhang, Jian Lu, Lei Zhang, Chen Xu

    Abstract: This work presents a novel module, namely multi-branch concat (MBC), to process the input tensor and obtain the multi-scale feature map. The proposed MBC module brings new degrees of freedom (DoF) for the design of attention networks by allowing the type of transformation operators and the number of branches to be flexibly adjusted. Two important transformation operators, multiplex and split, are… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  22. arXiv:2407.05236  [pdf, other

    astro-ph.HE

    A timing view of the additional high-energy spectral component discovered in the black hole candidate Swift J1727.8-1613

    Authors: Zi-Xu Yang, Liang Zhang, Shuang-Nan Zhang, L. Tao, Shu Zhang, Ruican Ma, Qingcui Bu, Yue Huang, He-Xin Liu, Wei Yu, Guang C. Xiao, Peng-Ju Wang, Hua Feng, Li-Ming Song, Xiang Ma, Mingyu Ge, QingChang Zhao, J. L. Qu

    Abstract: We present an energy-dependent analysis for the type-C quasi-periodic oscillations (QPOs) observed in the black hole X-ray binary Swift J1727.8-1613 using Insight-HXMT observations. We find that the QPO fractional rms at energies above 40 keV is significantly higher than that below 20 keV. This is the first report of a high energy (HE)-rms excess in the rms spectrum of a black hole X-ray binary. I… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  23. arXiv:2407.05131  [pdf, other

    cs.LG cs.AI cs.CL cs.CV cs.CY

    RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models

    Authors: Peng Xia, Kangyu Zhu, Haoran Li, Hongtu Zhu, Yun Li, Gang Li, Linjun Zhang, Huaxiu Yao

    Abstract: The recent emergence of Medical Large Vision Language Models (Med-LVLMs) has enhanced medical diagnosis. However, current Med-LVLMs frequently encounter factual issues, often generating responses that do not align with established medical facts. Retrieval-Augmented Generation (RAG), which utilizes external knowledge, can improve the factual accuracy of these models but introduces two major challen… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  24. arXiv:2407.05106  [pdf, other

    cs.CV

    DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition

    Authors: Qi Wang, Zhou Xu, Yuming Lin, Jingtao Ye, Hongsheng Li, Guangming Zhu, Syed Afaq Ali Shah, Mohammed Bennamoun, Liang Zhang

    Abstract: Neuromorphic sensors, specifically event cameras, revolutionize visual data acquisition by capturing pixel intensity changes with exceptional dynamic range, minimal latency, and energy efficiency, setting them apart from conventional frame-based cameras. The distinctive capabilities of event cameras have ignited significant interest in the domain of event-based action recognition, recognizing thei… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV 2024

  25. arXiv:2407.04967  [pdf, other

    stat.CO

    posteriordb: Testing, Benchmarking and Developing Bayesian Inference Algorithms

    Authors: Måns Magnusson, Jakob Torgander, Paul-Christian Bürkner, Lu Zhang, Bob Carpenter, Aki Vehtari

    Abstract: The generality and robustness of inference algorithms is critical to the success of widely used probabilistic programming languages such as Stan, PyMC, Pyro, and Turing.jl. When designing a new general-purpose inference algorithm, whether it involves Monte Carlo sampling or variational approximation, the fundamental problem arises in evaluating its accuracy and efficiency across a range of represe… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  26. arXiv:2407.04963  [pdf, other

    cs.CV

    Towards Context-Aware Emotion Recognition Debiasing from a Causal Demystification Perspective via De-confounded Training

    Authors: Dingkang Yang, Kun Yang, Haopeng Kuang, Zhaoyu Chen, Yuzheng Wang, Lihua Zhang

    Abstract: Understanding emotions from diverse contexts has received widespread attention in computer vision communities. The core philosophy of Context-Aware Emotion Recognition (CAER) is to provide valuable semantic cues for recognizing the emotions of target persons by leveraging rich contextual information. Current approaches invariably focus on designing sophisticated structures to extract perceptually… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: TPAMI 2024

  27. arXiv:2407.04955  [pdf, other

    cs.CV

    Asynchronous Multimodal Video Sequence Fusion via Learning Modality-Exclusive and -Agnostic Representations

    Authors: Dingkang Yang, Mingcheng Li, Linhao Qu, Kun Yang, Peng Zhai, Song Wang, Lihua Zhang

    Abstract: Understanding human intentions (e.g., emotions) from videos has received considerable attention recently. Video streams generally constitute a blend of temporal data stemming from distinct modalities, including natural language, facial expressions, and auditory clues. Despite the impressive advancements of previous works via attention-based paradigms, the inherent temporal asynchrony and modality… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: TCSVT 2024

  28. arXiv:2407.04923  [pdf, other

    cs.CV cs.CL

    OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding

    Authors: Tiancheng Zhao, Qianqian Zhang, Kyusong Lee, Peng Liu, Lu Zhang, Chunxin Fang, Jiajia Liao, Kelei Jiang, Yibo Ma, Ruochen Xu

    Abstract: We introduce OmChat, a model designed to excel in handling long contexts and video understanding tasks. OmChat's new architecture standardizes how different visual inputs are processed, making it more efficient and adaptable. It uses a dynamic vision encoding process to effectively handle images of various resolutions, capturing fine details across a range of image qualities. OmChat utilizes an ac… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 14 pages

  29. arXiv:2407.04396  [pdf, other

    cs.CV cs.AI

    Graph-Guided Test-Time Adaptation for Glaucoma Diagnosis using Fundus Photography

    Authors: Qian Zeng, Le Zhang, Yipeng Liu, Ce Zhu, Fan Zhang

    Abstract: Glaucoma is a leading cause of irreversible blindness worldwide. While deep learning approaches using fundus images have largely improved early diagnosis of glaucoma, variations in images from different devices and locations (known as domain shifts) challenge the use of pre-trained models in real-world settings. To address this, we propose a novel Graph-guided Test-Time Adaptation (GTTA) framework… ▽ More

    Submitted 9 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

    Comments: 11 pages, 3 figures, 3 tables, submitted to MICCAI

  30. arXiv:2407.04162  [pdf, other

    eess.IV cs.CV

    Measurement Embedded Schrödinger Bridge for Inverse Problems

    Authors: Yuang Wang, Pengfei Jin, Siyeop Yoon, Matthew Tivnan, Quanzheng Li, Li Zhang, Dufan Wu

    Abstract: Score-based diffusion models are frequently employed as structural priors in inverse problems. However, their iterative denoising process, initiated from Gaussian noise, often results in slow inference speeds. The Image-to-Image Schrödinger Bridge (I$^2$SB), which begins with the corrupted image, presents a promising alternative as a prior for addressing inverse problems. In this work, we introduc… ▽ More

    Submitted 22 May, 2024; originally announced July 2024.

    Comments: 14 pages, 2 figures, Neurips preprint

  31. arXiv:2407.03891  [pdf, other

    cs.SE cs.PL

    AutoBench: Automatic Testbench Generation and Evaluation Using LLMs for HDL Design

    Authors: Ruidi Qiu, Grace Li Zhang, Rolf Drechsler, Ulf Schlichtmann, Bing Li

    Abstract: In digital circuit design, testbenches constitute the cornerstone of simulation-based hardware verification. Traditional methodologies for testbench generation during simulation-based hardware verification still remain partially manual, resulting in inefficiencies in testing various scenarios and requiring expensive time from designers. Large Language Models (LLMs) have demonstrated their potentia… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  32. arXiv:2407.03889  [pdf, other

    eess.SY

    Automated C/C++ Program Repair for High-Level Synthesis via Large Language Models

    Authors: Kangwei Xu, Grace Li Zhang, Xunzhao Yin, Cheng Zhuo, Ulf Schlichtmann, Bing Li

    Abstract: In High-Level Synthesis (HLS), converting a regular C/C++ program into its HLS-compatible counterpart (HLS-C) still requires tremendous manual effort. Various program scripts have been introduced to automate this process. But the resulting codes usually contain many issues that should be manually repaired by developers. Since Large Language Models (LLMs) have the ability to automate code generatio… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  33. arXiv:2407.03738  [pdf, other

    eess.SY cs.LG

    BasisN: Reprogramming-Free RRAM-Based In-Memory-Computing by Basis Combination for Deep Neural Networks

    Authors: Amro Eldebiky, Grace Li Zhang, Xunzhao Yin, Cheng Zhuo, Ing-Chao Lin, Ulf Schlichtmann, Bing Li

    Abstract: Deep neural networks (DNNs) have made breakthroughs in various fields including image recognition and language processing. DNNs execute hundreds of millions of multiply-and-accumulate (MAC) operations. To efficiently accelerate such computations, analog in-memory-computing platforms have emerged leveraging emerging devices such as resistive RAM (RRAM). However, such accelerators face the hurdle of… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: accepted by ICCAD2024

  34. arXiv:2407.03581  [pdf, ps, other

    cond-mat.str-el

    Topologically nontrivial $1/3$-magnetization plateau state in a spin-1/2 trimer chain

    Authors: Y. Y. Han, B. C. Yu, Z. Du, L. S. Ling, L. Zhang, W. Tong, C. Y. Xi, J. L. Zhang, T. Shang, Li Pi, Long Ma

    Abstract: Topologically nontrivial Haldane phase is theoretically proposed to be realized in the 1/3-magnetization ($M$) plateau of spin-1/2 trimer systems. However, the spin excitation gap, typical characteristic of Haldane phase, is not yet experimentally verified. Here, we report the nuclear magnetic resonance investigations into the low-energy spin dynamics in the $S=1/2$ spin-trimer antiferromagnetic c… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 6 pages, 4 figures

  35. arXiv:2407.03469  [pdf

    cs.SE cs.AI

    Scaling Data-Driven Building Energy Modelling using Large Language Models

    Authors: Sunil Khadka, Liang Zhang

    Abstract: Building Management System (BMS) through a data-driven method always faces data and model scalability issues. We propose a methodology to tackle the scalability challenges associated with the development of data-driven models for BMS by using Large Language Models (LLMs). LLMs' code generation adaptability can enable broader adoption of BMS by "automating the automation," particularly the data han… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  36. arXiv:2407.03331  [pdf, other

    cs.CV cs.AI cs.DC

    Anole: Adapting Diverse Compressed Models For Cross-Scene Prediction On Mobile Devices

    Authors: Yunzhe Li, Hongzi Zhu, Zhuohong Deng, Yunlong Cheng, Liang Zhang, Shan Chang, Minyi Guo

    Abstract: Emerging Artificial Intelligence of Things (AIoT) applications desire online prediction using deep neural network (DNN) models on mobile devices. However, due to the movement of devices, unfamiliar test samples constantly appear, significantly affecting the prediction accuracy of a pre-trained DNN. In addition, unstable network connection calls for local model inference. In this paper, we propose… ▽ More

    Submitted 9 May, 2024; originally announced July 2024.

  37. arXiv:2407.03008  [pdf, other

    cs.CV

    Align and Aggregate: Compositional Reasoning with Video Alignment and Answer Aggregation for Video Question-Answering

    Authors: Zhaohe Liao, Jiangtong Li, Li Niu, Liqing Zhang

    Abstract: Despite the recent progress made in Video Question-Answering (VideoQA), these methods typically function as black-boxes, making it difficult to understand their reasoning processes and perform consistent compositional reasoning. To address these challenges, we propose a \textit{model-agnostic} Video Alignment and Answer Aggregation (VA$^{3}$) framework, which is capable of enhancing both compositi… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 10 pages,CVPR

    Journal ref: CVPR (2024) 13395-13404

  38. arXiv:2407.02899  [pdf, other

    hep-ex

    Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

    Abstract: A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  39. arXiv:2407.02505  [pdf, other

    cs.CE cs.LG physics.flu-dyn

    A MgNO Method for Multiphase Flow in Porous Media

    Authors: Xinliang Liu, Xia Yang, Chen-Song Zhang, Lian Zhang, Li Zhao

    Abstract: This research investigates the application of Multigrid Neural Operator (MgNO), a neural operator architecture inspired by multigrid methods, in the simulation for multiphase flow within porous media. The architecture is adjusted to manage a variety of crucial factors, such as permeability and porosity heterogeneity. The study extendes MgNO to time-dependent porous media flow problems and validate… ▽ More

    Submitted 16 June, 2024; originally announced July 2024.

  40. arXiv:2407.02392  [pdf, other

    cs.CV

    TokenPacker: Efficient Visual Projector for Multimodal LLM

    Authors: Wentong Li, Yuqian Yuan, Jian Liu, Dongqi Tang, Song Wang, Jianke Zhu, Lei Zhang

    Abstract: The visual projector serves as an essential bridge between the visual encoder and the Large Language Model (LLM) in a Multimodal LLM (MLLM). Typically, MLLMs adopt a simple MLP to preserve all visual contexts via one-to-one transformation. However, the visual tokens are redundant and can be considerably increased when dealing with high-resolution images, impairing the efficiency of MLLMs significa… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 16 pages, Codes:https://github.com/CircleRadon/TokenPacker

  41. arXiv:2407.02049  [pdf, other

    eess.AS cs.CL cs.SD

    Accompanied Singing Voice Synthesis with Fully Text-controlled Melody

    Authors: Ruiqi Li, Zhiqing Hong, Yongqi Wang, Lichao Zhang, Rongjie Huang, Siqi Zheng, Zhou Zhao

    Abstract: Text-to-song (TTSong) is a music generation task that synthesizes accompanied singing voices. Current TTSong methods, inherited from singing voice synthesis (SVS), require melody-related information that can sometimes be impractical, such as music scores or MIDI sequences. We present MelodyLM, the first TTSong model that generates high-quality song pieces with fully text-controlled melodies, achie… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Working in progress

  42. arXiv:2407.02040  [pdf, other

    cs.CV cs.AI cs.MM

    ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation

    Authors: Zhiyuan Ma, Yuxiang Wei, Yabin Zhang, Xiangyu Zhu, Zhen Lei, Lei Zhang

    Abstract: By leveraging the text-to-image diffusion priors, score distillation can synthesize 3D contents without paired text-3D training data. Instead of spending hours of online optimization per text prompt, recent studies have been focused on learning a text-to-3D generative network for amortizing multiple text-3D relations, which can synthesize 3D contents in seconds. However, existing score distillatio… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024. Code available at https://github.com/theEricMa/ScaleDreamer

  43. arXiv:2407.02031  [pdf, other

    cs.DC cs.AI cs.LG

    SwiftDiffusion: Efficient Diffusion Model Serving with Add-on Modules

    Authors: Suyi Li, Lingyun Yang, Xiaoxiao Jiang, Hanfeng Lu, Zhipeng Di, Weiyi Lu, Jiawei Chen, Kan Liu, Yinghao Yu, Tao Lan, Guodong Yang, Lin Qu, Liping Zhang, Wei Wang

    Abstract: This paper documents our characterization study and practices for serving text-to-image requests with stable diffusion models in production. We first comprehensively analyze inference request traces for commercial text-to-image applications. It commences with our observation that add-on modules, i.e., ControlNets and LoRAs, that augment the base stable diffusion models, are ubiquitous in generatin… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  44. arXiv:2407.01928  [pdf, other

    cs.CV

    SymPoint Revolutionized: Boosting Panoptic Symbol Spotting with Layer Feature Enhancement

    Authors: Wenlong Liu, Tianyu Yang, Qizhi Yu, Lei Zhang

    Abstract: SymPoint is an initial attempt that utilizes point set representation to solve the panoptic symbol spotting task on CAD drawing. Despite its considerable success, it overlooks graphical layer information and suffers from prohibitively slow training convergence. To tackle this issue, we introduce SymPoint-V2, a robust and efficient solution featuring novel, streamlined designs that overcome these l… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: code at https://github.com/nicehuster/SymPointV2

  45. arXiv:2407.01780  [pdf

    cond-mat.mtrl-sci

    Fracture Characteristics of Rare-earth Phosphate under Molten Calcium Magnesium Aluminosilicate Corrosion

    Authors: Subrato Sarkar, Rahul Rahul, Bishnu Pada Majee, Keith Bryce, Lucy Zhang, Liping Huang, Jie Lian, Suvranu De

    Abstract: The fracture characteristics of LuPO4 rare-earth phosphate environmental barrier coating (EBC) material under molten calcium-magnesium aluminosilicate (CMAS) corrosion is quantified. EBCs are crucial for protecting SiC-based ceramic matrix composite components in the hot section of gas turbine engines. Recent research has highlighted the potential of rare-earth phosphates as better EBC materials t… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  46. arXiv:2407.01749  [pdf, other

    cs.LG cs.AI

    Invariant Correlation of Representation with Label

    Authors: Gaojie Jin, Ronghui Mu, Xinping Yi, Xiaowei Huang, Lijun Zhang

    Abstract: The Invariant Risk Minimization (IRM) approach aims to address the challenge of domain generalization by training a feature representation that remains invariant across multiple environments. However, in noisy environments, IRM-related techniques such as IRMv1 and VREx may be unable to achieve the optimal IRM solution, primarily due to erroneous optimization directions. To address this issue, we i… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  47. arXiv:2407.01731  [pdf, other

    cs.CV

    Uncertainty Quantification in Table Structure Recognition

    Authors: Kehinde Ajayi, Leizhen Zhang, Yi He, Jian Wu

    Abstract: Quantifying uncertainties for machine learning models is a critical step to reduce human verification effort by detecting predictions with low confidence. This paper proposes a method for uncertainty quantification (UQ) of table structure recognition (TSR). The proposed UQ method is built upon a mixture-of-expert approach termed Test-Time Augmentation (TTA). Our key idea is to enrich and diversify… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 7 Figures

  48. arXiv:2407.01636  [pdf, other

    cs.CV

    Learning Frequency-Aware Dynamic Transformers for All-In-One Image Restoration

    Authors: Zenglin Shi, Tong Su, Pei Liu, Yunpeng Wu, Le Zhang, Meng Wang

    Abstract: This work aims to tackle the all-in-one image restoration task, which seeks to handle multiple types of degradation with a single model. The primary challenge is to extract degradation representations from the input degraded images and use them to guide the model's adaptation to specific degradation types. Recognizing that various degradations affect image content differently across frequency band… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: 8 pages

  49. arXiv:2407.01489  [pdf, other

    cs.SE cs.AI cs.CL cs.LG

    Agentless: Demystifying LLM-based Software Engineering Agents

    Authors: Chunqiu Steven Xia, Yinlin Deng, Soren Dunn, Lingming Zhang

    Abstract: Recent advancements in large language models (LLMs) have significantly advanced the automation of software development tasks, including code synthesis, program repair, and test generation. More recently, researchers and industry practitioners have developed various autonomous LLM agents to perform end-to-end software development tasks. These agents are equipped with the ability to use tools, run c… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  50. arXiv:2407.01389  [pdf, other

    cond-mat.str-el

    Feynman diagrammatics based on discrete pole representations: A path to renormalized perturbation theories

    Authors: Daria Gazizova, Lei Zhang, Emanuel Gull, J. P. F. LeBlanc

    Abstract: By merging algorithmic Matsubara integration with discrete pole representations we present a procedure to generate fully analytic closed form results for impurity problems at fixed perturbation order. To demonstrate the utility of this approach we study the Bethe lattice and evaluate the second order self-energy for which reliable benchmarks exist. We show that, when evaluating diagrams on the Mat… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 10 pages, 8 figures