Skip to main content

Showing 1–50 of 1,177 results for author: Zhou, B

  1. arXiv:2407.08940  [pdf, other

    cs.CL

    Large Language Models as Biomedical Hypothesis Generators: A Comprehensive Evaluation

    Authors: Biqing Qi, Kaiyan Zhang, Kai Tian, Haoxiang Li, Zhang-Ren Chen, Sihang Zeng, Ermo Hua, Hu Jinfang, Bowen Zhou

    Abstract: The rapid growth of biomedical knowledge has outpaced our ability to efficiently extract insights and generate novel hypotheses. Large language models (LLMs) have emerged as a promising tool to revolutionize knowledge interaction and potentially accelerate biomedical discovery. In this paper, we present a comprehensive evaluation of LLMs as biomedical hypothesis generators. We construct a dataset… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Accepted to COLM 2024. This is an extended version of the paper at arXiv:2311.05965

  2. arXiv:2407.08725  [pdf, other

    cs.CV cs.AI cs.RO

    MetaUrban: A Simulation Platform for Embodied AI in Urban Spaces

    Authors: Wayne Wu, Honglin He, Yiran Wang, Chenda Duan, Jack He, Zhizheng Liu, Quanyi Li, Bolei Zhou

    Abstract: Public urban spaces like streetscapes and plazas serve residents and accommodate social life in all its vibrant variations. Recent advances in Robotics and Embodied AI make public urban spaces no longer exclusive to humans. Food delivery bots and electric wheelchairs have started sharing sidewalks with pedestrians, while diverse robot dogs and humanoids have recently emerged in the street. Ensurin… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Technical report. Project page: https://metadriverse.github.io/metaurban/

  3. arXiv:2407.08642  [pdf, other

    cs.CL

    Towards Building Specialized Generalist AI with System 1 and System 2 Fusion

    Authors: Kaiyan Zhang, Biqing Qi, Bowen Zhou

    Abstract: In this perspective paper, we introduce the concept of Specialized Generalist Artificial Intelligence (SGAI or simply SGI) as a crucial milestone toward Artificial General Intelligence (AGI). Compared to directly scaling general abilities, SGI is defined as AI that specializes in at least one task, surpassing human experts, while also retaining general abilities. This fusion path enables SGI to ra… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  4. arXiv:2407.08033  [pdf, other

    physics.ins-det

    Studies of Cherenkov Photon Production in PbF$_2$ Crystals using Proton Beams at Fermilab

    Authors: Thomas Anderson, Alberto Belloni, Grace Cummings, Sarah Eno, Nora Fischer, Liang Guan, Yuxiang Guo, Robert Hirosky, James Hirschauer, Yihui Lai, Daniel Levin, Hui-Chi Lin, Mekhala Paranjpe, Jianming Qian, Bing Zhou, Junjie Zhu, Ren-Yuan Zhu

    Abstract: Future lepton colliders such as the FCC-ee, CEPC, ILC, or a muon collider will collect large data samples that allow precision physics studies with unprecedented accuracy, especially when the data is collected by innovative state-of-the-art detectors. An electromagnetic calorimeter based on scintillating crystals, designed to separately record Cherenkov and scintillation light, can achieve precisi… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 10 pages

  5. arXiv:2407.07443  [pdf, other

    cs.AI

    Secondary Structure-Guided Novel Protein Sequence Generation with Latent Graph Diffusion

    Authors: Yutong Hu, Yang Tan, Andi Han, Lirong Zheng, Liang Hong, Bingxin Zhou

    Abstract: The advent of deep learning has introduced efficient approaches for de novo protein sequence design, significantly improving success rates and reducing development costs compared to computational or experimental methods. However, existing methods face challenges in generating proteins with diverse lengths and shapes while maintaining key structural features. To address these challenges, we introdu… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 10 pages, 4 figures

  6. arXiv:2407.06615  [pdf

    cond-mat.mtrl-sci

    SG-NNP: Species-separated Gaussian Neural Network Potential with Linear Elemental Scaling and Optimized Dimensions for Multi-component Materials

    Authors: Ji Wei Yoon, Bangjian Zhou, J Senthilnath

    Abstract: Accurate simulations of materials at long-time and large-length scales have increasingly been enabled by Machine-learned Interatomic Potentials (MLIPs). There have been increasing interest on improving the robustness of such models. To this end, we engineer a novel set of Gaussian-type descriptors that scale linearly with the number of atoms, reduce informational degeneracy for multi-component ato… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  7. arXiv:2407.05562  [pdf, other

    cs.CV

    Focus on the Whole Character: Discriminative Character Modeling for Scene Text Recognition

    Authors: Bangbang Zhou, Yadong Qu, Zixiao Wang, Zicheng Li, Boqiang Zhang, Hongtao Xie

    Abstract: Recently, scene text recognition (STR) models have shown significant performance improvements. However, existing models still encounter difficulties in recognizing challenging texts that involve factors such as severely distorted and perspective characters. These challenging texts mainly cause two problems: (1) Large Intra-Class Variance. (2) Small Inter-Class Variance. An extremely distorted char… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: Accepted to IJCAI2024

  8. arXiv:2407.04969  [pdf, other

    cs.CL

    EVA-Score: Evaluation of Long-form Summarization on Informativeness through Extraction and Validation

    Authors: Yuchen Fan, Xin Zhong, Chengsi Wang, Gaoche Wu, Bowen Zhou

    Abstract: Summarization is a fundamental task in natural language processing (NLP) and since large language models (LLMs), such as GPT-4 and Claude, come out, increasing attention has been paid to long-form summarization whose input sequences are much longer, indicating more information contained. The current evaluation metrics either use similarity-based metrics like ROUGE and BERTScore which rely on sim… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: 16 pages, 3 figures, submitted to EMNLP

  9. arXiv:2407.04900  [pdf, other

    cs.LG math.OC

    Closing the Gaps: Optimality of Sample Average Approximation for Data-Driven Newsvendor Problems

    Authors: Jiameng Lyu, Shilin Yuan, Bingkun Zhou, Yuan Zhou

    Abstract: We study the regret performance of Sample Average Approximation (SAA) for data-driven newsvendor problems with general convex inventory costs. In literature, the optimality of SAA has not been fully established under both α-global strong convexity and (α,β)-local strong convexity (α-strongly convex within the β-neighborhood of the optimal quantity) conditions. This paper closes the gaps between re… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  10. arXiv:2407.02867  [pdf, other

    cs.MM cs.CL

    Contrast then Memorize: Semantic Neighbor Retrieval-Enhanced Inductive Multimodal Knowledge Graph Completion

    Authors: Yu Zhao, Ying Zhang, Baohang Zhou, Xinying Qian, Kehui Song, Xiangrui Cai

    Abstract: A large number of studies have emerged for Multimodal Knowledge Graph Completion (MKGC) to predict the missing links in MKGs. However, fewer studies have been proposed to study the inductive MKGC (IMKGC) involving emerging entities unseen during training. Existing inductive approaches focus on learning textual entity representations, which neglect rich semantic information in visual modality. More… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Accepted by SIGIR 2024

  11. arXiv:2407.01222  [pdf, other

    cs.RO

    Deep Learning Models for Flapping Fin Unmanned Underwater Vehicle Control System Gait Optimization

    Authors: Brian Zhou, Kamal Viswanath, Jason Geder, Alisha Sharma, Julian Lee

    Abstract: The last few decades have led to the rise of research focused on propulsion and control systems for bio-inspired unmanned underwater vehicles (UUVs), which provide more maneuverable alternatives to traditional UUVs in underwater missions. Recent work has explored the use of time-series neural network surrogate models to predict thrust and power from vehicle design and fin kinematics. We develop a… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 28 pages, 20 figures. arXiv admin note: text overlap with arXiv:2310.14135

  12. arXiv:2407.00577  [pdf, other

    cs.RO

    FALCON: Fast Autonomous Aerial Exploration using Coverage Path Guidance

    Authors: Yichen Zhang, Xinyi Chen, Chen Feng, Boyu Zhou, Shaojie Shen

    Abstract: This paper introduces FALCON, a novel Fast Autonomous expLoration framework using COverage path guidaNce, which aims at setting a new performance benchmark in the field of autonomous aerial exploration. Despite recent advancements in the domain, existing exploration planners often suffer from inefficiencies such as frequent revisitations of previously explored regions. FALCON effectively harnesses… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  13. arXiv:2406.19755  [pdf, other

    q-bio.QM cs.AI

    Protein Representation Learning with Sequence Information Embedding: Does it Always Lead to a Better Performance?

    Authors: Yang Tan, Lirong Zheng, Bozitao Zhong, Liang Hong, Bingxin Zhou

    Abstract: Deep learning has become a crucial tool in studying proteins. While the significance of modeling protein structure has been discussed extensively in the literature, amino acid types are typically included in the input as a default operation for many inference tasks. This study demonstrates with structure alignment task that embedding amino acid types in some cases may not help a deep learning mode… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: 8 pages, 4 figures

  14. arXiv:2406.19744  [pdf, other

    q-bio.QM

    ProtSolM: Protein Solubility Prediction with Multi-modal Features

    Authors: Yang Tan, Jia Zheng, Liang Hong, Bingxin Zhou

    Abstract: Understanding protein solubility is essential for their functional applications. Computational methods for predicting protein solubility are crucial for reducing experimental costs and enhancing the efficiency and success rates of protein engineering. Existing methods either construct a supervised learning scheme on small-scale datasets with manually processed physicochemical properties, or blindl… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: 10 pages, 7 figures, 9 tables

  15. arXiv:2406.19699  [pdf, other

    cond-mat.mes-hall

    Tunable corner-like modes in generalized quadrupole topological insulator

    Authors: Rui Chen, Bin Zhou, Dong-Hui Xu

    Abstract: Higher-order topological insulators harbor unique corner modes that hold immense potential for applications in information storage. However, the practical manipulation of these states has been constrained by the fixed positions and energies of conventional corner modes. In this work, we present a theoretical framework for generating topologically protected corner-like modes in higher-order topolog… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  16. arXiv:2406.17820  [pdf, ps, other

    math.CO

    Spectral conditions implying the existence of doubly chorded cycles without or with constraints

    Authors: Leyou Xu, Bo Zhou

    Abstract: What spectral conditions imply a graph contains a chorded cycle? This question was asked by R.J. Gould in 2022. We answer three modified versions of Gould's question by giving tight spectral conditions that imply the existence of doubly chorded cycle, a doubly chorded cycle with two chords incident to a vertex, and a doubly chorded cycle on five vertices with two chords incident to one vertex, in… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  17. arXiv:2406.17766  [pdf, other

    cond-mat.mes-hall

    Generalized anomalous Hall crystals in twisted bilayer-trilayer graphene

    Authors: Ruiheng Su, Dacen Waters, Boran Zhou, Kenji Watanabe, Takashi Taniguchi, Ya-Hui Zhang, Matthew Yankowitz, Joshua Folk

    Abstract: In a dilute two-dimensional electron gas, Coulomb interactions can stabilize the formation of a Wigner crystal. Although Wigner crystals are topologically trivial, it has been predicted that electrons in a partially-filled band can break continuous translational symmetry and time-reversal symmetry spontaneously to form an anomalous Hall crystal (AHC). Here, we report the observation of a generaliz… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  18. arXiv:2406.16928  [pdf, other

    eess.SP cs.LG

    A Multi-Resolution Mutual Learning Network for Multi-Label ECG Classification

    Authors: Wei Huang, Ning Wang, Panpan Feng, Haiyan Wang, Zongmin Wang, Bing Zhou

    Abstract: Electrocardiograms (ECG), which record the electrophysiological activity of the heart, have become a crucial tool for diagnosing these diseases. In recent years, the application of deep learning techniques has significantly improved the performance of ECG signal classification. Multi-resolution feature analysis, which captures and processes information at different time scales, can extract subtle… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  19. arXiv:2406.16803  [pdf, other

    hep-ph hep-ex

    Discovering neutrino tridents at the Large Hadron Collider

    Authors: Wolfgang Altmannshofer, Toni Mäkelä, Subir Sarkar, Sebastian Trojanowski, Keping Xie, Bei Zhou

    Abstract: Neutrino trident production of di-lepton pairs is well recognized as a sensitive probe of both electroweak physics and physics beyond the Standard Model. Although a rare process, it could be significantly boosted by such new physics, and it also allows the electroweak theory to be tested in a new regime. We demonstrate that the forward neutrino physics program at the Large Hadron Collider offers a… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 19 pages, 11 figures

    Report number: FERMILAB-PUB-24-0294-T, MSUHEP-24-007

  20. arXiv:2406.13228  [pdf, other

    cs.LG cs.AI cs.CR

    AGSOA:Graph Neural Network Targeted Attack Based on Average Gradient and Structure Optimization

    Authors: Yang Chen, Bin Zhou

    Abstract: Graph Neural Networks(GNNs) are vulnerable to adversarial attack that cause performance degradation by adding small perturbations to the graph. Gradient-based attacks are one of the most commonly used methods and have achieved good performance in many attack scenarios. However, current gradient attacks face the problems of easy to fall into local optima and poor attack invisibility. Specifically,… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  21. arXiv:2406.13215  [pdf, other

    cs.CV cs.AI

    Neural Residual Diffusion Models for Deep Scalable Vision Generation

    Authors: Zhiyuan Ma, Liangliang Zhao, Biqing Qi, Bowen Zhou

    Abstract: The most advanced diffusion models have recently adopted increasingly deep stacked networks (e.g., U-Net or Transformer) to promote the generative emergence capabilities of vision generation models similar to large language models (LLMs). However, progressively deeper stacked networks will intuitively cause numerical propagation errors and reduce noisy prediction capabilities on generative data, w… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  22. arXiv:2406.12643  [pdf, other

    physics.plasm-ph physics.optics

    Waveshape of THz radiation produced by two-color laser-induced air plasmas

    Authors: Alexandre Stathopulos, Stefan Skupin, Binbin Zhou, Peter U. Jepsen, Luc Bergé

    Abstract: The spatial and spectral distributions of terahertz (THz) pulses emitted by two-color air plasmas are theoretically investigated for focused laser pulses and in the filamentation regime. We derive a so-called ''augmented'' conical emission model, which, similarly to the one originally proposed by You et al.\ [Phys.\ Rev.\ Lett.\ {\bf 109}, 183902 (2012)], involves phase matching between laser harm… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 22 pages, 18 figures

  23. arXiv:2406.12295  [pdf, other

    cs.CL

    Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding

    Authors: Kaiyan Zhang, Jianyu Wang, Ning Ding, Biqing Qi, Ermo Hua, Xingtai Lv, Bowen Zhou

    Abstract: Large Language Models (LLMs) demonstrate impressive performance in diverse applications, yet they face significant drawbacks, including high inference latency, expensive training cost, and generation of hallucination. Collaborative decoding between large and small language models (SLMs) offers a novel approach to address these challenges. Inspired by dual-process cognitive theory, we integrate the… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  24. arXiv:2406.11914  [pdf, other

    cs.LG cs.ET eess.SP

    Initial Investigation of Kolmogorov-Arnold Networks (KANs) as Feature Extractors for IMU Based Human Activity Recognition

    Authors: Mengxi Liu, Daniel Geißler, Dominique Nshimyimana, Sizhen Bian, Bo Zhou, Paul Lukowicz

    Abstract: In this work, we explore the use of a novel neural network architecture, the Kolmogorov-Arnold Networks (KANs) as feature extractors for sensor-based (specifically IMU) Human Activity Recognition (HAR). Where conventional networks perform a parameterized weighted sum of the inputs at each node and then feed the result into a statically defined nonlinearity, KANs perform non-linear computations rep… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: This paper is under review

  25. arXiv:2406.11243  [pdf, other

    cs.CL cs.AI

    FamiCom: Further Demystifying Prompts for Language Models with Task-Agnostic Performance Estimation

    Authors: Bangzheng Li, Ben Zhou, Xingyu Fu, Fei Wang, Dan Roth, Muhao Chen

    Abstract: Language models have shown impressive in-context-learning capabilities, which allow them to benefit from input prompts and perform better on downstream end tasks. Existing works investigate the mechanisms behind this observation, and propose label-agnostic prompt metrics that can better estimate end-task performances. One popular approach is using perplexity as a way to measure models' familiarity… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  26. arXiv:2406.10757  [pdf, other

    gr-qc astro-ph.CO astro-ph.HE hep-ph

    Searching for cosmological stochastic backgrounds by notching out resolvable compact binary foregrounds with next-generation gravitational-wave detectors

    Authors: Haowen Zhong, Bei Zhou, Luca Reali, Emanuele Berti, Vuk Mandic

    Abstract: Stochastic gravitational-wave backgrounds can be of either cosmological or astrophysical origin. The detection of an astrophysical stochastic gravitational-wave background with ground-based interferometers is expected in the near future. Perhaps even more excitingly, the detection of stochastic backgrounds of cosmological origin by future ground-based interferometers could reveal invaluable inform… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: 13 pages, 6 figures

  27. arXiv:2406.09386  [pdf, other

    cs.CV

    SimGen: Simulator-conditioned Driving Scene Generation

    Authors: Yunsong Zhou, Michael Simon, Zhenghao Peng, Sicheng Mo, Hongzi Zhu, Minyi Guo, Bolei Zhou

    Abstract: Controllable synthetic data generation can substantially lower the annotation cost of training data in autonomous driving research and development. Prior works use diffusion models to generate driving images conditioned on the 3D object layout. However, those models are trained on small-scale datasets like nuScenes, which lack appearance and layout diversity. Moreover, the trained models can only… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  28. arXiv:2406.08887  [pdf, other

    eess.SP

    Low-Overhead Channel Estimation via 3D Extrapolation for TDD mmWave Massive MIMO Systems Under High-Mobility Scenarios

    Authors: Binggui Zhou, Xi Yang, Shaodan Ma, Feifei Gao, Guanghua Yang

    Abstract: In TDD mmWave massive MIMO systems, the downlink CSI can be attained through uplink channel estimation thanks to the uplink-downlink channel reciprocity. However, the channel aging issue is significant under high-mobility scenarios and thus necessitates frequent uplink channel estimation. In addition, large amounts of antennas and subcarriers lead to high-dimensional CSI matrices, aggravating the… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 13 pages, 11 figures, 3 tables. This paper has been submitted to IEEE journal for possible publication

  29. arXiv:2406.08698  [pdf, other

    astro-ph.HE hep-ph

    Constraints on Ultra Heavy Dark Matter Properties from Dwarf Spheroidal Galaxies with LHAASO Observations

    Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

    Abstract: In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 17 pages, 12 figures, accepted by PRL

  30. arXiv:2406.08374  [pdf, other

    cs.CV cs.AI eess.IV

    2.5D Multi-view Averaging Diffusion Model for 3D Medical Image Translation: Application to Low-count PET Reconstruction with CT-less Attenuation Correction

    Authors: Tianqi Chen, Jun Hou, Yinchi Zhou, Huidong Xie, Xiongchao Chen, Qiong Liu, Xueqi Guo, Menghua Xia, James S. Duncan, Chi Liu, Bo Zhou

    Abstract: Positron Emission Tomography (PET) is an important clinical imaging tool but inevitably introduces radiation hazards to patients and healthcare providers. Reducing the tracer injection dose and eliminating the CT acquisition for attenuation correction can reduce the overall radiation dose, but often results in PET with high noise and bias. Thus, it is desirable to develop 3D methods to translate t… ▽ More

    Submitted 15 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: 15 pages, 7 figures

  31. arXiv:2406.07540  [pdf, other

    cs.CV cs.LG

    Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance

    Authors: Kuan Heng Lin, Sicheng Mo, Ben Klingher, Fangzhou Mu, Bolei Zhou

    Abstract: Recent controllable generation approaches such as FreeControl and Diffusion Self-guidance bring fine-grained spatial and appearance control to text-to-image (T2I) diffusion models without training auxiliary modules. However, these methods optimize the latent embedding for each type of score function with longer diffusion steps, making the generation process time-consuming and limiting their flexib… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 18 pages, 11 figures, see project page at https://genforce.github.io/ctrl-x

  32. arXiv:2406.07294  [pdf, other

    cs.RO cs.CV

    OTO Planner: An Efficient Only Travelling Once Exploration Planner for Complex and Unknown Environments

    Authors: Bo Zhou, Chuanzhao Lu, Yan Pan, Fu Chen

    Abstract: Autonomous exploration in complex and cluttered environments is essential for various applications. However, there are many challenges due to the lack of global heuristic information. Existing exploration methods suffer from the repeated paths and considerable computational resource requirement in large-scale environments. To address the above issues, this letter proposes an efficient exploration… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  33. arXiv:2406.05534  [pdf, other

    cs.AI cs.CL cs.LG

    Online DPO: Online Direct Preference Optimization with Fast-Slow Chasing

    Authors: Biqing Qi, Pengfei Li, Fangyuan Li, Junqi Gao, Kaiyan Zhang, Bowen Zhou

    Abstract: Direct Preference Optimization (DPO) improves the alignment of large language models (LLMs) with human values by training directly on human preference datasets, eliminating the need for reward models. However, due to the presence of cross-domain human preferences, direct continual training can lead to catastrophic forgetting, limiting DPO's performance and efficiency. Inspired by intraspecific com… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  34. arXiv:2406.05532  [pdf, other

    cs.LG cs.AI

    Exploring Adversarial Robustness of Deep State Space Models

    Authors: Biqing Qi, Yang Luo, Junqi Gao, Pengfei Li, Kai Tian, Zhiyuan Ma, Bowen Zhou

    Abstract: Deep State Space Models (SSMs) have proven effective in numerous task scenarios but face significant security challenges due to Adversarial Perturbations (APs) in real-world deployments. Adversarial Training (AT) is a mainstream approach to enhancing Adversarial Robustness (AR) and has been validated on various traditional DNN architectures. However, its effectiveness in improving the AR of SSMs r… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  35. arXiv:2406.05531  [pdf, other

    cs.LG cs.AI

    Enhancing Adversarial Transferability via Information Bottleneck Constraints

    Authors: Biqing Qi, Junqi Gao, Jianxing Liu, Ligang Wu, Bowen Zhou

    Abstract: From the perspective of information bottleneck (IB) theory, we propose a novel framework for performing black-box transferable adversarial attacks named IBTA, which leverages advancements in invariant features. Intuitively, diminishing the reliance of adversarial perturbations on the original data, under equivalent attack performance constraints, encourages a greater reliance on invariant features… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Journal ref: IEEE Signal Processing Letters, 2024

  36. arXiv:2406.03949  [pdf, other

    cs.CL

    UltraMedical: Building Specialized Generalists in Biomedicine

    Authors: Kaiyan Zhang, Sihang Zeng, Ermo Hua, Ning Ding, Zhang-Ren Chen, Zhiyuan Ma, Haoxin Li, Ganqu Cui, Biqing Qi, Xuekai Zhu, Xingtai Lv, Hu Jinfang, Zhiyuan Liu, Bowen Zhou

    Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities across various domains and are moving towards more specialized areas. Recent advanced proprietary models such as GPT-4 and Gemini have achieved significant advancements in biomedicine, which have also raised privacy and security challenges. The construction of specialized generalists hinges largely on high-quality datasets, enh… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Datasets and models are available at https://github.com/TsinghuaC3I/UltraMedical

  37. arXiv:2406.01646  [pdf, other

    cs.LG cs.AI eess.SP

    iKAN: Global Incremental Learning with KAN for Human Activity Recognition Across Heterogeneous Datasets

    Authors: Mengxi Liu, Sizhen Bian, Bo Zhou, Paul Lukowicz

    Abstract: This work proposes an incremental learning (IL) framework for wearable sensor human activity recognition (HAR) that tackles two challenges simultaneously: catastrophic forgetting and non-uniform inputs. The scalable framework, iKAN, pioneers IL with Kolmogorov-Arnold Networks (KAN) to replace multi-layer perceptrons as the classifier that leverages the local plasticity and global stability of spli… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: This work is submitted to Ubicomp/ISWC24 and is under review

  38. arXiv:2406.00504  [pdf

    cs.RO cs.AI

    Research on an Autonomous UAV Search and Rescue System Based on the Improved

    Authors: Haobin Chen, Junyu Tao, Bize Zhou, Xiaoyan Liu

    Abstract: The demand is to solve the issue of UAV (unmanned aerial vehicle) operating autonomously and implementing practical functions such as search and rescue in complex unknown environments. This paper proposes an autonomous search and rescue UAV system based on an EGO-Planner algorithm, which is improved by innovative UAV body application and takes the methods of inverse motor backstepping to enhance t… ▽ More

    Submitted 7 June, 2024; v1 submitted 1 June, 2024; originally announced June 2024.

    Comments: 2024 5th International Conference on Computer Engineering and Application

  39. arXiv:2405.18424  [pdf, other

    cs.CV

    3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting

    Authors: Qihang Zhang, Yinghao Xu, Chaoyang Wang, Hsin-Ying Lee, Gordon Wetzstein, Bolei Zhou, Ceyuan Yang

    Abstract: Scene image editing is crucial for entertainment, photography, and advertising design. Existing methods solely focus on either 2D individual object or 3D global scene editing. This results in a lack of a unified approach to effectively control and manipulate scenes at the 3D level with different levels of granularity. In this work, we propose 3DitScene, a novel and unified scene editing framework… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  40. arXiv:2405.17534  [pdf, other

    cs.LG

    SMR: State Memory Replay for Long Sequence Modeling

    Authors: Biqing Qi, Junqi Gao, Kaiyan Zhang, Dong Li, Jianxing Liu, Ligang Wu, Bowen Zhou

    Abstract: Despite the promising performance of state space models (SSMs) in long sequence modeling, limitations still exist. Advanced SSMs like S5 and S6 (Mamba) in addressing non-uniform sampling, their recursive structures impede efficient SSM computation via convolution. To overcome compatibility limitations in parallel convolutional computation, this paper proposes a novel non-recursive non-uniform samp… ▽ More

    Submitted 8 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

    Journal ref: Findings of the Association for Computational Linguistics, 2024

  41. arXiv:2405.16501  [pdf, other

    cs.CV

    User-Friendly Customized Generation with Multi-Modal Prompts

    Authors: Linhao Zhong, Yan Hong, Wentao Chen, Binglin Zhou, Yiyi Zhang, Jianfu Zhang, Liqing Zhang

    Abstract: Text-to-image generation models have seen considerable advancement, catering to the increasing interest in personalized image creation. Current customization techniques often necessitate users to provide multiple images (typically 3-5) for each customized object, along with the classification of these objects and descriptive textual prompts for scenes. This paper questions whether the process can… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: 11 pages, 8 figures

  42. arXiv:2405.14866  [pdf, other

    cs.CV

    Tele-Aloha: A Low-budget and High-authenticity Telepresence System Using Sparse RGB Cameras

    Authors: Hanzhang Tu, Ruizhi Shao, Xue Dong, Shunyuan Zheng, Hao Zhang, Lili Chen, Meili Wang, Wenyu Li, Siyan Ma, Shengping Zhang, Boyao Zhou, Yebin Liu

    Abstract: In this paper, we present a low-budget and high-authenticity bidirectional telepresence system, Tele-Aloha, targeting peer-to-peer communication scenarios. Compared to previous systems, Tele-Aloha utilizes only four sparse RGB cameras, one consumer-grade GPU, and one autostereoscopic screen to achieve high-resolution (2048x2048), real-time (30 fps), low-latency (less than 150ms) and robust distant… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Paper accepted by SIGGRAPH 2024. Project page: http://118.178.32.38/c/Tele-Aloha/

  43. arXiv:2405.13584  [pdf, other

    cs.LG cs.DC

    Emulating Full Client Participation: A Long-Term Client Selection Strategy for Federated Learning

    Authors: Qingming Li, Juzheng Miao, Puning Zhao, Li Zhou, Shouling Ji, Bowen Zhou, Furui Liu

    Abstract: Client selection significantly affects the system convergence efficiency and is a crucial problem in federated learning. Existing methods often select clients by evaluating each round individually and overlook the necessity for long-term optimization, resulting in suboptimal performance and potential fairness issues. In this study, we propose a novel client selection strategy designed to emulate t… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  44. arXiv:2405.12996  [pdf, other

    eess.IV

    Dose-aware Diffusion Model for 3D Low-dose PET: Multi-institutional Validation with Reader Study and Real Low-dose Data

    Authors: Huidong Xie, Weijie Gan, Bo Zhou, Ming-Kai Chen, Michal Kulon, Annemarie Boustani, Benjamin A. Spencer, Reimund Bayerlein, Xiongchao Chen, Qiong Liu, Xueqi Guo, Menghua Xia, Yinchi Zhou, Hui Liu, Liang Guo, Hongyu An, Ulugbek S. Kamilov, Hanzhong Wang, Biao Li, Axel Rominger, Kuangyu Shi, Ge Wang, Ramsey D. Badawi, Chi Liu

    Abstract: As PET imaging is accompanied by radiation exposure and potentially increased cancer risk, reducing radiation dose in PET scans without compromising the image quality is an important topic. Deep learning (DL) techniques have been investigated for low-dose PET imaging. However, existing models have often resulted in compromised image quality when achieving low-dose PET and have limited generalizabi… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 16 Pages, 15 Figures, 4 Tables. Paper under review. arXiv admin note: substantial text overlap with arXiv:2311.04248

  45. arXiv:2405.12223  [pdf, other

    eess.IV cs.CV

    Cascaded Multi-path Shortcut Diffusion Model for Medical Image Translation

    Authors: Yinchi Zhou, Tianqi Chen, Jun Hou, Huidong Xie, Nicha C. Dvornek, S. Kevin Zhou, David L. Wilson, James S. Duncan, Chi Liu, Bo Zhou

    Abstract: Image-to-image translation is a vital component in medical imaging processing, with many uses in a wide range of imaging modalities and clinical scenarios. Previous methods include Generative Adversarial Networks (GANs) and Diffusion Models (DMs), which offer realism but suffer from instability and lack uncertainty estimation. Even though both GAN and DM methods have individually exhibited their c… ▽ More

    Submitted 5 April, 2024; originally announced May 2024.

    Comments: 15 pages, 5 figures

  46. arXiv:2405.12027  [pdf, other

    hep-th gr-qc

    Linearized gravity and soft graviton theorem in de Sitter spacetime

    Authors: Pujian Mao, Bochen Zhou

    Abstract: We study the linearized gravity theory in the Newman-Unti gauge in the near horizon region of the de Sitter spacetime. The linearized Einstein equation involves the cosmological constant. The near horizon symmetry consists of near horizon supertranslation and near horizon superrotation. We compute the near horizon supertranslation charge and find the proper near horizon fall-off conditions which u… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  47. arXiv:2405.11870  [pdf, other

    cs.CL cs.AI

    Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single Process

    Authors: Ermo Hua, Biqing Qi, Kaiyan Zhang, Yue Yu, Ning Ding, Xingtai Lv, Kai Tian, Bowen Zhou

    Abstract: Supervised Fine-Tuning (SFT) and Preference Optimization (PO) are two fundamental processes for enhancing the capabilities of Language Models (LMs) post pre-training, aligning them better with human preferences. Although SFT advances in training efficiency, PO delivers better alignment, thus they are often combined. However, common practices simply apply them sequentially without integrating their… ▽ More

    Submitted 28 May, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

  48. arXiv:2405.11826  [pdf, other

    astro-ph.IM hep-ex physics.ins-det

    Data quality control system and long-term performance monitor of the LHAASO-KM2A

    Authors: Zhen Cao, F. Aharonian, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, W. Bian, A. V. Bukevich, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, H. X. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. Chen , et al. (263 additional authors not shown)

    Abstract: The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To… ▽ More

    Submitted 13 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

    Comments: 15 pages, 9 figures

  49. arXiv:2405.11788  [pdf, other

    cs.LG

    TinyLLaVA Factory: A Modularized Codebase for Small-scale Large Multimodal Models

    Authors: Junlong Jia, Ying Hu, Xi Weng, Yiming Shi, Miao Li, Xingjian Zhang, Baichuan Zhou, Ziyu Liu, Jie Luo, Lei Huang, Ji Wu

    Abstract: We present TinyLLaVA Factory, an open-source modular codebase for small-scale large multimodal models (LMMs) with a focus on simplicity of code implementations, extensibility of new features, and reproducibility of training results. Following the design philosophy of the factory pattern in software engineering, TinyLLaVA Factory modularizes the entire system into interchangeable components, with e… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: Our codebase is made public at https://github.com/TinyLLaVA/TinyLLaVA_Factory with documentation available at https://tinyllava-factory.readthedocs.io/en/latest/

  50. arXiv:2405.10051  [pdf, other

    cs.CR cs.CL

    MarkLLM: An Open-Source Toolkit for LLM Watermarking

    Authors: Leyi Pan, Aiwei Liu, Zhiwei He, Zitian Gao, Xuandong Zhao, Yijian Lu, Binglin Zhou, Shuliang Liu, Xuming Hu, Lijie Wen, Irwin King

    Abstract: LLM watermarking, which embeds imperceptible yet algorithmically detectable signals in model outputs to identify LLM-generated text, has become crucial in mitigating the potential misuse of large language models. However, the abundance of LLM watermarking algorithms, their intricate mechanisms, and the complex evaluation procedures and perspectives pose challenges for researchers and the community… ▽ More

    Submitted 24 May, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

    Comments: 16 pages, 5 figures, 6 tables

    MSC Class: 68T50 ACM Class: I.2.7