Skip to main content

Showing 1–50 of 233 results for author: Lyu, S

  1. arXiv:2407.05108  [pdf, other

    cs.LG stat.ML

    The Role of Depth, Width, and Tree Size in Expressiveness of Deep Forest

    Authors: Shen-Huan Lyu, Jin-Hui Wu, Qin-Cheng Zheng, Baoliu Ye

    Abstract: Random forests are classical ensemble algorithms that construct multiple randomized decision trees and aggregate their predictions using naive averaging. \citet{zhou2019deep} further propose a deep forest algorithm with multi-layer forests, which outperforms random forests in various tasks. The performance of deep forests is related to three hyperparameters in practice: depth, width, and tree size… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Journal ref: In: Proceedings of the 27th European Conference on Artificial Intelligence, 2024

  2. arXiv:2407.03107  [pdf

    cs.HC cs.GR cs.MM

    Design of a UE5-based digital twin platform

    Authors: Shaoqiu Lyu, Muzhi Wang, Sunrui Zhang, Shengzhi Wang

    Abstract: Aiming at the current mainstream 3D scene engine learning and building cost is too high, this thesis proposes a digital twin platform design program based on Unreal Engine 5 (UE5). It aims to provide a universal platform construction design process to effectively reduce the learning cost of large-scale scene construction. Taking an actual project of a unit as an example, the overall cycle work of… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  3. arXiv:2406.16943  [pdf, other

    eess.SP cs.AI cs.HC cs.LG

    EarDA: Towards Accurate and Data-Efficient Earable Activity Sensing

    Authors: Shengzhe Lyu, Yongliang Chen, Di Duan, Renqi Jia, Weitao Xu

    Abstract: In the realm of smart sensing with the Internet of Things, earable devices are empowered with the capability of multi-modality sensing and intelligence of context-aware computing, leading to its wide usage in Human Activity Recognition (HAR). Nonetheless, unlike the movements captured by Inertial Measurement Unit (IMU) sensors placed on the upper or lower body, those motion signals obtained from e… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: accepted by 2024 IEEE Coupling of Sensing & Computing in AIoT Systems (CSCAIoT)

  4. arXiv:2406.10427  [pdf, other

    cs.LG cs.CR

    Adaptive Randomized Smoothing: Certifying Multi-Step Defences against Adversarial Examples

    Authors: Saiyue Lyu, Shadab Shaikh, Frederick Shpilevskiy, Evan Shelhamer, Mathias Lécuyer

    Abstract: We propose Adaptive Randomized Smoothing (ARS) to certify the predictions of our test-time adaptive models against adversarial examples. ARS extends the analysis of randomized smoothing using f-Differential Privacy to certify the adaptive composition of multiple steps. For the first time, our theory covers the sound adaptive composition of general and high-dimensional functions of noisy input. We… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  5. arXiv:2406.04745  [pdf, other

    cs.LG cs.CV

    Confidence-aware Contrastive Learning for Selective Classification

    Authors: Yu-Chang Wu, Shen-Huan Lyu, Haopu Shang, Xiangyu Wang, Chao Qian

    Abstract: Selective classification enables models to make predictions only when they are sufficiently confident, aiming to enhance safety and reliability, which is important in high-stakes scenarios. Previous methods mainly use deep neural networks and focus on modifying the architecture of classification layers to enable the model to estimate the confidence of its prediction. This work provides a generaliz… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: Accepted by ICML 2024

  6. arXiv:2406.01112  [pdf, other

    cs.CV

    BACON: Bayesian Optimal Condensation Framework for Dataset Distillation

    Authors: Zheng Zhou, Hongbo Zhao, Guangliang Cheng, Xiangtai Li, Shuchang Lyu, Wenquan Feng, Qi Zhao

    Abstract: Dataset Distillation (DD) aims to distill knowledge from extensive datasets into more compact ones while preserving performance on the test set, thereby reducing storage costs and training expenses. However, existing methods often suffer from computational intensity, particularly exhibiting suboptimal performance with large dataset sizes due to the lack of a robust theoretical framework for analyz… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 22 pages, 10 figures

  7. arXiv:2406.00985  [pdf, other

    cs.CV

    MultiEdits: Simultaneous Multi-Aspect Editing with Text-to-Image Diffusion Models

    Authors: Mingzhen Huang, Jialing Cai, Shan Jia, Vishnu Suresh Lokhande, Siwei Lyu

    Abstract: Text-driven image synthesis has made significant advancements with the development of diffusion models, transforming how visual content is generated from text prompts. Despite these advances, text-driven image editing, a key area in computer graphics, faces unique challenges. A major challenge is making simultaneous edits across multiple objects or attributes. Applying these methods sequentially f… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  8. arXiv:2405.18320  [pdf, other

    cs.CV cs.AI cs.CL

    Self-Supervised Learning Based Handwriting Verification

    Authors: Mihir Chauhan, Mohammad Abuzar Shaikh, Bina Ramamurthy, Mingchen Gao, Siwei Lyu, Sargur Srihari

    Abstract: We present SSL-HV: Self-Supervised Learning approaches applied to the task of Handwriting Verification. This task involves determining whether a given pair of handwritten images originate from the same or different writer distribution. We have compared the performance of multiple generative, contrastive SSL approaches against handcrafted feature extractors and supervised learning on CEDAR AND data… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 14 pages, 6 figures, 2 tables

  9. arXiv:2405.17837  [pdf, other

    cs.HC

    Enabling Generative Design Tools with LLM Agents for Building Novel Devices: A Case Study on Fluidic Computation Interfaces

    Authors: Qiuyu Lu, Jiawei Fang, Zhihao Yao, Yue Yang, Shiqing Lyu, Haipeng Mi, Lining Yao

    Abstract: In the field of Human-Computer Interaction (HCI), the development of interactive devices represents a significant area of focus. The advent of novel hardware and advanced fabrication techniques has underscored the demand for specialized design tools that democratize the prototyping process for such cutting-edge devices. While these tools simplify the process through parametric design and simulatio… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 25 pages, 12 figures

  10. arXiv:2405.11326  [pdf, other

    cs.LG cs.CV

    On the Trajectory Regularity of ODE-based Diffusion Sampling

    Authors: Defang Chen, Zhenyu Zhou, Can Wang, Chunhua Shen, Siwei Lyu

    Abstract: Diffusion-based generative models use stochastic differential equations (SDEs) and their equivalent ordinary differential equations (ODEs) to establish a smooth connection between a complex data distribution and a tractable prior distribution. In this paper, we identify several intriguing trajectory properties in the ODE-based sampling process of diffusion models. We characterize an implicit denoi… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

    Comments: ICML 2024, 30 pages

  11. arXiv:2405.08487  [pdf, other

    cs.CV cs.CR

    Semantic Contextualization of Face Forgery: A New Definition, Dataset, and Detection Method

    Authors: Mian Zou, Baosheng Yu, Yibing Zhan, Siwei Lyu, Kede Ma

    Abstract: In recent years, deep learning has greatly streamlined the process of generating realistic fake face images. Aware of the dangers, researchers have developed various tools to spot these counterfeits. Yet none asked the fundamental question: What digital manipulations make a real photographic face image fake, while others do not? In this paper, we put face forgery in a semantic context and define t… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  12. arXiv:2405.04051  [pdf, ps, other

    cs.IT

    On the quantization goodness of polar lattices

    Authors: Ling Liu, Shanxiang Lyu, Cong Ling, Baoming Bai

    Abstract: In this work, we prove that polar lattices, when tailored for lossy compression, are quantization-good in the sense that their normalized second moments approach $\frac{1}{2πe}$ as the dimension of lattices increases. It has been predicted by Zamir et al. \cite{ZamirQZ96} that the Entropy Coded Dithered Quantization (ECDQ) system using quantization-good lattices can achieve the rate-distortion bou… ▽ More

    Submitted 13 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

    Comments: 12 pages, 5 figures, submitted to IEEE for possible publication

  13. arXiv:2405.00135  [pdf, other

    cs.IT eess.SP

    Improving Channel Resilience for Task-Oriented Semantic Communications: A Unified Information Bottleneck Approach

    Authors: Shuai Lyu, Yao Sun, Linke Guo, Xiaoyong Yuan, Fang Fang, Lan Zhang, Xianbin Wang

    Abstract: Task-oriented semantic communications (TSC) enhance radio resource efficiency by transmitting task-relevant semantic information. However, current research often overlooks the inherent semantic distinctions among encoded features. Due to unavoidable channel variations from time and frequency-selective fading, semantically sensitive feature units could be more susceptible to erroneous inference if… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: This work has been submitted to the IEEE Communications Letters

  14. arXiv:2404.19171  [pdf, other

    cs.CV cs.AI

    Explicit Correlation Learning for Generalizable Cross-Modal Deepfake Detection

    Authors: Cai Yu, Shan Jia, Xiaomeng Fu, Jin Liu, Jiahe Tian, Jiao Dai, Xi Wang, Siwei Lyu, Jizhong Han

    Abstract: With the rising prevalence of deepfakes, there is a growing interest in developing generalizable detection methods for various types of deepfakes. While effective in their specific modalities, traditional detection methods fall short in addressing the generalizability of detection across diverse cross-modal deepfakes. This paper aims to explicitly learn potential cross-modal correlation to enhance… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: accepted by ICME 2024

  15. arXiv:2404.18033  [pdf, other

    cs.CV

    Exposing Text-Image Inconsistency Using Diffusion Models

    Authors: Mingzhen Huang, Shan Jia, Zhou Zhou, Yan Ju, Jialing Cai, Siwei Lyu

    Abstract: In the battle against widespread online misinformation, a growing problem is text-image inconsistency, where images are misleadingly paired with texts with different intent or meaning. Existing classification-based methods for text-image inconsistency can identify contextual inconsistencies but fail to provide explainable justifications for their decisions that humans can understand. Although more… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  16. arXiv:2404.13146  [pdf, other

    cs.CR cs.CV

    DeepFake-O-Meter v2.0: An Open Platform for DeepFake Detection

    Authors: Yan Ju, Chengzhe Sun, Shan Jia, Shuwei Hou, Zhaofeng Si, Soumyya Kanti Datta, Lipeng Ke, Riky Zhou, Anita Nikolich, Siwei Lyu

    Abstract: Deepfakes, as AI-generated media, have increasingly threatened media integrity and personal privacy with realistic yet fake digital content. In this work, we introduce an open-source and user-friendly online platform, DeepFake-O-Meter v2.0, that integrates state-of-the-art methods for detecting Deepfake images, videos, and audio. Built upon DeepFake-O-Meter v1.0, we have made significant upgrades… ▽ More

    Submitted 27 June, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

  17. arXiv:2403.14077  [pdf, other

    cs.AI cs.CR

    Can ChatGPT Detect DeepFakes? A Study of Using Multimodal Large Language Models for Media Forensics

    Authors: Shan Jia, Reilin Lyu, Kangran Zhao, Yize Chen, Zhiyuan Yan, Yan Ju, Chuanbo Hu, Xin Li, Baoyuan Wu, Siwei Lyu

    Abstract: DeepFakes, which refer to AI-generated media content, have become an increasing concern due to their use as a means for disinformation. Detecting DeepFakes is currently solved with programmed machine learning algorithms. In this work, we investigate the capabilities of multimodal large language models (LLMs) in DeepFake detection. We conducted qualitative and quantitative experiments to demonstrat… ▽ More

    Submitted 11 June, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

  18. arXiv:2403.13358  [pdf, other

    cs.RO cs.CV cs.LG

    GeRM: A Generalist Robotic Model with Mixture-of-experts for Quadruped Robot

    Authors: Wenxuan Song, Han Zhao, Pengxiang Ding, Can Cui, Shangke Lyu, Yaning Fan, Donglin Wang

    Abstract: Multi-task robot learning holds significant importance in tackling diverse and complex scenarios. However, current approaches are hindered by performance issues and difficulties in collecting training datasets. In this paper, we propose GeRM (Generalist Robotic Model). We utilize offline reinforcement learning to optimize data utilization strategies to learn from both demonstrations and sub-optima… ▽ More

    Submitted 9 April, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

  19. arXiv:2403.12631  [pdf

    cs.RO cs.AI

    PointGrasp: Point Cloud-based Grasping for Tendon-driven Soft Robotic Glove Applications

    Authors: Chen Hu, Shirui Lyu, Eojin Rho, Daekyum Kim, Shan Luo, Letizia Gionfrida

    Abstract: Controlling hand exoskeletons to assist individuals with grasping tasks poses a challenge due to the difficulty in understanding user intentions. We propose that most daily grasping tasks during activities of daily living (ADL) can be deduced by analyzing object geometries (simple and complex) from 3D point clouds. The study introduces PointGrasp, a real-time system designed for identifying househ… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: 6 pages, 8 figures, conference

    ACM Class: I.2; I.4

  20. arXiv:2403.03101  [pdf, other

    cs.CL cs.AI cs.HC cs.LG cs.MA

    KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents

    Authors: Yuqi Zhu, Shuofei Qiao, Yixin Ou, Shumin Deng, Ningyu Zhang, Shiwei Lyu, Yue Shen, Lei Liang, Jinjie Gu, Huajun Chen

    Abstract: Large Language Models (LLMs) have demonstrated great potential in complex reasoning tasks, yet they fall short when tackling more sophisticated challenges, especially when interacting with environments through generating executable actions. This inadequacy primarily stems from the lack of built-in action knowledge in language agents, which fails to effectively guide the planning trajectories durin… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: Work in progress. Project page: https://zjunlp.github.io/project/KnowAgent/ Code: https://github.com/zjunlp/KnowAgent

  21. arXiv:2402.06749  [pdf

    physics.optics physics.bio-ph physics.med-ph

    Copper phosphate micro-flowers coated with indocyanine green and iron oxide nanoparticles for in vivo localization optoacoustic tomography and magnetic actuation

    Authors: Daniil Nozdriukhin, Shuxin Lyu, Jerome Bonvin, Michael Reiss, Daniel Razansky, Xose Luis Dean-Ben

    Abstract: Efficient drug delivery is a major challenge in modern medicine and pharmaceutical research. Micrometer-scale robots have recently been proposed as a promising venue to amplify precision of drug administration. Remotely controlled microrobots sufficiently small to navigate through microvascular networks can reach any part of the human body, yet real-time tracking is crucial for providing precise g… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  22. arXiv:2402.01154  [pdf, other

    cs.CR

    Towards Quantum-Safe Federated Learning via Homomorphic Encryption: Learning with Gradients

    Authors: Guangfeng Yan, Shanxiang Lyu, Hanxu Hou, Zhiyong Zheng, Linqi Song

    Abstract: This paper introduces a privacy-preserving distributed learning framework via private-key homomorphic encryption. Thanks to the randomness of the quantization of gradients, our learning with error (LWE) based encryption can eliminate the error terms, thus avoiding the issue of error expansion in conventional LWE-based homomorphic encryption. The proposed system allows a large number of learning pa… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  23. arXiv:2401.17255  [pdf, other

    quant-ph cond-mat.str-el physics.chem-ph

    Towards Quantum Simulation of Non-Markovian Open Quantum Dynamics: A Universal and Compact Theory

    Authors: Xiang Li, Su-Xiang Lyu, Yao Wang, Rui-Xue Xu, Xiao Zheng, YiJing Yan

    Abstract: Non-Markovianity, the intricate dependence of an open quantum system on its temporal evolution history, holds tremendous implications across various scientific disciplines. However, accurately characterizing the complex non-Markovian effects has posed a formidable challenge for numerical simulations. Despite the promising potential of emerging quantum computing technologies, the pursuit of a unive… ▽ More

    Submitted 8 February, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: abstract and figures update

  24. arXiv:2401.10113  [pdf, ps, other

    cs.CV

    Exposing Lip-syncing Deepfakes from Mouth Inconsistencies

    Authors: Soumyya Kanti Datta, Shan Jia, Siwei Lyu

    Abstract: A lip-syncing deepfake is a digitally manipulated video in which a person's lip movements are created convincingly using AI models to match altered or entirely new audio. Lip-syncing deepfakes are a dangerous type of deepfakes as the artifacts are limited to the lip region and more difficult to discern. In this paper, we describe a novel approach, LIP-syncing detection based on mouth INConsistency… ▽ More

    Submitted 3 June, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

  25. arXiv:2312.17431  [pdf, other

    cs.CR cs.CV

    MVPatch: More Vivid Patch for Adversarial Camouflaged Attacks on Object Detectors in the Physical World

    Authors: Zheng Zhou, Hongbo Zhao, Ju Liu, Qiaosheng Zhang, Liwei Geng, Shuchang Lyu, Wenquan Feng

    Abstract: Recent investigations demonstrate that adversarial patches can be utilized to manipulate the result of object detection models. However, the conspicuous patterns on these patches may draw more attention and raise suspicions among humans. Moreover, existing works have primarily focused on enhancing the efficacy of attacks in the physical domain, rather than seeking to optimize their stealth attribu… ▽ More

    Submitted 11 January, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

    Comments: 14 pages, 8 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  26. arXiv:2312.09785  [pdf, other

    cs.CL

    RJUA-QA: A Comprehensive QA Dataset for Urology

    Authors: Shiwei Lyu, Chenfei Chi, Hongbo Cai, Lei Shi, Xiaoyan Yang, Lei Liu, Xiang Chen, Deng Zhao, Zhiqiang Zhang, Xianguo Lyu, Ming Zhang, Fangzhou Li, Xiaowei Ma, Yue Shen, Jinjie Gu, Wei Xue, Yiran Huang

    Abstract: We introduce RJUA-QA, a novel medical dataset for question answering (QA) and reasoning with clinical evidence, contributing to bridge the gap between general large language models (LLMs) and medical-specific LLM applications. RJUA-QA is derived from realistic clinical scenarios and aims to facilitate LLMs in generating reliable diagnostic and advice. The dataset contains 2,132 curated Question-Co… ▽ More

    Submitted 7 January, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: An initial version

  27. Multiple Instance Learning for Uplift Modeling

    Authors: Yao Zhao, Haipeng Zhang, Shiwei Lyu, Ruiying Jiang, Jinjie Gu, Guannan Zhang

    Abstract: Uplift modeling is widely used in performance marketing to estimate effects of promotion campaigns (e.g., increase of customer retention rate). Since it is impossible to observe outcomes of a recipient in treatment (e.g., receiving a certain promotion) and control (e.g., without promotion) groups simultaneously (i.e., counter-factual), uplift models are mainly trained on instances of treatment and… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: short paper of CIKM22(full version)

    Journal ref: Proceedings of the 31st ACM International Conference on Information and Knowledge Management (2022) 4727-4731

  28. arXiv:2312.05738  [pdf, other

    cs.CR cs.AI

    FedReverse: Multiparty Reversible Deep Neural Network Watermarking

    Authors: Junlong Mao, Huiyi Tang, Yi Zhang, Fengxia Liu, Zhiyong Zheng, Shanxiang Lyu

    Abstract: The proliferation of Deep Neural Networks (DNN) in commercial applications is expanding rapidly. Simultaneously, the increasing complexity and cost of training DNN models have intensified the urgency surrounding the protection of intellectual property associated with these trained models. In this regard, DNN watermarking has emerged as a crucial safeguarding technique. This paper presents FedRever… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

    Comments: 13 pages

  29. arXiv:2311.18196  [pdf, ps, other

    math.AC

    Formal lifting of dualizing complexes and consequences

    Authors: Shiji Lyu

    Abstract: We show that for a Noetherian ring $A$ that is $I$-adically complete for an ideal $I$, if $A/I$ admits a dualizing complex, so does $A$. We discuss several consequences of this result. We also consider a generalization of the notion of dualizing complexes to infinite-dimensional rings and prove the results in this generality. In addition, we give an alternative proof of the fact that every excelle… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: 19 pages. Comments welcome!

  30. arXiv:2311.11278  [pdf, other

    cs.CV

    Transcending Forgery Specificity with Latent Space Augmentation for Generalizable Deepfake Detection

    Authors: Zhiyuan Yan, Yuhao Luo, Siwei Lyu, Qingshan Liu, Baoyuan Wu

    Abstract: Deepfake detection faces a critical generalization hurdle, with performance deteriorating when there is a mismatch between the distributions of training and testing data. A broadly received explanation is the tendency of these detectors to be overfitted to forgery-specific artifacts, rather than learning features that are widely applicable across various forgeries. To address this issue, we propos… ▽ More

    Submitted 28 March, 2024; v1 submitted 19 November, 2023; originally announced November 2023.

  31. arXiv:2311.06712  [pdf, other

    eess.IV

    PuzzleTuning: Explicitly Bridge Pathological and Natural Image with Puzzles

    Authors: Tianyi Zhang, Shangqing Lyu, Yanli Lei, Sicheng Chen, Nan Ying, Yufang He, Yu Zhao, Yunlu Feng, Hwee Kuan Lee, Guanglei Zhang

    Abstract: Pathological image analysis is a crucial field in computer vision. Due to the annotation scarcity in the pathological field, pre-training with self-supervised learning (SSL) is widely applied to learn on unlabeled images. However, the current SSL-based pathological pre-training: (1) does not explicitly explore the essential focuses of the pathological field, and (2) does not effectively bridge wit… ▽ More

    Submitted 22 April, 2024; v1 submitted 11 November, 2023; originally announced November 2023.

    Comments: 13 pages, 9 figures, 8 tables

  32. arXiv:2311.06015  [pdf

    cs.RO cs.AI

    RSG: Fast Learning Adaptive Skills for Quadruped Robots by Skill Graph

    Authors: Hongyin Zhang, Diyuan Shi, Zifeng Zhuang, Han Zhao, Zhenyu Wei, Feng Zhao, Sibo Gai, Shangke Lyu, Donglin Wang

    Abstract: Developing robotic intelligent systems that can adapt quickly to unseen wild situations is one of the critical challenges in pursuing autonomous robotics. Although some impressive progress has been made in walking stability and skill learning in the field of legged robots, their ability to fast adaptation is still inferior to that of animals in nature. Animals are born with massive skills needed t… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

  33. arXiv:2311.05836  [pdf, other

    eess.IV cs.CV cs.LG

    UMedNeRF: Uncertainty-aware Single View Volumetric Rendering for Medical Neural Radiance Fields

    Authors: Jing Hu, Qinrui Fan, Shu Hu, Siwei Lyu, Xi Wu, Xin Wang

    Abstract: In the field of clinical medicine, computed tomography (CT) is an effective medical imaging modality for the diagnosis of various pathologies. Compared with X-ray images, CT images can provide more information, including multi-planar slices and three-dimensional structures for clinical diagnosis. However, CT imaging requires patients to be exposed to large doses of ionizing radiation for a long ti… ▽ More

    Submitted 1 March, 2024; v1 submitted 9 November, 2023; originally announced November 2023.

  34. arXiv:2311.04732  [pdf, other

    cond-mat.mtrl-sci physics.comp-ph

    General-purpose machine-learned potential for 16 elemental metals and their alloys

    Authors: Keke Song, Rui Zhao, Jiahui Liu, Yanzhou Wang, Eric Lindgren, Yong Wang, Shunda Chen, Ke Xu, Ting Liang, Penghua Ying, Nan Xu, Zhiqiang Zhao, Jiuyang Shi, Junjie Wang, Shuang Lyu, Zezhu Zeng, Shirong Liang, Haikuan Dong, Ligang Sun, Yue Chen, Zhuhua Zhang, Wanlin Guo, Ping Qian, Jian Sun, Paul Erhart , et al. (3 additional authors not shown)

    Abstract: Machine-learned potentials (MLPs) have exhibited remarkable accuracy, yet the lack of general-purpose MLPs for a broad spectrum of elements and their alloys limits their applicability. Here, we present a feasible approach for constructing a unified general-purpose MLP for numerous elements, demonstrated through a model (UNEP-v1) for 16 elemental metals and their alloys. To achieve a complete repre… ▽ More

    Submitted 12 June, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

    Comments: Main text with 17 pages and 8 figures; supplementary with 26 figures and 4 tables; source code and training/test data available

  35. arXiv:2311.02926  [pdf, other

    cs.CV cs.AI

    Deep Image Semantic Communication Model for Artificial Intelligent Internet of Things

    Authors: Li Ping Qian, Yi Zhang, Sikai Lyu, Huijie Zhu, Yuan Wu, Xuemin Sherman Shen, Xiaoniu Yang

    Abstract: With the rapid development of Artificial Intelligent Internet of Things (AIoT), the image data from AIoT devices has been witnessing the explosive increasing. In this paper, a novel deep image semantic communication model is proposed for the efficient image communication in AIoT. Particularly, at the transmitter side, a high-precision image semantic segmentation algorithm is proposed to extract th… ▽ More

    Submitted 8 November, 2023; v1 submitted 6 November, 2023; originally announced November 2023.

  36. arXiv:2310.17902  [pdf

    eess.IV

    CPIA Dataset: A Comprehensive Pathological Image Analysis Dataset for Self-supervised Learning Pre-training

    Authors: Nan Ying, Yanli Lei, Tianyi Zhang, Shangqing Lyu, Chunhui Li, Sicheng Chen, Zeyu Liu, Yu Zhao, Guanglei Zhang

    Abstract: Pathological image analysis is a crucial field in computer-aided diagnosis, where deep learning is widely applied. Transfer learning using pre-trained models initialized on natural images has effectively improved the downstream pathological performance. However, the lack of sophisticated domain-specific pathological initialization hinders their potential. Self-supervised learning (SSL) enables pre… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

  37. arXiv:2310.14374  [pdf, other

    cs.CV

    OV-VG: A Benchmark for Open-Vocabulary Visual Grounding

    Authors: Chunlei Wang, Wenquan Feng, Xiangtai Li, Guangliang Cheng, Shuchang Lyu, Binghao Liu, Lijiang Chen, Qi Zhao

    Abstract: Open-vocabulary learning has emerged as a cutting-edge research area, particularly in light of the widespread adoption of vision-based foundational models. Its primary objective is to comprehend novel concepts that are not encompassed within a predefined vocabulary. One key facet of this endeavor is Visual Grounding, which entails locating a specific region within an image based on a corresponding… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

  38. arXiv:2310.07525  [pdf, other

    cs.RO

    ViT-A*: Legged Robot Path Planning using Vision Transformer A*

    Authors: Jianwei Liu, Shirui Lyu, Denis Hadjivelichkov, Valerio Modugno, Dimitrios Kanoulas

    Abstract: Legged robots, particularly quadrupeds, offer promising navigation capabilities, especially in scenarios requiring traversal over diverse terrains and obstacle avoidance. This paper addresses the challenge of enabling legged robots to navigate complex environments effectively through the integration of data-driven path-planning methods. We propose an approach that utilizes differentiable planners,… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: 6 pages, 6 figures, conference

    Journal ref: IEEE-RAS International Conference on Humanoids Robots (Humanoids) 2023

  39. arXiv:2310.06750  [pdf, other

    astro-ph.SR

    Unraveling the Thermodynamic Enigma between Fast and Slow Coronal Mass Ejections

    Authors: Soumyaranjan Khuntia, Wageesh Mishra, Sudheer K Mishra, Yuming Wang, Jie Zhang, Shaoyu Lyu

    Abstract: Coronal Mass Ejections (CMEs) are the most energetic expulsions of magnetized plasma from the Sun that play a crucial role in space weather dynamics. This study investigates the diverse kinematics and thermodynamic evolution of two CMEs (CME1: 2011 September 24 and CME2: 2018 August 20) at coronal heights where thermodynamic measurements are limited. The peak 3D propagation speed of CME1 is high (… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: 23 pages, 9 figures, accepted for publication in The Astrophysical Journal (ApJ)

  40. Efficient State Estimation with Constrained Rao-Blackwellized Particle Filter

    Authors: Shuai Li, Siwei Lyu, Jeff Trinkle

    Abstract: Due to the limitations of the robotic sensors, during a robotic manipulation task, the acquisition of the object's state can be unreliable and noisy. Combining an accurate model of multi-body dynamic system with Bayesian filtering methods has been shown to be able to filter out noise from the object's observed states. However, efficiency of these filtering methods suffers from samples that violate… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

  41. arXiv:2310.03827  [pdf, other

    cs.CV

    Integrating Audio-Visual Features for Multimodal Deepfake Detection

    Authors: Sneha Muppalla, Shan Jia, Siwei Lyu

    Abstract: Deepfakes are AI-generated media in which an image or video has been digitally modified. The advancements made in deepfake technology have led to privacy and security issues. Most deepfake detection techniques rely on the detection of a single modality. Existing methods for audio-visual detection do not always surpass that of the analysis based on single modalities. Therefore, this paper proposes… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  42. arXiv:2310.00405  [pdf, other

    cs.CV

    Controlling Neural Style Transfer with Deep Reinforcement Learning

    Authors: Chengming Feng, Jing Hu, Xin Wang, Shu Hu, Bin Zhu, Xi Wu, Hongtu Zhu, Siwei Lyu

    Abstract: Controlling the degree of stylization in the Neural Style Transfer (NST) is a little tricky since it usually needs hand-engineering on hyper-parameters. In this paper, we propose the first deep Reinforcement Learning (RL) based architecture that splits one-step style transfer into a step-wise process for the NST task. Our RL-based method tends to preserve more details and structures of the content… ▽ More

    Submitted 30 September, 2023; originally announced October 2023.

    Comments: Accepted by IJCAI 2023. The contributions of Chengming Feng and Jing Hu to this paper were equal. arXiv admin note: text overlap with arXiv:2309.13672

  43. arXiv:2310.00359  [pdf, other

    cs.CV

    Improving Cross-dataset Deepfake Detection with Deep Information Decomposition

    Authors: Shanmin Yang, Shu Hu, Bin Zhu, Ying Fu, Siwei Lyu, Xi Wu, Xin Wang

    Abstract: Deepfake technology poses a significant threat to security and social trust. Although existing detection methods have demonstrated high performance in identifying forgeries within datasets using the same techniques for training and testing, they suffer from sharp performance degradation when faced with cross-dataset scenarios where unseen deepfake techniques are tested. To address this challenge,… ▽ More

    Submitted 30 September, 2023; originally announced October 2023.

  44. arXiv:2309.13672  [pdf, other

    cs.CV cs.AI

    RL-I2IT: Image-to-Image Translation with Deep Reinforcement Learning

    Authors: Xin Wang, Ziwei Luo, Jing Hu, Chengming Feng, Shu Hu, Bin Zhu, Xi Wu, Hongtu Zhu, Xin Li, Siwei Lyu

    Abstract: Most existing Image-to-Image Translation (I2IT) methods generate images in a single run of a deep learning (DL) model. However, designing such a single-step model is always challenging, requiring a huge number of parameters and easily falling into bad global minimums and overfitting. In this work, we reformulate I2IT as a step-wise decision-making problem via deep reinforcement learning (DRL) and… ▽ More

    Submitted 7 June, 2024; v1 submitted 24 September, 2023; originally announced September 2023.

  45. arXiv:2309.05145  [pdf, other

    cs.LG cs.AI stat.ML

    Outlier Robust Adversarial Training

    Authors: Shu Hu, Zhenhuan Yang, Xin Wang, Yiming Ying, Siwei Lyu

    Abstract: Supervised learning models are challenged by the intrinsic complexities of training data such as outliers and minority subpopulations and intentional attacks at inference time with adversarial samples. While traditional robust learning methods and the recent adversarial training approaches are designed to handle each of the two challenges, to date, no work has been done to develop models that are… ▽ More

    Submitted 10 September, 2023; originally announced September 2023.

    Comments: Accepted by The 15th Asian Conference on Machine Learning (ACML 2023)

  46. REB: Reducing Biases in Representation for Industrial Anomaly Detection

    Authors: Shuai Lyu, Dongmei Mo, Waikeung Wong

    Abstract: Existing representation-based methods usually conduct industrial anomaly detection in two stages: obtain feature representations with a pre-trained model and perform distance measures for anomaly detection. Among them, K-nearest neighbor (KNN) retrieval-based anomaly detection methods show promising results. However, the features are not fully exploited as these methods ignore domain bias of pre-t… ▽ More

    Submitted 17 May, 2024; v1 submitted 24 August, 2023; originally announced August 2023.

    Comments: 14 pages, 7 figures, 7 tables

  47. arXiv:2308.09611  [pdf, other

    cs.CV

    Language-guided Human Motion Synthesis with Atomic Actions

    Authors: Yuanhao Zhai, Mingzhen Huang, Tianyu Luan, Lu Dong, Ifeoma Nwogu, Siwei Lyu, David Doermann, Junsong Yuan

    Abstract: Language-guided human motion synthesis has been a challenging task due to the inherent complexity and diversity of human behaviors. Previous methods face limitations in generalization to novel actions, often resulting in unrealistic or incoherent motion sequences. In this paper, we propose ATOM (ATomic mOtion Modeling) to mitigate this problem, by decomposing actions into atomic actions, and emplo… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

    Comments: Accepted to ACM MM 2023, code: https://github.com/yhZhai/ATOM

  48. arXiv:2308.08896  [pdf, other

    cs.LG

    Optimal Resource Allocation for U-Shaped Parallel Split Learning

    Authors: Song Lyu, Zheng Lin, Guanqiao Qu, Xianhao Chen, Xiaoxia Huang, Pan Li

    Abstract: Split learning (SL) has emerged as a promising approach for model training without revealing the raw data samples from the data owners. However, traditional SL inevitably leaks label privacy as the tail model (with the last layers) should be placed on the server. To overcome this limitation, one promising solution is to utilize U-shaped architecture to leave both early layers and last layers on th… ▽ More

    Submitted 8 October, 2023; v1 submitted 17 August, 2023; originally announced August 2023.

    Comments: 6 pages, 6 figures

  49. arXiv:2308.01520  [pdf, other

    cs.CV

    COMICS: End-to-end Bi-grained Contrastive Learning for Multi-face Forgery Detection

    Authors: Cong Zhang, Honggang Qi, Shuhui Wang, Yuezun Li, Siwei Lyu

    Abstract: DeepFakes have raised serious societal concerns, leading to a great surge in detection-based forensics methods in recent years. Face forgery recognition is a standard detection method that usually follows a two-phase pipeline. While those methods perform well in ideal experimental environment, they face challenges when dealing with DeepFakes in the wild involving complex background and multiple fa… ▽ More

    Submitted 24 May, 2024; v1 submitted 2 August, 2023; originally announced August 2023.

  50. arXiv:2308.00964  [pdf, other

    cs.CV

    ForensicsForest Family: A Series of Multi-scale Hierarchical Cascade Forests for Detecting GAN-generated Faces

    Authors: Jiucui Lu, Jiaran Zhou, Junyu Dong, Bin Li, Siwei Lyu, Yuezun Li

    Abstract: The prominent progress in generative models has significantly improved the reality of generated faces, bringing serious concerns to society. Since recent GAN-generated faces are in high realism, the forgery traces have become more imperceptible, increasing the forensics challenge. To combat GAN-generated faces, many countermeasures based on Convolutional Neural Networks (CNNs) have been spawned du… ▽ More

    Submitted 26 April, 2024; v1 submitted 2 August, 2023; originally announced August 2023.

    Comments: To Appear in IEEE TIFS 2024