Skip to main content

Showing 1–12 of 12 results for author: Ci, H

  1. arXiv:2406.09026  [pdf, other

    cs.CV

    Steganalysis on Digital Watermarking: Is Your Defense Truly Impervious?

    Authors: Pei Yang, Hai Ci, Yiren Song, Mike Zheng Shou

    Abstract: Digital watermarking techniques are crucial for copyright protection and source identification of images, especially in the era of generative AI models. However, many existing watermarking methods, particularly content-agnostic approaches that embed fixed patterns regardless of image content, are vulnerable to steganalysis attacks that can extract and remove the watermark with minimal perceptual d… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  2. arXiv:2406.08337  [pdf, other

    cs.CV eess.IV

    WMAdapter: Adding WaterMark Control to Latent Diffusion Models

    Authors: Hai Ci, Yiren Song, Pei Yang, Jinheng Xie, Mike Zheng Shou

    Abstract: Watermarking is crucial for protecting the copyright of AI-generated images. We propose WMAdapter, a diffusion model watermark plugin that takes user-specified watermark information and allows for seamless watermark imprinting during the diffusion generation process. WMAdapter is efficient and robust, with a strong emphasis on high generation quality. To achieve this, we make two key designs: (1)… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 20 pages, 13 figures

  3. arXiv:2406.06062  [pdf, other

    cs.CV cs.AI

    ProcessPainter: Learn Painting Process from Sequence Data

    Authors: Yiren Song, Shijie Huang, Chen Yao, Xiaojun Ye, Hai Ci, Jiaming Liu, Yuxuan Zhang, Mike Zheng Shou

    Abstract: The painting process of artists is inherently stepwise and varies significantly among different painters and styles. Generating detailed, step-by-step painting processes is essential for art education and research, yet remains largely underexplored. Traditional stroke-based rendering methods break down images into sequences of brushstrokes, yet they fall short of replicating the authentic processe… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  4. arXiv:2404.14055  [pdf, other

    cs.CV

    RingID: Rethinking Tree-Ring Watermarking for Enhanced Multi-Key Identification

    Authors: Hai Ci, Pei Yang, Yiren Song, Mike Zheng Shou

    Abstract: We revisit Tree-Ring Watermarking, a recent diffusion model watermarking method that demonstrates great robustness to various attacks. We conduct an in-depth study on it and reveal that the distribution shift unintentionally introduced by the watermarking process, apart from watermark pattern matching, contributes to its exceptional robustness. Our investigation further exposes inherent flaws in i… ▽ More

    Submitted 23 April, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: 25 pages, 8 figures

  5. arXiv:2404.09857  [pdf, other

    cs.CV cs.AI cs.RO

    Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL

    Authors: Fangwei Zhong, Kui Wu, Hai Ci, Churan Wang, Hao Chen

    Abstract: Embodied visual tracking is to follow a target object in dynamic 3D environments using an agent's egocentric vision. This is a vital and challenging skill for embodied agents. However, existing methods suffer from inefficient training and poor generalization. In this paper, we propose a novel framework that combines visual foundation models (VFM) and offline reinforcement learning (offline RL) to… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  6. arXiv:2403.01543  [pdf, other

    cs.CV

    Efficient Action Counting with Dynamic Queries

    Authors: Zishi Li, Xiaoxuan Ma, Qiuyan Shang, Wentao Zhu, Hai Ci, Yu Qiao, Yizhou Wang

    Abstract: Temporal repetition counting aims to quantify the repeated action cycles within a video. The majority of existing methods rely on the similarity correlation matrix to characterize the repetitiveness of actions, but their scalability is hindered due to the quadratic computational complexity. In this work, we introduce a novel approach that employs an action query representation to localize repeated… ▽ More

    Submitted 9 June, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

    Comments: code: https://github.com/lizishi/DeTRC, proj page: https://shirleymaxx.github.io/DeTRC/

  7. arXiv:2311.04726  [pdf, other

    cs.CV

    Social Motion Prediction with Cognitive Hierarchies

    Authors: Wentao Zhu, Jason Qin, Yuke Lou, Hang Ye, Xiaoxuan Ma, Hai Ci, Yizhou Wang

    Abstract: Humans exhibit a remarkable capacity for anticipating the actions of others and planning their own actions accordingly. In this study, we strive to replicate this ability by addressing the social motion prediction problem. We introduce a new benchmark, a novel formulation, and a cognition-inspired framework. We present Wusi, a 3D multi-person motion dataset under the context of team sports, which… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: NeurIPS 2023

  8. arXiv:2310.00322  [pdf, other

    cs.CL cs.GT

    Red Teaming Game: A Game-Theoretic Framework for Red Teaming Language Models

    Authors: Chengdong Ma, Ziran Yang, Minquan Gao, Hai Ci, Jun Gao, Xuehai Pan, Yaodong Yang

    Abstract: Deployable Large Language Models (LLMs) must conform to the criterion of helpfulness and harmlessness, thereby achieving consistency between LLMs outputs and human values. Red-teaming techniques constitute a critical way towards this criterion. Existing work rely solely on manual red team designs and heuristic adversarial prompts for vulnerability detection and optimization. These approaches lack… ▽ More

    Submitted 6 April, 2024; v1 submitted 30 September, 2023; originally announced October 2023.

  9. arXiv:2307.10894  [pdf, other

    cs.CV

    Human Motion Generation: A Survey

    Authors: Wentao Zhu, Xiaoxuan Ma, Dongwoo Ro, Hai Ci, Jinlu Zhang, Jiaxin Shi, Feng Gao, Qi Tian, Yizhou Wang

    Abstract: Human motion generation aims to generate natural human pose sequences and shows immense potential for real-world applications. Substantial progress has been made recently in motion data collection technologies and generation methods, laying the foundation for increasing interest in human motion generation. Most research within this field focuses on generating human motions based on conditional sig… ▽ More

    Submitted 15 November, 2023; v1 submitted 20 July, 2023; originally announced July 2023.

    Comments: Accepted to TPAMI

  10. arXiv:2303.03767  [pdf, other

    cs.CV cs.LG cs.MA

    Proactive Multi-Camera Collaboration For 3D Human Pose Estimation

    Authors: Hai Ci, Mickel Liu, Xuehai Pan, Fangwei Zhong, Yizhou Wang

    Abstract: This paper presents a multi-agent reinforcement learning (MARL) scheme for proactive Multi-Camera Collaboration in 3D Human Pose Estimation in dynamic human crowds. Traditional fixed-viewpoint multi-camera solutions for human motion capture (MoCap) are limited in capture space and susceptible to dynamic occlusions. Active camera approaches proactively control camera poses to find optimal viewpoint… ▽ More

    Submitted 7 March, 2023; originally announced March 2023.

    Comments: ICLR 2023 poster

  11. arXiv:2212.08641  [pdf, other

    cs.CV

    GFPose: Learning 3D Human Pose Prior with Gradient Fields

    Authors: Hai Ci, Mingdong Wu, Wentao Zhu, Xiaoxuan Ma, Hao Dong, Fangwei Zhong, Yizhou Wang

    Abstract: Learning 3D human pose prior is essential to human-centered AI. Here, we present GFPose, a versatile framework to model plausible 3D human poses for various applications. At the core of GFPose is a time-dependent score network, which estimates the gradient on each body joint and progressively denoises the perturbed 3D human pose to match a given task specification. During the denoising process, GF… ▽ More

    Submitted 16 December, 2022; originally announced December 2022.

  12. arXiv:2103.15507  [pdf, other

    cs.CV

    Context Modeling in 3D Human Pose Estimation: A Unified Perspective

    Authors: Xiaoxuan Ma, Jiajun Su, Chunyu Wang, Hai Ci, Yizhou Wang

    Abstract: Estimating 3D human pose from a single image suffers from severe ambiguity since multiple 3D joint configurations may have the same 2D projection. The state-of-the-art methods often rely on context modeling methods such as pictorial structure model (PSM) or graph neural network (GNN) to reduce ambiguity. However, there is no study that rigorously compares them side by side. So we first present a g… ▽ More

    Submitted 30 March, 2021; v1 submitted 29 March, 2021; originally announced March 2021.