Skip to main content

Showing 201–250 of 887 results for author: Tao, D

  1. arXiv:2305.01443  [pdf, other

    cs.CV cs.AI

    Scalable Mask Annotation for Video Text Spotting

    Authors: Haibin He, Jing Zhang, Mengyang Xu, Juhua Liu, Bo Du, Dacheng Tao

    Abstract: Video text spotting refers to localizing, recognizing, and tracking textual elements such as captions, logos, license plates, signs, and other forms of text within consecutive video frames. However, current datasets available for this task rely on quadrilateral ground truth annotations, which may result in including excessive background content and inaccurate text boundaries. Furthermore, methods… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

    Comments: Technical report. Work in progress

  2. arXiv:2305.00873  [pdf, other

    cs.LG cs.CR cs.DC

    Towards the Flatter Landscape and Better Generalization in Federated Learning under Client-level Differential Privacy

    Authors: Yifan Shi, Kang Wei, Li Shen, Yingqi Liu, Xueqian Wang, Bo Yuan, Dacheng Tao

    Abstract: To defend the inference attacks and mitigate the sensitive information leakages in Federated Learning (FL), client-level Differentially Private FL (DPFL) is the de-facto standard for privacy protection by clipping local updates and adding random noise. However, existing DPFL methods tend to make a sharp loss landscape and have poor weight perturbation robustness, resulting in severe performance de… ▽ More

    Submitted 2 May, 2023; v1 submitted 1 May, 2023; originally announced May 2023.

    Comments: 20 pages. arXiv admin note: substantial text overlap with arXiv:2303.11242

  3. arXiv:2304.14593  [pdf, other

    cs.CV

    Deep Graph Reprogramming

    Authors: Yongcheng Jing, Chongbin Yuan, Li Ju, Yiding Yang, Xinchao Wang, Dacheng Tao

    Abstract: In this paper, we explore a novel model reusing task tailored for graph neural networks (GNNs), termed as "deep graph reprogramming". We strive to reprogram a pre-trained GNN, without amending raw node features nor model parameters, to handle a bunch of cross-level downstream tasks in various domains. To this end, we propose an innovative Data Reprogramming paradigm alongside a Model Reprogramming… ▽ More

    Submitted 27 April, 2023; originally announced April 2023.

    Comments: CVPR 2023 Highlight

  4. arXiv:2304.14070  [pdf, other

    cs.CV cs.AI cs.LG

    Compositional 3D Human-Object Neural Animation

    Authors: Zhi Hou, Baosheng Yu, Dacheng Tao

    Abstract: Human-object interactions (HOIs) are crucial for human-centric scene understanding applications such as human-centric visual generation, AR/VR, and robotics. Since existing methods mainly explore capturing HOIs, rendering HOI remains less investigated. In this paper, we address this challenge in HOI animation from a compositional perspective, i.e., animating novel HOIs including novel interaction,… ▽ More

    Submitted 27 April, 2023; originally announced April 2023.

    Comments: 14 pages, 6 figures

  5. FZ-GPU: A Fast and High-Ratio Lossy Compressor for Scientific Computing Applications on GPUs

    Authors: Boyuan Zhang, Jiannan Tian, Sheng Di, Xiaodong Yu, Yunhe Feng, Xin Liang, Dingwen Tao, Franck Cappello

    Abstract: Today's large-scale scientific applications running on high-performance computing (HPC) systems generate vast data volumes. Thus, data compression is becoming a critical technique to mitigate the storage burden and data-movement cost. However, existing lossy compressors for scientific data cannot achieve a high compression ratio and throughput simultaneously, hindering their adoption in many appli… ▽ More

    Submitted 2 May, 2023; v1 submitted 24 April, 2023; originally announced April 2023.

    Comments: 14 pages, 12 figures, accepted by ACM HPDC '23

  6. arXiv:2304.11701  [pdf, other

    cs.CV

    HKNAS: Classification of Hyperspectral Imagery Based on Hyper Kernel Neural Architecture Search

    Authors: Di Wang, Bo Du, Liangpei Zhang, Dacheng Tao

    Abstract: Recent neural architecture search (NAS) based approaches have made great progress in hyperspectral image (HSI) classification tasks. However, the architectures are usually optimized independently of the network weights, increasing searching time and restricting model performances. To tackle these issues, in this paper, different from previous methods that extra define structural parameters, we pro… ▽ More

    Submitted 23 April, 2023; originally announced April 2023.

    Comments: Accepted by IEEE TNNLS. The code will be released at https://github.com/DotWang/HKNAS

  7. arXiv:2304.11595  [pdf, other

    cs.CV cs.AI cs.LG

    Segment Anything in Non-Euclidean Domains: Challenges and Opportunities

    Authors: Yongcheng Jing, Xinchao Wang, Dacheng Tao

    Abstract: The recent work known as Segment Anything (SA) has made significant strides in pushing the boundaries of semantic segmentation into the era of foundation models. The impact of SA has sparked extremely active discussions and ushered in an encouraging new wave of developing foundation models for the diverse tasks in the Euclidean domain, such as object detection and image inpainting. Despite the pro… ▽ More

    Submitted 23 April, 2023; originally announced April 2023.

    Comments: Work in progress

  8. DCN-T: Dual Context Network with Transformer for Hyperspectral Image Classification

    Authors: Di Wang, Jing Zhang, Bo Du, Liangpei Zhang, Dacheng Tao

    Abstract: Hyperspectral image (HSI) classification is challenging due to spatial variability caused by complex imaging conditions. Prior methods suffer from limited representation ability, as they train specially designed networks from scratch on limited annotated data. We propose a tri-spectral image generation pipeline that transforms HSI into high-quality tri-spectral images, enabling the use of off-the-… ▽ More

    Submitted 19 April, 2023; originally announced April 2023.

    Comments: Accepted by IEEE TIP. The code will be released at https://github.com/DotWang/DCN-T

  9. arXiv:2304.09793  [pdf, other

    cs.CV cs.RO

    Event-based Simultaneous Localization and Mapping: A Comprehensive Survey

    Authors: Kunping Huang, Sen Zhang, Jing Zhang, Dacheng Tao

    Abstract: In recent decades, visual simultaneous localization and mapping (vSLAM) has gained significant interest in both academia and industry. It estimates camera motion and reconstructs the environment concurrently using visual sensors on a moving robot. However, conventional cameras are limited by hardware, including motion blur and low dynamic range, which can negatively impact performance in challengi… ▽ More

    Submitted 22 March, 2024; v1 submitted 19 April, 2023; originally announced April 2023.

  10. arXiv:2304.08393  [pdf, other

    gr-qc astro-ph.CO astro-ph.HE

    Search for gravitational-lensing signatures in the full third observing run of the LIGO-Virgo network

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, R. Abbott, H. Abe, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi, C. Alléné, A. Allocca, P. A. Altin , et al. (1670 additional authors not shown)

    Abstract: Gravitational lensing by massive objects along the line of sight to the source causes distortions of gravitational wave-signals; such distortions may reveal information about fundamental physics, cosmology and astrophysics. In this work, we have extended the search for lensing signatures to all binary black hole events from the third observing run of the LIGO--Virgo network. We search for repeated… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

    Comments: 28 pages, 11 figures

    Report number: LIGO-P2200031

  11. GPULZ: Optimizing LZSS Lossless Compression for Multi-byte Data on Modern GPUs

    Authors: Boyuan Zhang, Jiannan Tian, Sheng Di, Xiaodong Yu, Martin Swany, Dingwen Tao, Franck Cappello

    Abstract: Today's graphics processing unit (GPU) applications produce vast volumes of data, which are challenging to store and transfer efficiently. Thus, data compression is becoming a critical technique to mitigate the storage burden and communication cost. LZSS is the core algorithm in many widely used compressors, such as Deflate. However, existing GPU-based LZSS compressors suffer from low throughput d… ▽ More

    Submitted 2 May, 2023; v1 submitted 14 April, 2023; originally announced April 2023.

    Comments: 12 pages, 9 figures, 3 tables, accepted by ACM ICS '23

  12. HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs

    Authors: Chengming Zhang, Shaden Smith, Baixi Sun, Jiannan Tian, Jonathan Soifer, Xiaodong Yu, Shuaiwen Leon Song, Yuxiong He, Dingwen Tao

    Abstract: Collaborative filtering (CF) has been proven to be one of the most effective techniques for recommendation. Among all CF approaches, SimpleX is the state-of-the-art method that adopts a novel loss function and a proper number of negative samples. However, there is no work that optimizes SimpleX on multi-core CPUs, leading to limited performance. To this end, we perform an in-depth profiling and an… ▽ More

    Submitted 3 May, 2023; v1 submitted 14 April, 2023; originally announced April 2023.

    Comments: 12 pages, 14 figures, 7 tables, accepted by ACM ICS '23

  13. arXiv:2304.06969  [pdf, other

    cs.CV

    UVA: Towards Unified Volumetric Avatar for View Synthesis, Pose rendering, Geometry and Texture Editing

    Authors: Jinlong Fan, Jing Zhang, Dacheng Tao

    Abstract: Neural radiance field (NeRF) has become a popular 3D representation method for human avatar reconstruction due to its high-quality rendering capabilities, e.g., regarding novel views and poses. However, previous methods for editing the geometry and appearance of the avatar only allow for global editing through body shape parameters and 2D texture maps. In this paper, we propose a new approach name… ▽ More

    Submitted 14 April, 2023; originally announced April 2023.

  14. arXiv:2304.04672  [pdf, other

    cs.CV

    Deep Image Matting: A Comprehensive Survey

    Authors: Jizhizi Li, Jing Zhang, Dacheng Tao

    Abstract: Image matting refers to extracting precise alpha matte from natural images, and it plays a critical role in various downstream applications, such as image editing. Despite being an ill-posed problem, traditional methods have been trying to solve it for decades. The emergence of deep learning has revolutionized the field of image matting and given birth to multiple new techniques, including automat… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

  15. arXiv:2304.03589  [pdf, other

    cs.LG cs.AI cs.DC

    On Efficient Training of Large-Scale Deep Learning Models: A Literature Review

    Authors: Li Shen, Yan Sun, Zhiyuan Yu, Liang Ding, Xinmei Tian, Dacheng Tao

    Abstract: The field of deep learning has witnessed significant progress, particularly in computer vision (CV), natural language processing (NLP), and speech. The use of large-scale models trained on vast amounts of data holds immense promise for practical applications, enhancing industrial productivity and facilitating social development. With the increasing demands on computational capacity, though numerou… ▽ More

    Submitted 7 April, 2023; originally announced April 2023.

    Comments: 60 pages

  16. arXiv:2304.02480  [pdf, other

    quant-ph cs.LG

    Quantum Imitation Learning

    Authors: Zhihao Cheng, Kaining Zhang, Li Shen, Dacheng Tao

    Abstract: Despite remarkable successes in solving various complex decision-making tasks, training an imitation learning (IL) algorithm with deep neural networks (DNNs) suffers from the high computation burden. In this work, we propose quantum imitation learning (QIL) with a hope to utilize quantum advantage to speed up IL. Concretely, we develop two QIL algorithms, quantum behavioural cloning (Q-BC) and qua… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

    Comments: Manuscript submitted to a journal for review on January 5, 2022

  17. arXiv:2304.00948  [pdf, other

    cs.CV

    VTAE: Variational Transformer Autoencoder with Manifolds Learning

    Authors: Pourya Shamsolmoali, Masoumeh Zareapoor, Huiyu Zhou, Dacheng Tao, Xuelong Li

    Abstract: Deep generative models have demonstrated successful applications in learning non-linear data distributions through a number of latent variables and these models use a nonlinear function (generator) to map latent samples into the data space. On the other hand, the nonlinearity of the generator implies that the latent space shows an unsatisfactory projection of the data space, which results in poor… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

  18. arXiv:2303.16818  [pdf, other

    cs.CV

    SimDistill: Simulated Multi-modal Distillation for BEV 3D Object Detection

    Authors: Haimei Zhao, Qiming Zhang, Shanshan Zhao, Zhe Chen, Jing Zhang, Dacheng Tao

    Abstract: Multi-view camera-based 3D object detection has become popular due to its low cost, but accurately inferring 3D geometry solely from camera data remains challenging and may lead to inferior performance. Although distilling precise 3D geometry knowledge from LiDAR data could help tackle this challenge, the benefits of LiDAR information could be greatly hindered by the significant modality gap betwe… ▽ More

    Submitted 8 January, 2024; v1 submitted 29 March, 2023; originally announced March 2023.

    Comments: Accepted by AAAI 2024

  19. arXiv:2303.15105  [pdf, other

    cs.CV

    Vision Transformer with Quadrangle Attention

    Authors: Qiming Zhang, Jing Zhang, Yufei Xu, Dacheng Tao

    Abstract: Window-based attention has become a popular choice in vision transformers due to its superior performance, lower computational complexity, and less memory footprint. However, the design of hand-crafted windows, which is data-agnostic, constrains the flexibility of transformers to adapt to objects of varying sizes, shapes, and orientations. To address this issue, we propose a novel quadrangle atten… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

    Comments: 15 pages, the extension of the ECCV 2022 paper (VSA: Learning Varied-Size Window Attention in Vision Transformers)

  20. arXiv:2303.13809  [pdf, other

    cs.CL

    Error Analysis Prompting Enables Human-Like Translation Evaluation in Large Language Models

    Authors: Qingyu Lu, Baopu Qiu, Liang Ding, Kanjian Zhang, Tom Kocmi, Dacheng Tao

    Abstract: Generative large language models (LLMs), e.g., ChatGPT, have demonstrated remarkable proficiency across several NLP tasks, such as machine translation, text summarization. Recent research (Kocmi and Federmann, 2023) has shown that utilizing LLMs for assessing the quality of machine translation (MT) achieves state-of-the-art performance at the system level but \textit{performs poorly at the segment… ▽ More

    Submitted 5 June, 2024; v1 submitted 24 March, 2023; originally announced March 2023.

    Comments: Findings of ACL 2024

  21. arXiv:2303.13780  [pdf, other

    cs.CL

    Towards Making the Most of ChatGPT for Machine Translation

    Authors: Keqin Peng, Liang Ding, Qihuang Zhong, Li Shen, Xuebo Liu, Min Zhang, Yuanxin Ouyang, Dacheng Tao

    Abstract: ChatGPT shows remarkable capabilities for machine translation (MT). Several prior studies have shown that it achieves comparable results to commercial systems for high-resource languages, but lags behind in complex tasks, e.g., low-resource and distant-language-pairs translation. However, they usually adopt simple prompts which can not fully elicit the capability of ChatGPT. In this paper, we aim… ▽ More

    Submitted 20 October, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

    Comments: EMNLP 2023 (findings)

  22. arXiv:2303.11242  [pdf, other

    cs.LG cs.CR cs.CV

    Make Landscape Flatter in Differentially Private Federated Learning

    Authors: Yifan Shi, Yingqi Liu, Kang Wei, Li Shen, Xueqian Wang, Dacheng Tao

    Abstract: To defend the inference attacks and mitigate the sensitive information leakages in Federated Learning (FL), client-level Differentially Private FL (DPFL) is the de-facto standard for privacy protection by clipping local updates and adding random noise. However, existing DPFL methods tend to make a sharper loss landscape and have poorer weight perturbation robustness, resulting in severe performanc… ▽ More

    Submitted 26 June, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

    Comments: CVPR2023

  23. arXiv:2303.11183  [pdf, other

    cs.LG cs.AI cs.CV

    Architecture, Dataset and Model-Scale Agnostic Data-free Meta-Learning

    Authors: Zixuan Hu, Li Shen, Zhenyi Wang, Tongliang Liu, Chun Yuan, Dacheng Tao

    Abstract: The goal of data-free meta-learning is to learn useful prior knowledge from a collection of pre-trained models without accessing their training data. However, existing works only solve the problem in parameter space, which (i) ignore the fruitful data knowledge contained in the pre-trained models; (ii) can not scale to large-scale pre-trained models; (iii) can only meta-learn pre-trained models wi… ▽ More

    Submitted 19 June, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

  24. arXiv:2303.10559  [pdf, other

    cs.CV

    Deep Learning for Camera Calibration and Beyond: A Survey

    Authors: Kang Liao, Lang Nie, Shujuan Huang, Chunyu Lin, Jing Zhang, Yao Zhao, Moncef Gabbouj, Dacheng Tao

    Abstract: Camera calibration involves estimating camera parameters to infer geometric features from captured sequences, which is crucial for computer vision and robotics. However, conventional calibration is laborious and requires dedicated collection. Recent efforts show that learning-based solutions have the potential to be used in place of the repeatability works of manual calibrations. Among these solut… ▽ More

    Submitted 4 June, 2024; v1 submitted 19 March, 2023; originally announced March 2023.

    Comments: Github repository: https://github.com/KangLiao929/Awesome-Deep-Camera-Calibration

  25. arXiv:2303.08678  [pdf, other

    cs.LG cs.CV cs.DC

    Visual Prompt Based Personalized Federated Learning

    Authors: Guanghao Li, Wansen Wu, Yan Sun, Li Shen, Baoyuan Wu, Dacheng Tao

    Abstract: As a popular paradigm of distributed learning, personalized federated learning (PFL) allows personalized models to improve generalization ability and robustness by utilizing knowledge from all distributed clients. Most existing PFL algorithms tackle personalization in a model-centric way, such as personalized layer partition, model regularization, and model interpolation, which all fail to take in… ▽ More

    Submitted 15 March, 2023; originally announced March 2023.

    Comments: 14 pages

  26. arXiv:2303.08566  [pdf, other

    cs.CV cs.AI cs.LG

    Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning

    Authors: Haoyu He, Jianfei Cai, Jing Zhang, Dacheng Tao, Bohan Zhuang

    Abstract: Visual Parameter-Efficient Fine-Tuning (PEFT) has become a powerful alternative for full fine-tuning so as to adapt pre-trained vision models to downstream tasks, which only tunes a small number of parameters while freezing the vast majority ones to ease storage burden and optimization difficulty. However, existing PEFT methods introduce trainable parameters to the same positions across different… ▽ More

    Submitted 31 August, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: ICCV 2023 Oral

  27. arXiv:2303.07347  [pdf, other

    cs.CV cs.AI cs.MM

    TriDet: Temporal Action Detection with Relative Boundary Modeling

    Authors: Dingfeng Shi, Yujie Zhong, Qiong Cao, Lin Ma, Jia Li, Dacheng Tao

    Abstract: In this paper, we present a one-stage framework TriDet for temporal action detection. Existing methods often suffer from imprecise boundary predictions due to the ambiguous action boundaries in videos. To alleviate this problem, we propose a novel Trident-head to model the action boundary via an estimated relative probability distribution around the boundary. In the feature pyramid of TriDet, we p… ▽ More

    Submitted 16 March, 2023; v1 submitted 13 March, 2023; originally announced March 2023.

    Comments: CVPR2023; Temporal Action Detection; Temporal Action Localization

  28. arXiv:2303.07110  [pdf, other

    cs.CV cs.AI cs.LG

    Upcycling Models under Domain and Category Shift

    Authors: Sanqing Qu, Tianpei Zou, Florian Roehrbein, Cewu Lu, Guang Chen, Dacheng Tao, Changjun Jiang

    Abstract: Deep neural networks (DNNs) often perform poorly in the presence of domain shift and category shift. How to upcycle DNNs and adapt them to the target task remains an important open problem. Unsupervised Domain Adaptation (UDA), especially recently proposed Source-free Domain Adaptation (SFDA), has become a promising technology to address this issue. Nevertheless, existing SFDA methods require that… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

    Comments: To appear in CVPR 2023. The code has been made public

  29. arXiv:2303.06682  [pdf, other

    cs.CV eess.IV

    DDS2M: Self-Supervised Denoising Diffusion Spatio-Spectral Model for Hyperspectral Image Restoration

    Authors: Yuchun Miao, Lefei Zhang, Liangpei Zhang, Dacheng Tao

    Abstract: Diffusion models have recently received a surge of interest due to their impressive performance for image restoration, especially in terms of noise robustness. However, existing diffusion-based methods are trained on a large amount of training data and perform very well in-distribution, but can be quite susceptible to distribution shift. This is especially inappropriate for data-starved hyperspect… ▽ More

    Submitted 19 March, 2023; v1 submitted 12 March, 2023; originally announced March 2023.

    Comments: 11 pages, 5 figures

  30. arXiv:2303.04664  [pdf, other

    cs.CV

    Centroid-centered Modeling for Efficient Vision Transformer Pre-training

    Authors: Xin Yan, Zuchao Li, Lefei Zhang, Bo Du, Dacheng Tao

    Abstract: Masked Image Modeling (MIM) is a new self-supervised vision pre-training paradigm using Vision Transformer (ViT). Previous works can be pixel-based or token-based, using original pixels or discrete visual tokens from parametric tokenizer models, respectively. Our proposed approach, \textbf{CCViT}, leverages k-means clustering to obtain centroids for image modeling without supervised training of to… ▽ More

    Submitted 8 March, 2023; originally announced March 2023.

  31. arXiv:2303.03747  [pdf, other

    cs.LG cs.AI

    Graph Decision Transformer

    Authors: Shengchao Hu, Li Shen, Ya Zhang, Dacheng Tao

    Abstract: Offline reinforcement learning (RL) is a challenging task, whose objective is to learn policies from static trajectory data without interacting with the environment. Recently, offline RL has been viewed as a sequence modeling problem, where an agent generates a sequence of subsequent actions based on a set of static transition experiences. However, existing approaches that use transformers to atte… ▽ More

    Submitted 7 March, 2023; originally announced March 2023.

    Comments: 14 pages

  32. arXiv:2303.02570  [pdf, other

    cs.LG cs.AI

    Time Associated Meta Learning for Clinical Prediction

    Authors: Hao Liu, Muhan Zhang, Zehao Dong, Lecheng Kong, Yixin Chen, Bradley Fritz, Dacheng Tao, Christopher King

    Abstract: Rich Electronic Health Records (EHR), have created opportunities to improve clinical processes using machine learning methods. Prediction of the same patient events at different time horizons can have very different applications and interpretations; however, limited number of events in each potential time window hurts the effectiveness of conventional machine learning algorithms. We propose a nove… ▽ More

    Submitted 4 March, 2023; originally announced March 2023.

  33. ESceme: Vision-and-Language Navigation with Episodic Scene Memory

    Authors: Qi Zheng, Daqing Liu, Chaoyue Wang, Jing Zhang, Dadong Wang, Dacheng Tao

    Abstract: Vision-and-language navigation (VLN) simulates a visual agent that follows natural-language navigation instructions in real-world scenes. Existing approaches have made enormous progress in navigation in new environments, such as beam search, pre-exploration, and dynamic or hierarchical history encoding. To balance generalization and efficiency, we resort to memorizing visited scenarios apart from… ▽ More

    Submitted 15 July, 2024; v1 submitted 2 March, 2023; originally announced March 2023.

    Comments: Accepted by IJCV

  34. arXiv:2303.00565  [pdf, other

    cs.LG cs.DC math.OC

    AdaSAM: Boosting Sharpness-Aware Minimization with Adaptive Learning Rate and Momentum for Training Deep Neural Networks

    Authors: Hao Sun, Li Shen, Qihuang Zhong, Liang Ding, Shixiang Chen, Jingwei Sun, Jing Li, Guangzhong Sun, Dacheng Tao

    Abstract: Sharpness aware minimization (SAM) optimizer has been extensively explored as it can generalize better for training deep neural networks via introducing extra perturbation steps to flatten the landscape of deep learning models. Integrating SAM with adaptive learning rate and momentum acceleration, dubbed AdaSAM, has already been explored empirically to train large-scale deep neural networks withou… ▽ More

    Submitted 1 March, 2023; originally announced March 2023.

    Comments: 18 pages

  35. arXiv:2303.00501  [pdf, other

    cs.LG cs.AI

    OmniForce: On Human-Centered, Large Model Empowered and Cloud-Edge Collaborative AutoML System

    Authors: Chao Xue, Wei Liu, Shuai Xie, Zhenfang Wang, Jiaxing Li, Xuyang Peng, Liang Ding, Shanshan Zhao, Qiong Cao, Yibo Yang, Fengxiang He, Bohua Cai, Rongcheng Bian, Yiyan Zhao, Heliang Zheng, Xiangyang Liu, Dongkai Liu, Daqing Liu, Li Shen, Chang Li, Shijin Zhang, Yukang Zhang, Guanpu Chen, Shixiang Chen, Yibing Zhan , et al. (3 additional authors not shown)

    Abstract: Automated machine learning (AutoML) seeks to build ML models with minimal human effort. While considerable research has been conducted in the area of AutoML in general, aiming to take humans out of the loop when building artificial intelligence (AI) applications, scant literature has focused on how AutoML works well in open-environment scenarios such as the process of training and updating large m… ▽ More

    Submitted 8 July, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

  36. arXiv:2302.12448  [pdf, other

    cs.LG cs.CR cs.DC math.OC

    Subspace based Federated Unlearning

    Authors: Guanghao Li, Li Shen, Yan Sun, Yue Hu, Han Hu, Dacheng Tao

    Abstract: Federated learning (FL) enables multiple clients to train a machine learning model collaboratively without exchanging their local data. Federated unlearning is an inverse FL process that aims to remove a specified target client's contribution in FL to satisfy the user's right to be forgotten. Most existing federated unlearning algorithms require the server to store the history of the parameter upd… ▽ More

    Submitted 23 February, 2023; originally announced February 2023.

    Comments: 12 pages

  37. arXiv:2302.11085  [pdf, other

    cs.LG stat.ML

    Learning to Generalize Provably in Learning to Optimize

    Authors: Junjie Yang, Tianlong Chen, Mingkang Zhu, Fengxiang He, Dacheng Tao, Yingbin Liang, Zhangyang Wang

    Abstract: Learning to optimize (L2O) has gained increasing popularity, which automates the design of optimizers by data-driven approaches. However, current L2O methods often suffer from poor generalization performance in at least two folds: (i) applying the L2O-learned optimizer to unseen optimizees, in terms of lowering their loss function values (optimizer generalization, or ``generalizable learning of op… ▽ More

    Submitted 28 March, 2023; v1 submitted 21 February, 2023; originally announced February 2023.

    Comments: This paper is accepted in AISTATS 2023

  38. arXiv:2302.11051  [pdf, other

    cs.LG

    Fusion of Global and Local Knowledge for Personalized Federated Learning

    Authors: Tiansheng Huang, Li Shen, Yan Sun, Weiwei Lin, Dacheng Tao

    Abstract: Personalized federated learning, as a variant of federated learning, trains customized models for clients using their heterogeneously distributed data. However, it is still inconclusive about how to design personalized models with better representation of shared global knowledge and personalized pattern. To bridge the gap, we in this paper explore personalized models with low-rank and sparse decom… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

    Comments: Accepted by TMLR

  39. arXiv:2302.10429  [pdf, other

    cs.LG cs.DC math.OC

    FedSpeed: Larger Local Interval, Less Communication Round, and Higher Generalization Accuracy

    Authors: Yan Sun, Li Shen, Tiansheng Huang, Liang Ding, Dacheng Tao

    Abstract: Federated learning is an emerging distributed machine learning framework which jointly trains a global model via a large number of local devices with data privacy protections. Its performance suffers from the non-vanishing biases introduced by the local inconsistent optimal and the rugged client-drifts by the local over-fitting. In this paper, we propose a novel and practical method, FedSpeed, to… ▽ More

    Submitted 5 July, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

    Comments: ICLR 2023

  40. arXiv:2302.10198  [pdf, other

    cs.CL

    Can ChatGPT Understand Too? A Comparative Study on ChatGPT and Fine-tuned BERT

    Authors: Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du, Dacheng Tao

    Abstract: Recently, ChatGPT has attracted great attention, as it can generate fluent and high-quality responses to human inquiries. Several prior studies have shown that ChatGPT attains remarkable generation ability compared with existing models. However, the quantitative analysis of ChatGPT's understanding ability has been given little attention. In this report, we explore the understanding ability of Chat… ▽ More

    Submitted 2 March, 2023; v1 submitted 19 February, 2023; originally announced February 2023.

    Comments: Work in progress. Added results of advanced prompting strategies, e.g., CoT. (19 pages)

  41. arXiv:2302.09532  [pdf, other

    cs.LG cs.AI

    Pseudo Contrastive Learning for Graph-based Semi-supervised Learning

    Authors: Weigang Lu, Ziyu Guan, Wei Zhao, Yaming Yang, Yuanhai Lv, Lining Xing, Baosheng Yu, Dacheng Tao

    Abstract: Pseudo Labeling is a technique used to improve the performance of semi-supervised Graph Neural Networks (GNNs) by generating additional pseudo-labels based on confident predictions. However, the quality of generated pseudo-labels has been a longstanding concern due to the sensitivity of the classification objective with respect to the given labels. To avoid the untrustworthy classification supervi… ▽ More

    Submitted 18 December, 2023; v1 submitted 19 February, 2023; originally announced February 2023.

    Comments: Under Review

  42. arXiv:2302.09491  [pdf, other

    cs.CR cs.AI cs.CV

    X-Adv: Physical Adversarial Object Attacks against X-ray Prohibited Item Detection

    Authors: Aishan Liu, Jun Guo, Jiakai Wang, Siyuan Liang, Renshuai Tao, Wenbo Zhou, Cong Liu, Xianglong Liu, Dacheng Tao

    Abstract: Adversarial attacks are valuable for evaluating the robustness of deep learning models. Existing attacks are primarily conducted on the visible light spectrum (e.g., pixel-wise texture perturbation). However, attacks targeting texture-free X-ray images remain underexplored, despite the widespread application of X-ray imaging in safety-critical scenarios such as the X-ray detection of prohibited it… ▽ More

    Submitted 19 February, 2023; originally announced February 2023.

    Comments: Accepted by USENIX Security 2023

  43. arXiv:2302.09268  [pdf, other

    cs.CL

    Bag of Tricks for Effective Language Model Pretraining and Downstream Adaptation: A Case Study on GLUE

    Authors: Qihuang Zhong, Liang Ding, Keqin Peng, Juhua Liu, Bo Du, Li Shen, Yibing Zhan, Dacheng Tao

    Abstract: This technical report briefly describes our JDExplore d-team's submission Vega v1 on the General Language Understanding Evaluation (GLUE) leaderboard, where GLUE is a collection of nine natural language understanding tasks, including question answering, linguistic acceptability, sentiment analysis, text similarity, paraphrase detection, and natural language inference. [Method] We investigate sever… ▽ More

    Submitted 18 February, 2023; originally announced February 2023.

    Comments: Technical report. arXiv admin note: text overlap with arXiv:2212.01853

  44. arXiv:2302.08890  [pdf, other

    cs.CV

    Deep Learning for Event-based Vision: A Comprehensive Survey and Benchmarks

    Authors: Xu Zheng, Yexin Liu, Yunfan Lu, Tongyan Hua, Tianbo Pan, Weiming Zhang, Dacheng Tao, Lin Wang

    Abstract: Event cameras are bio-inspired sensors that capture the per-pixel intensity changes asynchronously and produce event streams encoding the time, pixel position, and polarity (sign) of the intensity changes. Event cameras possess a myriad of advantages over canonical frame-based cameras, such as high temporal resolution, high dynamic range, low latency, etc. Being capable of capturing information in… ▽ More

    Submitted 11 April, 2024; v1 submitted 17 February, 2023; originally announced February 2023.

  45. arXiv:2302.07450  [pdf, other

    cs.LG cs.CR

    FedABC: Targeting Fair Competition in Personalized Federated Learning

    Authors: Dui Wang, Li Shen, Yong Luo, Han Hu, Kehua Su, Yonggang Wen, Dacheng Tao

    Abstract: Federated learning aims to collaboratively train models without accessing their client's local private data. The data may be Non-IID for different clients and thus resulting in poor performance. Recently, personalized federated learning (PFL) has achieved great success in handling Non-IID data by enforcing regularization in local optimization or improving the model aggregation scheme on the server… ▽ More

    Submitted 14 February, 2023; originally announced February 2023.

    Comments: 9 pages,5 figures

    Journal ref: AAAI2023

  46. arXiv:2302.05726  [pdf, other

    eess.SY

    Enhance Local Consistency in Federated Learning: A Multi-Step Inertial Momentum Approach

    Authors: Yixing Liu, Yan Sun, Zhengtao Ding, Li Shen, Bo Liu, Dacheng Tao

    Abstract: Federated learning (FL), as a collaborative distributed training paradigm with several edge computing devices under the coordination of a centralized server, is plagued by inconsistent local stationary points due to the heterogeneity of the local partial participation clients, which precipitates the local client-drifts problems and sparks off the unstable and slow convergence, especially on the ag… ▽ More

    Submitted 11 February, 2023; originally announced February 2023.

  47. arXiv:2302.05201  [pdf, other

    cs.CV

    PointWavelet: Learning in Spectral Domain for 3D Point Cloud Analysis

    Authors: Cheng Wen, Jianzhi Long, Baosheng Yu, Dacheng Tao

    Abstract: With recent success of deep learning in 2D visual recognition, deep learning-based 3D point cloud analysis has received increasing attention from the community, especially due to the rapid development of autonomous driving technologies. However, most existing methods directly learn point features in the spatial domain, leaving the local structures in the spectral domain poorly investigated. In thi… ▽ More

    Submitted 10 February, 2023; originally announced February 2023.

    Comments: 10 pages

  48. arXiv:2302.04083  [pdf, other

    cs.LG cs.DC math.OC

    Improving the Model Consistency of Decentralized Federated Learning

    Authors: Yifan Shi, Li Shen, Kang Wei, Yan Sun, Bo Yuan, Xueqian Wang, Dacheng Tao

    Abstract: To mitigate the privacy leakages and communication burdens of Federated Learning (FL), decentralized FL (DFL) discards the central server and each client only communicates with its neighbors in a decentralized communication network. However, existing DFL suffers from high inconsistency among local clients, which results in severe distribution shift and inferior performance compared with centralize… ▽ More

    Submitted 9 June, 2023; v1 submitted 8 February, 2023; originally announced February 2023.

    Comments: ICML2023

  49. Open data from the third observing run of LIGO, Virgo, KAGRA and GEO

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, R. Abbott, H. Abe, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi, A. Al-Jodah, C. Alléné, A. Allocca , et al. (1719 additional authors not shown)

    Abstract: The global network of gravitational-wave observatories now includes five detectors, namely LIGO Hanford, LIGO Livingston, Virgo, KAGRA, and GEO 600. These detectors collected data during their third observing run, O3, composed of three phases: O3a starting in April of 2019 and lasting six months, O3b starting in November of 2019 and lasting five months, and O3GK starting in April of 2020 and lasti… ▽ More

    Submitted 7 February, 2023; originally announced February 2023.

    Comments: 27 pages, 3 figures

    Report number: LIGO-P2200316

  50. arXiv:2302.03397  [pdf, other

    cs.CV

    AniPixel: Towards Animatable Pixel-Aligned Human Avatar

    Authors: Jinlong Fan, Jing Zhang, Zhi Hou, Dacheng Tao

    Abstract: Although human reconstruction typically results in human-specific avatars, recent 3D scene reconstruction techniques utilizing pixel-aligned features show promise in generalizing to new scenes. Applying these techniques to human avatar reconstruction can result in a volumetric avatar with generalizability but limited animatability due to rendering only being possible for static representations. In… ▽ More

    Submitted 17 October, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

    Comments: Accepted by MM'23, code will be released at https://github.com/loong8888/AniPixel