Skip to main content

Showing 1–50 of 203 results for author: Duan, L

  1. arXiv:2407.11494  [pdf, other

    cs.CV

    Learning Semantic Latent Directions for Accurate and Controllable Human Motion Prediction

    Authors: Guowei Xu, Jiale Tao, Wen Li, Lixin Duan

    Abstract: In the realm of stochastic human motion prediction (SHMP), researchers have often turned to generative models like GANS, VAEs and diffusion models. However, most previous approaches have struggled to accurately predict motions that are both realistic and coherent with past motion due to a lack of guidance on the latent distribution. In this paper, we introduce Semantic Latent Directions (SLD) as a… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV 2024

  2. arXiv:2407.08303  [pdf, other

    cs.CV cs.AI

    DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception

    Authors: Xiaotong Li, Fan Zhang, Haiwen Diao, Yueze Wang, Xinlong Wang, Ling-Yu Duan

    Abstract: Existing Multimodal Large Language Models (MLLMs) increasingly emphasize complex understanding of various visual elements, including multiple objects, text information, and spatial relations. Their development for comprehensive visual perception hinges on the availability of high-quality image-text datasets that offer diverse visual elements and throughout image descriptions. However, the scarcity… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  3. arXiv:2407.06642  [pdf, other

    cs.CV

    Powerful and Flexible: Personalized Text-to-Image Generation via Reinforcement Learning

    Authors: Fanyue Wei, Wei Zeng, Zhenyang Li, Dawei Yin, Lixin Duan, Wen Li

    Abstract: Personalized text-to-image models allow users to generate varied styles of images (specified with a sentence) for an object (specified with a set of reference images). While remarkable results have been achieved using diffusion-based generation models, the visual structure and details of the object are often unexpectedly changed during the diffusion process. One major reason is that these diffusio… ▽ More

    Submitted 18 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024

  4. arXiv:2407.03842  [pdf, other

    cs.CV

    Beyond Viewpoint: Robust 3D Object Recognition under Arbitrary Views through Joint Multi-Part Representation

    Authors: Linlong Fan, Ye Huang, Yanqi Ge, Wen Li, Lixin Duan

    Abstract: Existing view-based methods excel at recognizing 3D objects from predefined viewpoints, but their exploration of recognition under arbitrary views is limited. This is a challenging and realistic setting because each object has different viewpoint positions and quantities, and their poses are not aligned. However, most view-based methods, which aggregate multiple view features to obtain a global fe… ▽ More

    Submitted 17 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

    Comments: ECCV 2024 camera ready

  5. arXiv:2407.01017  [pdf, other

    cs.CV

    Coding for Intelligence from the Perspective of Category

    Authors: Wenhan Yang, Zixuan Hu, Lilang Lin, Jiaying Liu, Ling-Yu Duan

    Abstract: Coding, which targets compressing and reconstructing data, and intelligence, often regarded at an abstract computational level as being centered around model learning and prediction, interweave recently to give birth to a series of significant progress. The recent trends demonstrate the potential homogeneity of these two fields, especially when deep-learning models aid these two categories for bet… ▽ More

    Submitted 2 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

  6. arXiv:2406.16437  [pdf, other

    cs.LG cs.AI

    Theory on Mixture-of-Experts in Continual Learning

    Authors: Hongbo Li, Sen Lin, Lingjie Duan, Yingbin Liang, Ness B. Shroff

    Abstract: Continual learning (CL) has garnered significant attention because of its ability to adapt to new tasks that arrive over time. Catastrophic forgetting (of old tasks) has been identified as a major issue in CL, as the model adapts to new tasks. The Mixture-of-Experts (MoE) model has recently been shown to effectively mitigate catastrophic forgetting in CL, by employing a gating network to sparsify… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  7. arXiv:2406.14941  [pdf, other

    cs.CV

    Brightearth roads: Towards fully automatic road network extraction from satellite imagery

    Authors: Liuyun Duan, Willard Mapurisa, Maxime Leras, Leigh Lotter, Yuliya Tarabalka

    Abstract: The modern road network topology comprises intricately designed structures that introduce complexity when automatically reconstructing road networks. While open resources like OpenStreetMap (OSM) offer road networks with well-defined topology, they may not always be up to date worldwide. In this paper, we propose a fully automated pipeline for extracting road networks from very-high-resolution (VH… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Journal ref: IGARSS2024, Jul 2024, ATHENE, Greece

  8. arXiv:2405.16447  [pdf, other

    cs.LG

    Fast Asymmetric Factorization for Large Scale Multiple Kernel Clustering

    Authors: Yan Chen, Liang Du, Lei Duan

    Abstract: Kernel methods are extensively employed for nonlinear data clustering, yet their effectiveness heavily relies on selecting suitable kernels and associated parameters, posing challenges in advance determination. In response, Multiple Kernel Clustering (MKC) has emerged as a solution, allowing the fusion of information from multiple base kernels for clustering. However, both early fusion and late fu… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  9. arXiv:2405.03031  [pdf, other

    cs.GT

    Distributed Learning for Dynamic Congestion Games

    Authors: Hongbo Li, Lingjie Duan

    Abstract: Today mobile users learn and share their traffic observations via crowdsourcing platforms (e.g., Google Maps and Waze). Yet such platforms myopically recommend the currently shortest path to users, and selfish users are unwilling to travel to longer paths of varying traffic conditions to explore. Prior studies focus on one-shot congestion games without information learning, while our work studies… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: This paper has been accepted by IEEE ISIT 2024. arXiv admin note: substantial text overlap with arXiv:2404.15599

  10. Human-in-the-loop Learning for Dynamic Congestion Games

    Authors: Hongbo Li, Lingjie Duan

    Abstract: Today mobile users learn and share their traffic observations via crowdsourcing platforms (e.g., Waze). Yet such platforms simply cater to selfish users' myopic interests to recommend the shortest path, and do not encourage enough users to travel and learn other paths for future others. Prior studies focus on one-shot congestion games without considering users' information learning, while our work… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: This paper has been accepted by IEEE Transactions on Mobile Computing (2024)

  11. arXiv:2404.15159  [pdf, other

    cs.CL cs.AI

    MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts

    Authors: Dengchun Li, Yingzi Ma, Naizheng Wang, Zhengmao Ye, Zhiyuan Cheng, Yinghao Tang, Yan Zhang, Lei Duan, Jie Zuo, Cal Yang, Mingjie Tang

    Abstract: Fine-tuning Large Language Models (LLMs) is a common practice to adapt pre-trained models for specific applications. While methods like LoRA have effectively addressed GPU memory constraints during fine-tuning, their performance often falls short, especially in multi-task scenarios. In contrast, Mixture-of-Expert (MoE) models, such as Mixtral 8x7B, demonstrate remarkable performance in multi-task… ▽ More

    Submitted 23 May, 2024; v1 submitted 21 April, 2024; originally announced April 2024.

    Comments: 18 pages, 5 figures

  12. arXiv:2404.11903  [pdf, other

    cs.CV

    Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition

    Authors: Xunsong Li, Pengzhan Sun, Yangcen Liu, Lixin Duan, Wen Li

    Abstract: The interactions between human and objects are important for recognizing object-centric actions. Existing methods usually adopt a two-stage pipeline, where object proposals are first detected using a pretrained detector, and then are fed to an action recognition model for extracting video features and learning the object relations for action recognition. However, since the action prior is unknown… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 12 pages, 5 figures, submitted to IEEE Transactions on Multimedia

  13. arXiv:2404.06835  [pdf, other

    cs.CV

    Tuning-Free Adaptive Style Incorporation for Structure-Consistent Text-Driven Style Transfer

    Authors: Yanqi Ge, Jiaqi Liu, Qingnan Fan, Xi Jiang, Ye Huang, Shuai Qin, Hong Gu, Wen Li, Lixin Duan

    Abstract: In this work, we target the task of text-driven style transfer in the context of text-to-image (T2I) diffusion models. The main challenge is consistent structure preservation while enabling effective style transfer effects. The past approaches in this field directly concatenate the content and style prompts for a prompt-level style injection, leading to unavoidable structure distortions. In this w… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  14. arXiv:2403.13831  [pdf

    cs.ET cs.HC physics.optics

    Dual-sided transparent display

    Authors: Suman Halder, Yunho Shin, Yidan Peng, Long Wang, Liye Duan, Paul Schmalenberg, Guangkui Qin, Yuxi Gao, Ercan M. Dede, Deng-Ke Yang, Sean P. Rodrigues

    Abstract: In the past decade, display technology has been reimagined to meet the needs of the virtual world. By mapping information onto a scene through a transparent display, users can simultaneously visualize both the real world and layers of virtual elements. However, advances in augmented reality (AR) technology have primarily focused on wearable gear or personal devices. Here we present a single displa… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  15. arXiv:2403.11113  [pdf, other

    cs.CV

    Local-consistent Transformation Learning for Rotation-invariant Point Cloud Analysis

    Authors: Yiyang Chen, Lunhao Duan, Shanshan Zhao, Changxing Ding, Dacheng Tao

    Abstract: Rotation invariance is an important requirement for point shape analysis. To achieve this, current state-of-the-art methods attempt to construct the local rotation-invariant representation through learning or defining the local reference frame (LRF). Although efficient, these LRF-based methods suffer from perturbation of local geometric relations, resulting in suboptimal local rotation invariance.… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR 2024

  16. arXiv:2401.14686  [pdf, other

    cs.CV

    SSR: SAM is a Strong Regularizer for domain adaptive semantic segmentation

    Authors: Yanqi Ge, Ye Huang, Wen Li, Lixin Duan

    Abstract: We introduced SSR, which utilizes SAM (segment-anything) as a strong regularizer during training, to greatly enhance the robustness of the image encoder for handling various domains. Specifically, given the fact that SAM is pre-trained with a large number of images over the internet, which cover a diverse variety of domains, the feature encoding extracted by the SAM is obviously less dependent on… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

  17. arXiv:2401.08061  [pdf, other

    cs.LG cs.CV

    Augmenting Ground-Level PM2.5 Prediction via Kriging-Based Pseudo-Label Generation

    Authors: Lei Duan, Ziyang Jiang, David Carlson

    Abstract: Fusing abundant satellite data with sparse ground measurements constitutes a major challenge in climate modeling. To address this, we propose a strategy to augment the training dataset by introducing unlabeled satellite images paired with pseudo-labels generated through a spatial interpolation technique known as ordinary kriging, thereby making full use of the available satellite data resources. W… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

    Comments: 8 pages, 4 figures, NeurIPS 2023 Workshop: Tackling Climate Change with Machine Learning

  18. arXiv:2312.11872  [pdf, other

    cs.CV

    Beyond Prototypes: Semantic Anchor Regularization for Better Representation Learning

    Authors: Yanqi Ge, Qiang Nie, Ye Huang, Yong Liu, Chengjie Wang, Feng Zheng, Wen Li, Lixin Duan

    Abstract: One of the ultimate goals of representation learning is to achieve compactness within a class and well-separability between classes. Many outstanding metric-based and prototype-based methods following the Expectation-Maximization paradigm, have been proposed for this objective. However, they inevitably introduce biases into the learning process, particularly with long-tail distributed training dat… ▽ More

    Submitted 4 February, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

    Comments: AAAI 2024

  19. arXiv:2312.11112  [pdf, other

    cs.CV

    ConDaFormer: Disassembled Transformer with Local Structure Enhancement for 3D Point Cloud Understanding

    Authors: Lunhao Duan, Shanshan Zhao, Nan Xue, Mingming Gong, Gui-Song Xia, Dacheng Tao

    Abstract: Transformers have been recently explored for 3D point cloud understanding with impressive progress achieved. A large number of points, over 0.1 million, make the global self-attention infeasible for point cloud data. Thus, most methods propose to apply the transformer in a local region, e.g., spherical or cubic window. However, it still contains a large number of Query-Key pairs, which requires hi… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

    Comments: NeurIPS 2023. Code: https://github.com/LHDuan/ConDaFormer

  20. arXiv:2312.02515  [pdf, other

    cs.LG cs.AI

    ASPEN: High-Throughput LoRA Fine-Tuning of Large Language Models with a Single GPU

    Authors: Zhengmao Ye, Dengchun Li, Jingqi Tian, Tingfeng Lan, Jie Zuo, Lei Duan, Hui Lu, Yexi Jiang, Jian Sha, Ke Zhang, Mingjie Tang

    Abstract: Transformer-based large language models (LLMs) have demonstrated outstanding performance across diverse domains, particularly when fine-turned for specific domains. Recent studies suggest that the resources required for fine-tuning LLMs can be economized through parameter-efficient methods such as Low-Rank Adaptation (LoRA). While LoRA effectively reduces computational burdens and resource demands… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: 14 pages, 14 figures

  21. arXiv:2311.14281  [pdf, ps, other

    cs.CV

    Multi-modal Instance Refinement for Cross-domain Action Recognition

    Authors: Yuan Qing, Naixing Wu, Shaohua Wan, Lixin Duan

    Abstract: Unsupervised cross-domain action recognition aims at adapting the model trained on an existing labeled source domain to a new unlabeled target domain. Most existing methods solve the task by directly aligning the feature distributions of source and target domains. However, this would cause negative transfer during domain adaptation due to some negative training samples in both domains. In the sour… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

    Comments: Accepted by PRCV 2023

  22. arXiv:2311.10292  [pdf, other

    quant-ph cs.ET physics.optics

    Realization of a programmable multi-purpose photonic quantum memory with over-thousand qubit manipulations

    Authors: Sheng Zhang, Jixuan Shi, Zhaibin Cui, Ye Wang, Yukai Wu, Luming Duan, Yunfei Pu

    Abstract: Quantum networks can enable various applications such as distributed quantum computing, long-distance quantum communication, and network-based quantum sensing with unprecedented performances. One of the most important building blocks for a quantum network is a photonic quantum memory which serves as the interface between the communication channel and the local functional unit. A programmable quant… ▽ More

    Submitted 29 April, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: 17 pages, 19 figures

    Journal ref: Phys. Rev. X 14, 021018 (2024)

  23. arXiv:2311.07896  [pdf, other

    physics.flu-dyn cs.LG

    Bayesian Conditional Diffusion Models for Versatile Spatiotemporal Turbulence Generation

    Authors: Han Gao, Xu Han, Xiantao Fan, Luning Sun, Li-Ping Liu, Lian Duan, Jian-Xun Wang

    Abstract: Turbulent flows have historically presented formidable challenges to predictive computational modeling. Traditional numerical simulations often require vast computational resources, making them infeasible for numerous engineering applications. As an alternative, deep learning-based surrogate models have emerged, offering data-drive solutions. However, these are typically constructed within determi… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

    Comments: 37 pages, 31 figures

  24. arXiv:2310.18596  [pdf, other

    cs.CR cs.SI

    How Hard is Takeover in DPoS Blockchains? Understanding the Security of Coin-based Voting Governance

    Authors: Chao Li, Balaji Palanisamy, Runhua Xu, Li Duan, Jiqiang Liu, Wei Wang

    Abstract: Delegated-Proof-of-Stake (DPoS) blockchains, such as EOSIO, Steem and TRON, are governed by a committee of block producers elected via a coin-based voting system. We recently witnessed the first de facto blockchain takeover that happened between Steem and TRON. Within one hour of this incident, TRON founder took over the entire Steem committee, forcing the original Steem community to leave the blo… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

    Comments: This work has been accepted by ACM CCS 2023

  25. arXiv:2310.13912  [pdf, other

    cs.CV

    Learning Motion Refinement for Unsupervised Face Animation

    Authors: Jiale Tao, Shuhang Gu, Wen Li, Lixin Duan

    Abstract: Unsupervised face animation aims to generate a human face video based on the appearance of a source image, mimicking the motion from a driving video. Existing methods typically adopted a prior-based motion model (e.g., the local affine motion model or the local thin-plate-spline motion model). While it is able to capture the coarse facial motion, artifacts can often be observed around the tiny mot… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023

  26. arXiv:2309.12113  [pdf, other

    cs.AI

    Incentivizing Massive Unknown Workers for Budget-Limited Crowdsensing: From Off-Line and On-Line Perspectives

    Authors: Feng Li, Yuqi Chai, Huan Yang, Pengfei Hu, Lingjie Duan

    Abstract: How to incentivize strategic workers using limited budget is a very fundamental problem for crowdsensing systems; nevertheless, since the sensing abilities of the workers may not always be known as prior knowledge due to the diversities of their sensor devices and behaviors, it is difficult to properly select and pay the unknown workers. Although the uncertainties of the workers can be addressed b… ▽ More

    Submitted 2 January, 2024; v1 submitted 21 September, 2023; originally announced September 2023.

  27. arXiv:2309.01090  [pdf, other

    cs.CR

    Liquid Democracy in DPoS Blockchains

    Authors: Chao Li, Runhua Xu, Li Duan

    Abstract: Voting mechanisms play a crucial role in decentralized governance of blockchain systems. Liquid democracy, also known as delegative voting, allows voters to vote directly or delegate their voting power to others, thereby contributing to the resolution of problems such as low voter turnout. In recent years, liquid democracy has been widely adopted by Delegated-Proof-of-Stake (DPoS) blockchains and… ▽ More

    Submitted 3 September, 2023; originally announced September 2023.

    Journal ref: ACM BSCI 2023

  28. arXiv:2308.15074  [pdf, other

    cs.CV cs.LG

    Exploring Model Transferability through the Lens of Potential Energy

    Authors: Xiaotong Li, Zixuan Hu, Yixiao Ge, Ying Shan, Ling-Yu Duan

    Abstract: Transfer learning has become crucial in computer vision tasks due to the vast availability of pre-trained deep learning models. However, selecting the optimal pre-trained model from a diverse pool for a specific downstream task remains a challenge. Existing methods for measuring the transferability of pre-trained models rely on statistical correlations between encoded static features and task labe… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

    Comments: Accepted by ICCV 2023

  29. arXiv:2308.13303  [pdf, other

    cs.SI cs.DM cs.IT

    Age of Information Diffusion on Social Networks: Optimizing Multi-Stage Seeding Strategies

    Authors: Songhua Li, Lingjie Duan

    Abstract: To promote viral marketing, major social platforms (e.g., Facebook Marketplace and Pinduoduo) repeatedly select and invite different users (as seeds) in online social networks to share fresh information about a product or service with their friends. Thereby, we are motivated to optimize a multi-stage seeding process of viral marketing in social networks and adopt the recent notions of the peak and… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

    Comments: A long abstract of this work will appear at MobiHoc 2023

  30. arXiv:2308.13301  [pdf, other

    cs.GT cs.MA

    On Incentivizing Social Information Sharing in Routing Games

    Authors: Songhua Li, Lingjie Duan

    Abstract: Crowdsourcing services, such as Waze, leverage a mass of mobile users to learn massive point-of-interest (PoI) information while traveling and share it as a public good. Given that crowdsourced users mind their travel costs and possess various preferences over the PoI information along different paths, we formulate the problem as a novel non-atomic multi-path routing game with positive network ext… ▽ More

    Submitted 10 April, 2024; v1 submitted 25 August, 2023; originally announced August 2023.

    Comments: This version generalizes the results to multi-path routing games compared to the previous version, while also addressing numerous typos and grammar errors

  31. arXiv:2308.13260  [pdf, other

    cs.SI cs.DM cs.DS

    Approximation Algorithms to Enhance Social Sharing of Fresh Point-of-Interest Information

    Authors: Songhua Li, Lingjie Duan

    Abstract: In location-based social networks (LBSNs), such as Gowalla and Waze, users sense urban point-of-interest (PoI) information (e.g., restaurants' queue length and real-time traffic conditions) in the vicinity and share such information with friends in online social networks. Given each user's social connections and the severe lags in disseminating fresh PoI to all users, major LBSNs aim to enhance us… ▽ More

    Submitted 15 April, 2024; v1 submitted 25 August, 2023; originally announced August 2023.

  32. Multi-Objective Optimization for UAV Swarm-Assisted IoT with Virtual Antenna Arrays

    Authors: Jiahui Li, Geng Sun, Lingjie Duan, Qingqing Wu

    Abstract: Unmanned aerial vehicle (UAV) network is a promising technology for assisting Internet-of-Things (IoT), where a UAV can use its limited service coverage to harvest and disseminate data from IoT devices with low transmission abilities. The existing UAV-assisted data harvesting and dissemination schemes largely require UAVs to frequently fly between the IoTs and access points, resulting in extra ene… ▽ More

    Submitted 2 August, 2023; originally announced August 2023.

    Comments: This paper has been accepted by IEEE Transactions on Mobile Computing

  33. arXiv:2306.07973  [pdf, other

    cs.CR cs.LG

    PrivaScissors: Enhance the Privacy of Collaborative Inference through the Lens of Mutual Information

    Authors: Lin Duan, Jingwei Sun, Yiran Chen, Maria Gorlatova

    Abstract: Edge-cloud collaborative inference empowers resource-limited IoT devices to support deep learning applications without disclosing their raw data to the cloud server, thus preserving privacy. Nevertheless, prior research has shown that collaborative inference still results in the exposure of data and predictions from edge devices. To enhance the privacy of collaborative inference, we introduce a de… ▽ More

    Submitted 17 May, 2023; originally announced June 2023.

  34. arXiv:2306.06791  [pdf, other

    cs.GT

    To Save Mobile Crowdsourcing from Cheap-talk: A Game Theoretic Learning Approach

    Authors: Shugang Hao, Lingjie Duan

    Abstract: Today mobile crowdsourcing platforms invite users to provide anonymous reviews about service experiences, yet many reviews are found biased to be extremely positive or negative. The existing methods find it difficult to learn from biased reviews to infer the actual service state, as the state can also be extreme and the platform cannot verify the truthfulness of reviews immediately. Further, revie… ▽ More

    Submitted 29 December, 2023; v1 submitted 11 June, 2023; originally announced June 2023.

  35. Cross-Consensus Measurement of Individual-level Decentralization in Blockchains

    Authors: Chao Li, Balaji Palanisamy, Runhua Xu, Li Duan

    Abstract: Decentralization is widely recognized as a crucial characteristic of blockchains that enables them to resist malicious attacks such as the 51% attack and the takeover attack. Prior research has primarily examined decentralization in blockchains employing the same consensus protocol or at the level of block producers. This paper presents the first individual-level measurement study comparing the de… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

    Journal ref: IEEE BigDataSecurity 2023

  36. arXiv:2306.02393  [pdf, other

    cs.RO cs.CV

    Accessible Robot Control in Mixed Reality

    Authors: Ganlin Zhang, Deheng Zhang, Longteng Duan, Guo Han

    Abstract: A novel method to control the Spot robot of Boston Dynamics by Hololens 2 is proposed. This method is mainly designed for people with physical disabilities, users can control the robot's movement and robot arm without using their hands. The eye gaze tracking and head motion tracking technologies of Hololens 2 are utilized for sending control commands. The movement of the robot would follow the eye… ▽ More

    Submitted 4 June, 2023; originally announced June 2023.

    Comments: Course Project of Mixed Reality at ETH Zurich

  37. arXiv:2305.12862  [pdf, other

    cs.DC

    Average-Case Analysis of Greedy Matching for Large-Scale D2D Resource Sharing

    Authors: Shuqin Gao, Costas A. Courcoubetis, Lingjie Duan

    Abstract: Given the proximity of many wireless users and their diversity in consuming local resources (e.g., data-plans, computation and energy resources), device-to-device (D2D) resource sharing is a promising approach towards realizing a sharing economy. This paper adopts an easy-to-implement greedy matching algorithm with distributed fashion and only sub-linear O(log n) parallel complexity (in user numbe… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: Accepted by IEEE Transactions on Mobile Computing. arXiv admin note: substantial text overlap with arXiv:2107.12581

  38. arXiv:2304.10477  [pdf, other

    cs.CR cs.GT

    Location Privacy Protection Game against Adversary through Multi-user Cooperative Obfuscation

    Authors: Shu Hong, Lingjie Duan

    Abstract: In location-based services(LBSs), it is promising for users to crowdsource and share their Point-of-Interest(PoI) information with each other in a common cache to reduce query frequency and preserve location privacy. Yet most studies on multi-user privacy preservation overlook the opportunity of leveraging their service flexibility. This paper is the first to study multiple users' strategic cooper… ▽ More

    Submitted 17 February, 2023; originally announced April 2023.

    Comments: Online technical report for a forthcoming paper in IEEE Transactions on Mobile Computing (TMC)

  39. arXiv:2303.14966  [pdf, other

    cs.DC cs.LG

    Adaptive Federated Learning via New Entropy Approach

    Authors: Shensheng Zheng, Wenhao Yuan, Xuehe Wang, Lingjie Duan

    Abstract: Federated Learning (FL) has emerged as a prominent distributed machine learning framework that enables geographically discrete clients to train a global model collaboratively while preserving their privacy-sensitive data. However, due to the non-independent-and-identically-distributed (Non-IID) data generated by heterogeneous clients, the performances of the conventional federated optimization sch… ▽ More

    Submitted 12 April, 2024; v1 submitted 27 March, 2023; originally announced March 2023.

    Comments: 16 pages, 13 figures

  40. arXiv:2303.09253  [pdf, other

    cs.CV

    A Survey of Deep Visual Cross-Domain Few-Shot Learning

    Authors: Wenjian Wang, Lijuan Duan, Yuxi Wang, Junsong Fan, Zhi Gong, Zhaoxiang Zhang

    Abstract: Few-Shot transfer learning has become a major focus of research as it allows recognition of new classes with limited labeled data. While it is assumed that train and test data have the same data distribution, this is often not the case in real-world applications. This leads to decreased model transfer effects when the new class distribution differs significantly from the learned classes. Research… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

  41. arXiv:2303.08646  [pdf, other

    cs.CV

    High-level Feature Guided Decoding for Semantic Segmentation

    Authors: Ye Huang, Di Kang, Shenghua Gao, Wen Li, Lixin Duan

    Abstract: Existing pyramid-based upsamplers (e.g. SemanticFPN), although efficient, usually produce less accurate results compared to dilation-based models when using the same backbone. This is partially caused by the contaminated high-level features since they are fused and fine-tuned with noisy low-level features on limited data. To address this issue, we propose to use powerful pre-trained high-level fea… ▽ More

    Submitted 27 November, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: Revised version, refactored presentation and added more experiments

  42. arXiv:2302.12735  [pdf, other

    cs.GT cs.LG

    Regulating Clients' Noise Adding in Federated Learning without Verification

    Authors: Shu Hong, Lingjie Duan

    Abstract: In federated learning (FL), clients cooperatively train a global model without revealing their raw data but gradients or parameters, while the local information can still be disclosed from local outputs transmitted to the parameter server. With such privacy concerns, a client may overly add artificial noise to his local updates to compromise the global model training, and we prove the selfish nois… ▽ More

    Submitted 24 February, 2023; originally announced February 2023.

    Comments: 7 pages, to appear in IEEE ICC 2023

  43. arXiv:2301.06442  [pdf, other

    cs.CV cs.LG

    Modeling Uncertain Feature Representation for Domain Generalization

    Authors: Xiaotong Li, Zixuan Hu, Jun Liu, Yixiao Ge, Yongxing Dai, Ling-Yu Duan

    Abstract: Though deep neural networks have achieved impressive success on various vision tasks, obvious performance degradation still exists when models are tested in out-of-distribution scenarios. In addressing this limitation, we ponder that the feature statistics (mean and standard deviation), which carry the domain characteristics of the training data, can be properly manipulated to improve the generali… ▽ More

    Submitted 16 January, 2023; originally announced January 2023.

    Comments: This work is an extension of our ICLR 2022 paper [arXiv:2202.03958] https://openreview.net/forum?id=6HN7LHyzGgC

  44. arXiv:2301.04258  [pdf, other

    cs.CV

    CARD: Semantic Segmentation with Efficient Class-Aware Regularized Decoder

    Authors: Ye Huang, Di Kang, Liang Chen, Wenjing Jia, Xiangjian He, Lixin Duan, Xuefei Zhe, Linchao Bao

    Abstract: Semantic segmentation has recently achieved notable advances by exploiting "class-level" contextual information during learning. However, these approaches simply concatenate class-level information to pixel features to boost the pixel representation learning, which cannot fully utilize intra-class and inter-class contextual information. Moreover, these approaches learn soft class centers based on… ▽ More

    Submitted 10 January, 2023; originally announced January 2023.

    Comments: Tech report, text extended from arXiv:2203.07160

  45. arXiv:2212.09035  [pdf, other

    cs.CV cs.LG

    Minimizing Maximum Model Discrepancy for Transferable Black-box Targeted Attacks

    Authors: Anqi Zhao, Tong Chu, Yahao Liu, Wen Li, Jingjing Li, Lixin Duan

    Abstract: In this work, we study the black-box targeted attack problem from the model discrepancy perspective. On the theoretical side, we present a generalization error bound for black-box targeted attacks, which gives a rigorous theoretical analysis for guaranteeing the success of the attack. We reveal that the attack error on a target model mainly depends on empirical attack error on the substitute model… ▽ More

    Submitted 18 December, 2022; originally announced December 2022.

  46. arXiv:2212.00601  [pdf, other

    eess.IV cs.CV

    Multi-rater Prism: Learning self-calibrated medical image segmentation from multiple raters

    Authors: Junde Wu, Huihui Fang, Yehui Yang, Yuanpei Liu, Jing Gao, Lixin Duan, Weihua Yang, Yanwu Xu

    Abstract: In medical image segmentation, it is often necessary to collect opinions from multiple experts to make the final decision. This clinical routine helps to mitigate individual bias. But when data is multiply annotated, standard deep learning models are often not applicable. In this paper, we propose a novel neural network framework, called Multi-Rater Prism (MrPrism) to learn the medical image segme… ▽ More

    Submitted 1 December, 2022; originally announced December 2022.

  47. arXiv:2211.14029  [pdf, other

    cs.GT

    When Congestion Games Meet Mobile Crowdsourcing: Selective Information Disclosure

    Authors: Hongbo Li, Lingjie Duan

    Abstract: In congestion games, users make myopic routing decisions to jam each other, and the social planner with the full information designs mechanisms on information or payment side to regulate. However, it is difficult to obtain time-varying traffic conditions, and emerging crowdsourcing platforms (e.g., Waze and Google Maps) provide a convenient way for mobile users travelling on the paths to learn and… ▽ More

    Submitted 12 February, 2023; v1 submitted 25 November, 2022; originally announced November 2022.

    Comments: Online technical report for our forthcoming AAAI 2023 paper

  48. arXiv:2209.14529  [pdf, other

    cs.CV cs.AI

    Motion and Appearance Adaptation for Cross-Domain Motion Transfer

    Authors: Borun Xu, Biao Wang, Jinhong Deng, Jiale Tao, Tiezheng Ge, Yuning Jiang, Wen Li, Lixin Duan

    Abstract: Motion transfer aims to transfer the motion of a driving video to a source image. When there are considerable differences between object in the driving video and that in the source image, traditional single domain motion transfer approaches often produce notable artifacts; for example, the synthesized image may fail to preserve the human shape of the source image (cf . Fig. 1 (a)). To address this… ▽ More

    Submitted 6 October, 2022; v1 submitted 28 September, 2022; originally announced September 2022.

    Comments: fix bugs

  49. arXiv:2209.14024  [pdf, other

    cs.CV

    Motion Transformer for Unsupervised Image Animation

    Authors: Jiale Tao, Biao Wang, Tiezheng Ge, Yuning Jiang, Wen Li, Lixin Duan

    Abstract: Image animation aims to animate a source image by using motion learned from a driving video. Current state-of-the-art methods typically use convolutional neural networks (CNNs) to predict motion information, such as motion keypoints and corresponding local transformations. However, these CNN based methods do not explicitly model the interactions between motions; as a result, the important underlyi… ▽ More

    Submitted 28 September, 2022; originally announced September 2022.

  50. arXiv:2208.13695  [pdf, other

    cs.RO

    A Data-Centric Approach For Dual-Arm Robotic Garment Flattening

    Authors: Li Duan, Gerardo Aragon-Camarasa

    Abstract: Due to the high dimensionality of object states, a garment flattening pipeline requires recognising the configurations of garments for a robot to produce/select manipulation plans to flatten garments. In this paper, we propose a data-centric approach to identify known configurations of garments based on a known configuration network (KCNet) trained on depth images that capture the known configurat… ▽ More

    Submitted 29 August, 2022; originally announced August 2022.