Skip to main content

Showing 1–50 of 879 results for author: Yu, X

  1. arXiv:2407.13229  [pdf, other

    cs.RO eess.SY

    Disturbance Observer for Estimating Coupled Disturbances

    Authors: Jindou Jia, Yuhang Liu, Kexin Guo, Xiang Yu, Lihua Xie, Lei Guo

    Abstract: High-precision control for nonlinear systems is impeded by the low-fidelity dynamical model and external disturbance. Especially, the intricate coupling between internal uncertainty and external disturbance is usually difficult to be modeled explicitly. Here we show an effective and convergent algorithm enabling accurate estimation of the coupled disturbance via combining control and learning phil… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: 8 pages, 3 figures

  2. arXiv:2407.11624  [pdf, other

    cs.LG cs.AI cs.CY

    Rethinking Fair Graph Neural Networks from Re-balancing

    Authors: Zhixun Li, Yushun Dong, Qiang Liu, Jeffrey Xu Yu

    Abstract: Driven by the powerful representation ability of Graph Neural Networks (GNNs), plentiful GNN models have been widely deployed in many real-world applications. Nevertheless, due to distribution disparities between different demographic groups, fairness in high-stake decision-making systems is receiving increasing attention. Although lots of recent works devoted to improving the fairness of GNNs and… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Accepted by SIGKDD 2024, research track

  3. arXiv:2407.11253  [pdf, other

    cs.LG cs.CE

    Separable Operator Networks

    Authors: Xinling Yu, Sean Hooten, Ziyue Liu, Yequan Zhao, Marco Fiorentino, Thomas Van Vaerenbergh, Zheng Zhang

    Abstract: Operator learning has become a powerful tool in machine learning for modeling complex physical systems. Although Deep Operator Networks (DeepONet) show promise, they require extensive data acquisition. Physics-informed DeepONets (PI-DeepONet) mitigate data scarcity but suffer from inefficient training processes. We introduce Separable Operator Networks (SepONet), a novel framework that significant… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  4. arXiv:2407.09509  [pdf, other

    q-bio.NC cs.HC

    Brain Dialogue Interface (BDI): A User-Friendly fMRI Model for Interactive Brain Decoding

    Authors: Heng Huang, Lin Zhao, Zihao Wu, Xiaowei Yu, Jing Zhang, Xintao Hu, Dajiang Zhu, Tianming Liu

    Abstract: Brain decoding techniques are essential for understanding the neurocognitive system. Although numerous methods have been introduced in this field, accurately aligning complex external stimuli with brain activities remains a formidable challenge. To alleviate alignment difficulties, many studies have simplified their models by employing single-task paradigms and establishing direct links between br… ▽ More

    Submitted 17 June, 2024; originally announced July 2024.

  5. arXiv:2407.08164  [pdf, other

    cs.AI cs.MA cs.RO

    Hierarchical Consensus-Based Multi-Agent Reinforcement Learning for Multi-Robot Cooperation Tasks

    Authors: Pu Feng, Junkang Liang, Size Wang, Xin Yu, Rongye Shi, Wenjun Wu

    Abstract: In multi-agent reinforcement learning (MARL), the Centralized Training with Decentralized Execution (CTDE) framework is pivotal but struggles due to a gap: global state guidance in training versus reliance on local observations in execution, lacking global signals. Inspired by human societal consensus mechanisms, we introduce the Hierarchical Consensus-based Multi-Agent Reinforcement Learning (HC-… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 8 pages, 10 figures. Accepted for presentation at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)

  6. arXiv:2407.07587  [pdf, other

    cs.CV

    Let Occ Flow: Self-Supervised 3D Occupancy Flow Prediction

    Authors: Yili Liu, Linzhan Mou, Xuan Yu, Chenrui Han, Sitong Mao, Rong Xiong, Yue Wang

    Abstract: Accurate perception of the dynamic environment is a fundamental task for autonomous driving and robot systems. This paper introduces Let Occ Flow, the first self-supervised work for joint 3D occupancy and occupancy flow prediction using only camera inputs, eliminating the need for 3D annotations. Utilizing TPV for unified scene representation and deformable attention layers for feature aggregation… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  7. arXiv:2407.06542  [pdf, other

    cs.CL

    LIONs: An Empirically Optimized Approach to Align Language Models

    Authors: Xiao Yu, Qingyang Wu, Yu Li, Zhou Yu

    Abstract: Alignment is a crucial step to enhance the instruction-following and conversational abilities of language models. Despite many recent work proposing new algorithms, datasets, and training pipelines, there is a lack of comprehensive studies measuring the impact of various design choices throughout the whole training process. We first conduct a rigorous analysis over a three-stage training pipeline… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  8. arXiv:2407.04521  [pdf, ps, other

    math.OC cs.LG q-fin.CP

    Unified continuous-time q-learning for mean-field game and mean-field control problems

    Authors: Xiaoli Wei, Xiang Yu, Fengyi Yuan

    Abstract: This paper studies the continuous-time q-learning in the mean-field jump-diffusion models from the representative agent's perspective. To overcome the challenge when the population distribution may not be directly observable, we introduce the integrated q-function in decoupled form (decoupled Iq-function) and establish its martingale characterization together with the value function, which provide… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  9. arXiv:2407.03888  [pdf, other

    math.OC cs.LG

    Continuous-time q-Learning for Jump-Diffusion Models under Tsallis Entropy

    Authors: Lijun Bo, Yijie Huang, Xiang Yu, Tingting Zhang

    Abstract: This paper studies continuous-time reinforcement learning for controlled jump-diffusion models by featuring the q-function (the continuous-time counterpart of Q-function) and the q-learning algorithms under the Tsallis entropy regularization. Contrary to the conventional Shannon entropy, the general form of Tsallis entropy renders the optimal policy not necessary a Gibbs measure, where some Lagran… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  10. arXiv:2407.03009  [pdf, other

    cs.CV

    Model Guidance via Explanations Turns Image Classifiers into Segmentation Models

    Authors: Xiaoyan Yu, Jannik Franzen, Wojciech Samek, Marina M. -C. Höhne, Dagmar Kainmueller

    Abstract: Heatmaps generated on inputs of image classification networks via explainable AI methods like Grad-CAM and LRP have been observed to resemble segmentations of input images in many cases. Consequently, heatmaps have also been leveraged for achieving weakly supervised segmentation with image-level supervision. On the other hand, losses can be imposed on differentiable heatmaps, which has been shown… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  11. arXiv:2407.01349  [pdf, other

    cs.CV cs.RO

    PanopticRecon: Leverage Open-vocabulary Instance Segmentation for Zero-shot Panoptic Reconstruction

    Authors: Xuan Yu, Yili Liu, Chenrui Han, Sitong Mao, Shunbo Zhou, Rong Xiong, Yiyi Liao, Yue Wang

    Abstract: Panoptic reconstruction is a challenging task in 3D scene understanding. However, most existing methods heavily rely on pre-trained semantic segmentation models and known 3D object bounding boxes for 3D panoptic segmentation, which is not available for in-the-wild scenes. In this paper, we propose a novel zero-shot panoptic reconstruction method from RGB-D images of scenes. For zero-shot segmentat… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  12. arXiv:2407.00949  [pdf, ps, other

    cs.CV eess.IV

    SpectralKAN: Kolmogorov-Arnold Network for Hyperspectral Images Change Detection

    Authors: Yanheng Wang, Xiaohan Yu, Yongsheng Gao, Jianjun Sha, Jian Wang, Lianru Gao, Yonggang Zhang, Xianhui Rong

    Abstract: It has been verified that deep learning methods, including convolutional neural networks (CNNs), graph neural networks (GNNs), and transformers, can accurately extract features from hyperspectral images (HSIs). These algorithms perform exceptionally well on HSIs change detection (HSIs-CD). However, the downside of these impressive results is the enormous number of parameters, FLOPs, GPU memory, tr… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  13. arXiv:2406.19544  [pdf, other

    cs.SE

    Where Are Large Language Models for Code Generation on GitHub?

    Authors: Xiao Yu, Lei Liu, Xing Hu, Jacky Wai Keung, Jin Liu, Xin Xia

    Abstract: The increasing use of Large Language Models (LLMs) in software development has garnered significant attention from researchers assessing the quality of the code they generate. However, much of the research focuses on controlled datasets such as HumanEval, which fail to adequately represent how developers actually utilize LLMs' code generation capabilities or clarify the characteristics of LLM-gene… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  14. arXiv:2406.19240  [pdf, other

    cs.SE

    Data Preparation for Deep Learning based Code Smell Detection: A Systematic Literature Review

    Authors: Fengji Zhang, Zexian Zhang, Jacky Wai Keung, Xiangru Tang, Zhen Yang, Xiao Yu, Wenhua Hu

    Abstract: Code Smell Detection (CSD) plays a crucial role in improving software quality and maintainability. And Deep Learning (DL) techniques have emerged as a promising approach for CSD due to their superior performance. However, the effectiveness of DL-based CSD methods heavily relies on the quality of the training data. Despite its importance, little attention has been paid to analyzing the data prepara… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  15. arXiv:2406.18984  [pdf, other

    cs.IR

    Amplify Graph Learning for Recommendation via Sparsity Completion

    Authors: Peng Yuan, Haojie Li, Minying Fang, Xu Yu, Yongjing Hao, Junwei Du

    Abstract: Graph learning models have been widely deployed in collaborative filtering (CF) based recommendation systems. Due to the issue of data sparsity, the graph structure of the original input lacks potential positive preference edges, which significantly reduces the performance of recommendations. In this paper, we study how to enhance the graph structure for CF more effectively, thereby optimizing the… ▽ More

    Submitted 1 July, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

  16. arXiv:2406.17156  [pdf, other

    cs.GR cs.HC

    Toward Ubiquitous 3D Object Digitization: A Wearable Computing Framework for Non-Invasive Physical Property Acquisition

    Authors: Yunxiang Zhang, Xin Sun, Dengfeng Li, Xinge Yu, Qi Sun

    Abstract: Accurately digitizing physical objects is central to many applications, including virtual/augmented reality, industrial design, and e-commerce. Prior research has demonstrated efficient and faithful reconstruction of objects' geometric shapes and visual appearances, which suffice for digitally representing rigid objects. In comparison, physical properties, such as elasticity and pressure, are also… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 10 pages, 6 figures

  17. arXiv:2406.16382  [pdf, other

    cs.CL

    UNO Arena for Evaluating Sequential Decision-Making Capability of Large Language Models

    Authors: Zhanyue Qin, Haochuan Wang, Deyuan Liu, Ziyang Song, Cunhang Fan, Zhao Lv, Jinlin Wu, Zhen Lei, Zhiying Tu, Dianhui Chu, Xiaoyan Yu, Dianbo Sui

    Abstract: Sequential decision-making refers to algorithms that take into account the dynamics of the environment, where early decisions affect subsequent decisions. With large language models (LLMs) demonstrating powerful capabilities between tasks, we can't help but ask: Can Current LLMs Effectively Make Sequential Decisions? In order to answer this question, we propose the UNO Arena based on the card game… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  18. arXiv:2406.14924  [pdf, other

    cs.CV

    DiPEx: Dispersing Prompt Expansion for Class-Agnostic Object Detection

    Authors: Jia Syuen Lim, Zhuoxiao Chen, Mahsa Baktashmotlagh, Zhi Chen, Xin Yu, Zi Huang, Yadan Luo

    Abstract: Class-agnostic object detection (OD) can be a cornerstone or a bottleneck for many downstream vision tasks. Despite considerable advancements in bottom-up and multi-object discovery methods that leverage basic visual cues to identify salient objects, consistently achieving a high recall rate remains difficult due to the diversity of object types and their contextual complexity. In this work, we in… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 19 pages

  19. arXiv:2406.14497  [pdf, other

    cs.SE cs.CL

    CodeRAG-Bench: Can Retrieval Augment Code Generation?

    Authors: Zora Zhiruo Wang, Akari Asai, Xinyan Velocity Yu, Frank F. Xu, Yiqing Xie, Graham Neubig, Daniel Fried

    Abstract: While language models (LMs) have proven remarkably adept at generating code, many programs are challenging for LMs to generate using their parametric knowledge alone. Providing external contexts such as library documentation can facilitate generating accurate and functional code. Despite the success of retrieval-augmented generation (RAG) in various text-oriented tasks, its potential for improving… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  20. arXiv:2406.13375  [pdf, other

    cs.CL

    ALiiCE: Evaluating Positional Fine-grained Citation Generation

    Authors: Yilong Xu, Jinhua Gao, Xiaoming Yu, Baolong Bi, Huawei Shen, Xueqi Cheng

    Abstract: Large Language Models (LLMs) can enhance the credibility and verifiability by generating text with citations. However, existing tasks and evaluation methods are predominantly limited to sentence-level statement, neglecting the significance of positional fine-grained citations that can appear anywhere within sentences. To facilitate further exploration of the fine-grained citation generation, we pr… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  21. arXiv:2406.12566  [pdf, other

    cs.CL

    RichRAG: Crafting Rich Responses for Multi-faceted Queries in Retrieval-Augmented Generation

    Authors: Shuting Wang, Xin Yu, Mang Wang, Weipeng Chen, Yutao Zhu, Zhicheng Dou

    Abstract: Retrieval-augmented generation (RAG) effectively addresses issues of static knowledge and hallucination in large language models. Existing studies mostly focus on question scenarios with clear user intents and concise answers. However, it is prevalent that users issue broad, open-ended queries with diverse sub-intents, for which they desire rich and long-form answers covering multiple relevant asp… ▽ More

    Submitted 21 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  22. arXiv:2406.12465  [pdf, other

    cs.CY cs.AI cs.IR

    RIGL: A Unified Reciprocal Approach for Tracing the Independent and Group Learning Processes

    Authors: Xiaoshan Yu, Chuan Qin, Dazhong Shen, Shangshang Yang, Haiping Ma, Hengshu Zhu, Xingyi Zhang

    Abstract: In the realm of education, both independent learning and group learning are esteemed as the most classic paradigms. The former allows learners to self-direct their studies, while the latter is typically characterized by teacher-directed scenarios. Recent studies in the field of intelligent education have leveraged deep temporal models to trace the learning process, capturing the dynamics of studen… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Accepted by KDD 2024. 12 pages

  23. arXiv:2406.12426  [pdf, other

    cs.IT eess.SP

    Multi-Active-IRS-Assisted Cooperative Sensing: Cramér-Rao Bound and Joint Beamforming Design

    Authors: Yuan Fang, Xianghao Yu, Jie Xu, Ying-Jun Angela Zhang

    Abstract: This paper studies the multi-intelligent reflecting surface (IRS)-assisted cooperative sensing, in which multiple active IRSs are deployed in a distributed manner to facilitate multi-view target sensing at the non-line-of-sight (NLoS) area of the base station (BS). Different from prior works employing passive IRSs, we leverage active IRSs with the capability of amplifying the reflected signals to… ▽ More

    Submitted 18 July, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2404.13536

  24. arXiv:2406.12254  [pdf, other

    eess.IV cs.CV

    Enhancing Single-Slice Segmentation with 3D-to-2D Unpaired Scan Distillation

    Authors: Xin Yu, Qi Yang, Han Liu, Ho Hin Lee, Yucheng Tang, Lucas W. Remedios, Michael E. Kim, Rendong Zhang, Shunxing Bao, Yuankai Huo, Ann Zenobia Moore, Luigi Ferrucci, Bennett A. Landman

    Abstract: 2D single-slice abdominal computed tomography (CT) enables the assessment of body habitus and organ health with low radiation exposure. However, single-slice data necessitates the use of 2D networks for segmentation, but these networks often struggle to capture contextual information effectively. Consequently, even when trained on identical datasets, 3D networks typically achieve superior segmenta… ▽ More

    Submitted 12 July, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  25. arXiv:2406.11608  [pdf, other

    cs.CV

    Learning Hierarchical Semantic Classification by Grounding on Consistent Image Segmentations

    Authors: Seulki Park, Youren Zhang, Stella X. Yu, Sara Beery, Jonathan Huang

    Abstract: Hierarchical semantic classification requires the prediction of a taxonomy tree instead of a single flat level of the tree, where both accuracies at individual levels and consistency across levels matter. We can train classifiers for individual levels, which has accuracy but not consistency, or we can train only the finest level classification and infer higher levels, which has consistency but not… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 34 pages

  26. arXiv:2406.10593  [pdf, other

    cs.AI cs.DB cs.IR

    QDA-SQL: Questions Enhanced Dialogue Augmentation for Multi-Turn Text-to-SQL

    Authors: Yinggang Sun, Ziming Guo, Haining Yu, Chuanyi Liu, Xiang Li, Bingxuan Wang, Xiangzhan Yu, Tiancheng Zhao

    Abstract: Fine-tuning large language models (LLMs) for specific domain tasks has achieved great success in Text-to-SQL tasks. However, these fine-tuned models often face challenges with multi-turn Text-to-SQL tasks caused by ambiguous or unanswerable questions. It is desired to enhance LLMs to handle multiple types of questions in multi-turn Text-to-SQL tasks. To address this, we propose a novel data augmen… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: 13 pages, 7 figures

  27. arXiv:2406.10223  [pdf, other

    cs.LG cs.SD eess.AS

    Diffusion Synthesizer for Efficient Multilingual Speech to Speech Translation

    Authors: Nameer Hirschkind, Xiao Yu, Mahesh Kumar Nandwana, Joseph Liu, Eloi DuBois, Dao Le, Nicolas Thiebaut, Colin Sinclair, Kyle Spence, Charles Shang, Zoe Abrams, Morgan McGuire

    Abstract: We introduce DiffuseST, a low-latency, direct speech-to-speech translation system capable of preserving the input speaker's voice zero-shot while translating from multiple source languages into English. We experiment with the synthesizer component of the architecture, comparing a Tacotron-based synthesizer to a novel diffusion-based synthesizer. We find the diffusion-based synthesizer to improve M… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Published in Interspeech 2024

  28. arXiv:2406.10111  [pdf, other

    cs.CV

    GaussianSR: 3D Gaussian Super-Resolution with 2D Diffusion Priors

    Authors: Xiqian Yu, Hanxin Zhu, Tianyu He, Zhibo Chen

    Abstract: Achieving high-resolution novel view synthesis (HRNVS) from low-resolution input views is a challenging task due to the lack of high-resolution data. Previous methods optimize high-resolution Neural Radiance Field (NeRF) from low-resolution input views but suffer from slow rendering speed. In this work, we base our method on 3D Gaussian Splatting (3DGS) due to its capability of producing high-qual… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  29. arXiv:2406.08829  [pdf, other

    cs.CV cs.CR

    Improving Adversarial Robustness via Feature Pattern Consistency Constraint

    Authors: Jiacong Hu, Jingwen Ye, Zunlei Feng, Jiazhen Yang, Shunyu Liu, Xiaotian Yu, Lingxiang Jia, Mingli Song

    Abstract: Convolutional Neural Networks (CNNs) are well-known for their vulnerability to adversarial attacks, posing significant security concerns. In response to these threats, various defense methods have emerged to bolster the model's robustness. However, most existing methods either focus on learning from adversarial perturbations, leading to overfitting to the adversarial examples, or aim to eliminate… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  30. arXiv:2406.07031   

    cs.MA

    Arbitrary-Order Distributed Finite-Time Differentiator for Multi-Agent Systems

    Authors: Weile Chen, Haibo Du, Shihua Li, Xinghuo Yu

    Abstract: This paper proposes arbitrary-order distributed finite-time differentiator (AODFD) for leader-follower multi-agent systems (MAS) under directed graph by only using relative or absolute output information. By using arbitrary-order distributed finite-time differentiator via relative output information (AODFD-R), each follower agent can obtain the relative output information between itself and leader… ▽ More

    Submitted 13 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

    Comments: Because there are some mistakes in the expression of the article, in order not to mislead readers, I apply for withdrawal

  31. arXiv:2406.05488  [pdf, other

    cs.LG cs.AI

    Online Policy Distillation with Decision-Attention

    Authors: Xinqiang Yu, Chuanguang Yang, Chengqing Yu, Libo Huang, Zhulin An, Yongjun Xu

    Abstract: Policy Distillation (PD) has become an effective method to improve deep reinforcement learning tasks. The core idea of PD is to distill policy knowledge from a teacher agent to a student agent. However, the teacher-student framework requires a well-trained teacher model which is computationally expensive.In the light of online knowledge distillation, we study the knowledge transfer between differe… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  32. arXiv:2406.04875  [pdf, other

    cs.CV

    3DRealCar: An In-the-wild RGB-D Car Dataset with 360-degree Views

    Authors: Xiaobiao Du, Haiyang Sun, Shuyun Wang, Zhuojie Wu, Hongwei Sheng, Jiaying Ying, Ming Lu, Tianqing Zhu, Kun Zhan, Xin Yu

    Abstract: 3D cars are commonly used in self-driving systems, virtual/augmented reality, and games. However, existing 3D car datasets are either synthetic or low-quality, presenting a significant gap toward the high-quality real-world 3D car datasets and limiting their applications in practical scenarios. In this paper, we propose the first large-scale 3D real car dataset, termed 3DRealCar, offering three di… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: Project Page: https://xiaobiaodu.github.io/3drealcar

  33. arXiv:2406.04815  [pdf, other

    cs.LG cs.AI cs.RO

    Skill-aware Mutual Information Optimisation for Generalisation in Reinforcement Learning

    Authors: Xuehui Yu, Mhairi Dunion, Xin Li, Stefano V. Albrecht

    Abstract: Meta-Reinforcement Learning (Meta-RL) agents can struggle to operate across tasks with varying environmental features that require different optimal skills (i.e., different modes of behaviours). Using context encoders based on contrastive learning to enhance the generalisability of Meta-RL agents is now widely studied but faces challenges such as the requirement for a large sample size, also refer… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  34. arXiv:2406.02913  [pdf, other

    cs.LG cs.AI

    Zeroth-Order Fine-Tuning of LLMs with Extreme Sparsity

    Authors: Wentao Guo, Jikai Long, Yimeng Zeng, Zirui Liu, Xinyu Yang, Yide Ran, Jacob R. Gardner, Osbert Bastani, Christopher De Sa, Xiaodong Yu, Beidi Chen, Zhaozhuo Xu

    Abstract: Zeroth-order optimization (ZO) is a memory-efficient strategy for fine-tuning Large Language Models using only forward passes. However, the application of ZO fine-tuning in memory-constrained settings such as mobile phones and laptops is still challenging since full precision forward passes are infeasible. In this study, we address this limitation by integrating sparsity and quantization into ZO f… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  35. arXiv:2406.02616  [pdf, other

    cs.LG cs.AI

    Adaptive Layer Splitting for Wireless LLM Inference in Edge Computing: A Model-Based Reinforcement Learning Approach

    Authors: Yuxuan Chen, Rongpeng Li, Xiaoxue Yu, Zhifeng Zhao, Honggang Zhang

    Abstract: Optimizing the deployment of large language models (LLMs) in edge computing environments is critical for enhancing privacy and computational efficiency. Toward efficient wireless LLM inference in edge computing, this study comprehensively analyzes the impact of different splitting points in mainstream open-source LLMs. On this basis, this study introduces a framework taking inspiration from model-… ▽ More

    Submitted 8 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  36. arXiv:2406.01065  [pdf, other

    cs.LG cs.AI

    Causal prompting model-based offline reinforcement learning

    Authors: Xuehui Yu, Yi Guan, Rujia Shen, Xin Li, Chen Tang, Jingchi Jiang

    Abstract: Model-based offline Reinforcement Learning (RL) allows agents to fully utilise pre-collected datasets without requiring additional or unethical explorations. However, applying model-based offline RL to online systems presents challenges, primarily due to the highly suboptimal (noise-filled) and diverse nature of datasets generated by online systems. To tackle these issues, we introduce the Causal… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  37. arXiv:2406.00939  [pdf, ps, other

    cs.IT

    Bounds on f-Divergences between Distributions within Generalized Quasi-$\varepsilon$-Neighborhood

    Authors: Xinchun Yu, Shuangqing Wei, Shao-Lun Huang, Xiao-Ping Zhang

    Abstract: A general reverse Pinsker's inequality is derived to give an upper bound on f-divergences in terms of total variational distance when two distributions are close measured under our proposed generalized local information geometry framework. In addition, relationships between two f-divergences equipped with functions that are third order differentiable are established in terms of the lower and upper… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  38. arXiv:2405.19652  [pdf, other

    cs.CV

    Dual sparse training framework: inducing activation map sparsity via Transformed $\ell1$ regularization

    Authors: Xiaolong Yu, Cong Tian

    Abstract: Although deep convolutional neural networks have achieved rapid development, it is challenging to widely promote and apply these models on low-power devices, due to computational and storage limitations. To address this issue, researchers have proposed techniques such as model compression, activation sparsity induction, and hardware accelerators. This paper presents a method to induce the sparsity… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  39. arXiv:2405.19610  [pdf, other

    stat.ML cs.LG stat.ME

    Factor Augmented Tensor-on-Tensor Neural Networks

    Authors: Guanhao Zhou, Yuefeng Han, Xiufan Yu

    Abstract: This paper studies the prediction task of tensor-on-tensor regression in which both covariates and responses are multi-dimensional arrays (a.k.a., tensors) across time with arbitrary tensor order and data dimension. Existing methods either focused on linear models without accounting for possibly nonlinear relationships between covariates and responses, or directly employed black-box deep learning… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  40. Encouraging Bystander Assistance for Urban Robots: Introducing Playful Robot Help-Seeking as a Strategy

    Authors: Xinyan Yu, Marius Hoggenmueller, Martin Tomitsch

    Abstract: Robots in urban environments will inevitably encounter situations beyond their capabilities (e.g., delivery robots unable to press traffic light buttons), necessitating bystander assistance. These spontaneous collaborations possess challenges distinct from traditional human-robot collaboration, requiring design investigation and tailored interaction strategies. This study investigates playful help… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  41. arXiv:2405.18525  [pdf, other

    cs.CV

    REPARO: Compositional 3D Assets Generation with Differentiable 3D Layout Alignment

    Authors: Haonan Han, Rui Yang, Huan Liao, Jiankai Xing, Zunnan Xu, Xiaoming Yu, Junwei Zha, Xiu Li, Wanhua Li

    Abstract: Traditional image-to-3D models often struggle with scenes containing multiple objects due to biases and occlusion complexities. To address this challenge, we present REPARO, a novel approach for compositional 3D asset generation from single images. REPARO employs a two-step process: first, it extracts individual objects from the scene and reconstructs their 3D meshes using off-the-shelf image-to-3… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  42. arXiv:2405.16063  [pdf, other

    cs.SE

    Risk Scenario Generation for Autonomous Driving Systems based on Causal Bayesian Networks

    Authors: Jiangnan Zhao, Dehui Du, Xing Yu, Hang Li

    Abstract: Advancements in Autonomous Driving Systems (ADS) have brought significant benefits, but also raised concerns regarding their safety. Virtual tests are common practices to ensure the safety of ADS because they are more efficient and safer compared to field operational tests. However, capturing the complex dynamics of real-world driving environments and effectively generating risk scenarios for test… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

    Comments: 10 pages

  43. arXiv:2405.14749  [pdf, other

    cs.LG cs.AI math.OC

    Policy Gradient Methods for Risk-Sensitive Distributional Reinforcement Learning with Provable Convergence

    Authors: Minheng Xiao, Xian Yu, Lei Ying

    Abstract: Risk-sensitive reinforcement learning (RL) is crucial for maintaining reliable performance in many high-stakes applications. While most RL methods aim to learn a point estimate of the random cumulative cost, distributional RL (DRL) seeks to estimate the entire distribution of it. The distribution provides all necessary information about the cost and leads to a unified framework for handling variou… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  44. arXiv:2405.14582  [pdf, other

    cs.CV cs.AI

    PoseCrafter: One-Shot Personalized Video Synthesis Following Flexible Pose Control

    Authors: Yong Zhong, Min Zhao, Zebin You, Xiaofeng Yu, Changwang Zhang, Chongxuan Li

    Abstract: In this paper, we introduce PoseCrafter, a one-shot method for personalized video generation following the control of flexible poses. Built upon Stable Diffusion and ControlNet, we carefully design an inference process to produce high-quality videos without the corresponding ground-truth frames. First, we select an appropriate reference frame from the training video and invert it to initialize all… ▽ More

    Submitted 18 July, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  45. arXiv:2405.14399  [pdf, other

    cs.LG cs.CY

    Endowing Interpretability for Neural Cognitive Diagnosis by Efficient Kolmogorov-Arnold Networks

    Authors: Shangshang Yang, Linrui Qin, Xiaoshan Yu

    Abstract: In the realm of intelligent education, cognitive diagnosis plays a crucial role in subsequent recommendation tasks attributed to the revealed students' proficiency in knowledge concepts. Although neural network-based neural cognitive diagnosis models (CDMs) have exhibited significantly better performance than traditional models, neural cognitive diagnosis is criticized for the poor model interpret… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Leverage Kolmogorov-Arnold Networks (KANs) for cognitive diagnosis, enhancing the model interpretability. The diagnosis performance is also improved

    MSC Class: 68T30 ACM Class: I.2.4

  46. arXiv:2405.13937  [pdf, other

    cs.LG

    DyGPrompt: Learning Feature and Time Prompts on Dynamic Graphs

    Authors: Xingtong Yu, Zhenghao Liu, Yuan Fang, Xinming Zhang

    Abstract: Dynamic graphs are pervasive in the real world, modeling dynamic relations between objects across various fields. For dynamic graph modeling, dynamic graph neural networks (DGNNs) have emerged as a mainstream technique, which are generally pre-trained on the link prediction task, leaving a significant gap from the objectives of downstream tasks such as node classification. To bridge the gap, promp… ▽ More

    Submitted 2 July, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

    Comments: Under review

  47. arXiv:2405.13934  [pdf, other

    cs.LG

    Text-Free Multi-domain Graph Pre-training: Toward Graph Foundation Models

    Authors: Xingtong Yu, Chang Zhou, Yuan Fang, Xinming Zhang

    Abstract: Given the ubiquity of graph data, it is intriguing to ask: Is it possible to train a graph foundation model on a broad range of graph data across diverse domains? A major hurdle toward this goal lies in the fact that graphs from different domains often exhibit profoundly divergent characteristics. Although there have been some initial efforts in integrating multi-domain graphs for pre-training, th… ▽ More

    Submitted 28 May, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

    Comments: Under review

  48. arXiv:2405.13859  [pdf, other

    cs.CV

    QGait: Toward Accurate Quantization for Gait Recognition with Binarized Input

    Authors: Senmao Tian, Haoyu Gao, Gangyi Hong, Shuyun Wang, JingJie Wang, Xin Yu, Shunli Zhang

    Abstract: Existing deep learning methods have made significant progress in gait recognition. Typically, appearance-based models binarize inputs into silhouette sequences. However, mainstream quantization methods prioritize minimizing task loss over quantization error, which is detrimental to gait recognition with binarized inputs. Minor variations in silhouette sequences can be diminished in the network's i… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  49. arXiv:2405.13745  [pdf, other

    cs.CV

    NeurCross: A Self-Supervised Neural Approach for Representing Cross Fields in Quad Mesh Generation

    Authors: Qiujie Dong, Huibiao Wen, Rui Xu, Xiaokang Yu, Jiaran Zhou, Shuangmin Chen, Shiqing Xin, Changhe Tu, Wenping Wang

    Abstract: Quadrilateral mesh generation plays a crucial role in numerical simulations within Computer-Aided Design and Engineering (CAD/E). The quality of the cross field is essential for generating a quadrilateral mesh. In this paper, we propose a self-supervised neural representation of the cross field, named NeurCross, comprising two modules: one to fit the signed distance function (SDF) and another to p… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  50. arXiv:2405.12875  [pdf

    cs.CV cs.CL

    Diffusion-RSCC: Diffusion Probabilistic Model for Change Captioning in Remote Sensing Images

    Authors: Xiaofei Yu, Yitong Li, Jie Ma

    Abstract: Remote sensing image change captioning (RSICC) aims at generating human-like language to describe the semantic changes between bi-temporal remote sensing image pairs. It provides valuable insights into environmental dynamics and land management. Unlike conventional change captioning task, RSICC involves not only retrieving relevant information across different modalities and generating fluent capt… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.