Skip to main content

Showing 1–50 of 918 results for author: Jiang, J

  1. arXiv:2407.10804  [pdf, other

    cs.CL

    Mix-CPT: A Domain Adaptation Framework via Decoupling Knowledge Learning and Format Alignment

    Authors: Jinhao Jiang, Junyi Li, Wayne Xin Zhao, Yang Song, Tao Zhang, Ji-Rong Wen

    Abstract: Adapting general large language models (LLMs) to specialized domains presents great challenges due to varied data distributions. This adaptation typically requires continual pre-training on massive domain-specific corpora to facilitate knowledge memorization, followed by training to apply this knowledge following human instructions and preferences. However, this method may result in inefficient kn… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: LLM, CPT, knowledge learning, format alignment; work in progress

  2. arXiv:2407.08801  [pdf, other

    cs.CV

    DG-PIC: Domain Generalized Point-In-Context Learning for Point Cloud Understanding

    Authors: Jincen Jiang, Qianyu Zhou, Yuhang Li, Xuequan Lu, Meili Wang, Lizhuang Ma, Jian Chang, Jian Jun Zhang

    Abstract: Recent point cloud understanding research suffers from performance drops on unseen data, due to the distribution shifts across different domains. While recent studies use Domain Generalization (DG) techniques to mitigate this by learning domain-invariant features, most are designed for a single task and neglect the potential of testing data. Despite In-Context Learning (ICL) showcasing multi-task… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV 2024

  3. arXiv:2407.08366  [pdf, other

    cs.RO cs.CV

    An Economic Framework for 6-DoF Grasp Detection

    Authors: Xiao-Ming Wu, Jia-Feng Cai, Jian-Jian Jiang, Dian Zheng, Yi-Lin Wei, Wei-Shi Zheng

    Abstract: Robotic grasping in clutters is a fundamental task in robotic manipulation. In this work, we propose an economic framework for 6-DoF grasp detection, aiming to economize the resource cost in training and meanwhile maintain effective grasp performance. To begin with, we discover that the dense supervision is the bottleneck of current SOTA methods that severely encumbers the entire training overload… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 19 pages, 7 figures. Accepted in ECCV 2024!

  4. arXiv:2407.06931  [pdf, other

    cs.RO

    A Unified Approach to Multi-task Legged Navigation: Temporal Logic Meets Reinforcement Learning

    Authors: Jesse Jiang, Samuel Coogan, Ye Zhao

    Abstract: This study examines the problem of hopping robot navigation planning to achieve simultaneous goal-directed and environment exploration tasks. We consider a scenario in which the robot has mandatory goal-directed tasks defined using Linear Temporal Logic (LTL) specifications as well as optional exploration tasks represented using a reward function. Additionally, there exists uncertainty in the robo… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 8 pages, 4 figures

  5. arXiv:2407.06204  [pdf, other

    cs.LG cs.CL

    A Survey on Mixture of Experts

    Authors: Weilin Cai, Juyong Jiang, Fan Wang, Jing Tang, Sunghun Kim, Jiayi Huang

    Abstract: Large language models (LLMs) have garnered unprecedented advancements across diverse fields, ranging from natural language processing to computer vision and beyond. The prowess of LLMs is underpinned by their substantial model size, extensive and diverse datasets, and the vast computational power harnessed during training, all of which contribute to the emergent abilities of LLMs (e.g., in-context… ▽ More

    Submitted 26 June, 2024; originally announced July 2024.

  6. arXiv:2407.06196  [pdf, other

    cs.CV cs.AI

    Poetry2Image: An Iterative Correction Framework for Images Generated from Chinese Classical Poetry

    Authors: Jing Jiang, Yiran Ling, Binzhu Li, Pengxiang Li, Junming Piao, Yu Zhang

    Abstract: Text-to-image generation models often struggle with key element loss or semantic confusion in tasks involving Chinese classical poetry.Addressing this issue through fine-tuning models needs considerable training costs. Additionally, manual prompts for re-diffusion adjustments need professional knowledge. To solve this problem, we propose Poetry2Image, an iterative correction framework for images g… ▽ More

    Submitted 15 June, 2024; originally announced July 2024.

    Comments: 13 pages, 7 figures

  7. arXiv:2407.05712  [pdf, other

    cs.CV cs.AI

    MobilePortrait: Real-Time One-Shot Neural Head Avatars on Mobile Devices

    Authors: Jianwen Jiang, Gaojie Lin, Zhengkun Rong, Chao Liang, Yongming Zhu, Jiaqi Yang, Tianyun Zhong

    Abstract: Existing neural head avatars methods have achieved significant progress in the image quality and motion range of portrait animation. However, these methods neglect the computational overhead, and to the best of our knowledge, none is designed to run on mobile devices. This paper presents MobilePortrait, a lightweight one-shot neural head avatars method that reduces learning complexity by integrati… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  8. arXiv:2407.05609  [pdf, other

    cs.CL

    Open-world Multi-label Text Classification with Extremely Weak Supervision

    Authors: Xintong Li, Jinya Jiang, Ria Dharmani, Jayanth Srinivasa, Gaowen Liu, Jingbo Shang

    Abstract: We study open-world multi-label text classification under extremely weak supervision (XWS), where the user only provides a brief description for classification objectives without any labels or ground-truth label space. Similar single-label XWS settings have been explored recently, however, these methods cannot be easily adapted for multi-label. We observe that (1) most documents have a dominant cl… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: Preprint

  9. arXiv:2407.04486  [pdf, other

    q-bio.QM cs.AI

    Variational and Explanatory Neural Networks for Encoding Cancer Profiles and Predicting Drug Responses

    Authors: Tianshu Feng, Rohan Gnanaolivu, Abolfazl Safikhani, Yuanhang Liu, Jun Jiang, Nicholas Chia, Alexander Partin, Priyanka Vasanthakumari, Yitan Zhu, Chen Wang

    Abstract: Human cancers present a significant public health challenge and require the discovery of novel drugs through translational research. Transcriptomics profiling data that describes molecular activities in tumors and cancer cell lines are widely utilized for predicting anti-cancer drug responses. However, existing AI models face challenges due to noise in transcriptomics data and lack of biological i… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  10. arXiv:2407.01601  [pdf, other

    cs.LG cs.AI

    Unveiling and Controlling Anomalous Attention Distribution in Transformers

    Authors: Ruiqing Yan, Xingbo Du, Haoyu Deng, Linghan Zheng, Qiuzhuang Sun, Jifang Hu, Yuhang Shao, Penghao Jiang, Jinrong Jiang, Lian Zhao

    Abstract: With the advent of large models based on the Transformer architecture, researchers have observed an anomalous phenomenon in the Attention mechanism--there is a very high attention on the first element, which is prevalent across Transformer-based models. It is crucial to understand it for the development of techniques focusing on attention distribution, such as Key-Value (KV) Cache compression and… ▽ More

    Submitted 3 July, 2024; v1 submitted 26 June, 2024; originally announced July 2024.

  11. arXiv:2407.01492  [pdf, other

    cs.CL cs.AI

    RegMix: Data Mixture as Regression for Language Model Pre-training

    Authors: Qian Liu, Xiaosen Zheng, Niklas Muennighoff, Guangtao Zeng, Longxu Dou, Tianyu Pang, Jing Jiang, Min Lin

    Abstract: The data mixture for large language model pre-training significantly impacts performance, yet how to determine an effective mixture remains unclear. We propose RegMix to automatically identify a high-performing data mixture by formulating it as a regression task. RegMix involves training a set of small models with diverse data mixtures and fitting a regression model to predict their performance gi… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  12. Face4RAG: Factual Consistency Evaluation for Retrieval Augmented Generation in Chinese

    Authors: Yunqi Xu, Tianchi Cai, Jiyan Jiang, Xierui Song

    Abstract: The prevailing issue of factual inconsistency errors in conventional Retrieval Augmented Generation (RAG) motivates the study of Factual Consistency Evaluation (FCE). Despite the various FCE methods proposed earlier, these methods are evaluated on datasets generated by specific Large Language Models (LLMs). Without a comprehensive benchmark, it remains unexplored how these FCE methods perform on o… ▽ More

    Submitted 3 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

    Journal ref: KDD 2024 (oral)

  13. arXiv:2407.00024  [pdf, other

    cs.CV cs.AI cs.MM

    LMVD: A Large-Scale Multimodal Vlog Dataset for Depression Detection in the Wild

    Authors: Lang He, Kai Chen, Junnan Zhao, Yimeng Wang, Ercheng Pei, Haifeng Chen, Jiewei Jiang, Shiqing Zhang, Jie Zhang, Zhongmin Wang, Tao He, Prayag Tiwari

    Abstract: Depression can significantly impact many aspects of an individual's life, including their personal and social functioning, academic and work performance, and overall quality of life. Many researchers within the field of affective computing are adopting deep learning technology to explore potential patterns related to the detection of depression. However, because of subjects' privacy protection con… ▽ More

    Submitted 8 May, 2024; originally announced July 2024.

  14. arXiv:2406.19631  [pdf, other

    cs.LG cs.DC

    Personalized Interpretation on Federated Learning: A Virtual Concepts approach

    Authors: Peng Yan, Guodong Long, Jing Jiang, Michael Blumenstein

    Abstract: Tackling non-IID data is an open challenge in federated learning research. Existing FL methods, including robust FL and personalized FL, are designed to improve model performance without consideration of interpreting non-IID across clients. This paper aims to design a novel FL method to robust and interpret the non-IID data across clients. Specifically, we interpret each client's dataset as a mixt… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  15. arXiv:2406.18846  [pdf, other

    cs.CE

    AFBench: A Large-scale Benchmark for Airfoil Design

    Authors: Jian Liu, Jianyu Wu, Hairun Xie, Guoqing Zhang, Jing Wang, Wei Liu, Wanli Ouyang, Junjun Jiang, Xianming Liu, Shixiang Tang, Miao Zhang

    Abstract: Data-driven generative models have emerged as promising approaches towards achieving efficient mechanical inverse design. However, due to prohibitively high cost in time and money, there is still lack of open-source and large-scale benchmarks in this field. It is mainly the case for airfoil inverse design, which requires to generate and edit diverse geometric-qualified and aerodynamic-qualified ai… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Submitted to NeurIPS 2024 Dataset & Benchmark Track

  16. arXiv:2406.18817  [pdf, other

    cs.CV cs.AI

    Correspondence-Free Non-Rigid Point Set Registration Using Unsupervised Clustering Analysis

    Authors: Mingyang Zhao, Jingen Jiang, Lei Ma, Shiqing Xin, Gaofeng Meng, Dong-Ming Yan

    Abstract: This paper presents a novel non-rigid point set registration method that is inspired by unsupervised clustering analysis. Unlike previous approaches that treat the source and target point sets as separate entities, we develop a holistic framework where they are formulated as clustering centroids and clustering members, separately. We then adopt Tikhonov regularization with an $\ell_1$-induced Lapl… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: [CVPR 2024 Highlight] Project and code at: https://github.com/zikai1/CVPR24_PointSetReg

  17. arXiv:2406.18351  [pdf, other

    cs.LG cs.AI

    Reinforcement Learning with Intrinsically Motivated Feedback Graph for Lost-sales Inventory Control

    Authors: Zifan Liu, Xinran Li, Shibo Chen, Gen Li, Jiashuo Jiang, Jun Zhang

    Abstract: Reinforcement learning (RL) has proven to be well-performed and general-purpose in the inventory control (IC). However, further improvement of RL algorithms in the IC domain is impeded due to two limitations of online experience. First, online experience is expensive to acquire in real-world applications. With the low sample efficiency nature of RL algorithms, it would take extensive time to train… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  18. arXiv:2406.18099  [pdf, other

    cs.DB

    CompassDB: Pioneering High-Performance Key-Value Store with Perfect Hash

    Authors: Jin Jiang, Dongsheng He, Yu Hu, Dong Liu, Chenfan Xiao, Hongxiao Bi, Yusong Zhang, Chaoqu Jiang, Zhijun Fu

    Abstract: Modern mainstream persistent key-value storage engines utilize Log-Structured Merge tree (LSM-tree) based designs, optimizing read/write performance by leveraging sequential disk I/O. However, the advent of SSDs, with their significant improvements in bandwidth and IOPS, shifts the bottleneck from I/O to CPU. The high compaction cost and large read/write amplification associated with LSM trees hav… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  19. arXiv:2406.17872  [pdf, other

    cs.HC

    Analysis of the Causes of Car Accidents in the United States of America in 2023: Gauge People Understanding of Data Visualisation

    Authors: Hamoud Alhazmi, Marcelo Morales, Jiachen Jiang, Jinxin Zhou, Jian Chen

    Abstract: This paper presents a comprehensive examination of interactive data visualization tools and their efficacy in the context of United States car accident data for the year 2023. We developed interactive heatmaps, histograms, and pie charts to enhance the understanding of accident severity distribution over time and location. Our research included the creation and distribution of an online survey, co… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 5 pages, 7 figures

  20. arXiv:2406.17343  [pdf, other

    cs.CV cs.AI

    Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers

    Authors: Lei Chen, Yuan Meng, Chen Tang, Xinzhu Ma, Jingyan Jiang, Xin Wang, Zhi Wang, Wenwu Zhu

    Abstract: Recent advancements in diffusion models, particularly the trend of architectural transformation from UNet-based Diffusion to Diffusion Transformer (DiT), have significantly improved the quality and scalability of image synthesis. Despite the incredible generative quality, the large computational requirements of these large-scale models significantly hinder the deployments in real-world scenarios.… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  21. arXiv:2406.16567  [pdf, other

    cs.CL

    Data Augmentation of Multi-turn Psychological Dialogue via Knowledge-driven Progressive Thought Prompting

    Authors: Jiyue Jiang, Liheng Chen, Sheng Wang, Lingpeng Kong, Yu Li, Chuan Wu

    Abstract: Existing dialogue data augmentation (DA) techniques predominantly focus on augmenting utterance-level dialogues, which makes it difficult to take dialogue contextual information into account. The advent of large language models (LLMs) has simplified the implementation of multi-turn dialogues. Due to absence of professional understanding and knowledge, it remains challenging to deliver satisfactory… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  22. arXiv:2406.16023  [pdf, other

    quant-ph cs.DS

    Quantum Metropolis Sampling via Weak Measurement

    Authors: Jiaqing Jiang, Sandy Irani

    Abstract: Gibbs sampling is a crucial computational technique used in physics, statistics, and many other scientific fields. For classical Hamiltonians, the most commonly used Gibbs sampler is the Metropolis algorithm, known for having the Gibbs state as its unique fixed point. For quantum Hamiltonians, designing provably correct Gibbs samplers has been more challenging. [TOV+11] introduced a novel method t… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: 33 pages, 2 figures

  23. arXiv:2406.14966  [pdf, other

    cs.CY cs.CR

    AIGC-Chain: A Blockchain-Enabled Full Lifecycle Recording System for AIGC Product Copyright Management

    Authors: Jiajia Jiang, Moting Su, Xiangli Xiao, Yushu Zhang, Yuming Fang

    Abstract: As artificial intelligence technology becomes increasingly prevalent, Artificial Intelligence Generated Content (AIGC) is being adopted across various sectors. Although AIGC is playing an increasingly significant role in business and culture, questions surrounding its copyright have sparked widespread debate. The current legal framework for copyright and intellectual property is grounded in the co… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  24. arXiv:2406.14479  [pdf, other

    cs.AI cs.CL cs.CV cs.LG

    On Layer-wise Representation Similarity: Application for Multi-Exit Models with a Single Classifier

    Authors: Jiachen Jiang, Jinxin Zhou, Zhihui Zhu

    Abstract: Analyzing the similarity of internal representations within and across different models has been an important technique for understanding the behavior of deep neural networks. Most existing methods for analyzing the similarity between representations of high dimensions, such as those based on Canonical Correlation Analysis (CCA) and widely used Centered Kernel Alignment (CKA), rely on statistical… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  25. FoRAG: Factuality-optimized Retrieval Augmented Generation for Web-enhanced Long-form Question Answering

    Authors: Tianchi Cai, Zhiwen Tan, Xierui Song, Tao Sun, Jiyan Jiang, Yunqi Xu, Yinger Zhang, Jinjie Gu

    Abstract: Retrieval Augmented Generation (RAG) has become prevalent in question-answering (QA) tasks due to its ability of utilizing search engine to enhance the quality of long-form question-answering (LFQA). Despite the emergence of various open source methods and web-enhanced commercial systems such as Bing Chat, two critical problems remain unsolved, i.e., the lack of factuality and clear logic in the g… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Report number: 30th

    Journal ref: KDD 2024

  26. arXiv:2406.13724  [pdf, other

    cs.AI

    Heterogeneous Graph Neural Networks with Post-hoc Explanations for Multi-modal and Explainable Land Use Inference

    Authors: Xuehao Zhai, Junqi Jiang, Adam Dejl, Antonio Rago, Fangce Guo, Francesca Toni, Aruna Sivakumar

    Abstract: Urban land use inference is a critically important task that aids in city planning and policy-making. Recently, the increased use of sensor and location technologies has facilitated the collection of multi-modal mobility data, offering valuable insights into daily activity patterns. Many studies have adopted advanced data-driven techniques to explore the potential of these multi-modal mobility dat… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  27. arXiv:2406.12272  [pdf, other

    cs.AI

    Slot State Space Models

    Authors: Jindong Jiang, Fei Deng, Gautam Singh, Minseung Lee, Sungjin Ahn

    Abstract: Recent State Space Models (SSMs) such as S4, S5, and Mamba have shown remarkable computational benefits in long-range temporal dependency modeling. However, in many sequence modeling problems, the underlying process is inherently modular and it is of interest to have inductive biases that mimic this modular structure. In this paper, we introduce SlotSSMs, a novel framework for incorporating indepe… ▽ More

    Submitted 30 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  28. arXiv:2406.11553  [pdf, other

    cs.SI

    The Susceptibility Paradox in Online Social Influence

    Authors: Luca Luceri, Jinyi Ye, Julie Jiang, Emilio Ferrara

    Abstract: Understanding susceptibility to online influence is crucial for mitigating the spread of misinformation and protecting vulnerable audiences. This paper investigates susceptibility to influence within social networks, focusing on the differential effects of influence-driven versus spontaneous behaviors on user content adoption. Our analysis reveals that influence-driven adoption exhibits high homop… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  29. arXiv:2406.11551  [pdf, other

    cs.CV

    Simple Yet Efficient: Towards Self-Supervised FG-SBIR with Unified Sample Feature Alignment

    Authors: Jianan Jiang, Di Wu, Zhilin Jiang, Weiren Yu

    Abstract: Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) aims to minimize the distance between sketches and corresponding images in the embedding space. However, scalability is hindered by the growing complexity of solutions, mainly due to the abstract nature of fine-grained sketches. In this paper, we propose a simple yet efficient approach to narrow the gap between the two modes. It mainly facilitate… ▽ More

    Submitted 22 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: 10 pages,8 figures, 4 tables

  30. arXiv:2406.10517  [pdf, other

    cs.IR cs.AI cs.LG

    ADSNet: Cross-Domain LTV Prediction with an Adaptive Siamese Network in Advertising

    Authors: Ruize Wang, Hui Xu, Ying Cheng, Qi He, Xing Zhou, Rui Feng, Wei Xu, Lei Huang, Jie Jiang

    Abstract: Advertising platforms have evolved in estimating Lifetime Value (LTV) to better align with advertisers' true performance metric. However, the sparsity of real-world LTV data presents a significant challenge to LTV predictive model(i.e., pLTV), severely limiting the their capabilities. Therefore, we propose to utilize external data, in addition to the internal data of advertising platform, to expan… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: Accepted to KDD 2024

  31. arXiv:2406.09332  [pdf, other

    cs.RO

    RoTipBot: Robotic Handling of Thin and Flexible Objects using Rotatable Tactile Sensors

    Authors: Jiaqi Jiang, Xuyang Zhang, Daniel Fernandes Gomes, Thanh-Toan Do, Shan Luo

    Abstract: This paper introduces RoTipBot, a novel robotic system for handling thin, flexible objects. Different from previous works that are limited to singulating them using suction cups or soft grippers, RoTipBot can grasp and count multiple layers simultaneously, emulating human handling in various environments. Specifically, we develop a novel vision-based tactile sensor named RoTip that can rotate and… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 20 pages, 21 figures

  32. arXiv:2406.07828  [pdf, other

    cs.CV

    Spatial Annealing Smoothing for Efficient Few-shot Neural Rendering

    Authors: Yuru Xiao, Xianming Liu, Deming Zhai, Kui Jiang, Junjun Jiang, Xiangyang Ji

    Abstract: Neural Radiance Fields (NeRF) with hybrid representations have shown impressive capabilities in reconstructing scenes for view synthesis, delivering high efficiency. Nonetheless, their performance significantly drops with sparse view inputs, due to the issue of overfitting. While various regularization strategies have been devised to address these challenges, they often depend on inefficient assum… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  33. arXiv:2406.07006  [pdf, other

    cs.CV

    MIPI 2024 Challenge on Few-shot RAW Image Denoising: Methods and Results

    Authors: Xin Jin, Chunle Guo, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangchen Zhou, Ruicheng Feng, Yuekun Dai, Peiqing Yang, Chen Change Loy, Ruoqi Li, Chang Liu, Ziyi Wang, Yao Du, Jingjing Yang, Long Bao, Heng Sun, Xiangyu Kong, Xiaoxia Xing, Jinlong Wu, Yuanyang Xue, Hyunhee Park, Sejun Song, Changho Kim, Jingfan Tan , et al. (17 additional authors not shown)

    Abstract: The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: CVPR 2024 Mobile Intelligent Photography and Imaging (MIPI) Workshop--Few-shot RAWImage Denoising Challenge Report. Website: https://mipi-challenge.org/MIPI2024/

  34. arXiv:2406.06842  [pdf, ps, other

    cs.IT eess.SP

    Aerial Relay to Achieve Covertness and Security

    Authors: Jiacheng Jiang, Hongjiang Lei, Ki-Hong Park, Gaofeng Pan, Mohamed-Slim Alouini

    Abstract: In this work, a delay-tolerant unmanned aerial vehicle (UAV) relayed covert and secure communication framework is investigated. In this framework, a legitimate UAV serves as an aerial relay to realize communication when the direct link between the terrestrial transmitter and receiver is blocked and also acts as a friendly jammer to suppress the malicious nodes presented on the ground. Subsequently… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 12 pages, 6 figures, submitted to IEEE Journal for review

  35. arXiv:2406.05413  [pdf, other

    cs.LG cs.AI cs.CV cs.MM

    Discover Your Neighbors: Advanced Stable Test-Time Adaptation in Dynamic World

    Authors: Qinting Jiang, Chuyang Ye, Dongyan Wei, Yuan Xue, Jingyan Jiang, Zhi Wang

    Abstract: Despite progress, deep neural networks still suffer performance declines under distribution shifts between training and test domains, leading to a substantial decrease in Quality of Experience (QoE) for multimedia applications. Existing test-time adaptation (TTA) methods are challenged by dynamic, multiple test distributions within batches. This work provides a new perspective on analyzing batch n… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: 10 pages

  36. arXiv:2406.01349  [pdf, other

    cs.CV

    Unleashing Generalization of End-to-End Autonomous Driving with Controllable Long Video Generation

    Authors: Enhui Ma, Lijun Zhou, Tao Tang, Zhan Zhang, Dong Han, Junpeng Jiang, Kun Zhan, Peng Jia, Xianpeng Lang, Haiyang Sun, Di Lin, Kaicheng Yu

    Abstract: Using generative models to synthesize new data has become a de-facto standard in autonomous driving to address the data scarcity issue. Though existing approaches are able to boost perception models, we discover that these approaches fail to improve the performance of planning of end-to-end autonomous driving models as the generated videos are usually less than 8 frames and the spatial and tempora… ▽ More

    Submitted 6 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: Project Page: https://westlake-autolab.github.io/delphi.github.io/, 8 figures

  37. arXiv:2406.01288  [pdf, other

    cs.CL cs.AI cs.CR cs.LG

    Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses

    Authors: Xiaosen Zheng, Tianyu Pang, Chao Du, Qian Liu, Jing Jiang, Min Lin

    Abstract: Recently, Anil et al. (2024) show that many-shot (up to hundreds of) demonstrations can jailbreak state-of-the-art LLMs by exploiting their long-context capability. Nevertheless, is it possible to use few-shot demonstrations to efficiently jailbreak LLMs within limited context sizes? While the vanilla few-shot jailbreaking may be inefficient, we propose improved techniques such as injecting specia… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  38. arXiv:2406.01065  [pdf, other

    cs.LG cs.AI

    Causal prompting model-based offline reinforcement learning

    Authors: Xuehui Yu, Yi Guan, Rujia Shen, Xin Li, Chen Tang, Jingchi Jiang

    Abstract: Model-based offline Reinforcement Learning (RL) allows agents to fully utilise pre-collected datasets without requiring additional or unethical explorations. However, applying model-based offline RL to online systems presents challenges, primarily due to the highly suboptimal (noise-filled) and diverse nature of datasets generated by online systems. To tackle these issues, we introduce the Causal… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  39. arXiv:2406.01062  [pdf, other

    cs.CV

    SceneTextGen: Layout-Agnostic Scene Text Image Synthesis with Diffusion Models

    Authors: Qilong Zhangli, Jindong Jiang, Di Liu, Licheng Yu, Xiaoliang Dai, Ankit Ramchandani, Guan Pang, Dimitris N. Metaxas, Praveen Krishnan

    Abstract: While diffusion models have significantly advanced the quality of image generation, their capability to accurately and coherently render text within these images remains a substantial challenge. Conventional diffusion-based methods for scene text generation are typically limited by their reliance on an intermediate layout output. This dependency often results in a constrained diversity of text sty… ▽ More

    Submitted 7 July, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 7496-7506

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 7496-7506

  40. arXiv:2406.00992  [pdf, other

    cs.SE

    Hybrid Automated Program Repair by Combining Large Language Models and Program Analysis

    Authors: Fengjie Li, Jiajun Jiang, Jiajun Sun, Hongyu Zhang

    Abstract: Automated Program Repair (APR) has garnered significant attention due to its potential to streamline the bug repair process for human developers. Recently, LLM-based APR methods have shown promise in repairing real-world bugs. However, existing APR methods often utilize patches generated by LLMs without further optimization, resulting in reduced effectiveness due to the lack of program-specific kn… ▽ More

    Submitted 4 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: 12 pages, 4 figures

  41. arXiv:2406.00515  [pdf, other

    cs.CL cs.AI cs.SE

    A Survey on Large Language Models for Code Generation

    Authors: Juyong Jiang, Fan Wang, Jiasi Shen, Sungju Kim, Sunghun Kim

    Abstract: Large Language Models (LLMs) have garnered remarkable advancements across diverse code-related tasks, known as Code LLMs, particularly in code generation that generates source code with LLM from natural language descriptions. This burgeoning field has captured significant interest from both academic researchers and industry professionals due to its practical significance in software development, e… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  42. arXiv:2406.00163  [pdf, other

    cs.IT eess.SY

    A Stochastic Incentive-based Demand Response Program for Virtual Power Plant with Solar, Battery, Electric Vehicles, and Controllable Loads

    Authors: Pratik Harsh, Hongjian Sun, Debapriya Das, Goyal Awagan, Jing Jiang

    Abstract: The growing integration of distributed energy resources (DERs) into the power grid necessitates an effective coordination strategy to maximize their benefits. Acting as an aggregator of DERs, a virtual power plant (VPP) facilitates this coordination, thereby amplifying their impact on the transmission level of the power grid. Further, a demand response program enhances the scheduling approach by m… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: 11 pages, 8 figures, submitted to IEEE Transactions on Industry Applications for potential publication

  43. arXiv:2405.20858  [pdf, other

    cs.RO

    CSDO: Enhancing Efficiency and Success in Large-Scale Multi-Vehicle Trajectory Planning

    Authors: Yibin Yang, Shaobing Xu, Xintao Yan, Junkai Jiang, Jianqiang Wang, Heye Huang

    Abstract: This paper presents an efficient algorithm, naming Centralized Searching and Decentralized Optimization (CSDO), to find feasible solution for large-scale Multi-Vehicle Trajectory Planning (MVTP) problem. Due to the intractable growth of non-convex constraints with the number of agents, exploring various homotopy classes that imply different convex domains, is crucial for finding a feasible solutio… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: 8 pages, 7 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  44. arXiv:2405.20348  [pdf, other

    physics.ao-ph cs.LG

    Personalized Adapter for Large Meteorology Model on Devices: Towards Weather Foundation Models

    Authors: Shengchao Chen, Guodong Long, Jing Jiang, Chengqi Zhang

    Abstract: This paper demonstrates that pre-trained language models (PLMs) are strong foundation models for on-device meteorological variables modeling. We present LM-Weather, a generic approach to taming PLMs, that have learned massive sequential knowledge from the universe of natural language databases, to acquire an immediate capability to obtain highly customized models for heterogeneous meteorological d… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 42 pages, under review

  45. arXiv:2405.19376  [pdf, other

    cs.LG cs.AI

    PureEBM: Universal Poison Purification via Mid-Run Dynamics of Energy-Based Models

    Authors: Omead Pooladzandi, Jeffrey Jiang, Sunay Bhat, Gregory Pottie

    Abstract: Data poisoning attacks pose a significant threat to the integrity of machine learning models by leading to misclassification of target distribution data by injecting adversarial examples during training. Existing state-of-the-art (SoTA) defense methods suffer from limitations, such as significantly reduced generalization performance and significant overhead during training, making them impractical… ▽ More

    Submitted 2 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2405.18627

  46. arXiv:2405.19291  [pdf, other

    cs.RO

    Grasp as You Say: Language-guided Dexterous Grasp Generation

    Authors: Yi-Lin Wei, Jian-Jian Jiang, Chengyi Xing, Xiantuo Tan, Xiao-Ming Wu, Hao Li, Mark Cutkosky, Wei-Shi Zheng

    Abstract: This paper explores a novel task ""Dexterous Grasp as You Say"" (DexGYS), enabling robots to perform dexterous grasping based on human commands expressed in natural language. However, the development of this field is hindered by the lack of datasets with natural human guidance; thus, we propose a language-guided dexterous grasp dataset, named DexGYSNet, offering high-quality dexterous grasp annota… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 9 pages, 7 figures

  47. arXiv:2405.18627  [pdf, other

    cs.LG cs.AI cs.CR

    PureGen: Universal Data Purification for Train-Time Poison Defense via Generative Model Dynamics

    Authors: Sunay Bhat, Jeffrey Jiang, Omead Pooladzandi, Alexander Branch, Gregory Pottie

    Abstract: Train-time data poisoning attacks threaten machine learning models by introducing adversarial examples during training, leading to misclassification. Current defense methods often reduce generalization performance, are attack-specific, and impose significant training overhead. To address this, we introduce a set of universal data purification methods using a stochastic transform, $Ψ(x)$, realized… ▽ More

    Submitted 2 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

  48. arXiv:2405.17458  [pdf, other

    cs.LG cs.AI

    Blood Glucose Control Via Pre-trained Counterfactual Invertible Neural Networks

    Authors: Jingchi Jiang, Rujia Shen, Boran Wang, Yi Guan

    Abstract: Type 1 diabetes mellitus (T1D) is characterized by insulin deficiency and blood glucose (BG) control issues. The state-of-the-art solution for continuous BG control is reinforcement learning (RL), where an agent can dynamically adjust exogenous insulin doses in time to maintain BG levels within the target range. However, due to the lack of action guidance, the agent often needs to learn from rando… ▽ More

    Submitted 18 July, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

  49. arXiv:2405.17102  [pdf, other

    cs.CV cs.RO

    DINO-SD: Champion Solution for ICRA 2024 RoboDepth Challenge

    Authors: Yifan Mao, Ming Li, Jian Liu, Jiayang Liu, Zihan Qin, Chunxi Chu, Jialei Xu, Wenbo Zhao, Junjun Jiang, Xianming Liu

    Abstract: Surround-view depth estimation is a crucial task aims to acquire the depth maps of the surrounding views. It has many applications in real world scenarios such as autonomous driving, AR/VR and 3D reconstruction, etc. However, given that most of the data in the autonomous driving dataset is collected in daytime scenarios, this leads to poor depth model performance in the face of out-of-distribution… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Outstanding Champion in the RoboDepth Challenge (ICRA24) https://robodrive-24.github.io/

  50. arXiv:2405.17067  [pdf, other

    cs.CL cs.AI

    Tokenization Matters! Degrading Large Language Models through Challenging Their Tokenization

    Authors: Dixuan Wang, Yanda Li, Junyuan Jiang, Zepeng Ding, Guochao Jiang, Jiaqing Liang, Deqing Yang

    Abstract: Large Language Models (LLMs) have shown remarkable capabilities in language understanding and generation. Nonetheless, it was also witnessed that LLMs tend to produce inaccurate responses to specific queries. This deficiency can be traced to the tokenization step LLMs must undergo, which is an inevitable limitation inherent to all LLMs. In fact, incorrect tokenization is the critical point that hi… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 17 pages, 3 figures, this paper is submitted to neurips 2024