Skip to main content

Showing 1–50 of 2,116 results for author: Gao, J

  1. arXiv:2407.08959  [pdf, other

    cs.CL

    Domain-Hierarchy Adaptation via Chain of Iterative Reasoning for Few-shot Hierarchical Text Classification

    Authors: Ke Ji, Peng Wang, Wenjun Ke, Guozheng Li, Jiajun Liu, Jingsheng Gao, Ziyu Shang

    Abstract: Recently, various pre-trained language models (PLMs) have been proposed to prove their impressive performances on a wide range of few-shot tasks. However, limited by the unstructured prior knowledge in PLMs, it is difficult to maintain consistent performance on complex structured scenarios, such as hierarchical text classification (HTC), especially when the downstream data is extremely scarce. The… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 9 pages, 2 figures, Accepted by IJCAI2024

  2. arXiv:2407.08937  [pdf, other

    cs.CL cs.AI

    Self-Evolving GPT: A Lifelong Autonomous Experiential Learner

    Authors: Jinglong Gao, Xiao Ding, Yiming Cui, Jianbai Zhao, Hepeng Wang, Ting Liu, Bing Qin

    Abstract: To improve the performance of large language models (LLMs), researchers have explored providing LLMs with textual task-solving experience via prompts. However, they rely on manual efforts to acquire and apply such experience for each task, which is not feasible for the growing demand for LLMs and the variety of user questions. To address this issue, we design a lifelong autonomous experiential lea… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Accepted by ACL 2024 MAIN

  3. arXiv:2407.08639  [pdf, other

    cs.AI cs.LG

    $β$-DPO: Direct Preference Optimization with Dynamic $β$

    Authors: Junkang Wu, Yuexiang Xie, Zhengyi Yang, Jiancan Wu, Jinyang Gao, Bolin Ding, Xiang Wang, Xiangnan He

    Abstract: Direct Preference Optimization (DPO) has emerged as a compelling approach for training Large Language Models (LLMs) to adhere to human preferences. However, the performance of DPO is sensitive to the fine-tuning of its trade-off parameter $β$, as well as to the quality of the preference data. We analyze the impact of $β$ and data quality on DPO, uncovering that optimal $β$ values vary with the inf… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  4. arXiv:2407.07880  [pdf, other

    cs.LG cs.AI cs.CL

    Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization

    Authors: Junkang Wu, Yuexiang Xie, Zhengyi Yang, Jiancan Wu, Jiawei Chen, Jinyang Gao, Bolin Ding, Xiang Wang, Xiangnan He

    Abstract: This study addresses the challenge of noise in training datasets for Direct Preference Optimization (DPO), a method for aligning Large Language Models (LLMs) with human preferences. We categorize noise into pointwise noise, which includes low-quality data points, and pairwise noise, which encompasses erroneous data pair associations that affect preference rankings. Utilizing Distributionally Robus… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  5. arXiv:2407.06453  [pdf, ps, other

    math.RA

    Dual minus partial order

    Authors: Ju Gao, Hongxing Wang, Xiaoji Liu

    Abstract: In this paper, we introduce the Dual-minus partial order, get some characterizations of the partial order, and prove that both the dual star partial order and the dual sharp partial order are Dual-minus-type partial orders. Based on the Dual-minus partial order, we introduce the Dual-minus sharp partial order and the Dual-minus star partial order, which are also Dual-minus-type partial orders. In… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 23 pages

    MSC Class: 15A09; 15A24; 62G30

  6. arXiv:2407.06022  [pdf

    math.NA

    Investigation of microstructural evolution of irradiation-induced defects in tungsten: an experimental-numerical approach

    Authors: Salahudeen Mohamed, Qian Yuan, Dimitri Litvinov, Jie Gao, Ermile Gaganidze, Dmitry Terentyev, Hans-Christian Schneider, Jarir Aktaa

    Abstract: The hostile condition in a fusion tokomak reactor poses the main challenge in the development and design of in-vessel components such as divertor and breeding blanket due to fusion relevant irradiation conditions (14 MeV) and large thermal loads. The current work describes the employment of an integrated experimental-numerical approach to assess the microstructure evolution of dislocation loops an… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  7. arXiv:2407.05771  [pdf, other

    cs.CV

    Multi-times Monte Carlo Rendering for Inter-reflection Reconstruction

    Authors: Tengjie Zhu, Zhuo Chen, Jingnan Gao, Yichao Yan, Xiaokang Yang

    Abstract: Inverse rendering methods have achieved remarkable performance in reconstructing high-fidelity 3D objects with disentangled geometries, materials, and environmental light. However, they still face huge challenges in reflective surface reconstruction. Although recent methods model the light trace to learn specularity, the ignorance of indirect illumination makes it hard to handle inter-reflections… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 10 pages,6 figures,NeurIPS 2024 Submitted

  8. arXiv:2407.05705  [pdf, other

    cs.AI

    Fast and Continual Knowledge Graph Embedding via Incremental LoRA

    Authors: Jiajun Liu, Wenjun Ke, Peng Wang, Jiahao Wang, Jinhua Gao, Ziyu Shang, Guozheng Li, Zijie Xu, Ke Ji, Yining Li

    Abstract: Continual Knowledge Graph Embedding (CKGE) aims to efficiently learn new knowledge and simultaneously preserve old knowledge. Dominant approaches primarily focus on alleviating catastrophic forgetting of old knowledge but neglect efficient learning for the emergence of new knowledge. However, in real-world scenarios, knowledge graphs (KGs) are continuously growing, which brings a significant chall… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: Accepted by IJCAI2024

  9. arXiv:2407.04422  [pdf, other

    hep-ph hep-ex nucl-ex nucl-th

    Global analysis of fragmentation functions to charged hadrons with high-precision data from the LHC

    Authors: Jun Gao, ChongYang Liu, XiaoMin Shen, Hongxi Xing, Yuxiang Zhao

    Abstract: Fragmentation functions (FFs) are essential non-perturbative QCD inputs for predicting hadron production cross sections in high energy scatterings. In this study, we present a joint determination of FFs for light charged hadrons through a global analysis at next-to-leading order (NLO) in QCD. Our analysis incorporates a wide range of precision measurements from the LHC, as well as data from electr… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: 44 pages, 45 figures

  10. arXiv:2407.03038  [pdf, other

    cs.CL cs.DC cs.LG

    On the Client Preference of LLM Fine-tuning in Federated Learning

    Authors: Feijie Wu, Xiaoze Liu, Haoyu Wang, Xingchen Wang, Jing Gao

    Abstract: Reinforcement learning with human feedback (RLHF) fine-tunes a pretrained large language model (LLM) using preference datasets, enabling the LLM to generate outputs that align with human preferences. Given the sensitive nature of these preference datasets held by various clients, there is a need to implement RLHF within a federated learning (FL) framework, where clients are reluctant to share thei… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Work in progress

  11. arXiv:2407.01875  [pdf, ps, other

    cs.AI

    Spatio-Temporal Graphical Counterfactuals: An Overview

    Authors: Mingyu Kang, Duxin Chen, Ziyuan Pu, Jianxi Gao, Wenwu Yu

    Abstract: Counterfactual thinking is a critical yet challenging topic for artificial intelligence to learn knowledge from data and ultimately improve their performances for new scenarios. Many research works, including Potential Outcome Model and Structural Causal Model, have been proposed to realize it. However, their modelings, theoretical foundations and application approaches are usually different. More… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  12. arXiv:2407.01414  [pdf, other

    cs.CV

    StyleShot: A Snapshot on Any Style

    Authors: Junyao Gao, Yanchen Liu, Yanan Sun, Yinhao Tang, Yanhong Zeng, Kai Chen, Cairong Zhao

    Abstract: In this paper, we show that, a good style representation is crucial and sufficient for generalized style transfer without test-time tuning. We achieve this through constructing a style-aware encoder and a well-organized style dataset called StyleGallery. With dedicated design for style learning, this style-aware encoder is trained to extract expressive style representation with decoupling training… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: project page:https://styleshot.github.io/

  13. arXiv:2407.00218  [pdf, other

    eess.SY cs.RO

    Resilient Estimator-based Control Barrier Functions for Dynamical Systems with Disturbances and Noise

    Authors: Chuyuan Tao, Wenbin Wan, Junjie Gao, Bihao Mo, Hunmin Kim, Naira Hovakimyan

    Abstract: Control Barrier Function (CBF) is an emerging method that guarantees safety in path planning problems by generating a control command to ensure the forward invariance of a safety set. Most of the developments up to date assume availability of correct state measurements and absence of disturbances on the system. However, if the system incurs disturbances and is subject to noise, the CBF cannot guar… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

  14. arXiv:2406.19605  [pdf, other

    math.OC

    A Customized Augmented Lagrangian Method for Block-Structured Integer Programming

    Authors: Rui Wang, Chuwen Zhang, Shanwen Pu, Jianjun Gao, Zaiwen Wen

    Abstract: Integer programming with block structures has received considerable attention recently and is widely used in many practical applications such as train timetabling and vehicle routing problems. It is known to be NP-hard due to the presence of integer variables. We define a novel augmented Lagrangian function by directly penalizing the inequality constraints and establish the strong duality between… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  15. arXiv:2406.18966  [pdf, other

    cs.CL

    UniGen: A Unified Framework for Textual Dataset Generation Using Large Language Models

    Authors: Siyuan Wu, Yue Huang, Chujie Gao, Dongping Chen, Qihui Zhang, Yao Wan, Tianyi Zhou, Xiangliang Zhang, Jianfeng Gao, Chaowei Xiao, Lichao Sun

    Abstract: Large Language Models (LLMs) such as GPT-4 and Llama3 have significantly impacted various fields by enabling high-quality synthetic data generation and reducing dependence on expensive human-generated datasets. Despite this, challenges remain in the areas of generalization, controllability, diversity, and truthfulness within the existing generative frameworks. To address these challenges, this pap… ▽ More

    Submitted 28 June, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

  16. arXiv:2406.18074  [pdf, other

    cs.CV cs.AI

    Few-Shot Medical Image Segmentation with High-Fidelity Prototypes

    Authors: Song Tang, Shaxu Yan, Xiaozhi Qi, Jianxin Gao, Mao Ye, Jianwei Zhang, Xiatian Zhu

    Abstract: Few-shot Semantic Segmentation (FSS) aims to adapt a pretrained model to new classes with as few as a single labelled training sample per class. Despite the prototype based approaches have achieved substantial success, existing models are limited to the imaging scenarios with considerably distinct objects and not highly complex background, e.g., natural images. This makes such models suboptimal fo… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  17. arXiv:2406.17706  [pdf, other

    cs.LG cs.CL cs.DC

    FedBiOT: LLM Local Fine-tuning in Federated Learning without Full Model

    Authors: Feijie Wu, Zitao Li, Yaliang Li, Bolin Ding, Jing Gao

    Abstract: Large language models (LLMs) show amazing performance on many domain-specific tasks after fine-tuning with some appropriate data. However, many domain-specific data are privately distributed across multiple owners. Thus, this dilemma raises the interest in how to perform LLM fine-tuning in federated learning (FL). However, confronted with limited computation and communication capacities, FL client… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: KDD 2024

  18. arXiv:2406.16200  [pdf, other

    cs.LG cs.CR cs.IT eess.SP

    Towards unlocking the mystery of adversarial fragility of neural networks

    Authors: Jingchao Gao, Raghu Mudumbai, Xiaodong Wu, Jirong Yi, Catherine Xu, Hui Xie, Weiyu Xu

    Abstract: In this paper, we study the adversarial robustness of deep neural networks for classification tasks. We look at the smallest magnitude of possible additive perturbations that can change the output of a classification algorithm. We provide a matrix-theoretic explanation of the adversarial fragility of deep neural network for classification. In particular, our theoretical results show that neural ne… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: 21 pages

  19. arXiv:2406.15781  [pdf, other

    cs.CL

    DABL: Detecting Semantic Anomalies in Business Processes Using Large Language Models

    Authors: Wei Guan, Jian Cao, Jianqi Gao, Haiyan Zhao, Shiyou Qian

    Abstract: Detecting anomalies in business processes is crucial for ensuring operational success. While many existing methods rely on statistical frequency to detect anomalies, it's important to note that infrequent behavior doesn't necessarily imply undesirability. To address this challenge, detecting anomalies from a semantic viewpoint proves to be a more effective approach. However, current semantic anoma… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  20. arXiv:2406.15452  [pdf, other

    physics.soc-ph

    A-TEAM: Advanced Traffic Event Analysis and Management Platform for Transportation Data-Driven Problem Solving

    Authors: Zilin Bian, Dachuan Zuo, Jingqin Gao, Kaan Ozbay, Matthew D. Maggio

    Abstract: The rapid growth in terms of the availability of transportation data provides great potential for the introduction of emerging data-driven methodologies into transportation-related research and development efforts. However, advanced data-driven models, such as artificial-intelligence based approaches, usually contain complicated modeling structures and require strict data formats along with a very… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  21. arXiv:2406.15439  [pdf

    physics.soc-ph stat.AP

    Heterogeneous peer effects of college roommates on academic performance

    Authors: Yi Cao, Tao Zhou, Jian Gao

    Abstract: Understanding how student peers influence learning outcomes is crucial for effective education management in complex social systems. The complexities of peer selection and evolving peer relationships, however, pose challenges for identifying peer effects using static observational data. Here we use both null-model and regression approaches to examine peer effects using longitudinal data from 5,272… ▽ More

    Submitted 29 May, 2024; originally announced June 2024.

    Comments: 56 pages, 4 figures, 2 tables, with Supplementary Information

    Journal ref: Nature Communications, 15(1), 4785 (2024)

  22. arXiv:2406.14558  [pdf, other

    cs.RO cs.AI

    CooHOI: Learning Cooperative Human-Object Interaction with Manipulated Object Dynamics

    Authors: Jiawei Gao, Ziqin Wang, Zeqi Xiao, Jingbo Wang, Tai Wang, Jinkun Cao, Xiaolin Hu, Si Liu, Jifeng Dai, Jiangmiao Pang

    Abstract: Recent years have seen significant advancements in humanoid control, largely due to the availability of large-scale motion capture data and the application of reinforcement learning methodologies. However, many real-world tasks, such as moving large and heavy furniture, require multi-character collaboration. Given the scarcity of data on multi-character collaboration and the efficiency challenges… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  23. arXiv:2406.14393  [pdf, other

    cs.LG cs.CL

    Jailbreaking as a Reward Misspecification Problem

    Authors: Zhihui Xie, Jiahui Gao, Lei Li, Zhenguo Li, Qi Liu, Lingpeng Kong

    Abstract: The widespread adoption of large language models (LLMs) has raised concerns about their safety and reliability, particularly regarding their vulnerability to adversarial attacks. In this paper, we propose a novel perspective that attributes this vulnerability to reward misspecification during the alignment process. We introduce a metric ReGap to quantify the extent of reward misspecification and d… ▽ More

    Submitted 12 July, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

    Comments: github url added

  24. arXiv:2406.14067  [pdf

    physics.optics eess.SP

    A microwave photonic prototype for concurrent radar detection and spectrum sensing over an 8 to 40 GHz bandwidth

    Authors: Taixia Shi, Dingding Liang, Lu Wang, Lin Li, Shaogang Guo, Jiawei Gao, Xiaowei Li, Chulun Lin, Lei Shi, Baogang Ding, Shiyang Liu, Fangyi Yang, Chi Jiang, Yang Chen

    Abstract: In this work, a microwave photonic prototype for concurrent radar detection and spectrum sensing is proposed, designed, built, and investigated. A direct digital synthesizer and an analog electronic circuit are integrated to generate an intermediate frequency (IF) linearly frequency-modulated (LFM) signal with a tunable center frequency from 2.5 to 9.5 GHz and an instantaneous bandwidth of 1 GHz.… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 18 pages, 12 figures, 1 table

  25. arXiv:2406.13986  [pdf, other

    astro-ph.SR astro-ph.GA

    Novae: An Important Source of Lithium in the Galaxy

    Authors: Jun Gao, Chunhua Zhu, Guoliang Lü, Jinlong Yu, Lin Li, Helei Liu, Sufen Guo

    Abstract: The source of the Galactic Lithium (Li) has long been a puzzle. With the discovery of Li in novae, extensive research has been conducted. However, there still exists a significant disparity between the observed abundance of lithium in novae and the existing theoretical predictions. Using the Modules for Experiments in Stellar Astrophysics (MESA), we simulate the evolution of nova with element diff… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 12 pages, 4 figures. Accepted for publication in Astrophysical Journal

  26. arXiv:2406.13375  [pdf, other

    cs.CL

    ALiiCE: Evaluating Positional Fine-grained Citation Generation

    Authors: Yilong Xu, Jinhua Gao, Xiaoming Yu, Baolong Bi, Huawei Shen, Xueqi Cheng

    Abstract: Large Language Models (LLMs) can enhance the credibility and verifiability by generating text with citations. However, existing tasks and evaluation methods are predominantly limited to sentence-level statement, neglecting the significance of positional fine-grained citations that can appear anywhere within sentences. To facilitate further exploration of the fine-grained citation generation, we pr… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  27. Learning Translations via Matrix Completion

    Authors: Derry Wijaya, Brendan Callahan, John Hewitt, Jie Gao, Xiao Ling, Marianna Apidianaki, Chris Callison-Burch

    Abstract: Bilingual Lexicon Induction is the task of learning word translations without bilingual parallel corpora. We model this task as a matrix completion problem, and present an effective and extendable framework for completing the matrix. This method harnesses diverse bilingual and monolingual signals, each of which may be incomplete or noisy. Our model achieves state-of-the-art performance for both hi… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: This is a late posting of an old paper as Google Scholar somehow misses indexing the ACL anthology version of the paper

    ACM Class: I.2.7

    Journal ref: Volume: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Year: 2017, Pages: 1452-1463

  28. arXiv:2406.13057  [pdf, other

    cs.LG cs.AI

    Informed along the road: roadway capacity driven graph convolution network for network-wide traffic prediction

    Authors: Zilin Bian, Jingqin Gao, Kaan Ozbay, Fan Zuo, Dachuan Zuo, Zhenning Li

    Abstract: While deep learning has shown success in predicting traffic states, most methods treat it as a general prediction task without considering transportation aspects. Recently, graph neural networks have proven effective for this task, but few incorporate external factors that impact roadway capacity and traffic flow. This study introduces the Roadway Capacity Driven Graph Convolution Network (RCDGCN)… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  29. arXiv:2406.13038  [pdf, other

    cs.AI eess.SP

    Traffic Prediction considering Multiple Levels of Spatial-temporal Information: A Multi-scale Graph Wavelet-based Approach

    Authors: Zilin Bian, Jingqin Gao, Kaan Ozbay, Zhenning Li

    Abstract: Although traffic prediction has been receiving considerable attention with a number of successes in the context of intelligent transportation systems, the prediction of traffic states over a complex transportation network that contains different road types has remained a challenge. This study proposes a multi-scale graph wavelet temporal convolution network (MSGWTCN) to predict the traffic states… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  30. arXiv:2406.12975  [pdf, other

    cs.CL cs.AI cs.CY

    SHIELD: Evaluation and Defense Strategies for Copyright Compliance in LLM Text Generation

    Authors: Xiaoze Liu, Ting Sun, Tianyang Xu, Feijie Wu, Cunxiang Wang, Xiaoqian Wang, Jing Gao

    Abstract: Large Language Models (LLMs) have transformed machine learning but raised significant legal concerns due to their potential to produce text that infringes on copyrights, resulting in several high-profile lawsuits. The legal landscape is struggling to keep pace with these rapid advancements, with ongoing debates about whether generated text might plagiarize copyrighted materials. Current LLMs may i… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  31. arXiv:2406.12719  [pdf, other

    cs.CL cs.AI

    On the Robustness of Language Models for Tabular Question Answering

    Authors: Kushal Raj Bhandari, Sixue Xing, Soham Dan, Jianxi Gao

    Abstract: Large Language Models (LLMs), originally shown to ace various text comprehension tasks have also remarkably been shown to tackle table comprehension tasks without specific training. While previous research has explored LLM capabilities with tabular dataset tasks, our study assesses the influence of $\textit{in-context learning}$,$ \textit{model scale}$, $\textit{instruction tuning}$, and… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  32. arXiv:2406.12433  [pdf, other

    cs.IR

    LLM-enhanced Reranking in Recommender Systems

    Authors: Jingtong Gao, Bo Chen, Xiangyu Zhao, Weiwen Liu, Xiangyang Li, Yichao Wang, Zijian Zhang, Wanyu Wang, Yuyang Ye, Shanru Lin, Huifeng Guo, Ruiming Tang

    Abstract: Reranking is a critical component in recommender systems, playing an essential role in refining the output of recommendation algorithms. Traditional reranking models have focused predominantly on accuracy, but modern applications demand consideration of additional criteria such as diversity and fairness. Existing reranking approaches often fail to harmonize these diverse criteria effectively at th… ▽ More

    Submitted 20 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  33. arXiv:2406.11340  [pdf, other

    cs.CV cs.LG

    CM2-Net: Continual Cross-Modal Mapping Network for Driver Action Recognition

    Authors: Ruoyu Wang, Chen Cai, Wenqian Wang, Jianjun Gao, Dan Lin, Wenyang Liu, Kim-Hui Yap

    Abstract: Driver action recognition has significantly advanced in enhancing driver-vehicle interactions and ensuring driving safety by integrating multiple modalities, such as infrared and depth. Nevertheless, compared to RGB modality only, it is always laborious and costly to collect extensive data for all types of non-RGB modalities in car cabin environments. Therefore, previous works have suggested indep… ▽ More

    Submitted 18 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  34. arXiv:2406.11142  [pdf, other

    cs.RO cs.CV

    Graspness Discovery in Clutters for Fast and Accurate Grasp Detection

    Authors: Chenxi Wang, Hao-Shu Fang, Minghao Gou, Hongjie Fang, Jin Gao, Cewu Lu

    Abstract: Efficient and robust grasp pose detection is vital for robotic manipulation. For general 6 DoF grasping, conventional methods treat all points in a scene equally and usually adopt uniform sampling to select grasp candidates. However, we discover that ignoring where to grasp greatly harms the speed and accuracy of current grasp pose detection methods. In this paper, we propose "graspness", a qualit… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: ICCV 2021

  35. arXiv:2406.10819  [pdf, other

    cs.CV cs.AI cs.CL

    GUI-WORLD: A Dataset for GUI-oriented Multimodal LLM-based Agents

    Authors: Dongping Chen, Yue Huang, Siyuan Wu, Jingyu Tang, Liuyi Chen, Yilin Bai, Zhigang He, Chenlong Wang, Huichi Zhou, Yiqiang Li, Tianshuo Zhou, Yue Yu, Chujie Gao, Qihui Zhang, Yi Gui, Zhen Li, Yao Wan, Pan Zhou, Jianfeng Gao, Lichao Sun

    Abstract: Recently, Multimodal Large Language Models (MLLMs) have been used as agents to control keyboard and mouse inputs by directly perceiving the Graphical User Interface (GUI) and generating corresponding code. However, current agents primarily exhibit excellent understanding capabilities in static environments and are predominantly applied in relatively simple domains, such as Web or mobile interfaces… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  36. arXiv:2406.10777  [pdf, other

    cs.CL

    RoseLoRA: Row and Column-wise Sparse Low-rank Adaptation of Pre-trained Language Model for Knowledge Editing and Fine-tuning

    Authors: Haoyu Wang, Tianci Liu, Tuo Zhao, Jing Gao

    Abstract: Pre-trained language models, trained on large-scale corpora, demonstrate strong generalizability across various NLP tasks. Fine-tuning these models for specific tasks typically involves updating all parameters, which is resource-intensive. Parameter-efficient fine-tuning (PEFT) methods, such as the popular LoRA family, introduce low-rank matrices to learn only a few parameters efficiently. However… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  37. arXiv:2406.10421  [pdf, other

    cs.CL

    SciEx: Benchmarking Large Language Models on Scientific Exams with Human Expert Grading and Automatic Grading

    Authors: Tu Anh Dinh, Carlos Mullov, Leonard Bärmann, Zhaolin Li, Danni Liu, Simon Reiß, Jueun Lee, Nathan Lerzer, Fabian Ternava, Jianfeng Gao, Tobias Röddiger, Alexander Waibel, Tamim Asfour, Michael Beigl, Rainer Stiefelhagen, Carsten Dachsbacher, Klemens Böhm, Jan Niehues

    Abstract: With the rapid development of Large Language Models (LLMs), it is crucial to have benchmarks which can evaluate the ability of LLMs on different domains. One common use of LLMs is performing tasks on scientific topics, such as writing algorithms, querying databases or giving mathematical proofs. Inspired by the way university students are evaluated on such tasks, in this paper, we propose SciEx -… ▽ More

    Submitted 12 July, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    ACM Class: I.2.7

  38. arXiv:2406.09307  [pdf, other

    cs.LG cs.CY stat.ML

    A tutorial on fairness in machine learning in healthcare

    Authors: Jianhui Gao, Benson Chou, Zachary R. McCaw, Hilary Thurston, Paul Varghese, Chuan Hong, Jessica Gronsbell

    Abstract: $\textbf{OBJECTIVE}… ▽ More

    Submitted 15 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

  39. arXiv:2406.08849  [pdf, other

    physics.atom-ph

    Electronic processes in collisions between nitrogen impurity ions and hydrogen atoms

    Authors: C. C. Jia, Y. Y. Qi, J. J. Niu, Y. Wu J. G. Wang, A. Dubois, N. Sisourat, J. W. Gao

    Abstract: In order to interpret and predict the behavior and properties of fusion plasma, accurate cross sections for electronic processes in collisions between plasma impurities and atomic hydrogen are required. In this work, we investigate the electron capture, target excitation, and ionization processes occurring in collision of ${\rm N}^{4+}$ with atomic hydrogen in a broad energy domain ranging from 0.… ▽ More

    Submitted 1 July, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

  40. arXiv:2406.08044  [pdf, other

    cond-mat.mes-hall

    Hofstadter spectrum in a semiconductor moiré lattice

    Authors: Chen Zhao, Ming Wu, Zhen Ma, Miao Liang, Ming Lu, Jin-Hua Gao, X. C. Xie

    Abstract: Recently, the Hofstadter spectrum of a twisted $\mathrm{WSe_2/MoSe_2}$ heterobilayer has been observed in experiment [C. R. Kometter, et al. Nat.Phys.19, 1861 (2023)], but the origin of Hofstadter states remains unclear. Here, we present a comprehensive theoretical interpretation of the observed Hofstadter states by calculating its accurate Hofstadter spectrum. We point out that the valley Zeeman… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 7 pages, 4 figures

  41. arXiv:2406.07794  [pdf, other

    cs.CL cs.AI

    Making Task-Oriented Dialogue Datasets More Natural by Synthetically Generating Indirect User Requests

    Authors: Amogh Mannekote, Jinseok Nam, Ziming Li, Jian Gao, Kristy Elizabeth Boyer, Bonnie J. Dorr

    Abstract: Indirect User Requests (IURs), such as "It's cold in here" instead of "Could you please increase the temperature?" are common in human-human task-oriented dialogue and require world knowledge and pragmatic reasoning from the listener. While large language models (LLMs) can handle these requests effectively, smaller models deployed on virtual assistants often struggle due to resource constraints. M… ▽ More

    Submitted 16 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

  42. arXiv:2406.07588  [pdf, other

    cs.MM cs.CL

    AIM: Let Any Multi-modal Large Language Models Embrace Efficient In-Context Learning

    Authors: Jun Gao, Qian Qiao, Ziqiang Cao, Zili Wang, Wenjie Li

    Abstract: In-context learning (ICL) facilitates Large Language Models (LLMs) exhibiting emergent ability on downstream tasks without updating billions of parameters. However, in the area of multi-modal Large Language Models (MLLMs), two problems hinder the application of multi-modal ICL: (1) Most primary MLLMs are only trained on single-image datasets, making them unable to read multi-modal demonstrations.… ▽ More

    Submitted 30 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

  43. arXiv:2406.06889  [pdf, other

    physics.soc-ph cs.SI

    Universal spatial inflation of human mobility

    Authors: Lu Zhong, Lei Dong, Qi Wang, Chaoming Song, Jianxi Gao

    Abstract: Understanding the interplay between egocentric preference and urban structure in shaping human mobility has profound implications for improving epidemic intervention, social equity, and urban resilience. However, numerous existing studies either solely identify the egocentric preferences -- the anchoring effects from home -- or the impact of hierarchical urban structures. Here, we propose a networ… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 15 pages, 6 figures

  44. arXiv:2406.06828  [pdf, other

    astro-ph.IM

    CCAT: Comparisons of 280 GHz TiN and Al Kinetic Inductance Detector Arrays

    Authors: Cody J. Duell, Jason Austermann, James Beall, James R. Burgoyne, Scott C. Chapman, Steve K. Choi, Rodrigo G. Freundt, Jiansong Gao, Christopher Groppi, Anthony I. Huber, Zachary B. Huber, Johannes Hubmayr, Ben Keller, Yaqiong Li, Lawrence T. Lin, Justin Matthewson, Philip Mauskopf, Alicia Middleton, Colin C. Murphy, Michael D. Niemack, Thomas Nikola, Adrian K. Sinclair, Ema Smith, Jeff van Lanen, Anna Vaskuri , et al. (5 additional authors not shown)

    Abstract: The CCAT Collaboration's six-meter Fred Young Submillimeter Telescope is scheduled to begin observing in the Chilean Atacama in 2025, targeting a variety of science goals throughout cosmic history. Prime-Cam is a 1.8-meter diameter cryostat that will host up to seven independent instrument modules designed for simultaneous spectroscopic and broadband, polarimetric surveys at millimeter to submilli… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 6 pages, 3 figures, conference proceedings submitted to the Journal of Low Temperature Physics

  45. arXiv:2406.05535  [pdf, other

    cs.LG cs.AI cs.CR

    Perturbation Towards Easy Samples Improves Targeted Adversarial Transferability

    Authors: Junqi Gao, Biqing Qi, Yao Li, Zhichang Guo, Dong Li, Yuming Xing, Dazhi Zhang

    Abstract: The transferability of adversarial perturbations provides an effective shortcut for black-box attacks. Targeted perturbations have greater practicality but are more difficult to transfer between models. In this paper, we experimentally and theoretically demonstrated that neural networks trained on the same dataset have more consistent performance in High-Sample-Density-Regions (HSDR) of each class… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Journal ref: Advances in Neural Information Processing Systems 36, 2023

  46. arXiv:2406.05534  [pdf, other

    cs.AI cs.CL cs.LG

    Online DPO: Online Direct Preference Optimization with Fast-Slow Chasing

    Authors: Biqing Qi, Pengfei Li, Fangyuan Li, Junqi Gao, Kaiyan Zhang, Bowen Zhou

    Abstract: Direct Preference Optimization (DPO) improves the alignment of large language models (LLMs) with human values by training directly on human preference datasets, eliminating the need for reward models. However, due to the presence of cross-domain human preferences, direct continual training can lead to catastrophic forgetting, limiting DPO's performance and efficiency. Inspired by intraspecific com… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  47. arXiv:2406.05532  [pdf, other

    cs.LG cs.AI

    Exploring Adversarial Robustness of Deep State Space Models

    Authors: Biqing Qi, Yang Luo, Junqi Gao, Pengfei Li, Kai Tian, Zhiyuan Ma, Bowen Zhou

    Abstract: Deep State Space Models (SSMs) have proven effective in numerous task scenarios but face significant security challenges due to Adversarial Perturbations (APs) in real-world deployments. Adversarial Training (AT) is a mainstream approach to enhancing Adversarial Robustness (AR) and has been validated on various traditional DNN architectures. However, its effectiveness in improving the AR of SSMs r… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  48. arXiv:2406.05531  [pdf, other

    cs.LG cs.AI

    Enhancing Adversarial Transferability via Information Bottleneck Constraints

    Authors: Biqing Qi, Junqi Gao, Jianxing Liu, Ligang Wu, Bowen Zhou

    Abstract: From the perspective of information bottleneck (IB) theory, we propose a novel framework for performing black-box transferable adversarial attacks named IBTA, which leverages advancements in invariant features. Intuitively, diminishing the reliance of adversarial perturbations on the original data, under equivalent attack performance constraints, encourages a greater reliance on invariant features… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Journal ref: IEEE Signal Processing Letters, 2024

  49. arXiv:2406.04334  [pdf, other

    cs.CV

    DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs

    Authors: Lingchen Meng, Jianwei Yang, Rui Tian, Xiyang Dai, Zuxuan Wu, Jianfeng Gao, Yu-Gang Jiang

    Abstract: Most large multimodal models (LMMs) are implemented by feeding visual tokens as a sequence into the first layer of a large language model (LLM). The resulting architecture is simple but significantly increases computation and memory costs, as it has to handle a large number of additional tokens in its input layer. This paper presents a new architecture DeepStack for LMMs. Considering $N$ layers in… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Project Page: https://deepstack-vl.github.io/

  50. arXiv:2406.03884  [pdf, ps, other

    math.AP math-ph

    Steady supersonic combustion flows with a contact discontinuity in two-dimensional finitely long nozzles

    Authors: Junlei Gao, Feimin Huang, Jie Kuang, Dehua Wang, Wei Xiang

    Abstract: In this paper, we are concerned with the two-dimensional steady supersonic combustion flows with a contact discontinuity moving through a nozzle of finite length. Mathematically, it can be formulated as a free boundary value problem governed by the two -dimensional steady combustion Euler equations with a contact discontinuity as the free boundary. The main mathematical difficulties are that the c… ▽ More

    Submitted 9 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.