Skip to main content

Showing 1–50 of 559 results for author: Huang, F

  1. arXiv:2407.10973  [pdf, other

    cs.AI

    Make-An-Agent: A Generalizable Policy Network Generator with Behavior-Prompted Diffusion

    Authors: Yongyuan Liang, Tingqiang Xu, Kaizhe Hu, Guangqi Jiang, Furong Huang, Huazhe Xu

    Abstract: Can we generate a control policy for an agent using just one demonstration of desired behaviors as a prompt, as effortlessly as creating an image from a textual description? In this paper, we present Make-An-Agent, a novel policy parameter generator that leverages the power of conditional diffusion models for behavior-to-policy generation. Guided by behavior embeddings that encode trajectory infor… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  2. arXiv:2407.10671  [pdf, other

    cs.CL cs.AI

    Qwen2 Technical Report

    Authors: An Yang, Baosong Yang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Zhou, Chengpeng Li, Chengyuan Li, Dayiheng Liu, Fei Huang, Guanting Dong, Haoran Wei, Huan Lin, Jialong Tang, Jialin Wang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Ma, Jianxin Yang, Jin Xu, Jingren Zhou, Jinze Bai, Jinzheng He, Junyang Lin , et al. (37 additional authors not shown)

    Abstract: This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models. We release a comprehensive suite of foundational and instruction-tuned language models, encompassing a parameter range from 0.5 to 72 billion, featuring dense models and a Mixture-of-Experts model. Qwen2 surpasses most prior open-weight models, including its predecessor Qwen1.5, a… ▽ More

    Submitted 17 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

    Comments: 25 pages, 1 figure

  3. arXiv:2407.08554  [pdf, other

    cs.AI cs.HC

    Establishing Rigorous and Cost-effective Clinical Trials for Artificial Intelligence Models

    Authors: Wanling Gao, Yunyou Huang, Dandan Cui, Zhuoming Yu, Wenjing Liu, Xiaoshuang Liang, Jiahui Zhao, Jiyue Xie, Hao Li, Li Ma, Ning Ye, Yumiao Kang, Dingfeng Luo, Peng Pan, Wei Huang, Zhongmou Liu, Jizhong Hu, Gangyuan Zhao, Chongrong Jiang, Fan Huang, Tianyi Wei, Suqin Tang, Bingjie Xia, Zhifei Zhang, Jianfeng Zhan

    Abstract: A profound gap persists between artificial intelligence (AI) and clinical practice in medicine, primarily due to the lack of rigorous and cost-effective evaluation methodologies. State-of-the-art and state-of-the-practice AI model evaluations are limited to laboratory studies on medical datasets or direct clinical trials with no or solely patient-centered controls. Moreover, the crucial role of cl… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 23 pages

  4. arXiv:2407.05682  [pdf, other

    cs.CL

    Retrieved In-Context Principles from Previous Mistakes

    Authors: Hao Sun, Yong Jiang, Bo Wang, Yingyan Hou, Yan Zhang, Pengjun Xie, Fei Huang

    Abstract: In-context learning (ICL) has been instrumental in adapting Large Language Models (LLMs) to downstream tasks using correct input-output examples. Recent advances have attempted to improve model performance through principles derived from mistakes, yet these approaches suffer from lack of customization and inadequate error coverage. To address these limitations, we propose Retrieved In-Context Prin… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  5. arXiv:2407.00942  [pdf, other

    cs.IR cs.AI cs.CL

    ProductAgent: Benchmarking Conversational Product Search Agent with Asking Clarification Questions

    Authors: Jingheng Ye, Yong Jiang, Xiaobin Wang, Yinghui Li, Yangning Li, Hai-Tao Zheng, Pengjun Xie, Fei Huang

    Abstract: This paper introduces the task of product demand clarification within an e-commercial scenario, where the user commences the conversation with ambiguous queries and the task-oriented agent is designed to achieve more accurate and tailored product searching by asking clarification questions. To address this task, we propose ProductAgent, a conversational information seeking agent equipped with abil… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: 17 pages, 13 tables, 6 figures. Under review

  6. arXiv:2407.00891  [pdf, other

    cs.LG q-bio.BM

    ZeroDDI: A Zero-Shot Drug-Drug Interaction Event Prediction Method with Semantic Enhanced Learning and Dual-Modal Uniform Alignment

    Authors: Ziyan Wang, Zhankun Xiong, Feng Huang, Xuan Liu, Wen Zhang

    Abstract: Drug-drug interactions (DDIs) can result in various pharmacological changes, which can be categorized into different classes known as DDI events (DDIEs). In recent years, previously unobserved/unseen DDIEs have been emerging, posing a new classification task when unseen classes have no labelled instances in the training stage, which is formulated as a zero-shot DDIE prediction (ZS-DDIE) task. Howe… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: Accepted by IJCAI2024

  7. arXiv:2406.19156  [pdf, other

    cs.LG

    Heterogeneous Causal Metapath Graph Neural Network for Gene-Microbe-Disease Association Prediction

    Authors: Kexin Zhang, Feng Huang, Luotao Liu, Zhankun Xiong, Hongyu Zhang, Yuan Quan, Wen Zhang

    Abstract: The recent focus on microbes in human medicine highlights their potential role in the genetic framework of diseases. To decode the complex interactions among genes, microbes, and diseases, computational predictions of gene-microbe-disease (GMD) associations are crucial. Existing methods primarily address gene-disease and microbe-disease associations, but the more intricate triple-wise GMD associat… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  8. arXiv:2406.17419  [pdf, other

    cs.CL cs.AI

    Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA

    Authors: Minzheng Wang, Longze Chen, Cheng Fu, Shengyi Liao, Xinghua Zhang, Bingli Wu, Haiyang Yu, Nan Xu, Lei Zhang, Run Luo, Yunshui Li, Min Yang, Fei Huang, Yongbin Li

    Abstract: Long-context modeling capabilities have garnered widespread attention, leading to the emergence of Large Language Models (LLMs) with ultra-context windows. Meanwhile, benchmarks for evaluating long-context LLMs are gradually catching up. However, existing benchmarks employ irrelevant noise texts to artificially extend the length of test cases, diverging from the real-world scenarios of long-contex… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: We release our code and data publicly at https://github.com/MozerWang/Loong

  9. arXiv:2406.15575  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Sketch-GNN: Scalable Graph Neural Networks with Sublinear Training Complexity

    Authors: Mucong Ding, Tahseen Rabbani, Bang An, Evan Z Wang, Furong Huang

    Abstract: Graph Neural Networks (GNNs) are widely applied to graph learning problems such as node classification. When scaling up the underlying graphs of GNNs to a larger size, we are forced to either train on the complete graph and keep the full graph adjacency and node embeddings in memory (which is often infeasible) or mini-batch sample the graph (which results in exponentially growing computational com… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: NeurIPS 2022

  10. arXiv:2406.15567  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    SAIL: Self-Improving Efficient Online Alignment of Large Language Models

    Authors: Mucong Ding, Souradip Chakraborty, Vibhu Agrawal, Zora Che, Alec Koppel, Mengdi Wang, Amrit Bedi, Furong Huang

    Abstract: Reinforcement Learning from Human Feedback (RLHF) is a key method for aligning large language models (LLMs) with human preferences. However, current offline alignment approaches like DPO, IPO, and SLiC rely heavily on fixed preference datasets, which can lead to sub-optimal performance. On the other hand, recent literature has focused on designing online RLHF methods but still lacks a unified conc… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 24 pages, 6 figures, 3 tables

  11. arXiv:2406.14884  [pdf, other

    cs.CL

    FlowBench: Revisiting and Benchmarking Workflow-Guided Planning for LLM-based Agents

    Authors: Ruixuan Xiao, Wentao Ma, Ke Wang, Yuchuan Wu, Junbo Zhao, Haobo Wang, Fei Huang, Yongbin Li

    Abstract: LLM-based agents have emerged as promising tools, which are crafted to fulfill complex tasks by iterative planning and action. However, these agents are susceptible to undesired planning hallucinations when lacking specific knowledge for expertise-intensive tasks. To address this, preliminary attempts are made to enhance planning reliability by incorporating external workflow-related knowledge. De… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  12. arXiv:2406.13114  [pdf, other

    cs.CL cs.AI

    Multi-Stage Balanced Distillation: Addressing Long-Tail Challenges in Sequence-Level Knowledge Distillation

    Authors: Yuhang Zhou, Jing Zhu, Paiheng Xu, Xiaoyu Liu, Xiyao Wang, Danai Koutra, Wei Ai, Furong Huang

    Abstract: Large language models (LLMs) have significantly advanced various natural language processing tasks, but deploying them remains computationally expensive. Knowledge distillation (KD) is a promising solution, enabling the transfer of capabilities from larger teacher LLMs to more compact student models. Particularly, sequence-level KD, which distills rationale-based reasoning processes instead of mer… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: preprint

  13. arXiv:2406.12429  [pdf, other

    cs.AI

    Adaptive Selection for Homogeneous Tools: An Instantiation in the RAG Scenario

    Authors: Feiteng Mu, Yong Jiang, Liwen Zhang, Chu Liu, Wenjie Li, Pengjun Xie, Fei Huang

    Abstract: Current research on tool learning primarily focuses on selecting the most effective tool from a wide array of options, often overlooking cost-effectiveness, a crucial factor in human problem-solving. In this paper, we address the selection of homogeneous tools by predicting both their performance and the associated cost required to accomplish a given task. We then assign queries to the optimal too… ▽ More

    Submitted 11 July, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  14. arXiv:2406.12259  [pdf

    cs.AI

    Adversarial Attacks on Large Language Models in Medicine

    Authors: Yifan Yang, Qiao Jin, Furong Huang, Zhiyong Lu

    Abstract: The integration of Large Language Models (LLMs) into healthcare applications offers promising advancements in medical diagnostics, treatment recommendations, and patient care. However, the susceptibility of LLMs to adversarial attacks poses a significant threat, potentially leading to harmful outcomes in delicate medical contexts. This study investigates the vulnerability of LLMs to two types of a… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  15. arXiv:2406.12091  [pdf, other

    cs.LG cs.CL cs.CR

    Is poisoning a real threat to LLM alignment? Maybe more so than you think

    Authors: Pankayaraj Pathmanathan, Souradip Chakraborty, Xiangyu Liu, Yongyuan Liang, Furong Huang

    Abstract: Recent advancements in Reinforcement Learning with Human Feedback (RLHF) have significantly impacted the alignment of Large Language Models (LLMs). The sensitivity of reinforcement learning algorithms such as Proximal Policy Optimization (PPO) has led to new line work on Direct Policy Optimization (DPO), which treats RLHF in a supervised learning framework. The increased practical use of these RLH… ▽ More

    Submitted 19 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Journal ref: ICML 2024 Workshop MHFAIA

  16. arXiv:2406.11882  [pdf

    cs.AI cs.LG

    Applications of Explainable artificial intelligence in Earth system science

    Authors: Feini Huang, Shijie Jiang, Lu Li, Yongkun Zhang, Ye Zhang, Ruqing Zhang, Qingliang Li, Danxi Li, Wei Shangguan, Yongjiu Dai

    Abstract: In recent years, artificial intelligence (AI) rapidly accelerated its influence and is expected to promote the development of Earth system science (ESS) if properly harnessed. In application of AI to ESS, a significant hurdle lies in the interpretability conundrum, an inherent problem of black-box nature arising from the complexity of AI algorithms. To address this, explainable AI (XAI) offers a s… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  17. arXiv:2406.11371  [pdf, other

    cs.CV physics.optics

    Video Frame Interpolation for Polarization via Swin-Transformer

    Authors: Feng Huang, Xin Zhang, Yixuan Xu, Xuesong Wang, Xianyu Wu

    Abstract: Video Frame Interpolation (VFI) has been extensively explored and demonstrated, yet its application to polarization remains largely unexplored. Due to the selective transmission of light by polarized filters, longer exposure times are typically required to ensure sufficient light intensity, which consequently lower the temporal sample rates. Furthermore, because polarization reflected by objects v… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 18 pages, 10 figures, 7 tables, 73 citations

  18. arXiv:2406.10900  [pdf, other

    cs.CV cs.CL

    AUTOHALLUSION: Automatic Generation of Hallucination Benchmarks for Vision-Language Models

    Authors: Xiyang Wu, Tianrui Guan, Dianqi Li, Shuaiyi Huang, Xiaoyu Liu, Xijun Wang, Ruiqi Xian, Abhinav Shrivastava, Furong Huang, Jordan Lee Boyd-Graber, Tianyi Zhou, Dinesh Manocha

    Abstract: Large vision-language models (LVLMs) hallucinate: certain context cues in an image may trigger the language module's overconfident and incorrect reasoning on abnormal or hypothetical objects. Though a few benchmarks have been developed to investigate LVLM hallucinations, they mainly rely on hand-crafted corner cases whose fail patterns may hardly generalize, and finetuning on them could undermine… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  19. arXiv:2406.08426  [pdf, other

    cs.CL cs.AI cs.DB

    Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL

    Authors: Zijin Hong, Zheng Yuan, Qinggang Zhang, Hao Chen, Junnan Dong, Feiran Huang, Xiao Huang

    Abstract: Generating accurate SQL from natural language questions (text-to-SQL) is a long-standing challenge due to the complexities in user question understanding, database schema comprehension, and SQL generation. Conventional text-to-SQL systems, comprising human engineering and deep neural networks, have made substantial progress. Subsequently, pre-trained language models (PLMs) have been developed and… ▽ More

    Submitted 16 July, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

  20. GPT4Rec: Graph Prompt Tuning for Streaming Recommendation

    Authors: Peiyan Zhang, Yuchen Yan, Xi Zhang, Liying Kang, Chaozhuo Li, Feiran Huang, Senzhang Wang, Sunghun Kim

    Abstract: In the realm of personalized recommender systems, the challenge of adapting to evolving user preferences and the continuous influx of new users and items is paramount. Conventional models, typically reliant on a static training-test approach, struggle to keep pace with these dynamic demands. Streaming recommendation, particularly through continual graph learning, has emerged as a novel solution. H… ▽ More

    Submitted 11 July, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted by SIGIR 2024

    ACM Class: H.3.3

  21. arXiv:2406.08116  [pdf, other

    cs.CL cs.AI

    Supportiveness-based Knowledge Rewriting for Retrieval-augmented Language Modeling

    Authors: Zile Qiao, Wei Ye, Yong Jiang, Tong Mo, Pengjun Xie, Weiping Li, Fei Huang, Shikun Zhang

    Abstract: Retrieval-augmented language models (RALMs) have recently shown great potential in mitigating the limitations of implicit knowledge in LLMs, such as untimely updating of the latest expertise and unreliable retention of long-tail knowledge. However, since the external knowledge base, as well as the retriever, can not guarantee reliability, potentially leading to the knowledge retrieved not being he… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  22. arXiv:2406.07381  [pdf, other

    cs.AI cs.LG

    World Models with Hints of Large Language Models for Goal Achieving

    Authors: Zeyuan Liu, Ziyu Huan, Xiyao Wang, Jiafei Lyu, Jian Tao, Xiu Li, Furong Huang, Huazhe Xu

    Abstract: Reinforcement learning struggles in the face of long-horizon tasks and sparse goals due to the difficulty in manual reward specification. While existing methods address this by adding intrinsic rewards, they may fail to provide meaningful guidance in long-horizon decision-making tasks with large state and action spaces, lacking purposeful exploration. Inspired by human cognition, we propose a new… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  23. arXiv:2406.07362  [pdf, other

    cs.HC

    AI.vs.Clinician: Unveiling Intricate Interactions Between AI and Clinicians through an Open-Access Database

    Authors: Wanling Gao, Yuan Liu, Zhuoming Yu, Dandan Cui, Wenjing Liu, Xiaoshuang Liang, Jiahui Zhao, Jiyue Xie, Hao Li, Li Ma, Ning Ye, Yumiao Kang, Dingfeng Luo, Peng Pan, Wei Huang, Zhongmou Liu, Jizhong Hu, Fan Huang, Gangyuan Zhao, Chongrong Jiang, Tianyi Wei, Zhifei Zhang, Yunyou Huang, Jianfeng Zhan

    Abstract: Artificial Intelligence (AI) plays a crucial role in medical field and has the potential to revolutionize healthcare practices. However, the success of AI models and their impacts hinge on the synergy between AI and medical specialists, with clinicians assuming a dominant role. Unfortunately, the intricate dynamics and interactions between AI and clinicians remain undiscovered and thus hinder AI f… ▽ More

    Submitted 15 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

    Comments: 12 pages

  24. arXiv:2406.05644  [pdf, other

    cs.CL cs.AI cs.CR cs.CY

    How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States

    Authors: Zhenhong Zhou, Haiyang Yu, Xinghua Zhang, Rongwu Xu, Fei Huang, Yongbin Li

    Abstract: Large language models (LLMs) rely on safety alignment to avoid responding to malicious user inputs. Unfortunately, jailbreak can circumvent safety guardrails, resulting in LLMs generating harmful content and raising concerns about LLM safety. Due to language models with intensive parameters often regarded as black boxes, the mechanisms of alignment and jailbreak are challenging to elucidate. In th… ▽ More

    Submitted 13 June, 2024; v1 submitted 9 June, 2024; originally announced June 2024.

    Comments: 27 pages

  25. arXiv:2406.03836  [pdf, other

    cs.CR cs.AI

    Proactive Detection of Physical Inter-rule Vulnerabilities in IoT Services Using a Deep Learning Approach

    Authors: Bing Huang, Chen Chen, Kwok-Yan Lam, Fuqun Huang

    Abstract: Emerging Internet of Things (IoT) platforms provide sophisticated capabilities to automate IoT services by enabling occupants to create trigger-action rules. Multiple trigger-action rules can physically interact with each other via shared environment channels, such as temperature, humidity, and illumination. We refer to inter-rule interactions via shared environment channels as a physical inter-ru… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted by IEEE ICWS 2024 Workshop

  26. arXiv:2406.01422  [pdf, other

    cs.SE cs.CL

    How to Understand Whole Software Repository?

    Authors: Yingwei Ma, Qingping Yang, Rongyu Cao, Binhua Li, Fei Huang, Yongbin Li

    Abstract: Recently, Large Language Model (LLM) based agents have advanced the significant development of Automatic Software Engineering (ASE). Although verified effectiveness, the designs of the existing methods mainly focus on the local information of codes, e.g., issues, classes, and functions, leading to limitations in capturing the global context and interdependencies within the software system. From th… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  27. arXiv:2406.01014  [pdf, other

    cs.CL cs.CV

    Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration

    Authors: Junyang Wang, Haiyang Xu, Haitao Jia, Xi Zhang, Ming Yan, Weizhou Shen, Ji Zhang, Fei Huang, Jitao Sang

    Abstract: Mobile device operation tasks are increasingly becoming a popular multi-modal AI application scenario. Current Multi-modal Large Language Models (MLLMs), constrained by their training data, lack the capability to function effectively as operation assistants. Instead, MLLM-based agents, which enhance capabilities through tool invocation, are gradually being applied to this scenario. However, the tw… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 22 pages, 11 figures, 10 Tables

  28. arXiv:2405.20495  [pdf, other

    cs.CL cs.LG

    Transfer Q Star: Principled Decoding for LLM Alignment

    Authors: Souradip Chakraborty, Soumya Suvra Ghosal, Ming Yin, Dinesh Manocha, Mengdi Wang, Amrit Singh Bedi, Furong Huang

    Abstract: Aligning foundation models is essential for their safe and trustworthy deployment. However, traditional fine-tuning methods are computationally intensive and require updating billions of model parameters. A promising alternative, alignment via decoding, adjusts the response distribution directly without model updates to maximize a target reward $r$, thus providing a lightweight and adaptable frame… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  29. arXiv:2405.19856  [pdf, other

    cs.CL cs.SE

    DevEval: A Manually-Annotated Code Generation Benchmark Aligned with Real-World Code Repositories

    Authors: Jia Li, Ge Li, Yunfei Zhao, Yongmin Li, Huanyu Liu, Hao Zhu, Lecheng Wang, Kaibo Liu, Zheng Fang, Lanshen Wang, Jiazheng Ding, Xuanming Zhang, Yuqi Zhu, Yihong Dong, Zhi Jin, Binhua Li, Fei Huang, Yongbin Li

    Abstract: How to evaluate the coding abilities of Large Language Models (LLMs) remains an open question. We find that existing benchmarks are poorly aligned with real-world code repositories and are insufficient to evaluate the coding abilities of LLMs. To address the knowledge gap, we propose a new benchmark named DevEval, which has three advances. (1) DevEval aligns with real-world repositories in multi… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Accepted by the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024). arXiv admin note: substantial text overlap with arXiv:2404.00599, arXiv:2401.06401

  30. arXiv:2405.17931  [pdf, other

    cs.CL cs.LG

    Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment

    Authors: Keming Lu, Bowen Yu, Fei Huang, Yang Fan, Runji Lin, Chang Zhou

    Abstract: Effectively aligning Large Language Models (LLMs) with human-centric values while preventing the degradation of abilities acquired through Pre-training and Supervised Fine-tuning (SFT) poses a central challenge in Reinforcement Learning from Human Feedback (RLHF). In this paper, we first discover that interpolating RLHF and SFT model parameters can adjust the trade-off between human preference and… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  31. arXiv:2405.17535  [pdf, other

    cs.LG cs.AI stat.ML

    Calibrated Dataset Condensation for Faster Hyperparameter Search

    Authors: Mucong Ding, Yuancheng Xu, Tahseen Rabbani, Xiaoyu Liu, Brian Gravelle, Teresa Ranadive, Tai-Ching Tuan, Furong Huang

    Abstract: Dataset condensation can be used to reduce the computational cost of training multiple models on a large dataset by condensing the training dataset into a small synthetic set. State-of-the-art approaches rely on matching the model gradients between the real and synthetic data. However, there is no theoretical guarantee of the generalizability of the condensed data: data condensation often generali… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  32. arXiv:2405.17404  [pdf, other

    cs.LG cs.AI stat.ML

    Spectral Greedy Coresets for Graph Neural Networks

    Authors: Mucong Ding, Yinhan He, Jundong Li, Furong Huang

    Abstract: The ubiquity of large-scale graphs in node-classification tasks significantly hinders the real-world applications of Graph Neural Networks (GNNs). Node sampling, graph coarsening, and dataset condensation are effective strategies for enhancing data efficiency. However, owing to the interdependence of graph nodes, coreset selection, which selects subsets of the data examples, has not been successfu… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  33. arXiv:2405.15973  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Enhancing Visual-Language Modality Alignment in Large Vision Language Models via Self-Improvement

    Authors: Xiyao Wang, Jiuhai Chen, Zhaoyang Wang, Yuhang Zhou, Yiyang Zhou, Huaxiu Yao, Tianyi Zhou, Tom Goldstein, Parminder Bhatia, Furong Huang, Cao Xiao

    Abstract: Large vision-language models (LVLMs) have achieved impressive results in various visual question-answering and reasoning tasks through vision instruction tuning on specific datasets. However, there is still significant room for improvement in the alignment between visual and language modalities. Previous methods to enhance this alignment typically require external models or data, heavily depending… ▽ More

    Submitted 7 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: 15 pages, 8 figures

  34. arXiv:2405.14768  [pdf, other

    cs.CL cs.AI cs.CV cs.IR cs.LG

    WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models

    Authors: Peng Wang, Zexi Li, Ningyu Zhang, Ziwen Xu, Yunzhi Yao, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen

    Abstract: Large language models (LLMs) need knowledge updates to meet the ever-growing world facts and correct the hallucinated responses, facilitating the methods of lifelong model editing. Where the updated knowledge resides in memories is a fundamental question for model editing. In this paper, we find that editing either long-term memory (direct model parameters) or working memory (non-parametric knowle… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Work in progress

  35. arXiv:2405.14431  [pdf, other

    cs.CL cs.AI cs.IR

    RaFe: Ranking Feedback Improves Query Rewriting for RAG

    Authors: Shengyu Mao, Yong Jiang, Boli Chen, Xiao Li, Peng Wang, Xinyu Wang, Pengjun Xie, Fei Huang, Huajun Chen, Ningyu Zhang

    Abstract: As Large Language Models (LLMs) and Retrieval Augmentation Generation (RAG) techniques have evolved, query rewriting has been widely incorporated into the RAG system for downstream tasks like open-domain QA. Many works have attempted to utilize small models with reinforcement learning rather than costly LLMs to improve query rewriting. However, current methods require annotations (e.g., labeled re… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 16 pages

  36. arXiv:2405.14205  [pdf, other

    cs.CL cs.AI cs.CV cs.LG cs.MA

    Agent Planning with World Knowledge Model

    Authors: Shuofei Qiao, Runnan Fang, Ningyu Zhang, Yuqi Zhu, Xiang Chen, Shumin Deng, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen

    Abstract: Recent endeavors towards directly using large language models (LLMs) as agent models to execute interactive planning tasks have shown commendable results. Despite their achievements, however, they still struggle with brainless trial-and-error in global planning and generating hallucinatory actions in local planning due to their poor understanding of the ''real'' physical world. Imitating humans' m… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Work in progress

  37. arXiv:2405.13879  [pdf, other

    cs.GT cs.DC cs.LG econ.TH

    FACT or Fiction: Can Truthful Mechanisms Eliminate Federated Free Riding?

    Authors: Marco Bornstein, Amrit Singh Bedi, Abdirisak Mohamed, Furong Huang

    Abstract: Standard federated learning (FL) approaches are vulnerable to the free-rider dilemma: participating agents can contribute little to nothing yet receive a well-trained aggregated model. While prior mechanisms attempt to solve the free-rider dilemma, none have addressed the issue of truthfulness. In practice, adversarial agents can provide false information to the server in order to cheat its way ou… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 18 pages, 5 figures

  38. arXiv:2405.13045  [pdf, other

    cs.HC cs.AI

    CoLay: Controllable Layout Generation through Multi-conditional Latent Diffusion

    Authors: Chin-Yi Cheng, Ruiqi Gao, Forrest Huang, Yang Li

    Abstract: Layout design generation has recently gained significant attention due to its potential applications in various fields, including UI, graphic, and floor plan design. However, existing models face two main challenges that limits their adoption in practice. Firstly, the limited expressiveness of individual condition types used in previous works restricts designers' ability to convey complex design i… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  39. arXiv:2405.13026  [pdf, other

    cs.CL cs.AI

    Leveraging Human Revisions for Improving Text-to-Layout Models

    Authors: Amber Xie, Chin-Yi Cheng, Forrest Huang, Yang Li

    Abstract: Learning from human feedback has shown success in aligning large, pretrained models with human values. Prior works have mostly focused on learning from high-level labels, such as preferences between pairs of model outputs. On the other hand, many domains could benefit from more involved, detailed feedback, such as revisions, explanations, and reasoning of human users. Our work proposes using nuanc… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  40. arXiv:2405.11431  [pdf, other

    cs.LG q-fin.ST stat.ML

    Review of deep learning models for crypto price prediction: implementation and evaluation

    Authors: Jingyang Wu, Xinyi Zhang, Fangyixuan Huang, Haochen Zhou, Rohtiash Chandra

    Abstract: There has been much interest in accurate cryptocurrency price forecast models by investors and researchers. Deep Learning models are prominent machine learning techniques that have transformed various fields and have shown potential for finance and economics. Although various deep learning models have been explored for cryptocurrency price forecasting, it is not clear which models are suitable due… ▽ More

    Submitted 2 June, 2024; v1 submitted 18 May, 2024; originally announced May 2024.

  41. arXiv:2405.05497  [pdf, other

    cs.CV

    Multi-Level Feature Fusion Network for Lightweight Stereo Image Super-Resolution

    Authors: Yunxiang Li, Wenbin Zou, Qiaomu Wei, Feng Huang, Jing Wu

    Abstract: Stereo image super-resolution utilizes the cross-view complementary information brought by the disparity effect of left and right perspective images to reconstruct higher-quality images. Cascading feature extraction modules and cross-view feature interaction modules to make use of the information from stereo images is the focus of numerous methods. However, this adds a great deal of network parame… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 10 pages, 7 figures, CVPRWorkshop NTIRE2024

  42. arXiv:2405.03296  [pdf, ps, other

    cs.LG cs.AI

    Coefficient Decomposition for Spectral Graph Convolution

    Authors: Feng Huang, Wen Zhang

    Abstract: Spectral graph convolutional network (SGCN) is a kind of graph neural networks (GNN) based on graph signal filters, and has shown compelling expressivity for modeling graph-structured data. Most SGCNs adopt polynomial filters and learn the coefficients from the training data. Many of them focus on which polynomial basis leads to optimal expressive power and models' architecture is little discussed… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  43. arXiv:2404.18702  [pdf, other

    cs.LG cs.CR stat.AP stat.ML

    Why You Should Not Trust Interpretations in Machine Learning: Adversarial Attacks on Partial Dependence Plots

    Authors: Xi Xin, Giles Hooker, Fei Huang

    Abstract: The adoption of artificial intelligence (AI) across industries has led to the widespread use of complex black-box models and interpretation tools for decision making. This paper proposes an adversarial framework to uncover the vulnerability of permutation-based interpretation methods for machine learning tasks, with a particular focus on partial dependence (PD) plots. This adversarial framework mo… ▽ More

    Submitted 1 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

  44. arXiv:2404.18426  [pdf, other

    cs.CV

    Efficient Meta-Learning Enabled Lightweight Multiscale Few-Shot Object Detection in Remote Sensing Images

    Authors: Wenbin Guan, Zijiu Yang, Xiaohong Wu, Liqiong Chen, Feng Huang, Xiaohai He, Honggang Chen

    Abstract: Presently, the task of few-shot object detection (FSOD) in remote sensing images (RSIs) has become a focal point of attention. Numerous few-shot detectors, particularly those based on two-stage detectors, face challenges when dealing with the multiscale complexities inherent in RSIs. Moreover, these detectors present impractical characteristics in real-world applications, mainly due to their unwie… ▽ More

    Submitted 16 June, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

  45. arXiv:2404.16635  [pdf, other

    cs.CV

    TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning

    Authors: Liang Zhang, Anwen Hu, Haiyang Xu, Ming Yan, Yichen Xu, Qin Jin, Ji Zhang, Fei Huang

    Abstract: Charts are important for presenting and explaining complex data relationships. Recently, multimodal large language models (MLLMs) have shown remarkable capabilities in various chart understanding tasks. However, the sheer size of these models in terms of parameters and computational requirements limits their use in resource-constrained environments. In this paper, we present TinyChart, an efficien… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: 13 pages, 11 figures

  46. arXiv:2404.16571  [pdf, other

    cs.CV

    MonoPCC: Photometric-invariant Cycle Constraint for Monocular Depth Estimation of Endoscopic Images

    Authors: Zhiwei Wang, Ying Zhou, Shiquan He, Ting Li, Fan Huang, Qiang Ding, Xinxia Feng, Mei Liu, Qiang Li

    Abstract: Photometric constraint is indispensable for self-supervised monocular depth estimation. It involves warping a source image onto a target view using estimated depth&pose, and then minimizing the difference between the warped and target images. However, the endoscopic built-in light causes significant brightness fluctuations, and thus makes the photometric constraint unreliable. Previous efforts onl… ▽ More

    Submitted 7 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: 11 pages, 10 figures

  47. arXiv:2404.14999  [pdf, other

    cs.DB cs.LG

    A Unified Replay-based Continuous Learning Framework for Spatio-Temporal Prediction on Streaming Data

    Authors: Hao Miao, Yan Zhao, Chenjuan Guo, Bin Yang, Kai Zheng, Feiteng Huang, Jiandong Xie, Christian S. Jensen

    Abstract: The widespread deployment of wireless and mobile devices results in a proliferation of spatio-temporal data that is used in applications, e.g., traffic prediction, human mobility mining, and air quality prediction, where spatio-temporal prediction is often essential to enable safety, predictability, or reliability. Many recent proposals that target deep learning for spatio-temporal prediction suff… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: Accepted by ICDE 2024

  48. arXiv:2404.14387  [pdf, other

    cs.CL cs.AI

    A Survey on Self-Evolution of Large Language Models

    Authors: Zhengwei Tao, Ting-En Lin, Xiancai Chen, Hangyu Li, Yuchuan Wu, Yongbin Li, Zhi Jin, Fei Huang, Dacheng Tao, Jingren Zhou

    Abstract: Large language models (LLMs) have significantly advanced in various fields and intelligent agent applications. However, current LLMs that learn from human or external model supervision are costly and may face performance ceilings as task complexity and diversity increase. To address this issue, self-evolution approaches that enable LLM to autonomously acquire, refine, and learn from experiences ge… ▽ More

    Submitted 3 June, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

  49. arXiv:2404.11384  [pdf, other

    cs.CL cs.LG

    Exploring Key Point Analysis with Pairwise Generation and Graph Partitioning

    Authors: Xiao Li, Yong Jiang, Shen Huang, Pengjun Xie, Gong Cheng, Fei Huang

    Abstract: Key Point Analysis (KPA), the summarization of multiple arguments into a concise collection of key points, continues to be a significant and unresolved issue within the field of argument mining. Existing models adapt a two-stage pipeline of clustering arguments or generating key points for argument clusters. This approach rely on semantic similarity instead of measuring the existence of shared key… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: 11 pages, 4 figures, 4 tables. Accepted to NAACL 2024

  50. arXiv:2404.03264  [pdf, other

    cs.CY cs.AI

    Foundation Model for Advancing Healthcare: Challenges, Opportunities, and Future Directions

    Authors: Yuting He, Fuxiang Huang, Xinrui Jiang, Yuxiang Nie, Minghao Wang, Jiguang Wang, Hao Chen

    Abstract: Foundation model, which is pre-trained on broad data and is able to adapt to a wide range of tasks, is advancing healthcare. It promotes the development of healthcare artificial intelligence (AI) models, breaking the contradiction between limited AI models and diverse healthcare practices. Much more widespread healthcare scenarios will benefit from the development of a healthcare foundation model… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.