Skip to main content

Showing 1–50 of 1,312 results for author: Huang, F

  1. arXiv:2407.08554  [pdf, other

    cs.AI cs.HC

    Establishing Rigorous and Cost-effective Clinical Trials for Artificial Intelligence Models

    Authors: Wanling Gao, Yunyou Huang, Dandan Cui, Zhuoming Yu, Wenjing Liu, Xiaoshuang Liang, Jiahui Zhao, Jiyue Xie, Hao Li, Li Ma, Ning Ye, Yumiao Kang, Dingfeng Luo, Peng Pan, Wei Huang, Zhongmou Liu, Jizhong Hu, Gangyuan Zhao, Chongrong Jiang, Fan Huang, Tianyi Wei, Suqin Tang, Bingjie Xia, Zhifei Zhang, Jianfeng Zhan

    Abstract: A profound gap persists between artificial intelligence (AI) and clinical practice in medicine, primarily due to the lack of rigorous and cost-effective evaluation methodologies. State-of-the-art and state-of-the-practice AI model evaluations are limited to laboratory studies on medical datasets or direct clinical trials with no or solely patient-centered controls. Moreover, the crucial role of cl… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 23 pages

  2. arXiv:2407.05682  [pdf, other

    cs.CL

    Retrieved In-Context Principles from Previous Mistakes

    Authors: Hao Sun, Yong Jiang, Bo Wang, Yingyan Hou, Yan Zhang, Pengjun Xie, Fei Huang

    Abstract: In-context learning (ICL) has been instrumental in adapting Large Language Models (LLMs) to downstream tasks using correct input-output examples. Recent advances have attempted to improve model performance through principles derived from mistakes, yet these approaches suffer from lack of customization and inadequate error coverage. To address these limitations, we propose Retrieved In-Context Prin… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  3. arXiv:2407.00942  [pdf, other

    cs.IR cs.AI cs.CL

    ProductAgent: Benchmarking Conversational Product Search Agent with Asking Clarification Questions

    Authors: Jingheng Ye, Yong Jiang, Xiaobin Wang, Yinghui Li, Yangning Li, Hai-Tao Zheng, Pengjun Xie, Fei Huang

    Abstract: This paper introduces the task of product demand clarification within an e-commercial scenario, where the user commences the conversation with ambiguous queries and the task-oriented agent is designed to achieve more accurate and tailored product searching by asking clarification questions. To address this task, we propose ProductAgent, a conversational information seeking agent equipped with abil… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: 17 pages, 13 tables, 6 figures. Under review

  4. arXiv:2407.00891  [pdf, other

    cs.LG q-bio.BM

    ZeroDDI: A Zero-Shot Drug-Drug Interaction Event Prediction Method with Semantic Enhanced Learning and Dual-Modal Uniform Alignment

    Authors: Ziyan Wang, Zhankun Xiong, Feng Huang, Xuan Liu, Wen Zhang

    Abstract: Drug-drug interactions (DDIs) can result in various pharmacological changes, which can be categorized into different classes known as DDI events (DDIEs). In recent years, previously unobserved/unseen DDIEs have been emerging, posing a new classification task when unseen classes have no labelled instances in the training stage, which is formulated as a zero-shot DDIE prediction (ZS-DDIE) task. Howe… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: Accepted by IJCAI2024

  5. arXiv:2407.00328  [pdf, other

    astro-ph.HE

    Constraints on the energy spectrum of the diffuse cosmic neutrino flux from the ANTARES neutrino telescope

    Authors: ANTARES Collaboration, A. Albert, S. Alves, M. André, M. Ardid, S. Ardid, J. -J. Aubert, J. Aublin, B. Baret, S. Basa, Y. Becherini, B. Belhorma, M. Bendahman, F. Benfenati, V. Bertin, S. Biagi, J. Boumaaza, M. Bouta, M. C. Bouwhuis, H. Brânzaş, R. Bruijn, J. Brunner, J. Busto, B. Caiffi, D. Calvo , et al. (117 additional authors not shown)

    Abstract: High-significance evidences of the existence of a high-energy diffuse flux of cosmic neutrinos have emerged in the last decade from several observations by the IceCube Collaboration. The ANTARES neutrino telescope took data for 15 years in the Mediterranean Sea, from 2007 to 2022, and collected a high-purity all-flavour neutrino sample. The search for a diffuse cosmic neutrino signal using this da… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  6. arXiv:2406.19156  [pdf, other

    cs.LG

    Heterogeneous Causal Metapath Graph Neural Network for Gene-Microbe-Disease Association Prediction

    Authors: Kexin Zhang, Feng Huang, Luotao Liu, Zhankun Xiong, Hongyu Zhang, Yuan Quan, Wen Zhang

    Abstract: The recent focus on microbes in human medicine highlights their potential role in the genetic framework of diseases. To decode the complex interactions among genes, microbes, and diseases, computational predictions of gene-microbe-disease (GMD) associations are crucial. Existing methods primarily address gene-disease and microbe-disease associations, but the more intricate triple-wise GMD associat… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  7. arXiv:2406.17419  [pdf, other

    cs.CL cs.AI

    Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA

    Authors: Minzheng Wang, Longze Chen, Cheng Fu, Shengyi Liao, Xinghua Zhang, Bingli Wu, Haiyang Yu, Nan Xu, Lei Zhang, Run Luo, Yunshui Li, Min Yang, Fei Huang, Yongbin Li

    Abstract: Long-context modeling capabilities have garnered widespread attention, leading to the emergence of Large Language Models (LLMs) with ultra-context windows. Meanwhile, benchmarks for evaluating long-context LLMs are gradually catching up. However, existing benchmarks employ irrelevant noise texts to artificially extend the length of test cases, diverging from the real-world scenarios of long-contex… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: We release our code and data publicly at https://github.com/MozerWang/Loong

  8. arXiv:2406.15575  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Sketch-GNN: Scalable Graph Neural Networks with Sublinear Training Complexity

    Authors: Mucong Ding, Tahseen Rabbani, Bang An, Evan Z Wang, Furong Huang

    Abstract: Graph Neural Networks (GNNs) are widely applied to graph learning problems such as node classification. When scaling up the underlying graphs of GNNs to a larger size, we are forced to either train on the complete graph and keep the full graph adjacency and node embeddings in memory (which is often infeasible) or mini-batch sample the graph (which results in exponentially growing computational com… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: NeurIPS 2022

  9. arXiv:2406.15567  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    SAIL: Self-Improving Efficient Online Alignment of Large Language Models

    Authors: Mucong Ding, Souradip Chakraborty, Vibhu Agrawal, Zora Che, Alec Koppel, Mengdi Wang, Amrit Bedi, Furong Huang

    Abstract: Reinforcement Learning from Human Feedback (RLHF) is a key method for aligning large language models (LLMs) with human preferences. However, current offline alignment approaches like DPO, IPO, and SLiC rely heavily on fixed preference datasets, which can lead to sub-optimal performance. On the other hand, recent literature has focused on designing online RLHF methods but still lacks a unified conc… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 24 pages, 6 figures, 3 tables

  10. arXiv:2406.14884  [pdf, other

    cs.CL

    FlowBench: Revisiting and Benchmarking Workflow-Guided Planning for LLM-based Agents

    Authors: Ruixuan Xiao, Wentao Ma, Ke Wang, Yuchuan Wu, Junbo Zhao, Haobo Wang, Fei Huang, Yongbin Li

    Abstract: LLM-based agents have emerged as promising tools, which are crafted to fulfill complex tasks by iterative planning and action. However, these agents are susceptible to undesired planning hallucinations when lacking specific knowledge for expertise-intensive tasks. To address this, preliminary attempts are made to enhance planning reliability by incorporating external workflow-related knowledge. De… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  11. arXiv:2406.14320  [pdf, ps, other

    hep-th cond-mat.str-el math-ph math.CT quant-ph

    Anyon condensation in mixed-state topological order

    Authors: Ken Kikuchi, Kah-Sen Kam, Fu-Hsiang Huang

    Abstract: We discuss anyon condensation in mixed-state topological order. The phases were recently conjectured to be classified by pre-modular fusion categories. Just like anyon condensation in pure-state topological order, a bootstrap analysis shows condensable anyons are given by connected étale algebras. We explain how to perform generic anyon condensation including non-invertible anyons and successive c… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 52 pages, 14 figures

  12. arXiv:2406.13114  [pdf, other

    cs.CL cs.AI

    Multi-Stage Balanced Distillation: Addressing Long-Tail Challenges in Sequence-Level Knowledge Distillation

    Authors: Yuhang Zhou, Jing Zhu, Paiheng Xu, Xiaoyu Liu, Xiyao Wang, Danai Koutra, Wei Ai, Furong Huang

    Abstract: Large language models (LLMs) have significantly advanced various natural language processing tasks, but deploying them remains computationally expensive. Knowledge distillation (KD) is a promising solution, enabling the transfer of capabilities from larger teacher LLMs to more compact student models. Particularly, sequence-level KD, which distills rationale-based reasoning processes instead of mer… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: preprint

  13. arXiv:2406.12429  [pdf, other

    cs.AI

    Adaptive Selection for Homogeneous Tools: An Instantiation in the RAG Scenario

    Authors: Feiteng Mu, Yong Jiang, Liwen Zhang, Chu Liu, Wenjie Li, Pengjun Xie, Fei Huang

    Abstract: Current research on tool learning primarily focuses on selecting the most effective tool from a wide array of options, often overlooking cost-effectiveness, a crucial factor in human problem-solving. In this paper, we address the selection of homogeneous tools by predicting both their performance and the associated cost required to accomplish a given task. We then assign queries to the optimal too… ▽ More

    Submitted 11 July, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  14. arXiv:2406.12259  [pdf

    cs.AI

    Adversarial Attacks on Large Language Models in Medicine

    Authors: Yifan Yang, Qiao Jin, Furong Huang, Zhiyong Lu

    Abstract: The integration of Large Language Models (LLMs) into healthcare applications offers promising advancements in medical diagnostics, treatment recommendations, and patient care. However, the susceptibility of LLMs to adversarial attacks poses a significant threat, potentially leading to harmful outcomes in delicate medical contexts. This study investigates the vulnerability of LLMs to two types of a… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  15. arXiv:2406.12091  [pdf, other

    cs.LG cs.CL cs.CR

    Is poisoning a real threat to LLM alignment? Maybe more so than you think

    Authors: Pankayaraj Pathmanathan, Souradip Chakraborty, Xiangyu Liu, Yongyuan Liang, Furong Huang

    Abstract: Recent advancements in Reinforcement Learning with Human Feedback (RLHF) have significantly impacted the alignment of Large Language Models (LLMs). The sensitivity of reinforcement learning algorithms such as Proximal Policy Optimization (PPO) has led to new line work on Direct Policy Optimization (DPO), which treats RLHF in a supervised learning framework. The increased practical use of these RLH… ▽ More

    Submitted 19 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Journal ref: ICML 2024 Workshop MHFAIA

  16. arXiv:2406.11882  [pdf

    cs.AI cs.LG

    Applications of Explainable artificial intelligence in Earth system science

    Authors: Feini Huang, Shijie Jiang, Lu Li, Yongkun Zhang, Ye Zhang, Ruqing Zhang, Qingliang Li, Danxi Li, Wei Shangguan, Yongjiu Dai

    Abstract: In recent years, artificial intelligence (AI) rapidly accelerated its influence and is expected to promote the development of Earth system science (ESS) if properly harnessed. In application of AI to ESS, a significant hurdle lies in the interpretability conundrum, an inherent problem of black-box nature arising from the complexity of AI algorithms. To address this, explainable AI (XAI) offers a s… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  17. arXiv:2406.11371  [pdf, other

    cs.CV physics.optics

    Video Frame Interpolation for Polarization via Swin-Transformer

    Authors: Feng Huang, Xin Zhang, Yixuan Xu, Xuesong Wang, Xianyu Wu

    Abstract: Video Frame Interpolation (VFI) has been extensively explored and demonstrated, yet its application to polarization remains largely unexplored. Due to the selective transmission of light by polarized filters, longer exposure times are typically required to ensure sufficient light intensity, which consequently lower the temporal sample rates. Furthermore, because polarization reflected by objects v… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 18 pages, 10 figures, 7 tables, 73 citations

  18. arXiv:2406.10900  [pdf, other

    cs.CV cs.CL

    AUTOHALLUSION: Automatic Generation of Hallucination Benchmarks for Vision-Language Models

    Authors: Xiyang Wu, Tianrui Guan, Dianqi Li, Shuaiyi Huang, Xiaoyu Liu, Xijun Wang, Ruiqi Xian, Abhinav Shrivastava, Furong Huang, Jordan Lee Boyd-Graber, Tianyi Zhou, Dinesh Manocha

    Abstract: Large vision-language models (LVLMs) hallucinate: certain context cues in an image may trigger the language module's overconfident and incorrect reasoning on abnormal or hypothetical objects. Though a few benchmarks have been developed to investigate LVLM hallucinations, they mainly rely on hand-crafted corner cases whose fail patterns may hardly generalize, and finetuning on them could undermine… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  19. arXiv:2406.08426  [pdf, other

    cs.CL cs.AI cs.DB

    Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL

    Authors: Zijin Hong, Zheng Yuan, Qinggang Zhang, Hao Chen, Junnan Dong, Feiran Huang, Xiao Huang

    Abstract: Generating accurate SQL according to natural language questions (text-to-SQL) is a long-standing challenge due to the complexities involved in user question understanding, database schema comprehension, and SQL generation. Conventional text-to-SQL systems, comprising human engineering and deep neural networks, have made substantial progress. Subsequently, pre-trained language models (PLMs) have be… ▽ More

    Submitted 27 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

  20. GPT4Rec: Graph Prompt Tuning for Streaming Recommendation

    Authors: Peiyan Zhang, Yuchen Yan, Xi Zhang, Liying Kang, Chaozhuo Li, Feiran Huang, Senzhang Wang, Sunghun Kim

    Abstract: In the realm of personalized recommender systems, the challenge of adapting to evolving user preferences and the continuous influx of new users and items is paramount. Conventional models, typically reliant on a static training-test approach, struggle to keep pace with these dynamic demands. Streaming recommendation, particularly through continual graph learning, has emerged as a novel solution. H… ▽ More

    Submitted 11 July, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted by SIGIR 2024

    ACM Class: H.3.3

  21. arXiv:2406.08116  [pdf, other

    cs.CL cs.AI

    Supportiveness-based Knowledge Rewriting for Retrieval-augmented Language Modeling

    Authors: Zile Qiao, Wei Ye, Yong Jiang, Tong Mo, Pengjun Xie, Weiping Li, Fei Huang, Shikun Zhang

    Abstract: Retrieval-augmented language models (RALMs) have recently shown great potential in mitigating the limitations of implicit knowledge in LLMs, such as untimely updating of the latest expertise and unreliable retention of long-tail knowledge. However, since the external knowledge base, as well as the retriever, can not guarantee reliability, potentially leading to the knowledge retrieved not being he… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  22. arXiv:2406.07381  [pdf, other

    cs.AI cs.LG

    World Models with Hints of Large Language Models for Goal Achieving

    Authors: Zeyuan Liu, Ziyu Huan, Xiyao Wang, Jiafei Lyu, Jian Tao, Xiu Li, Furong Huang, Huazhe Xu

    Abstract: Reinforcement learning struggles in the face of long-horizon tasks and sparse goals due to the difficulty in manual reward specification. While existing methods address this by adding intrinsic rewards, they may fail to provide meaningful guidance in long-horizon decision-making tasks with large state and action spaces, lacking purposeful exploration. Inspired by human cognition, we propose a new… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  23. arXiv:2406.07362  [pdf, other

    cs.HC

    AI.vs.Clinician: Unveiling Intricate Interactions Between AI and Clinicians through an Open-Access Database

    Authors: Wanling Gao, Yuan Liu, Zhuoming Yu, Dandan Cui, Wenjing Liu, Xiaoshuang Liang, Jiahui Zhao, Jiyue Xie, Hao Li, Li Ma, Ning Ye, Yumiao Kang, Dingfeng Luo, Peng Pan, Wei Huang, Zhongmou Liu, Jizhong Hu, Fan Huang, Gangyuan Zhao, Chongrong Jiang, Tianyi Wei, Zhifei Zhang, Yunyou Huang, Jianfeng Zhan

    Abstract: Artificial Intelligence (AI) plays a crucial role in medical field and has the potential to revolutionize healthcare practices. However, the success of AI models and their impacts hinge on the synergy between AI and medical specialists, with clinicians assuming a dominant role. Unfortunately, the intricate dynamics and interactions between AI and clinicians remain undiscovered and thus hinder AI f… ▽ More

    Submitted 15 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

    Comments: 12 pages

  24. arXiv:2406.06830  [pdf, other

    astro-ph.CO hep-ph hep-th

    Cosmological Stasis from Dynamical Scalars: Tracking Solutions and the Possibility of a Stasis-Induced Inflation

    Authors: Keith R. Dienes, Lucien Heurtier, Fei Huang, Tim M. P. Tait, Brooks Thomas

    Abstract: It has recently been realized that many theories of physics beyond the Standard Model give rise to cosmological histories exhibiting extended epochs of cosmological stasis. During such epochs, the abundances of different energy components such as matter, radiation, and vacuum energy each remain fixed despite cosmological expansion. In previous analyses of the stasis phenomenon, these different ene… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 25 pages, LaTeX, 11 figures

    Report number: KCL-PH-TH/2024-23, UCI-HEP-TR-2024-09

  25. arXiv:2406.05644  [pdf, other

    cs.CL cs.AI cs.CR cs.CY

    How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States

    Authors: Zhenhong Zhou, Haiyang Yu, Xinghua Zhang, Rongwu Xu, Fei Huang, Yongbin Li

    Abstract: Large language models (LLMs) rely on safety alignment to avoid responding to malicious user inputs. Unfortunately, jailbreak can circumvent safety guardrails, resulting in LLMs generating harmful content and raising concerns about LLM safety. Due to language models with intensive parameters often regarded as black boxes, the mechanisms of alignment and jailbreak are challenging to elucidate. In th… ▽ More

    Submitted 13 June, 2024; v1 submitted 9 June, 2024; originally announced June 2024.

    Comments: 27 pages

  26. arXiv:2406.03884  [pdf, ps, other

    math.AP math-ph

    Steady supersonic combustion flows with a contact discontinuity in two-dimensional finitely long nozzles

    Authors: Junlei Gao, Feimin Huang, Jie Kuang, Dehua Wang, Wei Xiang

    Abstract: In this paper, we are concerned with the two-dimensional steady supersonic combustion flows with a contact discontinuity moving through a nozzle of finite length. Mathematically, it can be formulated as a free boundary value problem governed by the two -dimensional steady combustion Euler equations with a contact discontinuity as the free boundary. The main mathematical difficulties are that the c… ▽ More

    Submitted 9 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

  27. arXiv:2406.03836  [pdf, other

    cs.CR cs.AI

    Proactive Detection of Physical Inter-rule Vulnerabilities in IoT Services Using a Deep Learning Approach

    Authors: Bing Huang, Chen Chen, Kwok-Yan Lam, Fuqun Huang

    Abstract: Emerging Internet of Things (IoT) platforms provide sophisticated capabilities to automate IoT services by enabling occupants to create trigger-action rules. Multiple trigger-action rules can physically interact with each other via shared environment channels, such as temperature, humidity, and illumination. We refer to inter-rule interactions via shared environment channels as a physical inter-ru… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted by IEEE ICWS 2024 Workshop

  28. arXiv:2406.01422  [pdf, other

    cs.SE cs.CL

    How to Understand Whole Software Repository?

    Authors: Yingwei Ma, Qingping Yang, Rongyu Cao, Binhua Li, Fei Huang, Yongbin Li

    Abstract: Recently, Large Language Model (LLM) based agents have advanced the significant development of Automatic Software Engineering (ASE). Although verified effectiveness, the designs of the existing methods mainly focus on the local information of codes, e.g., issues, classes, and functions, leading to limitations in capturing the global context and interdependencies within the software system. From th… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  29. arXiv:2406.01014  [pdf, other

    cs.CL cs.CV

    Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration

    Authors: Junyang Wang, Haiyang Xu, Haitao Jia, Xi Zhang, Ming Yan, Weizhou Shen, Ji Zhang, Fei Huang, Jitao Sang

    Abstract: Mobile device operation tasks are increasingly becoming a popular multi-modal AI application scenario. Current Multi-modal Large Language Models (MLLMs), constrained by their training data, lack the capability to function effectively as operation assistants. Instead, MLLM-based agents, which enhance capabilities through tool invocation, are gradually being applied to this scenario. However, the tw… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 22 pages, 11 figures, 10 Tables

  30. arXiv:2406.00476  [pdf, other

    astro-ph.HE

    Revisiting Energy Distribution and Formation Rate of CHIME Fast Radio Bursts

    Authors: K. J. Zhang, X. F. Dong, A. E. Rodin, V. A. Fedorova, Y. F. Huang, D. Li, P. Wang, Q. M. Li, C. Du, F. Xu, Z. B. Zhang

    Abstract: Using a large sample of fast radio bursts (FRBs) from the first CHIME/FRB catalog, we apply the Lynden-Bell's c$^-$ method to study their energy function and formation rate evolutions with redshift. It is found with the non-parametric Kendell's $τ$ statistics that the FRB energy strongly evolves with the cosmological redshift as $E(z)\propto(1 + z)^{5.23}$. After removing the redshift dependence,… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  31. arXiv:2405.20495  [pdf, other

    cs.CL cs.LG

    Transfer Q Star: Principled Decoding for LLM Alignment

    Authors: Souradip Chakraborty, Soumya Suvra Ghosal, Ming Yin, Dinesh Manocha, Mengdi Wang, Amrit Singh Bedi, Furong Huang

    Abstract: Aligning foundation models is essential for their safe and trustworthy deployment. However, traditional fine-tuning methods are computationally intensive and require updating billions of model parameters. A promising alternative, alignment via decoding, adjusts the response distribution directly without model updates to maximize a target reward $r$, thus providing a lightweight and adaptable frame… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  32. arXiv:2405.19856  [pdf, other

    cs.CL cs.SE

    DevEval: A Manually-Annotated Code Generation Benchmark Aligned with Real-World Code Repositories

    Authors: Jia Li, Ge Li, Yunfei Zhao, Yongmin Li, Huanyu Liu, Hao Zhu, Lecheng Wang, Kaibo Liu, Zheng Fang, Lanshen Wang, Jiazheng Ding, Xuanming Zhang, Yuqi Zhu, Yihong Dong, Zhi Jin, Binhua Li, Fei Huang, Yongbin Li

    Abstract: How to evaluate the coding abilities of Large Language Models (LLMs) remains an open question. We find that existing benchmarks are poorly aligned with real-world code repositories and are insufficient to evaluate the coding abilities of LLMs. To address the knowledge gap, we propose a new benchmark named DevEval, which has three advances. (1) DevEval aligns with real-world repositories in multi… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Accepted by the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024). arXiv admin note: substantial text overlap with arXiv:2404.00599, arXiv:2401.06401

  33. arXiv:2405.17931  [pdf, other

    cs.CL cs.LG

    Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment

    Authors: Keming Lu, Bowen Yu, Fei Huang, Yang Fan, Runji Lin, Chang Zhou

    Abstract: Effectively aligning Large Language Models (LLMs) with human-centric values while preventing the degradation of abilities acquired through Pre-training and Supervised Fine-tuning (SFT) poses a central challenge in Reinforcement Learning from Human Feedback (RLHF). In this paper, we first discover that interpolating RLHF and SFT model parameters can adjust the trade-off between human preference and… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  34. arXiv:2405.17535  [pdf, other

    cs.LG cs.AI stat.ML

    Calibrated Dataset Condensation for Faster Hyperparameter Search

    Authors: Mucong Ding, Yuancheng Xu, Tahseen Rabbani, Xiaoyu Liu, Brian Gravelle, Teresa Ranadive, Tai-Ching Tuan, Furong Huang

    Abstract: Dataset condensation can be used to reduce the computational cost of training multiple models on a large dataset by condensing the training dataset into a small synthetic set. State-of-the-art approaches rely on matching the model gradients between the real and synthetic data. However, there is no theoretical guarantee of the generalizability of the condensed data: data condensation often generali… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  35. arXiv:2405.17404  [pdf, other

    cs.LG cs.AI stat.ML

    Spectral Greedy Coresets for Graph Neural Networks

    Authors: Mucong Ding, Yinhan He, Jundong Li, Furong Huang

    Abstract: The ubiquity of large-scale graphs in node-classification tasks significantly hinders the real-world applications of Graph Neural Networks (GNNs). Node sampling, graph coarsening, and dataset condensation are effective strategies for enhancing data efficiency. However, owing to the interdependence of graph nodes, coreset selection, which selects subsets of the data examples, has not been successfu… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  36. arXiv:2405.16863  [pdf

    cond-mat.mes-hall cond-mat.mtrl-sci

    All-voltage control of Giant Magnetoresistance

    Authors: Lujun Wei, Yiyang Zhang, Fei Huang, Jiajv Yang, Jincheng Peng, Yanghui Li, Yu Lu, Jiarui Chen, Tianyu Liu, Yong Pu, Jun Du

    Abstract: The aim of voltage control of magnetism is to reduce the power consumption of spintronic devices. For a spin valve, the magnetization directions of two ferromagnetic layers determine the giant magnetoresistance magnitude. However, achieving all-voltage manipulation of the magnetization directions between parallel and antiparallel states is a significant challenge. Here, we demonstrate that by util… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  37. arXiv:2405.15973  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Enhancing Visual-Language Modality Alignment in Large Vision Language Models via Self-Improvement

    Authors: Xiyao Wang, Jiuhai Chen, Zhaoyang Wang, Yuhang Zhou, Yiyang Zhou, Huaxiu Yao, Tianyi Zhou, Tom Goldstein, Parminder Bhatia, Furong Huang, Cao Xiao

    Abstract: Large vision-language models (LVLMs) have achieved impressive results in various visual question-answering and reasoning tasks through vision instruction tuning on specific datasets. However, there is still significant room for improvement in the alignment between visual and language modalities. Previous methods to enhance this alignment typically require external models or data, heavily depending… ▽ More

    Submitted 7 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: 15 pages, 8 figures

  38. arXiv:2405.14768  [pdf, other

    cs.CL cs.AI cs.CV cs.IR cs.LG

    WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models

    Authors: Peng Wang, Zexi Li, Ningyu Zhang, Ziwen Xu, Yunzhi Yao, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen

    Abstract: Large language models (LLMs) need knowledge updates to meet the ever-growing world facts and correct the hallucinated responses, facilitating the methods of lifelong model editing. Where the updated knowledge resides in memories is a fundamental question for model editing. In this paper, we find that editing either long-term memory (direct model parameters) or working memory (non-parametric knowle… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Work in progress

  39. arXiv:2405.14431  [pdf, other

    cs.CL cs.AI cs.IR

    RaFe: Ranking Feedback Improves Query Rewriting for RAG

    Authors: Shengyu Mao, Yong Jiang, Boli Chen, Xiao Li, Peng Wang, Xinyu Wang, Pengjun Xie, Fei Huang, Huajun Chen, Ningyu Zhang

    Abstract: As Large Language Models (LLMs) and Retrieval Augmentation Generation (RAG) techniques have evolved, query rewriting has been widely incorporated into the RAG system for downstream tasks like open-domain QA. Many works have attempted to utilize small models with reinforcement learning rather than costly LLMs to improve query rewriting. However, current methods require annotations (e.g., labeled re… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 16 pages

  40. arXiv:2405.14205  [pdf, other

    cs.CL cs.AI cs.CV cs.LG cs.MA

    Agent Planning with World Knowledge Model

    Authors: Shuofei Qiao, Runnan Fang, Ningyu Zhang, Yuqi Zhu, Xiang Chen, Shumin Deng, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen

    Abstract: Recent endeavors towards directly using large language models (LLMs) as agent models to execute interactive planning tasks have shown commendable results. Despite their achievements, however, they still struggle with brainless trial-and-error in global planning and generating hallucinatory actions in local planning due to their poor understanding of the ''real'' physical world. Imitating humans' m… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Work in progress

  41. arXiv:2405.13879  [pdf, other

    cs.GT cs.DC cs.LG econ.TH

    FACT or Fiction: Can Truthful Mechanisms Eliminate Federated Free Riding?

    Authors: Marco Bornstein, Amrit Singh Bedi, Abdirisak Mohamed, Furong Huang

    Abstract: Standard federated learning (FL) approaches are vulnerable to the free-rider dilemma: participating agents can contribute little to nothing yet receive a well-trained aggregated model. While prior mechanisms attempt to solve the free-rider dilemma, none have addressed the issue of truthfulness. In practice, adversarial agents can provide false information to the server in order to cheat its way ou… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 18 pages, 5 figures

  42. arXiv:2405.13045  [pdf, other

    cs.HC cs.AI

    CoLay: Controllable Layout Generation through Multi-conditional Latent Diffusion

    Authors: Chin-Yi Cheng, Ruiqi Gao, Forrest Huang, Yang Li

    Abstract: Layout design generation has recently gained significant attention due to its potential applications in various fields, including UI, graphic, and floor plan design. However, existing models face two main challenges that limits their adoption in practice. Firstly, the limited expressiveness of individual condition types used in previous works restricts designers' ability to convey complex design i… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  43. arXiv:2405.13026  [pdf, other

    cs.CL cs.AI

    Leveraging Human Revisions for Improving Text-to-Layout Models

    Authors: Amber Xie, Chin-Yi Cheng, Forrest Huang, Yang Li

    Abstract: Learning from human feedback has shown success in aligning large, pretrained models with human values. Prior works have mostly focused on learning from high-level labels, such as preferences between pairs of model outputs. On the other hand, many domains could benefit from more involved, detailed feedback, such as revisions, explanations, and reasoning of human users. Our work proposes using nuanc… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  44. arXiv:2405.11431  [pdf, other

    cs.LG q-fin.ST stat.ML

    Review of deep learning models for crypto price prediction: implementation and evaluation

    Authors: Jingyang Wu, Xinyi Zhang, Fangyixuan Huang, Haochen Zhou, Rohtiash Chandra

    Abstract: There has been much interest in accurate cryptocurrency price forecast models by investors and researchers. Deep Learning models are prominent machine learning techniques that have transformed various fields and have shown potential for finance and economics. Although various deep learning models have been explored for cryptocurrency price forecasting, it is not clear which models are suitable due… ▽ More

    Submitted 2 June, 2024; v1 submitted 18 May, 2024; originally announced May 2024.

  45. arXiv:2405.07426  [pdf

    physics.optics

    Multiple Bound States in the Continuum: Towards Intense Terahertz Matter Interaction

    Authors: Quanlong Yang, Zhibo Yao, Lei Xu, Yapeng Dou, Lingli Ba, Fan Huang, Quan Xu, Longqing Cong, Jianqiang Gu, Junliang Yang, Mohsen Rahmani, Jiaguang Han, Ilya Shadrivov

    Abstract: Bound states in the continuum (BICs) are an excellent platform enabling highly efficient light-matter interaction in applications for lasing, nonlinear generation, and sensing. However, the current focus in implementing BICs has primarily been on single sharp resonances, limiting the extent of electric field enhancement for multiple resonances. In this study, we conducted experimental demonstratio… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

  46. arXiv:2405.07230  [pdf, other

    astro-ph.IM physics.ins-det

    Acoustic Positioning for Deep Sea Neutrino Telescopes with a System of Piezo Sensors Integrated into Glass Spheres

    Authors: A. Albert, S. Alves, M. André, M. Ardid, S. Ardid, J. -J. Aubert, J. Aublin, B. Baret, S. Basa, Y. Becherini, B. Belhorma, M. Bendahman, F. Benfenati, V. Bertin, S. Biagi, J. Boumaaza, M. Bouta, M. C. Bouwhuis, H. Brânzaş, R. Bruijn, J. Brunner, J. Busto, B. Caiffi, D. Calvo, S. Campion , et al. (115 additional authors not shown)

    Abstract: Position calibration in the deep sea is typically done by means of acoustic multilateration using three or more acoustic emitters installed at known positions. Rather than using hydrophones as receivers that are exposed to the ambient pressure, the sound signals can be coupled to piezo ceramics glued to the inside of existing containers for electronics or measuring instruments of a deep sea infras… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

    Comments: submitted to "Experimental Astronomy"

  47. arXiv:2405.05497  [pdf, other

    cs.CV

    Multi-Level Feature Fusion Network for Lightweight Stereo Image Super-Resolution

    Authors: Yunxiang Li, Wenbin Zou, Qiaomu Wei, Feng Huang, Jing Wu

    Abstract: Stereo image super-resolution utilizes the cross-view complementary information brought by the disparity effect of left and right perspective images to reconstruct higher-quality images. Cascading feature extraction modules and cross-view feature interaction modules to make use of the information from stereo images is the focus of numerous methods. However, this adds a great deal of network parame… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 10 pages, 7 figures, CVPRWorkshop NTIRE2024

  48. arXiv:2405.05094  [pdf, other

    gr-qc astro-ph.HE hep-ph

    Mass function of stellar black holes as revealed by the LIGO-Virgo-KAGRA observations

    Authors: Xiao-Fei Dong, Yong-Feng Huang, Zhi-Bin Zhang, Xiu-Juan Li, Ze-Cheng Zou, Chen-Ran Hu, Chen Deng, Yang Liu

    Abstract: Ninety gravitational wave events have been detected by the LIGO-Virgo-KAGRA network and are released in the Gravitational-Wave Transient Catalog. Among these events, 83 cases are definitely binary black hole mergers since the masses of all the objects involved significantly exceed the upper limit of neutron stars. The black holes in these merger events naturally form two interesting samples, a pre… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 14 pages, 5 figures, 1 table

    MSC Class: 85-08; 62L10; 62G09; 62E10 ACM Class: F.2.1

  49. arXiv:2405.03296  [pdf, ps, other

    cs.LG cs.AI

    Coefficient Decomposition for Spectral Graph Convolution

    Authors: Feng Huang, Wen Zhang

    Abstract: Spectral graph convolutional network (SGCN) is a kind of graph neural networks (GNN) based on graph signal filters, and has shown compelling expressivity for modeling graph-structured data. Most SGCNs adopt polynomial filters and learn the coefficients from the training data. Many of them focus on which polynomial basis leads to optimal expressive power and models' architecture is little discussed… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  50. arXiv:2405.03126  [pdf

    eess.IV eess.SP

    Infrared Polarization Imaging-based Non-destructive Thermography Inspection

    Authors: Xianyu Wu, Bin Zhou, Peng Lin, Rongjin Cao, Feng Huang

    Abstract: Infrared pulse thermography non-destructive testing (NDT) method is developed based on the difference in the infrared radiation intensity emitted by defective and non-defective areas of an object. However, when the radiation intensity of the defective target is similar to that of the non-defective area of the object, the detection results are poor. To address this issue, this study investigated th… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.