Skip to main content

Showing 1–50 of 144 results for author: Ye, F

  1. arXiv:2407.13246  [pdf, other

    cs.CV

    STS MICCAI 2023 Challenge: Grand challenge on 2D and 3D semi-supervised tooth segmentation

    Authors: Yaqi Wang, Yifan Zhang, Xiaodiao Chen, Shuai Wang, Dahong Qian, Fan Ye, Feng Xu, Hongyuan Zhang, Qianni Zhang, Chengyu Wu, Yunxiang Li, Weiwei Cui, Shan Luo, Chengkai Wang, Tianhao Li, Yi Liu, Xiang Feng, Huiyu Zhou, Dongyun Liu, Qixuan Wang, Zhouhao Lin, Wei Song, Yuanlin Li, Bing Wang, Chunshi Wang , et al. (2 additional authors not shown)

    Abstract: Computer-aided design (CAD) tools are increasingly popular in modern dental practice, particularly for treatment planning or comprehensive prognosis evaluation. In particular, the 2D panoramic X-ray image efficiently detects invisible caries, impacted teeth and supernumerary teeth in children, while the 3D dental cone beam computed tomography (CBCT) is widely used in orthodontics and endodontics d… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  2. arXiv:2407.04272  [pdf, other

    cs.LG cs.DC

    Accelerating Communication in Deep Learning Recommendation Model Training with Dual-Level Adaptive Lossy Compression

    Authors: Hao Feng, Boyuan Zhang, Fanjiang Ye, Min Si, Ching-Hsiang Chu, Jiannan Tian, Chunxing Yin, Summer Deng, Yuchen Hao, Pavan Balaji, Tong Geng, Dingwen Tao

    Abstract: DLRM is a state-of-the-art recommendation system model that has gained widespread adoption across various industry applications. The large size of DLRM models, however, necessitates the use of multiple devices/GPUs for efficient training. A significant bottleneck in this process is the time-consuming all-to-all communication required to collect embedding data from all devices. To mitigate this, we… ▽ More

    Submitted 11 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

    Comments: accepted by SC '24

  3. arXiv:2407.01445  [pdf, other

    cs.LG cs.CV

    FastCLIP: A Suite of Optimization Techniques to Accelerate CLIP Training with Limited Resources

    Authors: Xiyuan Wei, Fanjiang Ye, Ori Yonay, Xingyu Chen, Baixi Sun, Dingwen Tao, Tianbao Yang

    Abstract: Existing studies of training state-of-the-art Contrastive Language-Image Pretraining (CLIP) models on large-scale data involve hundreds of or even thousands of GPUs due to the requirement of a large batch size. However, such a large amount of resources is not accessible to most people. While advanced compositional optimization techniques for optimizing global contrastive losses have been demonstra… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 23 pages

  4. arXiv:2406.13922  [pdf, ps, other

    cs.IT

    Explicit Performance Bound of Finite Blocklength Coded MIMO: Time-Domain versus Spatiotemporal Channel Coding

    Authors: Feng Ye, Xiaohu You, Jiamin Li, Chuan Zhang, Chen Ji

    Abstract: In the sixth generation (6G), ultra-reliable low-latency communications (URLLC) will further develop to achieve TKu extreme connectivity, and multiple-input multiple-output (MIMO) is expected to be a key enabler for its realization. Since the latency constraint can be represented by the blocklength of a codeword, it is essential to analyze different coded MIMO schemes under finite blocklength regi… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 9 pages, 5 figures

  5. arXiv:2406.13249  [pdf, other

    cs.CL cs.AI cs.IR

    R^2AG: Incorporating Retrieval Information into Retrieval Augmented Generation

    Authors: Fuda Ye, Shuangyin Li, Yongqi Zhang, Lei Chen

    Abstract: Retrieval augmented generation (RAG) has been applied in many scenarios to augment large language models (LLMs) with external documents provided by retrievers. However, a semantic gap exists between LLMs and retrievers due to differences in their training objectives and architectures. This misalignment forces LLMs to passively accept the documents provided by the retrievers, leading to incomprehen… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  6. arXiv:2406.12844  [pdf, other

    cs.LG cs.AI

    Synergizing Foundation Models and Federated Learning: A Survey

    Authors: Shenghui Li, Fanghua Ye, Meng Fang, Jiaxu Zhao, Yun-Hin Chan, Edith C. -H. Ngai, Thiemo Voigt

    Abstract: The recent development of Foundation Models (FMs), represented by large language models, vision transformers, and multimodal models, has been making a significant impact on both academia and industry. Compared with small-scale models, FMs have a much stronger demand for high-volume data during the pre-training phase. Although general FMs can be pre-trained on data collected from open sources such… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  7. arXiv:2406.00114  [pdf, other

    cs.RO cs.NE

    Dynamic Multi-Objective Lion Swarm Optimization with Multi-strategy Fusion: An application in 6R robot trajectory planning

    Authors: Bao Liu, Tianbao Liu, Zhongshuo Hu, Fei Ye, Lei Gao

    Abstract: The advancement of industrialization has spurred the development of innovative swarm intelligence algorithms, with Lion Swarm Optimization (LSO) notable for its robustness, parallelism, simplicity, and efficiency. While LSO excels in single-objective optimization, its multi-objective variants face challenges such as poor initialization, local optima entrapment, and so on. This study proposes Dynam… ▽ More

    Submitted 7 June, 2024; v1 submitted 31 May, 2024; originally announced June 2024.

  8. arXiv:2405.13034  [pdf, other

    cs.CL cs.AI cs.HC

    Autonomous Workflow for Multimodal Fine-Grained Training Assistants Towards Mixed Reality

    Authors: Jiahuan Pei, Irene Viola, Haochen Huang, Junxiao Wang, Moonisa Ahsan, Fanghua Ye, Jiang Yiming, Yao Sai, Di Wang, Zhumin Chen, Pengjie Ren, Pablo Cesar

    Abstract: Autonomous artificial intelligence (AI) agents have emerged as promising protocols for automatically understanding the language-based environment, particularly with the exponential development of large language models (LLMs). However, a fine-grained, comprehensive understanding of multimodal environments remains under-explored. This work designs an autonomous workflow tailored for integrating AI a… ▽ More

    Submitted 5 June, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

    Comments: Accepted by ACL 2024

  9. arXiv:2405.05542  [pdf, other

    cs.RO cs.MA

    Dynamic Deep Factor Graph for Multi-Agent Reinforcement Learning

    Authors: Yuchen Shi, Shihong Duan, Cheng Xu, Ran Wang, Fangwen Ye, Chau Yuen

    Abstract: This work introduces a novel value decomposition algorithm, termed \textit{Dynamic Deep Factor Graphs} (DDFG). Unlike traditional coordination graphs, DDFG leverages factor graphs to articulate the decomposition of value functions, offering enhanced flexibility and adaptability to complex value function structures. Central to DDFG is a graph structure generation policy that innovatively generates… ▽ More

    Submitted 7 June, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

    Comments: submitted to IEEE TPAMI

  10. arXiv:2405.03692  [pdf, other

    eess.IV cs.NI eess.SY

    Imitation Learning for Adaptive Video Streaming with Future Adversarial Information Bottleneck Principle

    Authors: Shuoyao Wang, Jiawei Lin, Fangwei Ye

    Abstract: Adaptive video streaming plays a crucial role in ensuring high-quality video streaming services. Despite extensive research efforts devoted to Adaptive BitRate (ABR) techniques, the current reinforcement learning (RL)-based ABR algorithms may benefit the average Quality of Experience (QoE) but suffers from fluctuating performance in individual video sessions. In this paper, we present a novel appr… ▽ More

    Submitted 12 March, 2024; originally announced May 2024.

    Comments: submitted to IEEE Journal

  11. arXiv:2403.10971  [pdf, other

    cs.CV

    Task-Aware Low-Rank Adaptation of Segment Anything Model

    Authors: Xuehao Wang, Feiyang Ye, Yu Zhang

    Abstract: The Segment Anything Model (SAM), with its remarkable zero-shot capability, has been proven to be a powerful foundation model for image segmentation tasks, which is an important task in computer vision. However, the transfer of its rich semantic information to multiple different downstream tasks remains unexplored. In this paper, we propose the Task-Aware Low-Rank Adaptation (TA-LoRA) method, whic… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

  12. arXiv:2403.06568  [pdf, other

    cs.AI

    Better Understandings and Configurations in MaxSAT Local Search Solvers via Anytime Performance Analysis

    Authors: Furong Ye, Chuan Luo, Shaowei Cai

    Abstract: Though numerous solvers have been proposed for the MaxSAT problem, and the benchmark environment such as MaxSAT Evaluations provides a platform for the comparison of the state-of-the-art solvers, existing assessments were usually evaluated based on the quality, e.g., fitness, of the best-found solutions obtained within a given running time budget. However, concerning solely the final obtained solu… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  13. arXiv:2403.06144  [pdf, other

    cs.CY

    Simulating Family Conversations using LLMs: Demonstration of Parenting Styles

    Authors: Frank Tian-fang Ye, Xiaozi Gao

    Abstract: This study presents a framework for conducting psychological and linguistic research through simulated conversations using large language models (LLMs). The proposed methodology offers significant advantages, particularly for simulating human interactions involving potential unethical language or behaviors that would be impermissible in traditional experiments with human participants. As a demonst… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

  14. arXiv:2403.03310  [pdf, other

    quant-ph cs.LG

    Graph Learning for Parameter Prediction of Quantum Approximate Optimization Algorithm

    Authors: Zhiding Liang, Gang Liu, Zheyuan Liu, Jinglei Cheng, Tianyi Hao, Kecheng Liu, Hang Ren, Zhixin Song, Ji Liu, Fanny Ye, Yiyu Shi

    Abstract: In recent years, quantum computing has emerged as a transformative force in the field of combinatorial optimization, offering novel approaches to tackling complex problems that have long challenged classical computational methods. Among these, the Quantum Approximate Optimization Algorithm (QAOA) stands out for its potential to efficiently solve the Max-Cut problem, a quintessential example of com… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  15. arXiv:2402.18567  [pdf, other

    cs.LG q-bio.BM

    Diffusion Language Models Are Versatile Protein Learners

    Authors: Xinyou Wang, Zaixiang Zheng, Fei Ye, Dongyu Xue, Shujian Huang, Quanquan Gu

    Abstract: This paper introduces diffusion protein language model (DPLM), a versatile protein language model that demonstrates strong generative and predictive capabilities for protein sequences. We first pre-train scalable DPLMs from evolutionary-scale protein sequences within a generative self-supervised discrete diffusion probabilistic framework, which generalizes language modeling for proteins in a princ… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

  16. arXiv:2402.18070  [pdf, other

    cs.AR eess.SP

    A Hierarchical Dataflow-Driven Heterogeneous Architecture for Wireless Baseband Processing

    Authors: Limin Jiang, Yi Shi, Haiqin Hu, Qingyu Deng, Siyi Xu, Yintao Liu, Feng Yuan, Si Wang, Yihao Shen, Fangfang Ye, Shan Cao, Zhiyuan Jiang

    Abstract: Wireless baseband processing (WBP) is a key element of wireless communications, with a series of signal processing modules to improve data throughput and counter channel fading. Conventional hardware solutions, such as digital signal processors (DSPs) and more recently, graphic processing units (GPUs), provide various degrees of parallelism, yet they both fail to take into account the cyclical and… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: 7 pages, 7 figures, conference

  17. arXiv:2402.07654  [pdf, other

    cs.NE

    Impact of spatial transformations on landscape features of CEC2022 basic benchmark problems

    Authors: Haoran Yin, Diederick Vermetten, Furong Ye, Thomas H. W. Bäck, Anna V. Kononova

    Abstract: When benchmarking optimization heuristics, we need to take care to avoid an algorithm exploiting biases in the construction of the used problems. One way in which this might be done is by providing different versions of each problem but with transformations applied to ensure the algorithms are equipped with mechanisms for successfully tackling a range of problems. In this paper, we investigate sev… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

  18. arXiv:2402.07616  [pdf, other

    cs.CL cs.AI

    Anchor-based Large Language Models

    Authors: Jianhui Pang, Fanghua Ye, Derek Fai Wong, Xin He, Wanshun Chen, Longyue Wang

    Abstract: Large language models (LLMs) predominantly employ decoder-only transformer architectures, necessitating the retention of keys/values information for historical tokens to provide contextual information and avoid redundant computation. However, the substantial size and parameter volume of these LLMs require massive GPU memory. This memory demand increases with the length of the input text, leading t… ▽ More

    Submitted 1 June, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

    Comments: The paper has been accepted by the ACL2024 conference. Work was done when Jianhui Pang and Fanghua Ye were interning at Tencent AI Lab

  19. arXiv:2401.13246  [pdf, other

    cs.CL

    SEER: Facilitating Structured Reasoning and Explanation via Reinforcement Learning

    Authors: Guoxin Chen, Kexin Tang, Chao Yang, Fuying Ye, Yu Qiao, Yiming Qian

    Abstract: Elucidating the reasoning process with structured explanations from question to answer is crucial, as it significantly enhances the interpretability, traceability, and trustworthiness of question-answering (QA) systems. However, structured explanations demand models to perform intricately structured reasoning, which poses great challenges. Most existing methods focus on single-step reasoning throu… ▽ More

    Submitted 4 June, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

    Comments: Camera ready version for ACL 2024 Main Conference

  20. arXiv:2401.12794  [pdf, other

    cs.CL

    Benchmarking LLMs via Uncertainty Quantification

    Authors: Fanghua Ye, Mingming Yang, Jianhui Pang, Longyue Wang, Derek F. Wong, Emine Yilmaz, Shuming Shi, Zhaopeng Tu

    Abstract: The proliferation of open-source Large Language Models (LLMs) from various institutions has highlighted the urgent need for comprehensive evaluation methods. However, current evaluation platforms, such as the widely recognized HuggingFace open LLM leaderboard, neglect a crucial aspect -- uncertainty, which is vital for thoroughly assessing LLMs. To bridge this gap, we introduce a new benchmarking… ▽ More

    Submitted 25 April, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

    Comments: 25 pages, preprints

  21. arXiv:2401.11929  [pdf, other

    cs.LG

    Parsimony or Capability? Decomposition Delivers Both in Long-term Time Series Forecasting

    Authors: Jinliang Deng, Feiyang Ye, Du Yin, Xuan Song, Ivor W. Tsang, Hui Xiong

    Abstract: Long-term time series forecasting (LTSF) represents a critical frontier in time series analysis, characterized by extensive input sequences, as opposed to the shorter spans typical of traditional approaches. While longer sequences inherently offer richer information for enhanced predictive precision, prevailing studies often respond by escalating model complexity. These intricate models can inflat… ▽ More

    Submitted 23 May, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

  22. arXiv:2401.09257  [pdf, other

    cs.LG

    A First-Order Multi-Gradient Algorithm for Multi-Objective Bi-Level Optimization

    Authors: Feiyang Ye, Baijiong Lin, Xiaofeng Cao, Yu Zhang, Ivor Tsang

    Abstract: In this paper, we study the Multi-Objective Bi-Level Optimization (MOBLO) problem, where the upper-level subproblem is a multi-objective optimization problem and the lower-level subproblem is for scalar optimization. Existing gradient-based MOBLO algorithms need to compute the Hessian matrix, causing the computational inefficient problem. To address this, we propose an efficient first-order multi-… ▽ More

    Submitted 10 July, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

    Comments: ECAI 2024

  23. arXiv:2401.08350  [pdf, other

    cs.CL

    Salute the Classic: Revisiting Challenges of Machine Translation in the Age of Large Language Models

    Authors: Jianhui Pang, Fanghua Ye, Longyue Wang, Dian Yu, Derek F. Wong, Shuming Shi, Zhaopeng Tu

    Abstract: The evolution of Neural Machine Translation (NMT) has been significantly influenced by six core challenges (Koehn and Knowles, 2017), which have acted as benchmarks for progress in this field. This study revisits these challenges, offering insights into their ongoing relevance in the context of advanced Large Language Models (LLMs): domain mismatch, amount of parallel data, rare word prediction, t… ▽ More

    Submitted 17 January, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: 17 pages. Longyue Wang is the Corresponding Author

  24. arXiv:2312.11083  [pdf, other

    cs.NE

    MA-BBOB: A Problem Generator for Black-Box Optimization Using Affine Combinations and Shifts

    Authors: Diederick Vermetten, Furong Ye, Thomas Bäck, Carola Doerr

    Abstract: Choosing a set of benchmark problems is often a key component of any empirical evaluation of iterative optimization heuristics. In continuous, single-objective optimization, several sets of problems have become widespread, including the well-established BBOB suite. While this suite is designed to enable rigorous benchmarking, it is also commonly used for testing methods such as algorithm selection… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  25. arXiv:2312.08924  [pdf, other

    cs.CV

    Training-free Zero-shot Composed Image Retrieval with Local Concept Reranking

    Authors: Shitong Sun, Fanghua Ye, Shaogang Gong

    Abstract: Composed image retrieval attempts to retrieve an image of interest from gallery images through a composed query of a reference image and its corresponding modified text. It has recently attracted attention due to the collaboration of information-rich images and concise language to precisely express the requirements of target images. Most current composed image retrieval methods follow a supervised… ▽ More

    Submitted 24 March, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: Under Review

  26. arXiv:2312.05024  [pdf, other

    cs.CV

    A Unified Framework for Unsupervised Domain Adaptation based on Instance Weighting

    Authors: Jinjing Zhu, Feiyang Ye, Qiao Xiao, Pengxin Guo, Yu Zhang, Qiang Yang

    Abstract: Despite the progress made in domain adaptation, solving Unsupervised Domain Adaptation (UDA) problems with a general method under complex conditions caused by label shifts between domains remains a formidable task. In this work, we comprehensively investigate four distinct UDA settings including closed set domain adaptation, partial domain adaptation, open set domain adaptation, and universal doma… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

  27. arXiv:2310.14513  [pdf, other

    cs.CL

    Turn-Level Active Learning for Dialogue State Tracking

    Authors: Zihan Zhang, Meng Fang, Fanghua Ye, Ling Chen, Mohammad-Reza Namazi-Rad

    Abstract: Dialogue state tracking (DST) plays an important role in task-oriented dialogue systems. However, collecting a large amount of turn-by-turn annotated dialogue data is costly and inefficient. In this paper, we propose a novel turn-level active learning framework for DST to actively select turns in dialogues to annotate. Given the limited labelling budget, experimental results demonstrate the effect… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 Main Conference

  28. arXiv:2310.09716  [pdf, other

    cs.HC cs.AI cs.CL cs.IR

    Enhancing Conversational Search: Large Language Model-Aided Informative Query Rewriting

    Authors: Fanghua Ye, Meng Fang, Shenghui Li, Emine Yilmaz

    Abstract: Query rewriting plays a vital role in enhancing conversational search by transforming context-dependent user queries into standalone forms. Existing approaches primarily leverage human-rewritten queries as labels to train query rewriting models. However, human rewrites may lack sufficient information for optimal retrieval performance. To overcome this limitation, we propose utilizing large languag… ▽ More

    Submitted 18 October, 2023; v1 submitted 14 October, 2023; originally announced October 2023.

    Comments: 22 pages, accepted to EMNLP Findings 2023

  29. arXiv:2310.04230  [pdf, other

    cs.IR

    Lending Interaction Wings to Recommender Systems with Conversational Agents

    Authors: Jiarui Jin, Xianyu Chen, Fanghua Ye, Mengyue Yang, Yue Feng, Weinan Zhang, Yong Yu, Jun Wang

    Abstract: Recommender systems trained on offline historical user behaviors are embracing conversational techniques to online query user preference. Unlike prior conversational recommendation approaches that systemically combine conversational and recommender parts through a reinforcement learning framework, we propose CORE, a new offline-training and online-checking paradigm that bridges a COnversational ag… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023

  30. arXiv:2310.00339  [pdf, other

    cs.LG cs.AI

    FedLPA: One-shot Federated Learning with Layer-Wise Posterior Aggregation

    Authors: Xiang Liu, Liangxi Liu, Feiyang Ye, Yunheng Shen, Xia Li, Linshan Jiang, Jialin Li

    Abstract: Efficiently aggregating trained neural networks from local clients into a global model on a server is a widely researched topic in federated learning. Recently, motivated by diminishing privacy concerns, mitigating potential attacks, and reducing communication overhead, one-shot federated learning (i.e., limiting client-server communication into a single round) has gained popularity among research… ▽ More

    Submitted 21 May, 2024; v1 submitted 30 September, 2023; originally announced October 2023.

    Comments: 46pages, 6 figures

  31. arXiv:2309.13909  [pdf

    cs.HC

    Chinese herb medicine in augmented reality

    Authors: Qianyun Zhu, Yifeng Xie, Fangyang Ye, Zhenyuan Gao, Binjie Che, Zhenglin Chen, Dongmei Yu

    Abstract: Augmented reality becomes popular in education gradually, which provides a contextual and adaptive learning experience. Here, we develop a Chinese herb medicine AR platform based the 3dsMax and the Unity that allows users to visualize and interact with the herb model and learn the related information. The users use their mobile camera to scan the 2D herb picture to trigger the presentation of 3D A… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

  32. arXiv:2309.10218  [pdf, other

    cs.CY cs.HC

    A Hierarchy-based Analysis Approach for Blended Learning: A Case Study with Chinese Students

    Authors: Yu Ye, Gongjin Zhang, Hongbiao Si, Liang Xu, Shenghua Hu, Yong Li, Xulong Zhang, Kaiyu Hu, Fangzhou Ye

    Abstract: Blended learning is generally defined as the combination of traditional face-to-face learning and online learning. This learning mode has been widely used in advanced education across the globe due to the COVID-19 pandemic's social distance restriction as well as the development of technology. Online learning plays an important role in blended learning, and as it requires more student autonomy, th… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

    Comments: Accepted by the 7th APWeb-WAIM International Joint Conference on Web and Big Data. (APWeb 2023)

  33. arXiv:2309.07589  [pdf, other

    cs.MM eess.IV

    MPAI-EEV: Standardization Efforts of Artificial Intelligence based End-to-End Video Coding

    Authors: Chuanmin Jia, Feng Ye, Fanke Dong, Kai Lin, Leonardo Chiariglione, Siwei Ma, Huifang Sun, Wen Gao

    Abstract: The rapid advancement of artificial intelligence (AI) technology has led to the prioritization of standardizing the processing, coding, and transmission of video using neural networks. To address this priority area, the Moving Picture, Audio, and Data Coding by Artificial Intelligence (MPAI) group is developing a suite of standards called MPAI-EEV for "end-to-end optimized neural video coding." Th… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

  34. arXiv:2308.12029  [pdf, other

    cs.LG cs.AI

    Dual-Balancing for Multi-Task Learning

    Authors: Baijiong Lin, Weisen Jiang, Feiyang Ye, Yu Zhang, Pengguang Chen, Ying-Cong Chen, Shu Liu, James T. Kwok

    Abstract: Multi-task learning (MTL), a learning paradigm to learn multiple related tasks simultaneously, has achieved great success in various fields. However, task balancing problem remains a significant challenge in MTL, with the disparity in loss/gradient scales often leading to performance compromises. In this paper, we propose a Dual-Balancing Multi-Task Learning (DB-MTL) method to alleviate the task b… ▽ More

    Submitted 29 September, 2023; v1 submitted 23 August, 2023; originally announced August 2023.

    Comments: Technical Report

  35. arXiv:2308.09991  [pdf, other

    cs.CV

    AltDiffusion: A Multilingual Text-to-Image Diffusion Model

    Authors: Fulong Ye, Guang Liu, Xinya Wu, Ledell Wu

    Abstract: Large Text-to-Image(T2I) diffusion models have shown a remarkable capability to produce photorealistic and diverse images based on text inputs. However, existing works only support limited language input, e.g., English, Chinese, and Japanese, leaving users beyond these languages underserved and blocking the global expansion of T2I models. Therefore, this paper presents AltDiffusion, a novel multil… ▽ More

    Submitted 23 August, 2023; v1 submitted 19 August, 2023; originally announced August 2023.

    Comments: 15 pages; 17 figures

  36. arXiv:2308.09977  [pdf, other

    cs.CV

    Whether you can locate or not? Interactive Referring Expression Generation

    Authors: Fulong Ye, Yuxing Long, Fangxiang Feng, Xiaojie Wang

    Abstract: Referring Expression Generation (REG) aims to generate unambiguous Referring Expressions (REs) for objects in a visual scene, with a dual task of Referring Expression Comprehension (REC) to locate the referred object. Existing methods construct REG models independently by using only the REs as ground truth for model training, without considering the potential interaction between REG and REC models… ▽ More

    Submitted 19 August, 2023; originally announced August 2023.

    Comments: 10 papges, 7 figures

  37. Pseudo-Bag Mixup Augmentation for Multiple Instance Learning-Based Whole Slide Image Classification

    Authors: Pei Liu, Luping Ji, Xinyu Zhang, Feng Ye

    Abstract: Given the special situation of modeling gigapixel images, multiple instance learning (MIL) has become one of the most important frameworks for Whole Slide Image (WSI) classification. In current practice, most MIL networks often face two unavoidable problems in training: i) insufficient WSI data and ii) the sample memorization inclination inherent in neural networks. These problems may hinder MIL m… ▽ More

    Submitted 2 November, 2023; v1 submitted 28 June, 2023; originally announced June 2023.

    Comments: 12 pages, 6 figures, 10 tables

  38. arXiv:2306.10627  [pdf, other

    cs.LG cs.NE math.OC

    MA-BBOB: Many-Affine Combinations of BBOB Functions for Evaluating AutoML Approaches in Noiseless Numerical Black-Box Optimization Contexts

    Authors: Diederick Vermetten, Furong Ye, Thomas Bäck, Carola Doerr

    Abstract: Extending a recent suggestion to generate new instances for numerical black-box optimization benchmarking by interpolating pairs of the well-established BBOB functions from the COmparing COntinuous Optimizers (COCO) platform, we propose in this work a further generalization that allows multiple affine combinations of the original instances and arbitrarily chosen locations of the global optima. We… ▽ More

    Submitted 18 June, 2023; originally announced June 2023.

    Comments: To appear in the AutoML 2023 proceedings (ABCD track)

  39. arXiv:2305.12594  [pdf, other

    cs.CL

    Modeling User Satisfaction Dynamics in Dialogue via Hawkes Process

    Authors: Fanghua Ye, Zhiyuan Hu, Emine Yilmaz

    Abstract: Dialogue systems have received increasing attention while automatically evaluating their performance remains challenging. User satisfaction estimation (USE) has been proposed as an alternative. It assumes that the performance of a dialogue system can be measured by user satisfaction and uses an estimator to simulate users. The effectiveness of USE depends heavily on the estimator. Existing estimat… ▽ More

    Submitted 21 May, 2023; originally announced May 2023.

    Comments: To appear at ACL 2023

  40. arXiv:2304.13117  [pdf, other

    cs.NE math.OC

    When to be Discrete: Analyzing Algorithm Performance on Discretized Continuous Problems

    Authors: André Thomaser, Jacob de Nobel, Diederick Vermetten, Furong Ye, Thomas Bäck, Anna V. Kononova

    Abstract: The domain of an optimization problem is seen as one of its most important characteristics. In particular, the distinction between continuous and discrete optimization is rather impactful. Based on this, the optimizing algorithm, analyzing method, and more are specified. However, in practice, no problem is ever truly continuous. Whether this is caused by computing limits or more tangible propertie… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

  41. arXiv:2303.15520  [pdf, other

    cs.LG cs.AI q-bio.QM

    Learning Harmonic Molecular Representations on Riemannian Manifold

    Authors: Yiqun Wang, Yuning Shen, Shi Chen, Lihao Wang, Fei Ye, Hao Zhou

    Abstract: Molecular representation learning plays a crucial role in AI-assisted drug discovery research. Encoding 3D molecular structures through Euclidean neural networks has become the prevailing method in the geometric deep learning community. However, the equivariance constraints and message passing in Euclidean space may limit the network expressive power. In this work, we propose a Harmonic Molecular… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

    Comments: 25 pages including Appendix

  42. arXiv:2303.04611  [pdf, other

    cs.NE

    Towards Self-adaptive Mutation in Evolutionary Multi-Objective Algorithms

    Authors: Furong Ye, Frank Neumann, Jacob de Nobel, Aneta Neumann, Thomas Bäck

    Abstract: Parameter control has succeeded in accelerating the convergence process of evolutionary algorithms. While empirical and theoretical studies have shed light on the behavior of algorithms for single-objective optimization, little is known about how self-adaptation influences multi-objective evolutionary algorithms. In this work, we contribute (1) extensive experimental analysis of the Global Simple… ▽ More

    Submitted 8 May, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

    Comments: submitted to FOGA 2023

  43. arXiv:2303.04573  [pdf, other

    cs.NE

    Using Affine Combinations of BBOB Problems for Performance Assessment

    Authors: Diederick Vermetten, Furong Ye, Carola Doerr

    Abstract: Benchmarking plays a major role in the development and analysis of optimization algorithms. As such, the way in which the used benchmark problems are defined significantly affects the insights that can be gained from any given benchmark study. One way to easily extend the range of available benchmark functions is through affine combinations between pairs of functions. From the perspective of lands… ▽ More

    Submitted 8 March, 2023; originally announced March 2023.

  44. arXiv:2302.01649  [pdf, other

    cs.LG

    Structure-informed Language Models Are Protein Designers

    Authors: Zaixiang Zheng, Yifan Deng, Dongyu Xue, Yi Zhou, Fei YE, Quanquan Gu

    Abstract: This paper demonstrates that language models are strong structure-based protein designers. We present LM-Design, a generic approach to reprogramming sequence-based protein language models (pLMs), that have learned massive sequential evolutionary knowledge from the universe of natural protein sequences, to acquire an immediate capability to design preferable protein sequences for given folds. We co… ▽ More

    Submitted 9 February, 2023; v1 submitted 3 February, 2023; originally announced February 2023.

    Comments: 10 pages; ver.2 update: added image credit to RFdiffusion (Watson et al., 2022) in Fig. 1F, and fixed some small presentation errors

  45. arXiv:2302.01464  [pdf, other

    cs.AI cs.NE

    Benchmarking Algorithms for Submodular Optimization Problems Using IOHProfiler

    Authors: Frank Neumann, Aneta Neumann, Chao Qian, Viet Anh Do, Jacob de Nobel, Diederick Vermetten, Saba Sadeghi Ahouei, Furong Ye, Hao Wang, Thomas Bäck

    Abstract: Submodular functions play a key role in the area of optimization as they allow to model many real-world problems that face diminishing returns. Evolutionary algorithms have been shown to obtain strong theoretical performance guarantees for a wide class of submodular problems under various types of constraints while clearly outperforming standard greedy approximation algorithms. This paper introduc… ▽ More

    Submitted 2 February, 2023; originally announced February 2023.

  46. arXiv:2301.12112  [pdf, other

    cs.CL q-bio.BM

    On Pre-trained Language Models for Antibody

    Authors: Danqing Wang, Fei Ye, Hao Zhou

    Abstract: Antibodies are vital proteins offering robust protection for the human body from pathogens. The development of general protein and antibody-specific pre-trained language models both facilitate antibody prediction tasks. However, there have been limited studies that comprehensively explore the representation capability of distinct pre-trained language models on different antibody tasks. To investig… ▽ More

    Submitted 1 March, 2023; v1 submitted 28 January, 2023; originally announced January 2023.

    Comments: Accepted in ICLR 2023

  47. arXiv:2301.06115  [pdf, other

    cs.CV eess.IV

    Learning to Compress Unmanned Aerial Vehicle (UAV) Captured Video: Benchmark and Analysis

    Authors: Chuanmin Jia, Feng Ye, Huifang Sun, Siwei Ma, Wen Gao

    Abstract: During the past decade, the Unmanned-Aerial-Vehicles (UAVs) have attracted increasing attention due to their flexible, extensive, and dynamic space-sensing capabilities. The volume of video captured by UAVs is exponentially growing along with the increased bitrate generated by the advancement of the sensors mounted on UAVs, bringing new challenges for on-device UAV storage and air-ground data tran… ▽ More

    Submitted 15 January, 2023; originally announced January 2023.

    Comments: MPAI End-to-end Video group progress report, DCC 2023

  48. arXiv:2301.01949  [pdf, other

    cs.CL cs.AI cs.CV cs.LG cs.MM

    SPRING: Situated Conversation Agent Pretrained with Multimodal Questions from Incremental Layout Graph

    Authors: Yuxing Long, Binyuan Hui, Fulong Ye, Yanyang Li, Zhuoxin Han, Caixia Yuan, Yongbin Li, Xiaojie Wang

    Abstract: Existing multimodal conversation agents have shown impressive abilities to locate absolute positions or retrieve attributes in simple scenarios, but they fail to perform well when complex relative positions and information alignments are involved, which poses a bottleneck in response quality. In this paper, we propose a Situated Conversation Agent Petrained with Multimodal Questions from INcrement… ▽ More

    Submitted 5 January, 2023; originally announced January 2023.

    Comments: AAAI 2023

  49. arXiv:2212.09450  [pdf, other

    q-bio.BM cs.CE cs.LG

    Accelerating Antimicrobial Peptide Discovery with Latent Structure

    Authors: Danqing Wang, Zeyu Wen, Fei Ye, Lei Li, Hao Zhou

    Abstract: Antimicrobial peptides (AMPs) are promising therapeutic approaches against drug-resistant pathogens. Recently, deep generative models are used to discover new AMPs. However, previous studies mainly focus on peptide sequence attributes and do not consider crucial structure information. In this paper, we propose a latent sequence-structure model for designing AMPs (LSSAMP). LSSAMP exploits multi-sca… ▽ More

    Submitted 20 August, 2023; v1 submitted 28 November, 2022; originally announced December 2022.

    Comments: KDD 2023

  50. AdvMIL: Adversarial Multiple Instance Learning for the Survival Analysis on Whole-Slide Images

    Authors: Pei Liu, Luping Ji, Feng Ye, Bo Fu

    Abstract: The survival analysis on histological whole-slide images (WSIs) is one of the most important means to estimate patient prognosis. Although many weakly-supervised deep learning models have been developed for gigapixel WSIs, their potential is generally restricted by classical survival analysis rules and fully-supervised learning requirements. As a result, these models provide patients only with a c… ▽ More

    Submitted 5 April, 2023; v1 submitted 13 December, 2022; originally announced December 2022.

    Comments: 15 pages, 10 figures, 8 tables

    Journal ref: Medical Image Analysis, 103020 (2023)