Skip to main content

Showing 1–50 of 283 results for author: Qu, L

  1. arXiv:2407.05236  [pdf, other

    astro-ph.HE

    A timing view of the additional high-energy spectral component discovered in the black hole candidate Swift J1727.8-1613

    Authors: Zi-Xu Yang, Liang Zhang, Shuang-Nan Zhang, L. Tao, Shu Zhang, Ruican Ma, Qingcui Bu, Yue Huang, He-Xin Liu, Wei Yu, Guang C. Xiao, Peng-Ju Wang, Hua Feng, Li-Ming Song, Xiang Ma, Mingyu Ge, QingChang Zhao, J. L. Qu

    Abstract: We present an energy-dependent analysis for the type-C quasi-periodic oscillations (QPOs) observed in the black hole X-ray binary Swift J1727.8-1613 using Insight-HXMT observations. We find that the QPO fractional rms at energies above 40 keV is significantly higher than that below 20 keV. This is the first report of a high energy (HE)-rms excess in the rms spectrum of a black hole X-ray binary. I… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  2. arXiv:2407.04955  [pdf, other

    cs.CV

    Asynchronous Multimodal Video Sequence Fusion via Learning Modality-Exclusive and -Agnostic Representations

    Authors: Dingkang Yang, Mingcheng Li, Linhao Qu, Kun Yang, Peng Zhai, Song Wang, Lihua Zhang

    Abstract: Understanding human intentions (e.g., emotions) from videos has received considerable attention recently. Video streams generally constitute a blend of temporal data stemming from distinct modalities, including natural language, facial expressions, and auditory clues. Despite the impressive advancements of previous works via attention-based paradigms, the inherent temporal asynchrony and modality… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: TCSVT 2024

  3. arXiv:2407.02031  [pdf, other

    cs.DC cs.AI cs.LG

    SwiftDiffusion: Efficient Diffusion Model Serving with Add-on Modules

    Authors: Suyi Li, Lingyun Yang, Xiaoxiao Jiang, Hanfeng Lu, Zhipeng Di, Weiyi Lu, Jiawei Chen, Kan Liu, Yinghao Yu, Tao Lan, Guodong Yang, Lin Qu, Liping Zhang, Wei Wang

    Abstract: This paper documents our characterization study and practices for serving text-to-image requests with stable diffusion models in production. We first comprehensively analyze inference request traces for commercial text-to-image applications. It commences with our observation that add-on modules, i.e., ControlNets and LoRAs, that augment the base stable diffusion models, are ubiquitous in generatin… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  4. arXiv:2406.18156  [pdf, other

    cs.LG cs.DC cs.NI eess.SP

    FedAQ: Communication-Efficient Federated Edge Learning via Joint Uplink and Downlink Adaptive Quantization

    Authors: Linping Qu, Shenghui Song, Chi-Ying Tsui

    Abstract: Federated learning (FL) is a powerful machine learning paradigm which leverages the data as well as the computational resources of clients, while protecting clients' data privacy. However, the substantial model size and frequent aggregation between the server and clients result in significant communication overhead, making it challenging to deploy FL in resource-limited wireless networks. In this… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  5. arXiv:2406.17430  [pdf, other

    cs.CL cs.SD eess.AS

    Towards Probing Speech-Specific Risks in Large Multimodal Models: A Taxonomy, Benchmark, and Insights

    Authors: Hao Yang, Lizhen Qu, Ehsan Shareghi, Gholamreza Haffari

    Abstract: Large Multimodal Models (LMMs) have achieved great success recently, demonstrating a strong capability to understand multimodal information and to interact with human users. Despite the progress made, the challenge of detecting high-risk interactions in multimodal settings, and in particular in speech modality, remains largely unexplored. Conventional research on risk for speech modality primarily… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  6. arXiv:2406.17300  [pdf, other

    cs.CL

    CausalScore: An Automatic Reference-Free Metric for Assessing Response Relevance in Open-Domain Dialogue Systems

    Authors: Tao Feng, Lizhen Qu, Xiaoxi Kang, Gholamreza Haffari

    Abstract: Automatically evaluating the quality of responses in open-domain dialogue systems is a challenging but crucial task. Current evaluation metrics often fail to align with human judgments, especially when assessing responses that are grammatically correct. To address this issue, we propose a novel metric, called CausalScore, which assesses the relevance of responses by measuring the causal strength b… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  7. arXiv:2406.15490  [pdf, other

    cs.CL cs.AI cs.LG

    Causal Discovery Inspired Unsupervised Domain Adaptation for Emotion-Cause Pair Extraction

    Authors: Yuncheng Hua, Yujin Huang, Shuo Huang, Tao Feng, Lizhen Qu, Chris Bain, Richard Bassed, Gholamreza Haffari

    Abstract: This paper tackles the task of emotion-cause pair extraction in the unsupervised domain adaptation setting. The problem is challenging as the distributions of the events causing emotions in target domains are dramatically different than those in source domains, despite the distributions of emotional expressions between domains are overlapped. Inspired by causal discovery, we propose a novel deep l… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 12 pages, 6 figures, 4 tables; Under Review in EMNLP 2024

    ACM Class: I.2.4

  8. arXiv:2406.14133  [pdf, other

    physics.optics

    Beam shaping by nonlinear moiré metasurfaces

    Authors: Lun Qu, Wei Wu, Di Zhang, Chenxiong Wang, Lu Bai, Chenyang Li, Wei Cai, Mengxin Ren, Andrea Alù, Jingjun Xu

    Abstract: This paper explores the interplay of momentum transfer and nonlinear optical processes through moiré phenomena. Momentum transfer plays a crucial role in the interaction between photons and matter. Here, we study stacked metasurfaces with tailored dispersion and rotated against each other with varying twisted angles. The stacking introduces interlayer interactions, which can be controlled by the r… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 10 pages, 5 figures

  9. arXiv:2406.13217  [pdf, other

    cs.CL

    Bridging Law and Data: Augmenting Reasoning via a Semi-Structured Dataset with IRAC methodology

    Authors: Xiaoxi Kang, Lizhen Qu, Lay-Ki Soon, Zhuang Li, Adnan Trakic

    Abstract: The effectiveness of Large Language Models (LLMs) in legal reasoning is often limited due to the unique legal terminologies and the necessity for highly specialized knowledge. These limitations highlight the need for high-quality data tailored for complex legal reasoning tasks. This paper introduces LEGALSEMI, a benchmark specifically curated for legal scenario analysis. LEGALSEMI comprises 54 leg… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  10. arXiv:2406.10882  [pdf, other

    cs.CL

    SCAR: Efficient Instruction-Tuning for Large Language Models via Style Consistency-Aware Response Ranking

    Authors: Zhuang Li, Yuncheng Hua, Thuy-Trang Vu, Haolan Zhan, Lizhen Qu, Gholamreza Haffari

    Abstract: Recent studies have shown that maintaining a consistent response style by human experts and enhancing data quality in training sets can significantly improve the performance of fine-tuned Large Language Models (LLMs) while reducing the number of training examples needed. However, the precise definition of style and the relationship between style, data quality, and LLM performance remains unclear.… ▽ More

    Submitted 10 July, 2024; v1 submitted 16 June, 2024; originally announced June 2024.

    Comments: 21 pages

  11. arXiv:2406.08811  [pdf, other

    cs.CL

    Mixture-of-Skills: Learning to Optimize Data Usage for Fine-Tuning Large Language Models

    Authors: Minghao Wu, Thuy-Trang Vu, Lizhen Qu, Gholamreza Haffari

    Abstract: Large language models (LLMs) are typically fine-tuned on diverse and extensive datasets sourced from various origins to develop a comprehensive range of skills, such as writing, reasoning, chatting, coding, and more. Each skill has unique characteristics, and these datasets are often heterogeneous and imbalanced, making the fine-tuning process highly challenging. Balancing the development of each… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Work in progress; 15 pages, 7 tables, 4 figures

  12. arXiv:2406.05814  [pdf, other

    cs.CV cs.AI cs.CL cs.LG cs.MM

    Unified Text-to-Image Generation and Retrieval

    Authors: Leigang Qu, Haochuan Li, Tan Wang, Wenjie Wang, Yongqi Li, Liqiang Nie, Tat-Seng Chua

    Abstract: How humans can efficiently and effectively acquire images has always been a perennial question. A typical solution is text-to-image retrieval from an existing database given the text query; however, the limited database typically lacks creativity. By contrast, recent breakthroughs in text-to-image generation have made it possible to produce fancy and diverse visual content, but it faces challenges… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  13. arXiv:2406.05615  [pdf, other

    cs.CL

    Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data Perspectives

    Authors: Thong Nguyen, Yi Bin, Junbin Xiao, Leigang Qu, Yicong Li, Jay Zhangjie Wu, Cong-Duy Nguyen, See-Kiong Ng, Luu Anh Tuan

    Abstract: Humans use multiple senses to comprehend the environment. Vision and language are two of the most vital senses since they allow us to easily communicate our thoughts and perceive the world around us. There has been a lot of interest in creating video-language understanding systems with human-like senses since a video-language pair can mimic both our linguistic medium and visual environment with te… ▽ More

    Submitted 1 July, 2024; v1 submitted 8 June, 2024; originally announced June 2024.

    Comments: Accepted at ACL 2024 (Findings)

  14. arXiv:2406.05387  [pdf, other

    cs.IR

    PTF-FSR: A Parameter Transmission-Free Federated Sequential Recommender System

    Authors: Wei Yuan, Chaoqun Yang, Liang Qu, Quoc Viet Hung Nguyen, Guanhua Ye, Hongzhi Yin

    Abstract: Sequential recommender systems have made significant progress. Recently, due to increasing concerns about user data privacy, some researchers have implemented federated learning for sequential recommendation, a.k.a., Federated Sequential Recommender Systems (FedSeqRecs), in which a public sequential recommender model is shared and frequently transmitted between a central server and clients to achi… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  15. arXiv:2406.03749  [pdf, other

    cs.CL

    NAP^2: A Benchmark for Naturalness and Privacy-Preserving Text Rewriting by Learning from Human

    Authors: Shuo Huang, William MacLean, Xiaoxi Kang, Anqi Wu, Lizhen Qu, Qiongkai Xu, Zhuang Li, Xingliang Yuan, Gholamreza Haffari

    Abstract: Increasing concerns about privacy leakage issues in academia and industry arise when employing NLP models from third-party providers to process sensitive texts. To protect privacy before sending sensitive data to those models, we suggest sanitizing sensitive text using two common strategies used by humans: i) deleting sensitive expressions, and ii) obscuring sensitive details by abstracting them.… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  16. arXiv:2406.01375  [pdf, other

    cs.CL

    D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models

    Authors: Haoran Que, Jiaheng Liu, Ge Zhang, Chenchen Zhang, Xingwei Qu, Yinghao Ma, Feiyu Duan, Zhiqi Bai, Jiakai Wang, Yuanxing Zhang, Xu Tan, Jie Fu, Wenbo Su, Jiamang Wang, Lin Qu, Bo Zheng

    Abstract: Continual Pre-Training (CPT) on Large Language Models (LLMs) has been widely used to expand the model's fundamental understanding of specific downstream domains (e.g., math and code). For the CPT on domain-specific LLMs, one important question is how to choose the optimal mixture ratio between the general-corpus (e.g., Dolma, Slim-pajama) and the downstream domain-corpus. Existing methods usually… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  17. arXiv:2405.19637  [pdf, other

    stat.ME math.ST

    Inference in semiparametric formation models for directed networks

    Authors: Lianqiang Qu, Lu Chen, Ting Yan, Yuguo Chen

    Abstract: We propose a semiparametric model for dyadic link formations in directed networks. The model contains a set of degree parameters that measure different effects of popularity or outgoingness across nodes, a regression parameter vector that reflects the homophily effect resulting from the nodal attributes or pairwise covariates associated with edges, and a set of latent random noises with unknown di… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 28 pages, 3 figures

  18. arXiv:2405.10084  [pdf, other

    eess.AS cs.AI cs.SD

    Revisiting Deep Audio-Text Retrieval Through the Lens of Transportation

    Authors: Manh Luong, Khai Nguyen, Nhat Ho, Reza Haf, Dinh Phung, Lizhen Qu

    Abstract: The Learning-to-match (LTM) framework proves to be an effective inverse optimal transport approach for learning the underlying ground metric between two sources of data, facilitating subsequent matching. However, the conventional LTM framework faces scalability challenges, necessitating the use of the entire dataset each time the parameters of the ground metric are updated. In adapting LTM to the… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  19. arXiv:2405.06563  [pdf, other

    cs.CL

    What Can Natural Language Processing Do for Peer Review?

    Authors: Ilia Kuznetsov, Osama Mohammed Afzal, Koen Dercksen, Nils Dycke, Alexander Goldberg, Tom Hope, Dirk Hovy, Jonathan K. Kummerfeld, Anne Lauscher, Kevin Leyton-Brown, Sheng Lu, Mausam, Margot Mieskes, Aurélie Névéol, Danish Pruthi, Lizhen Qu, Roy Schwartz, Noah A. Smith, Thamar Solorio, Jingyan Wang, Xiaodan Zhu, Anna Rogers, Nihar B. Shah, Iryna Gurevych

    Abstract: The number of scientific articles produced every year is growing rapidly. Providing quality control over them is crucial for scientists and, ultimately, for the public good. In modern science, this process is largely delegated to peer review -- a distributed procedure in which each submission is evaluated by several independent experts in the field. Peer review is widely used, yet it is hard, time… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

  20. arXiv:2405.04254  [pdf, ps, other

    stat.ME

    Distributed variable screening for generalized linear models

    Authors: Tianbo Diao, Lianqiang Qu, Bo Li, Liuquan Sun

    Abstract: In this article, we develop a distributed variable screening method for generalized linear models. This method is designed to handle situations where both the sample size and the number of covariates are large. Specifically, the proposed method selects relevant covariates by using a sparsity-restricted surrogate likelihood estimator. It takes into account the joint effects of the covariates rather… ▽ More

    Submitted 7 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  21. arXiv:2404.15889  [pdf, other

    cs.CV cs.GR

    Sketch2Human: Deep Human Generation with Disentangled Geometry and Appearance Control

    Authors: Linzi Qu, Jiaxiang Shang, Hui Ye, Xiaoguang Han, Hongbo Fu

    Abstract: Geometry- and appearance-controlled full-body human image generation is an interesting but challenging task. Existing solutions are either unconditional or dependent on coarse conditions (e.g., pose, text), thus lacking explicit geometry and appearance control of body and garment. Sketching offers such editing ability and has been adopted in various sketch-based face generation and editing solutio… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  22. arXiv:2404.15585  [pdf, other

    cs.LG eess.IV

    Brain Storm Optimization Based Swarm Learning for Diabetic Retinopathy Image Classification

    Authors: Liang Qu, Cunze Wang, Yuhui Shi

    Abstract: The application of deep learning techniques to medical problems has garnered widespread research interest in recent years, such as applying convolutional neural networks to medical image classification tasks. However, data in the medical field is often highly private, preventing different hospitals from sharing data to train an accurate model. Federated learning, as a privacy-preserving machine le… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  23. arXiv:2404.13504  [pdf, other

    cs.CL

    IMO: Greedy Layer-Wise Sparse Representation Learning for Out-of-Distribution Text Classification with Pre-trained Models

    Authors: Tao Feng, Lizhen Qu, Zhuang Li, Haolan Zhan, Yuncheng Hua, Gholamreza Haffari

    Abstract: Machine learning models have made incredible progress, but they still struggle when applied to examples from unseen domains. This study focuses on a specific problem of domain generalization, where a model is trained on one source domain and tested on multiple target domains that are unseen during training. We propose IMO: Invariant features Masks for Out-of-Distribution text classification, to ac… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

  24. arXiv:2404.11818  [pdf, other

    cs.IR

    Automated Similarity Metric Generation for Recommendation

    Authors: Liang Qu, Yun Lin, Wei Yuan, Xiaojun Wan, Yuhui Shi, Hongzhi Yin

    Abstract: The embedding-based architecture has become the dominant approach in modern recommender systems, mapping users and items into a compact vector space. It then employs predefined similarity metrics, such as the inner product, to calculate similarity scores between user and item embeddings, thereby guiding the recommendation of items that align closely with a user's preferences. Given the critical ro… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  25. arXiv:2404.07598  [pdf, other

    physics.optics physics.app-ph

    Electro-optically Modulated Nonlinear Metasurfaces

    Authors: Zhengqing He, Lun Qu, Wei Wu, Jikun Liu, Jingfei You, Weiye Liu, Lu Bai, Chunyan Jin, Chenxiong Wang, Zhidong Gu, Wei Cai, Mengxin Ren, Jingjun Xu

    Abstract: Tunable nonlinearity facilitates the creation of reconfigurable nonlinear metasurfaces, enabling innovative applications in signal processing, light switching, and sensing. This paper presents a novel approach to electrically modulate SHG from a lithium niobate (LN) metasurface, exploiting the electro-optical (EO) effect. By fabricating a nanohole array metasurface on a thin LN film and applying a… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: 4 pages, 4 figures

  26. arXiv:2404.01177  [pdf, other

    cs.CR cs.IR

    Poisoning Decentralized Collaborative Recommender System and Its Countermeasures

    Authors: Ruiqi Zheng, Liang Qu, Tong Chen, Kai Zheng, Yuhui Shi, Hongzhi Yin

    Abstract: To make room for privacy and efficiency, the deployment of many recommender systems is experiencing a shift from central servers to personal devices, where the federated recommender systems (FedRecs) and decentralized collaborative recommender systems (DecRecs) are arguably the two most representative paradigms. While both leverage knowledge (e.g., gradients) sharing to facilitate learning local m… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  27. arXiv:2404.00358  [pdf, other

    cs.CV

    Spread Your Wings: A Radial Strip Transformer for Image Deblurring

    Authors: Duosheng Chen, Shihao Zhou, Jinshan Pan, Jinglei Shi, Lishen Qu, Jufeng Yang

    Abstract: Exploring motion information is important for the motion deblurring task. Recent the window-based transformer approaches have achieved decent performance in image deblurring. Note that the motion causing blurry results is usually composed of translation and rotation movements and the window-shift operation in the Cartesian coordinate system by the window-based transformer approaches only directly… ▽ More

    Submitted 21 May, 2024; v1 submitted 30 March, 2024; originally announced April 2024.

  28. arXiv:2404.00313  [pdf, other

    cs.CV

    Harmonizing Light and Darkness: A Symphony of Prior-guided Data Synthesis and Adaptive Focus for Nighttime Flare Removal

    Authors: Lishen Qu, Shihao Zhou, Jinshan Pan, Jinglei Shi, Duosheng Chen, Jufeng Yang

    Abstract: Intense light sources often produce flares in captured images at night, which deteriorates the visual quality and negatively affects downstream applications. In order to train an effective flare removal network, a reliable dataset is essential. The mainstream flare removal datasets are semi-synthetic to reduce human labour, but these datasets do not cover typical scenarios involving multiple scatt… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

  29. arXiv:2404.00288  [pdf, other

    cs.CV

    Seeing the Unseen: A Frequency Prompt Guided Transformer for Image Restoration

    Authors: Shihao Zhou, Jinshan Pan, Jinglei Shi, Duosheng Chen, Lishen Qu, Jufeng Yang

    Abstract: How to explore useful features from images as prompts to guide the deep image restoration models is an effective way to solve image restoration. In contrast to mining spatial relations within images as prompt, which leads to characteristics of different frequencies being neglected and further remaining subtle or undetectable artifacts in the restored image, we develop a Frequency Prompting image r… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: 18 pages, 10 figrues

  30. arXiv:2403.20107  [pdf, other

    cs.IR

    Robust Federated Contrastive Recommender System against Model Poisoning Attack

    Authors: Wei Yuan, Chaoqun Yang, Liang Qu, Guanhua Ye, Quoc Viet Hung Nguyen, Hongzhi Yin

    Abstract: Federated Recommender Systems (FedRecs) have garnered increasing attention recently, thanks to their privacy-preserving benefits. However, the decentralized and open characteristics of current FedRecs present two dilemmas. First, the performance of FedRecs is compromised due to highly sparse on-device data for each client. Second, the system's robustness is undermined by the vulnerability to model… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

  31. arXiv:2403.18702  [pdf, other

    cs.AR

    Toward CXL-Native Memory Tiering via Device-Side Profiling

    Authors: Zhe Zhou, Yiqi Chen, Tao Zhang, Yang Wang, Ran Shu, Shuotao Xu, Peng Cheng, Lei Qu, Yongqiang Xiong, Guangyu Sun

    Abstract: The Compute Express Link (CXL) interconnect has provided the ability to integrate diverse memory types into servers via byte-addressable SerDes links. Harnessing the full potential of such heterogeneous memory systems requires efficient memory tiering. However, existing research in this domain has been constrained by low-resolution and high-overhead memory access profiling techniques. To address t… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  32. arXiv:2403.18271  [pdf, other

    cs.CV

    Unleashing the Potential of SAM for Medical Adaptation via Hierarchical Decoding

    Authors: Zhiheng Cheng, Qingyue Wei, Hongru Zhu, Yan Wang, Liangqiong Qu, Wei Shao, Yuyin Zhou

    Abstract: The Segment Anything Model (SAM) has garnered significant attention for its versatile segmentation abilities and intuitive prompt-based interface. However, its application in medical imaging presents challenges, requiring either substantial training costs and extensive medical datasets for full model fine-tuning or high-quality prompts for optimal performance. This paper introduces H-SAM: a prompt… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: CVPR 2024

  33. arXiv:2403.10191  [pdf, other

    cs.CV

    Generative Region-Language Pretraining for Open-Ended Object Detection

    Authors: Chuang Lin, Yi Jiang, Lizhen Qu, Zehuan Yuan, Jianfei Cai

    Abstract: In recent research, significant attention has been devoted to the open-vocabulary object detection task, aiming to generalize beyond the limited number of classes labeled during training and detect objects described by arbitrary category names at inference. Compared with conventional object detection, open vocabulary object detection largely extends the object detection categories. However, it rel… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: CVPR 2024

  34. arXiv:2403.06131  [pdf, other

    cs.CR cs.AI

    FewFedPIT: Towards Privacy-preserving and Few-shot Federated Instruction Tuning

    Authors: Zhuo Zhang, Jingyuan Zhang, Jintao Huang, Lizhen Qu, Hongzhi Zhang, Qifan Wang, Xun Zhou, Zenglin Xu

    Abstract: Instruction tuning has been identified as a crucial technique for optimizing the performance of large language models (LLMs) in generating human-aligned responses. Nonetheless, gathering diversified and superior-quality instruction data for such tuning presents notable obstacles, especially in domains with rigid privacy provisions. Federated instruction tuning (FedIT) has emerged as a promising so… ▽ More

    Submitted 20 June, 2024; v1 submitted 10 March, 2024; originally announced March 2024.

    Comments: Work in progress

  35. arXiv:2403.04321  [pdf, other

    cs.CV cs.AI cs.CL cs.MM

    Discriminative Probing and Tuning for Text-to-Image Generation

    Authors: Leigang Qu, Wenjie Wang, Yongqi Li, Hanwang Zhang, Liqiang Nie, Tat-Seng Chua

    Abstract: Despite advancements in text-to-image generation (T2I), prior methods often face text-image misalignment problems such as relation confusion in generated images. Existing solutions involve cross-attention manipulation for better compositional understanding or integrating large language models for improved layout planning. However, the inherent alignment capabilities of T2I models are still inadequ… ▽ More

    Submitted 14 March, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

    Comments: CVPR 2024; project page: https://dpt-t2i.github.io/

  36. arXiv:2402.18467  [pdf, other

    cs.CV

    Separate and Conquer: Decoupling Co-occurrence via Decomposition and Representation for Weakly Supervised Semantic Segmentation

    Authors: Zhiwei Yang, Kexue Fu, Minghong Duan, Linhao Qu, Shuo Wang, Zhijian Song

    Abstract: Weakly supervised semantic segmentation (WSSS) with image-level labels aims to achieve segmentation tasks without dense annotations. However, attributed to the frequent coupling of co-occurring objects and the limited supervision from image-level labels, the challenging co-occurrence problem is widely present and leads to false activation of objects in WSSS. In this work, we devise a 'Separate and… ▽ More

    Submitted 21 March, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: Accepted by CVPR 2024

  37. arXiv:2402.14226  [pdf, other

    astro-ph.HE hep-ph

    Broadband noise and quasi-periodic oscillation characteristics of the X-ray pulsar RX J0440.9+4431

    Authors: P. P. Li, L. Tao, R. C. Ma, M. Y. Ge, Q. C. Zhao, S. J. Zhao, L. Zhang, Q. C. Bu, L. D. Kong, Y. L. Tuo, L. Ji, S. Zhang, J. L. Qu, S. N. Zhang, Y. Huang, X. Ma, W. T. Ye, Q. C. Shui

    Abstract: We present a comprehensive timing analysis on the Be/X-ray binary pulsar RX J0440.9+4431 using observations from \textit{NICER} and \textit{Insight}-HXMT during the 2022--2023 outburst. The power density spectrum (PDS) of RX J0440.9+4431 exhibits typical aperiodic variability in X-ray flux across a wide frequency range. During a super-critical accretion state, we detect quasi-periodic oscillations… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: 8 pages, 7 figures. Accepted in MNRAS

  38. arXiv:2402.11178  [pdf, other

    cs.CL

    RENOVI: A Benchmark Towards Remediating Norm Violations in Socio-Cultural Conversations

    Authors: Haolan Zhan, Zhuang Li, Xiaoxi Kang, Tao Feng, Yuncheng Hua, Lizhen Qu, Yi Ying, Mei Rianto Chandra, Kelly Rosalin, Jureynolds Jureynolds, Suraj Sharma, Shilin Qu, Linhao Luo, Lay-Ki Soon, Zhaleh Semnani Azad, Ingrid Zukerman, Gholamreza Haffari

    Abstract: Norm violations occur when individuals fail to conform to culturally accepted behaviors, which may lead to potential conflicts. Remediating norm violations requires social awareness and cultural sensitivity of the nuances at play. To equip interactive AI systems with a remediation ability, we offer ReNoVi - a large-scale corpus of 9,258 multi-turn dialogues annotated with social norms, as well as… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: work in progress. 15 pages, 7 figures

  39. arXiv:2402.10805  [pdf, other

    cs.MM cs.AI cs.CL cs.CV cs.IR

    Generative Cross-Modal Retrieval: Memorizing Images in Multimodal Language Models for Retrieval and Beyond

    Authors: Yongqi Li, Wenjie Wang, Leigang Qu, Liqiang Nie, Wenjie Li, Tat-Seng Chua

    Abstract: The recent advancements in generative language models have demonstrated their ability to memorize knowledge from documents and recall knowledge to respond to user queries effectively. Building upon this capability, we propose to enable multimodal large language models (MLLMs) to memorize and recall images within their parameters. Given a user query for visual content, the MLLM is anticipated to "r… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  40. arXiv:2402.09522  [pdf, other

    hep-th cond-mat.stat-mech quant-ph

    Krylov complexity of density matrix operators

    Authors: Pawel Caputa, Hyun-Sik Jeong, Sinong Liu, Juan F. Pedraza, Le-Chen Qu

    Abstract: Quantifying complexity in quantum systems has witnessed a surge of interest in recent years, with Krylov-based measures such as Krylov complexity ($C_K$) and Spread complexity ($C_S$) gaining prominence. In this study, we investigate their interplay by considering the complexity of states represented by density matrix operators. After setting up the problem, we analyze a handful of analytical and… ▽ More

    Submitted 6 June, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

    Comments: v1: 41 pages, 10 figures; v2: references added; v3: matching the published version

    Report number: YITP-24-21, IFT-UAM/CSIC-24-25

    Journal ref: J. High Energ. Phys. 2024, 337 (2024)

  41. arXiv:2402.06194  [pdf, other

    cs.DC

    SuperBench: Improving Cloud AI Infrastructure Reliability with Proactive Validation

    Authors: Yifan Xiong, Yuting Jiang, Ziyue Yang, Lei Qu, Guoshuai Zhao, Shuguang Liu, Dong Zhong, Boris Pinzur, Jie Zhang, Yang Wang, Jithin Jose, Hossein Pourreza, Jeff Baxter, Kushal Datta, Prabhat Ram, Luke Melton, Joe Chau, Peng Cheng, Yongqiang Xiong, Lidong Zhou

    Abstract: Reliability in cloud AI infrastructure is crucial for cloud service providers, prompting the widespread use of hardware redundancies. However, these redundancies can inadvertently lead to hidden degradation, so called "gray failure", for AI workloads, significantly affecting end-to-end performance and concealing performance issues, which complicates root cause analysis for failures and regressions… ▽ More

    Submitted 7 June, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

    Comments: USENIX ATC '24

  42. arXiv:2402.01737  [pdf, other

    cs.CL cs.AI

    Assistive Large Language Model Agents for Socially-Aware Negotiation Dialogues

    Authors: Yuncheng Hua, Lizhen Qu, Gholamreza Haffari

    Abstract: We develop assistive agents based on Large Language Models (LLMs) that aid interlocutors in business negotiations. Specifically, we simulate business negotiations by letting two LLM-based agents engage in role play. A third LLM acts as a remediator agent to rewrite utterances violating norms for improving negotiation outcomes. We introduce a simple tuning-free and label-free In-Context Learning (I… ▽ More

    Submitted 18 June, 2024; v1 submitted 29 January, 2024; originally announced February 2024.

    Comments: 25 pages, 3 figures, 13 tables; Under review in EMNLP 2024

    ACM Class: I.2.7

  43. arXiv:2402.01736  [pdf, other

    cs.CL cs.AI

    SADAS: A Dialogue Assistant System Towards Remediating Norm Violations in Bilingual Socio-Cultural Conversations

    Authors: Yuncheng Hua, Zhuang Li, Linhao Luo, Kadek Ananta Satriadi, Tao Feng, Haolan Zhan, Lizhen Qu, Suraj Sharma, Ingrid Zukerman, Zhaleh Semnani-Azad, Gholamreza Haffari

    Abstract: In today's globalized world, bridging the cultural divide is more critical than ever for forging meaningful connections. The Socially-Aware Dialogue Assistant System (SADAS) is our answer to this global challenge, and it's designed to ensure that conversations between individuals from diverse cultural backgrounds unfold with respect and understanding. Our system's novel architecture includes: (1)… ▽ More

    Submitted 29 January, 2024; originally announced February 2024.

    Comments: 8 pages, 2 figures

    ACM Class: I.2.7

  44. arXiv:2402.01097  [pdf, other

    cs.CL

    Let's Negotiate! A Survey of Negotiation Dialogue Systems

    Authors: Haolan Zhan, Yufei Wang, Tao Feng, Yuncheng Hua, Suraj Sharma, Zhuang Li, Lizhen Qu, Zhaleh Semnani Azad, Ingrid Zukerman, Gholamreza Haffari

    Abstract: Negotiation is a crucial ability in human communication. Recently, there has been a resurgent research interest in negotiation dialogue systems, whose goal is to create intelligent agents that can assist people in resolving conflicts or reaching agreements. Although there have been many explorations into negotiation dialogue systems, a systematic review of this task has not been performed to date.… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

    Comments: Accepted by EACL 2024 (findings). arXiv admin note: substantial text overlap with arXiv:2212.09072

  45. arXiv:2401.17630  [pdf, other

    cs.IR

    Towards Personalized Privacy: User-Governed Data Contribution for Federated Recommendation

    Authors: Liang Qu, Wei Yuan, Ruiqi Zheng, Lizhen Cui, Yuhui Shi, Hongzhi Yin

    Abstract: Federated recommender systems (FedRecs) have gained significant attention for their potential to protect user's privacy by keeping user privacy data locally and only communicating model parameters/gradients to the server. Nevertheless, the currently existing architecture of FedRecs assumes that all users have the same 0-privacy budget, i.e., they do not upload any data to the server, thus overlook… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

  46. arXiv:2401.17094  [pdf, ps, other

    math.CO cs.IT

    Constructing rotatable permutations of $\mathbb{F}_{2^m}^3$ with $3$-homogeneous functions

    Authors: Yunwen Chi, Kangquan Li, Longjiang Qu

    Abstract: In the literature, there are many results about permutation polynomials over finite fields. However, very few permutations of vector spaces are constructed although it has been shown that permutations of vector spaces have many applications in cryptography, especially in constructing permutations with low differential and boomerang uniformities. In this paper, motivated by the butterfly structur… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  47. arXiv:2401.15992  [pdf, other

    astro-ph.HE

    Pulsed Iron line Emission from the First Galactic Ultraluminous X-ray Pulsar Swift J0243.6+6124

    Authors: Y. X. Xiao, Y. J. Xu, M. Y. Ge, F. J. Lu, S. N. Zhang, S. Zhang, L. Tao, J. L. Qu, P. J. Wang, L. D. Kong, Y. L. Tuo, Y. You, S. J. Zhao, J. Q. Peng, Y. F. Du, Y. H. Zhang, W. T. Ye

    Abstract: We report the phase-resolved spectral results of the first Galactic Pulsating Ultra-Luminous X-ray source (PULX) Swift J0243.6+6124, modeling at its 2017-2018 outburst peak using data collected by the Hard X-ray Modulation Telescope (Insight-HXMT). The broad energy coverage of Insight-HXMT allows us to obtain more accurate spectral continuum to reduce the coupling of broad iron line profiles with… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

  48. arXiv:2401.15613  [pdf, other

    eess.IV cs.CV

    Towards Arbitrary-Scale Histopathology Image Super-resolution: An Efficient Dual-branch Framework via Implicit Self-texture Enhancement

    Authors: Minghong Duan, Linhao Qu, Zhiwei Yang, Manning Wang, Chenxi Zhang, Zhijian Song

    Abstract: High-quality whole-slide scanners are expensive, complex, and time-consuming, thus limiting the acquisition and utilization of high-resolution pathology whole-slide images in daily clinical work. Deep learning-based single-image super-resolution techniques are an effective way to solve this problem by synthesizing high-resolution images from low-resolution ones. However, the existing super-resolut… ▽ More

    Submitted 11 July, 2024; v1 submitted 28 January, 2024; originally announced January 2024.

  49. arXiv:2401.15360  [pdf, other

    cs.CL

    Importance-Aware Data Augmentation for Document-Level Neural Machine Translation

    Authors: Minghao Wu, Yufei Wang, George Foster, Lizhen Qu, Gholamreza Haffari

    Abstract: Document-level neural machine translation (DocNMT) aims to generate translations that are both coherent and cohesive, in contrast to its sentence-level counterpart. However, due to its longer input length and limited availability of training data, DocNMT often faces the challenge of data sparsity. To overcome this issue, we propose a novel Importance-Aware Data Augmentation (IADA) algorithm for Do… ▽ More

    Submitted 27 January, 2024; originally announced January 2024.

    Comments: 13 pages, 4 figures, 7 tables, accepted by EACL2024 main conference

  50. arXiv:2401.13448  [pdf, other

    cs.IR

    Decentralized Collaborative Learning with Adaptive Reference Data for On-Device POI Recommendation

    Authors: Ruiqi Zheng, Liang Qu, Tong Chen, Lizhen Cui, Yuhui Shi, Hongzhi Yin

    Abstract: In Location-based Social Networks, Point-of-Interest (POI) recommendation helps users discover interesting places. There is a trend to move from the cloud-based model to on-device recommendations for privacy protection and reduced server reliance. Due to the scarcity of local user-item interactions on individual devices, solely relying on local instances is not adequate. Collaborative Learning (CL… ▽ More

    Submitted 24 January, 2024; v1 submitted 24 January, 2024; originally announced January 2024.