Skip to main content

Showing 1–30 of 30 results for author: Luan, Z

  1. arXiv:2406.19633  [pdf, other

    cs.SE

    Combating Missed Recalls in E-commerce Search: A CoT-Prompting Testing Approach

    Authors: Shengnan Wu, Yongxiang Hu, Yingchuan Wang, Jiazhen Gu, Jin Meng, Liujie Fan, Zhongshi Luan, Xin Wang, Yangfan Zhou

    Abstract: Search components in e-commerce apps, often complex AI-based systems, are prone to bugs that can lead to missed recalls - situations where items that should be listed in search results aren't. This can frustrate shop owners and harm the app's profitability. However, testing for missed recalls is challenging due to difficulties in generating user-aligned test cases and the absence of oracles. In th… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering (FSE Companion '24), July 15--19, 2024, Porto de Galinhas, Brazil

  2. arXiv:2406.07925  [pdf, other

    cs.DC

    FDLoRA: Personalized Federated Learning of Large Language Model via Dual LoRA Tuning

    Authors: Jiaxing QI, Zhongzhi Luan, Shaohan Huang, Carol Fung, Hailong Yang, Depei Qian

    Abstract: Large language models (LLMs) have emerged as important components across various fields, yet their training requires substantial computation resources and abundant labeled data. It poses a challenge to robustly training LLMs for individual users (clients). To tackle this challenge, the intuitive idea is to introduce federated learning (FL), which can collaboratively train models on distributed pri… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  3. arXiv:2406.03046  [pdf, other

    cs.NE

    When Spiking neural networks meet temporal attention image decoding and adaptive spiking neuron

    Authors: Xuerui Qiu, Zheng Luan, Zhaorui Wang, Rui-Jie Zhu

    Abstract: Spiking Neural Networks (SNNs) are capable of encoding and processing temporal information in a biologically plausible way. However, most existing SNN-based methods for image tasks do not fully exploit this feature. Moreover, they often overlook the role of adaptive threshold in spiking neurons, which can enhance their dynamic behavior and learning ability. To address these issues, we propose a no… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted by ICLR23 tiny paper

  4. arXiv:2404.19286  [pdf, other

    cs.CV

    Soft Prompt Generation for Domain Generalization

    Authors: Shuanghao Bai, Yuedi Zhang, Wanqi Zhou, Zhirong Luan, Badong Chen

    Abstract: Large pre-trained vision language models (VLMs) have shown impressive zero-shot ability on downstream tasks with manually designed prompt. To further adapt VLMs to downstream tasks, soft prompt is proposed to replace manually designed prompt, which undergoes fine-tuning based on specific domain data. Prior prompt learning methods primarily learn a fixed prompt or residuled prompt from training sam… ▽ More

    Submitted 12 July, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

    Comments: 25 pages, 4 figures, accepted by ECCV 2024

  5. Multiple single-photon generations in three-level atoms coupled to cavity with non-Markovian effects

    Authors: H. Z. Shen, Y. Chen, T. Z. Luan, X. X. Yi

    Abstract: In this paper, we show how to generate the multiple single-photon wavepackets of arbitrary temporal shape from an optical cavity coupled with $N$ three-level atoms driven by a driving field in the non-Markovian regime. We derive an exact analytical expression of the optimal driving field for generating such wavepackets, which depends on two detunings of the cavity and driving field with respect to… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 20 pages, 15 figures

    Journal ref: Phys. Rev. A 107, 053705 (2023)

  6. Shortcuts to adiabaticity with general two-level non-Hermitian systems

    Authors: T. Z. Luan, H. Z. Shen, X. X. Yi

    Abstract: Shortcuts to adiabaticity are alternative fast processes which reproduce the same final state as the adiabatic process in a finite or even shorter time, which have been extended from Hermitian systems to non-Hermitian systems in recent years, but they are barely explored for general non-Hermitian systems where off-diagonal elements of the Hamiltonian are not Hermitian. In this paper, we propose a… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: 19 pages, 8 figures

    Journal ref: Phys. Rev. A 105, 013714 (2022)

  7. arXiv:2404.03226  [pdf, other

    cs.DC

    INSPIRIT: Optimizing Heterogeneous Task Scheduling through Adaptive Priority in Task-based Runtime Systems

    Authors: Yiqing Wang, Xiaoyan Liu, Hailong Yang, Xinyu Yang, Pengbo Wang, Yi Liu, Zhongzhi Luan, Depei Qian

    Abstract: As modern HPC computing platforms become increasingly heterogeneous, it is challenging for programmers to fully leverage the computation power of massive parallelism offered by such heterogeneity. Consequently, task-based runtime systems have been proposed as an intermediate layer to hide the complex heterogeneity from the application programmers. The core functionality of these systems is to real… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: 11 pages

  8. arXiv:2403.18320  [pdf, other

    math.OC

    Online Prediction for Streaming Tensor Time Series

    Authors: Zhenting Luan, Haoning Wang, Liping Zhang, Shansuo Liang, Wei Han

    Abstract: Real-time prediction plays a vital role in various control systems, such as traffic congestion control and wireless channel resource allocation. In these scenarios, the predictor usually needs to track the evolution of the latent statistical patterns in the modern high-dimensional streaming time series continuously and quickly, which presents new challenges for traditional prediction methods. This… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  9. arXiv:2402.15678  [pdf, other

    cs.DC

    Minions: Accelerating Large Language Model Inference with Adaptive and Collective Speculative Decoding

    Authors: Siqi Wang, Hailong Yang, Xuezhu Wang, Tongxuan Liu, Pengbo Wang, Xuning Liang, Kejie Ma, Tianyu Feng, Xin You, Yongjun Bao, Yi Liu, Zhongzhi Luan, Depei Qian

    Abstract: Large language models (LLM) have recently attracted surging interest due to their outstanding capabilities across various domains. However, enabling efficient LLM inference is challenging due to its autoregressive decoding that generates tokens only one at a time. Although research works apply pruning or quantization to speed up LLM inference, they typically require fine-tuning the LLM, incurring… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  10. arXiv:2402.03703  [pdf

    cs.RO

    Hierarchical Large Language Models in Cloud Edge End Architecture for Heterogeneous Robot Cluster Control

    Authors: Zhirong Luan, Yujun Lai, Rundong Huang, Yan Yan, Jingwei Wang, Jizhou Lu, Badong Chen

    Abstract: Despite their powerful semantic understanding and code generation capabilities, Large Language Models (LLMs) still face challenges when dealing with complex tasks. Multi agent strategy generation and motion control are highly complex domains that inherently require experts from multiple fields to collaborate. To enhance multi agent strategy generation and motion control, we propose an innovative a… ▽ More

    Submitted 16 February, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  11. arXiv:2402.03699  [pdf

    cs.RO cs.CV

    Automatic Robotic Development through Collaborative Framework by Large Language Models

    Authors: Zhirong Luan, Yujun Lai, Rundong Huang, Xiaruiqi Lan, Liangjun Chen, Badong Chen

    Abstract: Despite the remarkable code generation abilities of large language models LLMs, they still face challenges in complex task handling. Robot development, a highly intricate field, inherently demands human involvement in task allocation and collaborative teamwork . To enhance robot development, we propose an innovative automated collaboration framework inspired by real-world robot developers. This fr… ▽ More

    Submitted 16 February, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  12. arXiv:2312.09589  [pdf, other

    cs.CV

    Improving Cross-domain Few-shot Classification with Multilayer Perceptron

    Authors: Shuanghao Bai, Wanqi Zhou, Zhirong Luan, Donglin Wang, Badong Chen

    Abstract: Cross-domain few-shot classification (CDFSC) is a challenging and tough task due to the significant distribution discrepancies across different domains. To address this challenge, many approaches aim to learn transferable representations. Multilayer perceptron (MLP) has shown its capability to learn transferable representations in various downstream tasks, such as unsupervised image classification… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: 5pages, 4 figures

  13. arXiv:2312.09553  [pdf, other

    cs.CV

    Prompt-based Distribution Alignment for Unsupervised Domain Adaptation

    Authors: Shuanghao Bai, Min Zhang, Wanqi Zhou, Siteng Huang, Zhirong Luan, Donglin Wang, Badong Chen

    Abstract: Recently, despite the unprecedented success of large pre-trained visual-language models (VLMs) on a wide range of downstream tasks, the real-world unsupervised domain adaptation (UDA) problem is still not well explored. Therefore, in this paper, we first experimentally demonstrate that the unsupervised-trained VLMs can significantly reduce the distribution discrepancy between source and target dom… ▽ More

    Submitted 26 January, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: 13pages,6figures

  14. arXiv:2309.01189  [pdf, other

    cs.LG cs.AI cs.SE

    LogGPT: Exploring ChatGPT for Log-Based Anomaly Detection

    Authors: Jiaxing Qi, Shaohan Huang, Zhongzhi Luan, Carol Fung, Hailong Yang, Depei Qian

    Abstract: The increasing volume of log data produced by software-intensive systems makes it impractical to analyze them manually. Many deep learning-based methods have been proposed for log-based anomaly detection. These methods face several challenges such as high-dimensional and noisy log data, class imbalance, generalization, and model interpretability. Recently, ChatGPT has shown promising results in va… ▽ More

    Submitted 3 September, 2023; originally announced September 2023.

  15. arXiv:2307.16645  [pdf, other

    cs.CL

    Scaling Sentence Embeddings with Large Language Models

    Authors: Ting Jiang, Shaohan Huang, Zhongzhi Luan, Deqing Wang, Fuzhen Zhuang

    Abstract: Large language models (LLMs) have recently garnered significant interest. With in-context learning, LLMs achieve impressive results in various natural language tasks. However, the application of LLMs to sentence embeddings remains an area of ongoing research. In this work, we propose an in-context learning-based method aimed at improving sentence embeddings performance. Our approach involves adapt… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

  16. arXiv:2303.11715  [pdf, other

    cs.NI

    LogQA: Question Answering in Unstructured Logs

    Authors: Shaohan Huang, Yi Liu, Carol Fung, Jiaxing Qi, Hailong Yang, Zhongzhi Luan

    Abstract: Modern systems produce a large volume of logs to record run-time status and events. System operators use these raw logs to track a system in order to obtain some useful information to diagnose system anomalies. One of the most important problems in this area is to help operators find the answers to log-based questions efficiently and user-friendly. In this work, we propose LogQA, which aims at ans… ▽ More

    Submitted 21 March, 2023; originally announced March 2023.

  17. arXiv:2209.02478  [pdf, other

    cs.DC

    Mimose: An Input-Aware Checkpointing Planner for Efficient Training on GPU

    Authors: Jianjin Liao, Mingzhen Li, Qingxiao Sun, Jiwei Hao, Fengwei Yu, Shengdong Chen, Ye Tao, Zicheng Zhang, Hailong Yang, Zhongzhi Luan, Depei Qian

    Abstract: Larger deep learning models usually lead to higher model quality with an ever-increasing GPU memory footprint. Although tensor checkpointing techniques have been proposed to enable training under a restricted GPU memory budget, the input tensor dynamics have been unexploited for optimizing performance while reducing GPU memory footprint. Specifically, due to the diverse datasets and subsequent dat… ▽ More

    Submitted 6 September, 2022; originally announced September 2022.

  18. arXiv:2208.14228  [pdf, other

    cs.DC

    EasyScale: Accuracy-consistent Elastic Training for Deep Learning

    Authors: Mingzhen Li, Wencong Xiao, Biao Sun, Hanyu Zhao, Hailong Yang, Shiru Ren, Zhongzhi Luan, Xianyan Jia, Yi Liu, Yong Li, Wei Lin, Depei Qian

    Abstract: Distributed synchronized GPU training is commonly used for deep learning. The resource constraint of using a fixed number of GPUs makes large-scale training jobs suffer from long queuing time for resource allocation, and lowers the cluster utilization. Adapting to resource elasticity can alleviate this but often introduces inconsistent model accuracy, due to lacking of capability to decouple model… ▽ More

    Submitted 6 November, 2023; v1 submitted 30 August, 2022; originally announced August 2022.

    Comments: To be appeared at SC'23. Link: https://sc23.supercomputing.org/presentation/?id=pap262&sess=sess168

  19. arXiv:2201.00194  [pdf, other

    cs.LG cs.DC cs.PL

    FamilySeer: Towards Optimized Tensor Codes by Exploiting Computation Subgraph Similarity

    Authors: Shanjun Zhang, Mingzhen Li, Hailong Yang, Yi Liu, Zhongzhi Luan, Depei Qian

    Abstract: Deploying various deep learning (DL) models efficiently has boosted the research on DL compilers. The difficulty of generating optimized tensor codes drives DL compiler to ask for the auto-tuning approaches, and the increasing demands require increasing auto-tuning efficiency and quality. Currently, the DL compilers partition the input DL models into several subgraphs and leverage the auto-tuning… ▽ More

    Submitted 1 January, 2022; originally announced January 2022.

  20. arXiv:2112.02629  [pdf, other

    eess.SP math.OC

    A Tensor-BTD-based Modulation for Massive Unsourced Random Access

    Authors: Zhenting Luan, Yuchi Wu, Shansuo Liang, Liping Zhang, Wei Han, Bo Bai

    Abstract: In this letter, we propose a novel tensor-based modulation scheme for massive unsourced random access. The proposed modulation can be deemed as a summation of third-order tensors, of which the factors are representatives of subspaces. A constellation design based on high-dimensional Grassmann manifold is presented for information encoding. The uniqueness of tensor decomposition provides theoretica… ▽ More

    Submitted 5 December, 2021; originally announced December 2021.

  21. arXiv:2111.14780  [pdf, ps, other

    eess.SP math.OC

    Harmonic Retrieval with $L_1$-Tucker Tensor Decomposition

    Authors: Zhenting Luan, Zhenyu Ming, Yuchi Wu, Wei Han, Xiang Chen, Bo Bai, Liping Zhang

    Abstract: Harmonic retrieval (HR) has a wide range of applications in the scenes where signals are modelled as a summation of sinusoids. Past works have developed a number of approaches to recover the original signals. Most of them rely on classical singular value decomposition, which are vulnerable to unexpected outliers. In this paper, we present new decomposition algorithms of third-order complex-valued… ▽ More

    Submitted 30 November, 2021; v1 submitted 29 November, 2021; originally announced November 2021.

  22. Accelerating Sparse Approximate Matrix Multiplication on GPUs

    Authors: Xiaoyan Liu, Yi Liu, Ming Dun, Bohong Yin, Hailong Yang, Zhongzhi Luan, Depei Qian

    Abstract: Although the matrix multiplication plays a vital role in computational linear algebra, there are few efficient solutions for matrix multiplication of the near-sparse matrices. The Sparse Approximate Matrix Multiply (SpAMM) is one of the algorithms to fill the performance gap neglected by traditional optimizations for dense/sparse matrix multiplication. However, existing SpAMM algorithms fail to ex… ▽ More

    Submitted 24 March, 2021; originally announced March 2021.

  23. The Deep Learning Compiler: A Comprehensive Survey

    Authors: Mingzhen Li, Yi Liu, Xiaoyan Liu, Qingxiao Sun, Xin You, Hailong Yang, Zhongzhi Luan, Lin Gan, Guangwen Yang, Depei Qian

    Abstract: The difficulty of deploying various deep learning (DL) models on diverse DL hardware has boosted the research and development of DL compilers in the community. Several DL compilers have been proposed from both industry and academia such as Tensorflow XLA and TVM. Similarly, the DL compilers take the DL models described in different DL frameworks as input, and then generate optimized codes for dive… ▽ More

    Submitted 28 August, 2020; v1 submitted 6 February, 2020; originally announced February 2020.

    Journal ref: IEEE Transactions on Parallel & Distributed Systems, vol. 32, no. 03, pp. 708-727, 2021

  24. arXiv:2001.00493  [pdf, other

    cs.CR cs.LG

    Privacy for Rescue: A New Testimony Why Privacy is Vulnerable In Deep Models

    Authors: Ruiyuan Gao, Ming Dun, Hailong Yang, Zhongzhi Luan, Depei Qian

    Abstract: The huge computation demand of deep learning models and limited computation resources on the edge devices calls for the cooperation between edge device and cloud service by splitting the deep models into two halves. However, transferring the intermediates results from the partial models between edge device and cloud service makes the user privacy vulnerable since the attacker can intercept the int… ▽ More

    Submitted 31 December, 2019; originally announced January 2020.

  25. arXiv:1910.13346  [pdf, other

    cs.DC cs.PF cs.PL

    Intelligent-Unrolling: Exploiting Regular Patterns in Irregular Applications

    Authors: Changxi Liu, Hailong Yang, Xu Liu, Zhongzhi Luan, Depei Qian

    Abstract: Modern optimizing compilers are able to exploit memory access or computation patterns to generate vectorization codes. However, such patterns in irregular applications are unknown until runtime due to the input dependence. Thus, either compiler's static optimization or profile-guided optimization based on specific inputs cannot predict the patterns for any common input, which leads to suboptimal c… ▽ More

    Submitted 24 October, 2019; originally announced October 2019.

  26. arXiv:1908.03695  [pdf, other

    gr-qc astro-ph.CO hep-ph hep-th

    Towards degeneracy breaking of early universe models

    Authors: Ze Luan, Taotao Qiu

    Abstract: There are many possibilities of scenarios in the early universe, which can give rise to the same observational signals due to the degeneracy among each other, caused by equivalence under the conformal transformations. In order to break the degeneracy, in this paper we take into account the so-called "frame-invariant variables" proposed by A. Ijjas and P. J. Steinhardt in \cite{Ijjas:2015zma}. We… ▽ More

    Submitted 10 August, 2019; originally announced August 2019.

    Comments: 13 pages,6 figures, comments welcome

    Journal ref: Phys. Rev. D 101, 023517 (2020)

  27. arXiv:1907.11678  [pdf, other

    cs.DC

    Massively Scaling Seismic Processing on Sunway TaihuLight Supercomputer

    Authors: Yongmin Hu, Hailong Yang, Zhongzhi Luan, Depei Qian

    Abstract: Common Midpoint (CMP) and Common Reflection Surface (CRS) are widely used methods for improving the signal-to-noise ratio in the field of seismic processing. These methods are computationally intensive and require high performance computing. This paper optimizes these methods on the Sunway many-core architecture and implements large-scale seismic processing on the Sunway Taihulight supercomputer.… ▽ More

    Submitted 4 August, 2019; v1 submitted 26 July, 2019; originally announced July 2019.

  28. arXiv:1905.11669  [pdf, other

    cs.LG stat.ML

    CompactNet: Platform-Aware Automatic Optimization for Convolutional Neural Networks

    Authors: Weicheng Li, Rui Wang, Zhongzhi Luan, Di Huang, Zidong Du, Yunji Chen, Depei Qian

    Abstract: Convolutional Neural Network (CNN) based Deep Learning (DL) has achieved great progress in many real-life applications. Meanwhile, due to the complex model structures against strict latency and memory restriction, the implementation of CNN models on the resource-limited platforms is becoming more challenging. This work proposes a solution, called CompactNet\footnote{Project URL: \url{https://githu… ▽ More

    Submitted 28 May, 2019; originally announced May 2019.

  29. arXiv:1904.07404  [pdf, other

    cs.LG cs.PL stat.ML

    swTVM: Towards Optimized Tensor Code Generation for Deep Learning on Sunway Many-Core Processor

    Authors: Mingzhen Li, Changxi Liu, Jianjin Liao, Xuegui Zheng, Hailong Yang, Rujun Sun, Jun Xu, Lin Gan, Guangwen Yang, Zhongzhi Luan, Depei Qian

    Abstract: The flourish of deep learning frameworks and hardware platforms has been demanding an efficient compiler that can shield the diversity in both software and hardware in order to provide application portability. Among the existing deep learning compilers, TVM is well known for its efficiency in code generation and optimization across diverse hardware devices. In the meanwhile, the Sunway many-core p… ▽ More

    Submitted 11 July, 2022; v1 submitted 15 April, 2019; originally announced April 2019.

  30. arXiv:1712.03322  [pdf

    cond-mat.mtrl-sci cond-mat.mes-hall

    Observation of spin-orbit magnetoresistance in metallic thin films on magnetic insulators

    Authors: Lifan Zhou, Hongkang Song, Kai Liu, Zhongzhi Luan, Peng Wang, Lei Sun, Shengwei Jiang, Hongjun Xiang, Yanbin Chen, Jun Du, Haifeng Ding, Ke Xia, Jiang Xiao, Di Wu

    Abstract: A magnetoresistance effect induced by the Rashba spin-orbit interaction was predicted, but not yet observed, in bilayers consisting of normal metal and ferromagnetic insulator. Here, we present an experimental observation of this new type of spin-orbit magnetoresistance (SOMR) effect in a bilayer structure Cu[Pt]/Y3Fe5O12 (YIG), where the Cu/YIG interface is decorated with nanosize Pt islands. Thi… ▽ More

    Submitted 8 December, 2017; originally announced December 2017.

    Comments: 12 pages, 4 figures, 14 pages in supplementary. To appear on Science Advances

    Journal ref: Science Advances 4, eaao3318 (2018)