Skip to main content

Showing 1–7 of 7 results for author: Zan, C

  1. arXiv:2403.14399  [pdf, other

    cs.CL cs.AI

    Building Accurate Translation-Tailored LLMs with Language Aware Instruction Tuning

    Authors: Changtong Zan, Liang Ding, Li Shen, Yibing Zhen, Weifeng Liu, Dacheng Tao

    Abstract: Translation-tailored Large language models (LLMs) exhibit remarkable translation capabilities, even competing with supervised-trained commercial translation systems. However, off-target translation remains an unsolved problem, especially for low-resource languages, hindering us from developing accurate LLMs-based translation models. To mitigate the off-target translation problem and enhance the pe… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  2. arXiv:2309.16599  [pdf, other

    cs.CL

    Unlikelihood Tuning on Negative Samples Amazingly Improves Zero-Shot Translation

    Authors: Changtong Zan, Liang Ding, Li Shen, Yibin Lei, Yibing Zhan, Weifeng Liu, Dacheng Tao

    Abstract: Zero-shot translation (ZST), which is generally based on a multilingual neural machine translation model, aims to translate between unseen language pairs in training data. The common practice to guide the zero-shot language mapping during inference is to deliberately insert the source and target language IDs, e.g., <EN> for English and <DE> for German. Recent studies have shown that language IDs s… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

  3. arXiv:2306.03166  [pdf, other

    cs.IR cs.CL

    Unsupervised Dense Retrieval with Relevance-Aware Contrastive Pre-Training

    Authors: Yibin Lei, Liang Ding, Yu Cao, Changtong Zan, Andrew Yates, Dacheng Tao

    Abstract: Dense retrievers have achieved impressive performance, but their demand for abundant training data limits their application scenarios. Contrastive pre-training, which constructs pseudo-positive examples from unlabeled data, has shown great potential to solve this problem. However, the pseudo-positive examples crafted by data augmentations can be irrelevant. To this end, we propose relevance-aware… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: ACL 2023 Findings (Short), 5 pages main + 1 page references + 1 page appendix

  4. arXiv:2304.10354  [pdf, other

    cs.CL

    Prompt-Learning for Cross-Lingual Relation Extraction

    Authors: Chiaming Hsu, Changtong Zan, Liang Ding, Longyue Wang, Xiaoting Wang, Weifeng Liu, Fu Lin, Wenbin Hu

    Abstract: Relation Extraction (RE) is a crucial task in Information Extraction, which entails predicting relationships between entities within a given sentence. However, extending pre-trained RE models to other languages is challenging, particularly in real-world scenarios where Cross-Lingual Relation Extraction (XRE) is required. Despite recent advancements in Prompt-Learning, which involves transferring k… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Comments: IJCNN 2023

  5. arXiv:2209.09444  [pdf, other

    cs.CL

    Vega-MT: The JD Explore Academy Translation System for WMT22

    Authors: Changtong Zan, Keqin Peng, Liang Ding, Baopu Qiu, Boan Liu, Shwai He, Qingyu Lu, Zheng Zhang, Chuang Liu, Weifeng Liu, Yibing Zhan, Dacheng Tao

    Abstract: We describe the JD Explore Academy's submission of the WMT 2022 shared general translation task. We participated in all high-resource tracks and one medium-resource track, including Chinese-English, German-English, Czech-English, Russian-English, and Japanese-English. We push the limit of our previous work -- bidirectional training for translation by scaling up two main factors, i.e. language pair… ▽ More

    Submitted 6 May, 2023; v1 submitted 19 September, 2022; originally announced September 2022.

    Comments: WMT 2022 (Among all constrained systems, Vega-MT won 7 champions, 2 runners-up and 1 third place w.r.t sacreBLEU, and won 8 champions and 2 runners-up w.r.t COMET.)

  6. arXiv:2209.03316  [pdf, other

    cs.CL

    On the Complementarity between Pre-Training and Random-Initialization for Resource-Rich Machine Translation

    Authors: Changtong Zan, Liang Ding, Li Shen, Yu Cao, Weifeng Liu, Dacheng Tao

    Abstract: Pre-Training (PT) of text representations has been successfully applied to low-resource Neural Machine Translation (NMT). However, it usually fails to achieve notable gains (sometimes, even worse) on resource-rich NMT on par with its Random-Initialization (RI) counterpart. We take the first step to investigate the complementarity between PT and RI in resource-rich scenarios via two probing analyse… ▽ More

    Submitted 17 October, 2022; v1 submitted 7 September, 2022; originally announced September 2022.

    Comments: COLING 2022

  7. arXiv:2204.07834  [pdf, other

    cs.CL

    Bridging Cross-Lingual Gaps During Leveraging the Multilingual Sequence-to-Sequence Pretraining for Text Generation and Understanding

    Authors: Changtong Zan, Liang Ding, Li Shen, Yu Cao, Weifeng Liu, Dacheng Tao

    Abstract: For multilingual sequence-to-sequence pretrained language models (multilingual Seq2Seq PLMs), e.g. mBART, the self-supervised pretraining task is trained on a wide range of monolingual languages, e.g. 25 languages from CommonCrawl, while the downstream cross-lingual tasks generally progress on a bilingual language subset, e.g. English-German, making there exists the data discrepancy, namely domain… ▽ More

    Submitted 21 September, 2022; v1 submitted 16 April, 2022; originally announced April 2022.