Skip to main content

Showing 1–42 of 42 results for author: Freitag, M

  1. arXiv:2406.02832  [pdf, other

    cs.CL cs.LG

    Efficient Minimum Bayes Risk Decoding using Low-Rank Matrix Completion Algorithms

    Authors: Firas Trabelsi, David Vilar, Mara Finkelstein, Markus Freitag

    Abstract: Minimum Bayes Risk (MBR) decoding is a powerful decoding strategy widely used for text generation tasks, but its quadratic computational complexity limits its practical application. This paper presents a novel approach for approximating MBR decoding using matrix completion techniques, focusing on the task of machine translation. We formulate MBR decoding as a matrix completion problem, where the u… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  2. arXiv:2405.11814  [pdf, other

    cs.CV cs.AI cs.CY

    Climatic & Anthropogenic Hazards to the Nasca World Heritage: Application of Remote Sensing, AI, and Flood Modelling

    Authors: Masato Sakai, Marcus Freitag, Akihisa Sakurai, Conrad M Albrecht, Hendrik F Hamann

    Abstract: Preservation of the Nasca geoglyphs at the UNESCO World Heritage Site in Peru is urgent as natural and human impact accelerates. More frequent weather extremes such as flashfloods threaten Nasca artifacts. We demonstrate that runoff models based on (sub-)meter scale, LiDAR-derived digital elevation data can highlight AI-detected geoglyphs that are in danger of erosion. We recommend measures of mit… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: accepted at IGARSS 2024

  3. arXiv:2404.16653  [pdf, other

    cs.CL cs.AI

    Análise de ambiguidade linguística em modelos de linguagem de grande escala (LLMs)

    Authors: Lavínia de Carvalho Moraes, Irene Cristina Silvério, Rafael Alexandre Sousa Marques, Bianca de Castro Anaia, Dandara Freitas de Paula, Maria Carolina Schincariol de Faria, Iury Cleveston, Alana de Santana Correia, Raquel Meister Ko Freitag

    Abstract: Linguistic ambiguity continues to represent a significant challenge for natural language processing (NLP) systems, notwithstanding the advancements in architectures such as Transformers and BERT. Inspired by the recent success of instructional models like ChatGPT and Gemini (In 2023, the artificial intelligence was called Bard.), this study aims to analyze and discuss linguistic ambiguity within t… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: in Portuguese language, 16 páginas, 5 páginas de apêndice e 4 imagens

  4. arXiv:2404.01474  [pdf, other

    cs.CL

    Finding Replicable Human Evaluations via Stable Ranking Probability

    Authors: Parker Riley, Daniel Deutsch, George Foster, Viresh Ratnakar, Ali Dabirmoghaddam, Markus Freitag

    Abstract: Reliable human evaluation is critical to the development of successful natural language generation models, but achieving it is notoriously difficult. Stability is a crucial requirement when ranking systems by quality: consistent ranking of systems across repeated evaluations is not just desirable, but essential. Without it, there is no reliable foundation for hill-climbing or product launch decisi… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: To appear at NAACL 2024

  5. arXiv:2311.09336  [pdf, other

    cs.CL

    LLMRefine: Pinpointing and Refining Large Language Models via Fine-Grained Actionable Feedback

    Authors: Wenda Xu, Daniel Deutsch, Mara Finkelstein, Juraj Juraska, Biao Zhang, Zhongtao Liu, William Yang Wang, Lei Li, Markus Freitag

    Abstract: Recent large language models (LLM) are leveraging human feedback to improve their generation quality. However, human feedback is costly to obtain, especially during inference. In this work, we propose LLMRefine, an inference time optimization method to refine LLM's output. The core idea is to use a learned fine-grained feedback model to pinpoint defects and guide LLM to refine them iteratively. Us… ▽ More

    Submitted 2 April, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: Accepted to NAACL 2024

  6. arXiv:2311.05350  [pdf, other

    cs.CL

    There's no Data Like Better Data: Using QE Metrics for MT Data Filtering

    Authors: Jan-Thorsten Peter, David Vilar, Daniel Deutsch, Mara Finkelstein, Juraj Juraska, Markus Freitag

    Abstract: Quality Estimation (QE), the evaluation of machine translation output without the need of explicit references, has seen big improvements in the last years with the use of neural metrics. In this paper we analyze the viability of using QE metrics for filtering out bad quality sentence pairs in the training data of neural machine translation systems~(NMT). While most corpus filtering methods are foc… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Comments: to be published at WMT23

  7. arXiv:2310.06707  [pdf, other

    cs.CL cs.AI

    Quality-Aware Translation Models: Efficient Generation and Quality Estimation in a Single Model

    Authors: Christian Tomani, David Vilar, Markus Freitag, Colin Cherry, Subhajit Naskar, Mara Finkelstein, Xavier Garcia, Daniel Cremers

    Abstract: Maximum-a-posteriori (MAP) decoding is the most widely used decoding strategy for neural machine translation (NMT) models. The underlying assumption is that model probability correlates well with human judgment, with better translations getting assigned a higher score by the model. However, research has shown that this assumption does not always hold, and generation quality can be improved by deco… ▽ More

    Submitted 11 July, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024)

  8. arXiv:2309.10966  [pdf, other

    cs.CL

    MBR and QE Finetuning: Training-time Distillation of the Best and Most Expensive Decoding Methods

    Authors: Mara Finkelstein, Subhajit Naskar, Mehdi Mirzazadeh, Apurva Shah, Markus Freitag

    Abstract: Recent research in decoding methods for Natural Language Generation (NLG) tasks has shown that MAP decoding is not optimal, because model probabilities do not always align with human preferences. Stronger decoding methods, including Quality Estimation (QE) reranking and Minimum Bayes' Risk (MBR) decoding, have since been proposed to mitigate the model-perplexity-vs-quality mismatch. While these de… ▽ More

    Submitted 25 March, 2024; v1 submitted 19 September, 2023; originally announced September 2023.

  9. arXiv:2309.02094  [pdf

    cs.LG cs.AI cs.DB cs.IR

    TensorBank: Tensor Lakehouse for Foundation Model Training

    Authors: Romeo Kienzler, Leonardo Pondian Tizzei, Benedikt Blumenstiel, Zoltan Arnold Nagy, S. Karthik Mukkavilli, Johannes Schmude, Marcus Freitag, Michael Behrendt, Daniel Salles Civitarese, Naomi Simumba, Daiki Kimura, Hendrik Hamann

    Abstract: Storing and streaming high dimensional data for foundation model training became a critical requirement with the rise of foundation models beyond natural language. In this paper we introduce TensorBank, a petabyte scale tensor lakehouse capable of streaming tensors from Cloud Object Store (COS) to GPU memory at wire speed based on complex relational queries. We use Hierarchical Statistical Indices… ▽ More

    Submitted 21 March, 2024; v1 submitted 5 September, 2023; originally announced September 2023.

  10. arXiv:2308.13506  [pdf, other

    cs.CL

    Training and Meta-Evaluating Machine Translation Evaluation Metrics at the Paragraph Level

    Authors: Daniel Deutsch, Juraj Juraska, Mara Finkelstein, Markus Freitag

    Abstract: As research on machine translation moves to translating text beyond the sentence level, it remains unclear how effective automatic evaluation metrics are at scoring longer translations. In this work, we first propose a method for creating paragraph-level data for training and meta-evaluating metrics from existing sentence-level data. Then, we use these new datasets to benchmark existing sentence-l… ▽ More

    Submitted 28 August, 2023; v1 submitted 25 August, 2023; originally announced August 2023.

    Comments: Removing extra "and" from author list

  11. arXiv:2308.07286  [pdf, other

    cs.CL cs.LG

    The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation

    Authors: Patrick Fernandes, Daniel Deutsch, Mara Finkelstein, Parker Riley, André F. T. Martins, Graham Neubig, Ankush Garg, Jonathan H. Clark, Markus Freitag, Orhan Firat

    Abstract: Automatic evaluation of machine translation (MT) is a critical tool driving the rapid iterative development of MT systems. While considerable progress has been made on estimating a single scalar quality score, current metrics lack the informativeness of more detailed schemes that annotate individual errors, such as Multidimensional Quality Metrics (MQM). In this paper, we help fill this gap by pro… ▽ More

    Submitted 14 August, 2023; originally announced August 2023.

    Comments: 19 pages

  12. arXiv:2305.14324  [pdf, other

    cs.CL

    Ties Matter: Meta-Evaluating Modern Metrics with Pairwise Accuracy and Tie Calibration

    Authors: Daniel Deutsch, George Foster, Markus Freitag

    Abstract: Kendall's tau is frequently used to meta-evaluate how well machine translation (MT) evaluation metrics score individual translations. Its focus on pairwise score comparisons is intuitive but raises the question of how ties should be handled, a gray area that has motivated different variants in the literature. We demonstrate that, in settings like modern MT meta-evaluation, existing variants have w… ▽ More

    Submitted 17 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

  13. arXiv:2305.14282  [pdf, other

    cs.CL cs.AI

    INSTRUCTSCORE: Explainable Text Generation Evaluation with Finegrained Feedback

    Authors: Wenda Xu, Danqing Wang, Liangming Pan, Zhenqiao Song, Markus Freitag, William Yang Wang, Lei Li

    Abstract: Automatically evaluating the quality of language generation is critical. Although recent learned metrics show high correlation with human judgement, these metrics can not explain their verdict or associate the scores with defects in generated text. To address this limitation, we present InstructScore, an explainable evaluation metric for text generation. By harnessing both explicit human instructi… ▽ More

    Submitted 26 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted to EMNLP2023 Main Conference

  14. arXiv:2305.10403  [pdf, other

    cs.CL cs.AI

    PaLM 2 Technical Report

    Authors: Rohan Anil, Andrew M. Dai, Orhan Firat, Melvin Johnson, Dmitry Lepikhin, Alexandre Passos, Siamak Shakeri, Emanuel Taropa, Paige Bailey, Zhifeng Chen, Eric Chu, Jonathan H. Clark, Laurent El Shafey, Yanping Huang, Kathy Meier-Hellstern, Gaurav Mishra, Erica Moreira, Mark Omernick, Kevin Robinson, Sebastian Ruder, Yi Tay, Kefan Xiao, Yuanzhong Xu, Yujing Zhang, Gustavo Hernandez Abrego , et al. (103 additional authors not shown)

    Abstract: We introduce PaLM 2, a new state-of-the-art language model that has better multilingual and reasoning capabilities and is more compute-efficient than its predecessor PaLM. PaLM 2 is a Transformer-based model trained using a mixture of objectives. Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on… ▽ More

    Submitted 13 September, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

  15. arXiv:2305.09860  [pdf, other

    cs.CL cs.AI cs.LG

    Epsilon Sampling Rocks: Investigating Sampling Strategies for Minimum Bayes Risk Decoding for Machine Translation

    Authors: Markus Freitag, Behrooz Ghorbani, Patrick Fernandes

    Abstract: Recent advances in machine translation (MT) have shown that Minimum Bayes Risk (MBR) decoding can be a powerful alternative to beam search decoding, especially when combined with neural-based utility functions. However, the performance of MBR decoding depends heavily on how and how many candidates are sampled from the model. In this paper, we explore how different sampling approaches for generatin… ▽ More

    Submitted 17 May, 2023; v1 submitted 16 May, 2023; originally announced May 2023.

  16. arXiv:2302.09650  [pdf, other

    cs.CL cs.LG

    Scaling Laws for Multilingual Neural Machine Translation

    Authors: Patrick Fernandes, Behrooz Ghorbani, Xavier Garcia, Markus Freitag, Orhan Firat

    Abstract: In this work, we provide a large-scale empirical study of the scaling properties of multilingual neural machine translation models. We examine how increases in the model size affect the model performance and investigate the role of the training mixture composition on the scaling behavior. We find that changing the weightings of the individual language pairs in the training mixture only affect the… ▽ More

    Submitted 19 February, 2023; originally announced February 2023.

    Comments: 19 pages, 20 figures

  17. arXiv:2211.09102  [pdf, other

    cs.CL

    Prompting PaLM for Translation: Assessing Strategies and Performance

    Authors: David Vilar, Markus Freitag, Colin Cherry, Jiaming Luo, Viresh Ratnakar, George Foster

    Abstract: Large language models (LLMs) that have been trained on multilingual but not parallel text exhibit a remarkable ability to translate between languages. We probe this ability in an in-depth study of the pathways language model (PaLM), which has demonstrated the strongest machine translation (MT) performance among similarly-trained LLMs to date. We investigate various strategies for choosing translat… ▽ More

    Submitted 25 June, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

    Comments: ACL 2023

  18. arXiv:2210.03057  [pdf, other

    cs.CL cs.AI cs.LG

    Language Models are Multilingual Chain-of-Thought Reasoners

    Authors: Freda Shi, Mirac Suzgun, Markus Freitag, Xuezhi Wang, Suraj Srivats, Soroush Vosoughi, Hyung Won Chung, Yi Tay, Sebastian Ruder, Denny Zhou, Dipanjan Das, Jason Wei

    Abstract: We evaluate the reasoning abilities of large language models in multilingual settings. We introduce the Multilingual Grade School Math (MGSM) benchmark, by manually translating 250 grade-school math problems from the GSM8K dataset (Cobbe et al., 2021) into ten typologically diverse languages. We find that the ability to solve MGSM problems via chain-of-thought prompting emerges with increasing mod… ▽ More

    Submitted 6 October, 2022; originally announced October 2022.

  19. arXiv:2205.02293  [pdf, other

    cs.CL cs.AI cs.LG

    Original or Translated? A Causal Analysis of the Impact of Translationese on Machine Translation Performance

    Authors: Jingwei Ni, Zhijing Jin, Markus Freitag, Mrinmaya Sachan, Bernhard Schölkopf

    Abstract: Human-translated text displays distinct features from naturally written text in the same language. This phenomena, known as translationese, has been argued to confound the machine translation (MT) evaluation. Yet, we find that existing work on translationese neglects some important factors and the conclusions are mostly correlational but not causal. In this work, we collect CausalMT, a dataset whe… ▽ More

    Submitted 8 June, 2022; v1 submitted 4 May, 2022; originally announced May 2022.

    Comments: NAACL 2022 (Oral)

  20. arXiv:2204.05307  [pdf, other

    cs.CL cs.LG

    Toward More Effective Human Evaluation for Machine Translation

    Authors: Belén Saldías, George Foster, Markus Freitag, Qijun Tan

    Abstract: Improvements in text generation technologies such as machine translation have necessitated more costly and time-consuming human evaluation procedures to ensure an accurate signal. We investigate a simple way to reduce cost by reducing the number of text segments that must be annotated in order to accurately predict a score for a complete test set. Using a sampling approach, we demonstrate that inf… ▽ More

    Submitted 11 April, 2022; originally announced April 2022.

    Comments: ACL 2022 Workshop on Human Evaluation of NLP Systems

  21. arXiv:2111.09388  [pdf, other

    cs.CL cs.AI cs.LG

    High Quality Rather than High Model Probability: Minimum Bayes Risk Decoding with Neural Metrics

    Authors: Markus Freitag, David Grangier, Qijun Tan, Bowen Liang

    Abstract: In Neural Machine Translation, it is typically assumed that the sentence with the highest estimated probability should also be the translation with the highest quality as measured by humans. In this work, we question this assumption and show that model estimates and translation quality only vaguely correlate. We apply Minimum Bayes Risk (MBR) decoding on unbiased samples to optimize diverse automa… ▽ More

    Submitted 25 April, 2022; v1 submitted 17 November, 2021; originally announced November 2021.

    Comments: Accepted at TACL, presented at NAACL22

  22. arXiv:2109.07740  [pdf, other

    cs.LG cs.AI cs.CL

    Scaling Laws for Neural Machine Translation

    Authors: Behrooz Ghorbani, Orhan Firat, Markus Freitag, Ankur Bapna, Maxim Krikun, Xavier Garcia, Ciprian Chelba, Colin Cherry

    Abstract: We present an empirical study of scaling properties of encoder-decoder Transformer models used in neural machine translation (NMT). We show that cross-entropy loss as a function of model size follows a certain scaling law. Specifically (i) We propose a formula which describes the scaling behavior of cross-entropy loss as a bivariate function of encoder and decoder size, and show that it gives accu… ▽ More

    Submitted 16 September, 2021; originally announced September 2021.

    Comments: 31 pages, 23 figures

  23. arXiv:2107.04512  [pdf, other

    cs.CL cs.LG

    Using Machine Translation to Localize Task Oriented NLG Output

    Authors: Scott Roy, Cliff Brunk, Kyu-Young Kim, Justin Zhao, Markus Freitag, Mihir Kale, Gagan Bansal, Sidharth Mudgal, Chris Varano

    Abstract: One of the challenges in a task oriented natural language application like the Google Assistant, Siri, or Alexa is to localize the output to many languages. This paper explores doing this by applying machine translation to the English output. Using machine translation is very scalable, as it can work with any English output and can handle dynamic text, but otherwise the problem is a poor fit. The… ▽ More

    Submitted 9 July, 2021; originally announced July 2021.

    Comments: 12 pages, 10 figures

  24. arXiv:2106.15818  [pdf, other

    cs.CL

    On Systematic Style Differences between Unsupervised and Supervised MT and an Application for High-Resource Machine Translation

    Authors: Kelly Marchisio, Markus Freitag, David Grangier

    Abstract: Modern unsupervised machine translation (MT) systems reach reasonable translation quality under clean and controlled data conditions. As the performance gap between supervised and unsupervised MT narrows, it is interesting to ask whether the different training methods result in systematically different output beyond what is visible via quality metrics like adequacy or BLEU. We compare translations… ▽ More

    Submitted 13 April, 2022; v1 submitted 30 June, 2021; originally announced June 2021.

    Comments: NAACL 2022 Camera-Ready. Tiny text changes to deal with compiler differences between arxiv and Overleaf

  25. arXiv:2104.14478  [pdf, other

    cs.CL cs.AI cs.LG

    Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation

    Authors: Markus Freitag, George Foster, David Grangier, Viresh Ratnakar, Qijun Tan, Wolfgang Macherey

    Abstract: Human evaluation of modern high-quality machine translation systems is a difficult problem, and there is increasing evidence that inadequate evaluation procedures can lead to erroneous conclusions. While there has been considerable research on human evaluation, the field still lacks a commonly-accepted standard procedure. As a step toward this goal, we propose an evaluation methodology grounded in… ▽ More

    Submitted 29 April, 2021; originally announced April 2021.

  26. arXiv:2104.05146  [pdf, other

    cs.CL

    Assessing Reference-Free Peer Evaluation for Machine Translation

    Authors: Sweta Agrawal, George Foster, Markus Freitag, Colin Cherry

    Abstract: Reference-free evaluation has the potential to make machine translation evaluation substantially more scalable, allowing us to pivot easily to new languages or domains. It has been recently shown that the probabilities given by a large, multilingual model can achieve state of the art results when used as a reference-free metric. We experiment with various modifications to this model and demonstrat… ▽ More

    Submitted 11 April, 2021; originally announced April 2021.

    Comments: NAACL 2021

  27. arXiv:2010.10245  [pdf, other

    cs.CL cs.LG

    Human-Paraphrased References Improve Neural Machine Translation

    Authors: Markus Freitag, George Foster, David Grangier, Colin Cherry

    Abstract: Automatic evaluation comparing candidate translations to human-generated paraphrases of reference translations has recently been proposed by Freitag et al. When used in place of original references, the paraphrased versions produce metric scores that correlate better with human judgment. This effect holds for a variety of different automatic metrics, and tends to favor natural formulations over mo… ▽ More

    Submitted 20 October, 2020; originally announced October 2020.

    Comments: Accepted at WMT 2020

  28. arXiv:2010.10239  [pdf, other

    cs.CL cs.LG

    Complete Multilingual Neural Machine Translation

    Authors: Markus Freitag, Orhan Firat

    Abstract: Multilingual Neural Machine Translation (MNMT) models are commonly trained on a joint set of bilingual corpora which is acutely English-centric (i.e. English either as the source or target language). While direct data between two languages that are non-English is explicitly available at times, its use is not common. In this paper, we first take a step back and look at the commonly used bilingual c… ▽ More

    Submitted 20 October, 2020; originally announced October 2020.

    Comments: Accepted at WMT 2020

  29. arXiv:2010.04297  [pdf, other

    cs.CL

    Learning to Evaluate Translation Beyond English: BLEURT Submissions to the WMT Metrics 2020 Shared Task

    Authors: Thibault Sellam, Amy Pu, Hyung Won Chung, Sebastian Gehrmann, Qijun Tan, Markus Freitag, Dipanjan Das, Ankur P. Parikh

    Abstract: The quality of machine translation systems has dramatically improved over the last decade, and as a result, evaluation has become an increasingly challenging problem. This paper describes our contribution to the WMT 2020 Metrics Shared Task, the main benchmark for automatic evaluation of translation. We make several submissions based on BLEURT, a previously published metric based on transfer learn… ▽ More

    Submitted 19 October, 2020; v1 submitted 8 October, 2020; originally announced October 2020.

  30. arXiv:2009.11027  [pdf, other

    cs.CL

    KoBE: Knowledge-Based Machine Translation Evaluation

    Authors: Zorik Gekhman, Roee Aharoni, Genady Beryozkin, Markus Freitag, Wolfgang Macherey

    Abstract: We propose a simple and effective method for machine translation evaluation which does not require reference translations. Our approach is based on (1) grounding the entity mentions found in each source sentence and candidate translation against a large-scale multilingual knowledge base, and (2) measuring the recall of the grounded entities found in the candidate vs. those found in the source. Our… ▽ More

    Submitted 23 September, 2020; originally announced September 2020.

    Comments: Accepted as a short paper in Findings of EMNLP 2020

  31. arXiv:2004.06063  [pdf, other

    cs.CL cs.AI cs.LG

    BLEU might be Guilty but References are not Innocent

    Authors: Markus Freitag, David Grangier, Isaac Caswell

    Abstract: The quality of automatic metrics for machine translation has been increasingly called into question, especially for high-quality systems. This paper demonstrates that, while choice of metric is important, the nature of the references is also critical. We study different methods to collect references and compare their value in automated evaluation by reporting correlation with human evaluation for… ▽ More

    Submitted 20 October, 2020; v1 submitted 13 April, 2020; originally announced April 2020.

    Comments: Accepted at EMNLP 2020

  32. Learning and Recognizing Archeological Features from LiDAR Data

    Authors: Conrad M Albrecht, Chris Fisher, Marcus Freitag, Hendrik F Hamann, Sharathchandra Pankanti, Florencia Pezzutti, Francesca Rossi

    Abstract: We present a remote sensing pipeline that processes LiDAR (Light Detection And Ranging) data through machine & deep learning for the application of archeological feature detection on big geo-spatial data platforms such as e.g. IBM PAIRS Geoscope. Today, archeologists get overwhelmed by the task of visually surveying huge amounts of (raw) LiDAR data in order to identify areas of interest for insp… ▽ More

    Submitted 5 April, 2020; originally announced April 2020.

    Journal ref: 2019 IEEE International Conference on Big Data (Big Data)

  33. Event Clustering & Event Series Characterization on Expected Frequency

    Authors: Conrad M Albrecht, Marcus Freitag, Theodore G van Kessel, Siyuan Lu, Hendrik F Hamann

    Abstract: We present an efficient clustering algorithm applicable to one-dimensional data such as e.g. a series of timestamps. Given an expected frequency $ΔT^{-1}$, we introduce an $\mathcal{O}(N)$-efficient method of characterizing $N$ events represented by an ordered series of timestamps $t_1,t_2,\dots,t_N$. In practice, the method proves useful to e.g. identify time intervals of "missing" data or to loc… ▽ More

    Submitted 5 April, 2020; originally announced April 2020.

    Journal ref: 2017 IEEE International Conference on Big Data (Big Data)

  34. arXiv:1911.03823  [pdf, other

    cs.CL

    Translationese as a Language in "Multilingual" NMT

    Authors: Parker Riley, Isaac Caswell, Markus Freitag, David Grangier

    Abstract: Machine translation has an undesirable propensity to produce "translationese" artifacts, which can lead to higher BLEU scores while being liked less by human raters. Motivated by this, we model translationese and original (i.e. natural) text as separate languages in a multilingual model, and pose the question: can we perform zero-shot translation between original source text and original target te… ▽ More

    Submitted 9 July, 2020; v1 submitted 9 November, 2019; originally announced November 2019.

    Journal ref: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020) 7737-7746

  35. arXiv:1904.04790  [pdf, other

    cs.CL

    APE at Scale and its Implications on MT Evaluation Biases

    Authors: Markus Freitag, Isaac Caswell, Scott Roy

    Abstract: In this work, we train an Automatic Post-Editing (APE) model and use it to reveal biases in standard Machine Translation (MT) evaluation procedures. The goal of our APE model is to correct typical errors introduced by the translation process, and convert the "translationese" output into natural text. Our APE model is trained entirely on monolingual data that has been round-trip translated through… ▽ More

    Submitted 14 June, 2019; v1 submitted 9 April, 2019; originally announced April 2019.

    Comments: Accepted at WMT 2019

  36. arXiv:1804.07899  [pdf, other

    cs.CL

    Unsupervised Natural Language Generation with Denoising Autoencoders

    Authors: Markus Freitag, Scott Roy

    Abstract: Generating text from structured data is important for various tasks such as question answering and dialog systems. We show that in at least one domain, without any supervision and only based on unlabeled text, we are able to build a Natural Language Generation (NLG) system with higher performance than supervised approaches. In our approach, we interpret the structured data as a corrupt representat… ▽ More

    Submitted 24 August, 2018; v1 submitted 21 April, 2018; originally announced April 2018.

    Comments: Accepted at EMNLP 2018

  37. arXiv:1712.04382  [pdf, other

    cs.SD eess.AS

    auDeep: Unsupervised Learning of Representations from Audio with Deep Recurrent Neural Networks

    Authors: Michael Freitag, Shahin Amiriparian, Sergey Pugachevskiy, Nicholas Cummins, Björn Schuller

    Abstract: auDeep is a Python toolkit for deep unsupervised representation learning from acoustic data. It is based on a recurrent sequence to sequence autoencoder approach which can learn representations of time series data by taking into account their temporal dynamics. We provide an extensive command line interface in addition to a Python API for users and developers, both of which are comprehensively doc… ▽ More

    Submitted 22 December, 2017; v1 submitted 12 December, 2017; originally announced December 2017.

  38. arXiv:1706.03824  [pdf, other

    cs.CL

    Attention-based Vocabulary Selection for NMT Decoding

    Authors: Baskaran Sankaran, Markus Freitag, Yaser Al-Onaizan

    Abstract: Neural Machine Translation (NMT) models usually use large target vocabulary sizes to capture most of the words in the target language. The vocabulary size is a big factor when decoding new sentences as the final softmax layer normalizes over all possible target words. To address this problem, it is widely common to restrict the target vocabulary with candidate lists based on the source sentence. U… ▽ More

    Submitted 12 June, 2017; originally announced June 2017.

    Comments: Submitted to Second Conference on Machine Translation (WMT-17); 7 pages

  39. Local System Voting Feature for Machine Translation System Combination

    Authors: Markus Freitag, Jan-Thorsten Peter, Stephan Peitz, Minwei Feng, Hermann Ney

    Abstract: In this paper, we enhance the traditional confusion network system combination approach with an additional model trained by a neural network. This work is motivated by the fact that the commonly used binary system voting models only assign each input system a global weight which is responsible for the global impact of each input system on all translations. This prevents individual systems with low… ▽ More

    Submitted 9 February, 2017; originally announced February 2017.

    Comments: published WMT 2015

    Journal ref: Proceedings of the Tenth Workshop on Statistical Machine Translation (WMT), 2015

  40. Beam Search Strategies for Neural Machine Translation

    Authors: Markus Freitag, Yaser Al-Onaizan

    Abstract: The basic concept in Neural Machine Translation (NMT) is to train a large Neural Network that maximizes the translation performance on a given parallel corpus. NMT is then using a simple left-to-right beam-search decoder to generate new translations that approximately maximize the trained conditional probability. The current beam search strategy generates the target sentence word by word from left… ▽ More

    Submitted 13 June, 2017; v1 submitted 6 February, 2017; originally announced February 2017.

    Comments: First Workshop on Neural Machine Translation, 2017

    Journal ref: Proceedings of the First Workshop on Neural Machine Translation, 2017

  41. arXiv:1702.01802  [pdf, ps, other

    cs.CL

    Ensemble Distillation for Neural Machine Translation

    Authors: Markus Freitag, Yaser Al-Onaizan, Baskaran Sankaran

    Abstract: Knowledge distillation describes a method for training a student network to perform better by learning from a stronger teacher network. Translating a sentence with an Neural Machine Translation (NMT) engine is time expensive and having a smaller model speeds up this process. We demonstrate how to transfer the translation quality of an ensemble and an oracle BLEU teacher network into a single NMT s… ▽ More

    Submitted 7 August, 2017; v1 submitted 6 February, 2017; originally announced February 2017.

  42. arXiv:1612.06897  [pdf, ps, other

    cs.CL

    Fast Domain Adaptation for Neural Machine Translation

    Authors: Markus Freitag, Yaser Al-Onaizan

    Abstract: Neural Machine Translation (NMT) is a new approach for automatic translation of text from one human language into another. The basic concept in NMT is to train a large Neural Network that maximizes the translation performance on a given parallel corpus. NMT is gaining popularity in the research community because it outperformed traditional SMT approaches in several translation tasks at WMT and oth… ▽ More

    Submitted 20 December, 2016; originally announced December 2016.