Skip to main content

Showing 1–7 of 7 results for author: Ung, M

  1. arXiv:2406.19470  [pdf, other

    cs.CL

    Changing Answer Order Can Decrease MMLU Accuracy

    Authors: Vipul Gupta, David Pantoja, Candace Ross, Adina Williams, Megan Ung

    Abstract: As large language models (LLMs) have grown in prevalence, particular benchmarks have become essential for the evaluation of these models and for understanding model capabilities. Most commonly, we use test accuracy averaged across multiple subtasks in order to rank models on leaderboards, to determine which model is best for our purposes. In this paper, we investigate the robustness of the accurac… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Short paper, 9 pages

  2. arXiv:2311.18140  [pdf, other

    cs.CL

    ROBBIE: Robust Bias Evaluation of Large Generative Language Models

    Authors: David Esiobu, Xiaoqing Tan, Saghar Hosseini, Megan Ung, Yuchen Zhang, Jude Fernandes, Jane Dwivedi-Yu, Eleonora Presani, Adina Williams, Eric Michael Smith

    Abstract: As generative large language models (LLMs) grow more performant and prevalent, we must develop comprehensive enough tools to measure and improve their fairness. Different prompt-based datasets can be used to measure social bias across multiple text domains and demographic axes, meaning that testing LLMs on more datasets can potentially help us characterize their biases more fully, and better ensur… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: EMNLP 2023

  3. arXiv:2307.02768  [pdf, other

    cs.CL

    Training Models to Generate, Recognize, and Reframe Unhelpful Thoughts

    Authors: Mounica Maddela, Megan Ung, Jing Xu, Andrea Madotto, Heather Foran, Y-Lan Boureau

    Abstract: Many cognitive approaches to well-being, such as recognizing and reframing unhelpful thoughts, have received considerable empirical support over the past decades, yet still lack truly widespread adoption in self-help format. A barrier to that adoption is a lack of adequately specific and diverse dedicated practice material. This work examines whether current language models can be leveraged to bot… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

    Comments: ACL 2023

  4. arXiv:2306.04707  [pdf, other

    cs.CL cs.AI

    Improving Open Language Models by Learning from Organic Interactions

    Authors: Jing Xu, Da Ju, Joshua Lane, Mojtaba Komeili, Eric Michael Smith, Megan Ung, Morteza Behrooz, William Ngan, Rashel Moritz, Sainbayar Sukhbaatar, Y-Lan Boureau, Jason Weston, Kurt Shuster

    Abstract: We present BlenderBot 3x, an update on the conversational model BlenderBot 3, which is now trained using organic conversation and feedback data from participating users of the system in order to improve both its skills and safety. We are publicly releasing the participating de-identified interaction data for use by the research community, in order to spur further progress. Training models with org… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

  5. arXiv:2208.03270  [pdf, other

    cs.CL cs.AI

    Learning New Skills after Deployment: Improving open-domain internet-driven dialogue with human feedback

    Authors: Jing Xu, Megan Ung, Mojtaba Komeili, Kushal Arora, Y-Lan Boureau, Jason Weston

    Abstract: Frozen models trained to mimic static datasets can never improve their performance. Models that can employ internet-retrieval for up-to-date information and obtain feedback from humans during deployment provide the promise of both adapting to new information, and improving their performance. In this work we study how to improve internet-driven conversational skills in such a learning framework. We… ▽ More

    Submitted 16 August, 2022; v1 submitted 5 August, 2022; originally announced August 2022.

  6. arXiv:2208.03188  [pdf, other

    cs.CL cs.AI

    BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage

    Authors: Kurt Shuster, Jing Xu, Mojtaba Komeili, Da Ju, Eric Michael Smith, Stephen Roller, Megan Ung, Moya Chen, Kushal Arora, Joshua Lane, Morteza Behrooz, William Ngan, Spencer Poff, Naman Goyal, Arthur Szlam, Y-Lan Boureau, Melanie Kambadur, Jason Weston

    Abstract: We present BlenderBot 3, a 175B parameter dialogue model capable of open-domain conversation with access to the internet and a long-term memory, and having been trained on a large number of user defined tasks. We release both the model weights and code, and have also deployed the model on a public web page to interact with organic users. This technical report describes how the model was built (arc… ▽ More

    Submitted 10 August, 2022; v1 submitted 5 August, 2022; originally announced August 2022.

  7. arXiv:2110.07518  [pdf, other

    cs.CL cs.AI

    SaFeRDialogues: Taking Feedback Gracefully after Conversational Safety Failures

    Authors: Megan Ung, Jing Xu, Y-Lan Boureau

    Abstract: Current open-domain conversational models can easily be made to talk in inadequate ways. Online learning from conversational feedback given by the conversation partner is a promising avenue for a model to improve and adapt, so as to generate fewer of these safety failures. However, current state-of-the-art models tend to react to feedback with defensive or oblivious responses. This makes for an un… ▽ More

    Submitted 4 May, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: Accepted at ACL 2022