Skip to main content

Showing 1–17 of 17 results for author: Dorr, B J

  1. arXiv:2406.07794  [pdf, other

    cs.CL cs.AI

    Making Task-Oriented Dialogue Datasets More Natural by Synthetically Generating Indirect User Requests

    Authors: Amogh Mannekote, Jinseok Nam, Ziming Li, Jian Gao, Kristy Elizabeth Boyer, Bonnie J. Dorr

    Abstract: Indirect User Requests (IURs), such as "It's cold in here" instead of "Could you please increase the temperature?" are common in human-human task-oriented dialogue and require world knowledge and pragmatic reasoning from the listener. While large language models (LLMs) can handle these requests effectively, smaller models deployed on virtual assistants often struggle due to resource constraints. M… ▽ More

    Submitted 16 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

  2. arXiv:2404.09371  [pdf, other

    cs.CL

    The Effect of Data Partitioning Strategy on Model Generalizability: A Case Study of Morphological Segmentation

    Authors: Zoey Liu, Bonnie J. Dorr

    Abstract: Recent work to enhance data partitioning strategies for more realistic model evaluation face challenges in providing a clear optimal choice. This study addresses these challenges, focusing on morphological segmentation and synthesizing limitations related to language diversity, adoption of multiple datasets and splits, and detailed model comparisons. Our study leverages data from 19 languages, inc… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: Accepted to 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (16 pages including 9 tables and 1 figure)

    ACM Class: I.2.7

  3. arXiv:2402.14155  [pdf, other

    cs.CL cs.AI

    Can Similarity-Based Domain-Ordering Reduce Catastrophic Forgetting for Intent Recognition?

    Authors: Amogh Mannekote, Xiaoyi Tian, Kristy Elizabeth Boyer, Bonnie J. Dorr

    Abstract: Task-oriented dialogue systems are expected to handle a constantly expanding set of intents and domains even after they have been deployed to support more and more functionalities. To live up to this expectation, it becomes critical to mitigate the catastrophic forgetting problem (CF) that occurs in continual learning (CL) settings for a task such as intent recognition. While existing dialogue sys… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  4. arXiv:2311.11215  [pdf, other

    cs.CL cs.AI

    SPLAIN: Augmenting Cybersecurity Warnings with Reasons and Data

    Authors: Vera A. Kazakova, Jena D. Hwang, Bonnie J. Dorr, Yorick Wilks, J. Blake Gage, Alex Memory, Mark A. Clark

    Abstract: Effective cyber threat recognition and prevention demand comprehensible forecasting systems, as prior approaches commonly offer limited and, ultimately, unconvincing information. We introduce Simplified Plaintext Language (SPLAIN), a natural language generator that converts warning data into user-friendly cyber threat explanations. SPLAIN is designed to generate clear, actionable outputs, incorpor… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

    Comments: Presented at FLAIRS-2019 as poster (see ancillary files)

    ACM Class: I.2

    Journal ref: FLAIRS-2019

  5. arXiv:2307.06524  [pdf, other

    cs.CL

    Agreement Tracking for Multi-Issue Negotiation Dialogues

    Authors: Amogh Mannekote, Bonnie J. Dorr, Kristy Elizabeth Boyer

    Abstract: Automated negotiation support systems aim to help human negotiators reach more favorable outcomes in multi-issue negotiations (e.g., an employer and a candidate negotiating over issues such as salary, hours, and promotions before a job offer). To be successful, these systems must accurately track agreements reached by participants in real-time. Existing approaches either focus on task-oriented dia… ▽ More

    Submitted 12 July, 2023; originally announced July 2023.

  6. arXiv:2305.18736  [pdf, other

    cs.CL cs.SI

    LonXplain: Lonesomeness as a Consequence of Mental Disturbance in Reddit Posts

    Authors: Muskan Garg, Chandni Saxena, Debabrata Samanta, Bonnie J. Dorr

    Abstract: Social media is a potential source of information that infers latent mental states through Natural Language Processing (NLP). While narrating real-life experiences, social media users convey their feeling of loneliness or isolated lifestyle, impacting their mental well-being. Existing literature on psychological theories points to loneliness as the major consequence of interpersonal risk factors,… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

  7. arXiv:2305.18585  [pdf, other

    cs.CL cs.AI

    Exploiting Explainability to Design Adversarial Attacks and Evaluate Attack Resilience in Hate-Speech Detection Models

    Authors: Pranath Reddy Kumbam, Sohaib Uddin Syed, Prashanth Thamminedi, Suhas Harish, Ian Perera, Bonnie J. Dorr

    Abstract: The advent of social media has given rise to numerous ethical challenges, with hate speech among the most significant concerns. Researchers are attempting to tackle this problem by leveraging hate-speech detection and employing language models to automatically moderate content and promote civil discourse. Unfortunately, recent studies have revealed that hate-speech detection systems can be misled… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

  8. arXiv:2301.11004  [pdf, other

    cs.CL

    NLP as a Lens for Causal Analysis and Perception Mining to Infer Mental Health on Social Media

    Authors: Muskan Garg, Chandni Saxena, Usman Naseem, Bonnie J Dorr

    Abstract: Interactions among humans on social media often convey intentions behind their actions, yielding a psychological language resource for Mental Health Analysis (MHA) of online users. The success of Computational Intelligence Techniques (CIT) for inferring mental illness from such social media resources points to NLP as a lens for causal analysis and perception mining. However, we argue that more con… ▽ More

    Submitted 22 August, 2023; v1 submitted 26 January, 2023; originally announced January 2023.

  9. arXiv:2207.04674  [pdf, other

    cs.CL

    CAMS: An Annotated Corpus for Causal Analysis of Mental Health Issues in Social Media Posts

    Authors: Muskan Garg, Chandni Saxena, Veena Krishnan, Ruchi Joshi, Sriparna Saha, Vijay Mago, Bonnie J Dorr

    Abstract: Research community has witnessed substantial growth in the detection of mental health issues and their associated reasons from analysis of social media. We introduce a new dataset for Causal Analysis of Mental health issues in Social media posts (CAMS). Our contributions for causal analysis are two-fold: causal interpretation and causal categorization. We introduce an annotation schema for this ta… ▽ More

    Submitted 11 July, 2022; originally announced July 2022.

    Comments: 10 pages

    Report number: 6387--6396

    Journal ref: Proceedings of the Thirteenth Language Resources and Evaluation Conference, LREC 2022

  10. arXiv:2203.10659  [pdf, other

    cs.CL cs.AI

    From Stance to Concern: Adaptation of Propositional Analysis to New Tasks and Domains

    Authors: Brodie Mather, Bonnie J Dorr, Adam Dalton, William de Beaumont, Owen Rambow, Sonja M. Schmer-Galunder

    Abstract: We present a generalized paradigm for adaptation of propositional analysis (predicate-argument pairs) to new tasks and domains. We leverage an analogy between stances (belief-driven sentiment) and concerns (topical issues with moral dimensions/endorsements) to produce an explanatory representation. A key contribution is the combination of semi-automatic resource building for extraction of domain-d… ▽ More

    Submitted 20 March, 2022; originally announced March 2022.

    Comments: Accepted to Findings of the Association for Computational Linguistics, 2022

    MSC Class: 68T50 ACM Class: I.2.7

  11. arXiv:2004.09662  [pdf, other

    cs.CL cs.CR

    The Panacea Threat Intelligence and Active Defense Platform

    Authors: Adam Dalton, Ehsan Aghaei, Ehab Al-Shaer, Archna Bhatia, Esteban Castillo, Zhuo Cheng, Sreekar Dhaduvai, Qi Duan, Md Mazharul Islam, Younes Karimi, Amir Masoumzadeh, Brodie Mather, Sashank Santhanam, Samira Shaikh, Tomek Strzalkowski, Bonnie J. Dorr

    Abstract: We describe Panacea, a system that supports natural language processing (NLP) components for active defenses against social engineering attacks. We deploy a pipeline of human language technology, including Ask and Framing Detection, Named Entity Recognition, Dialogue Engineering, and Stylometry. Panacea processes modern message formats through a plug-in architecture to accommodate innovative appro… ▽ More

    Submitted 20 April, 2020; originally announced April 2020.

    Comments: Accepted at STOC

  12. arXiv:2004.09050  [pdf, ps, other

    cs.CL

    Adaptation of a Lexical Organization for Social Engineering Detection and Response Generation

    Authors: Archna Bhatia, Adam Dalton, Brodie Mather, Sashank Santhanam, Samira Shaikh, Alan Zemel, Tomek Strzalkowski, Bonnie J. Dorr

    Abstract: We present a paradigm for extensible lexicon development based on Lexical Conceptual Structure to support social engineering detection and response generation. We leverage the central notions of ask (elicitation of behaviors such as providing access to money) and framing (risk/reward implied by the ask). We demonstrate improvements in ask/framing detection through refinements to our lexical organi… ▽ More

    Submitted 20 April, 2020; originally announced April 2020.

    Comments: Accepted at STOC

  13. arXiv:2002.10931  [pdf, other

    cs.CL

    Detecting Asks in SE attacks: Impact of Linguistic and Structural Knowledge

    Authors: Bonnie J. Dorr, Archna Bhatia, Adam Dalton, Brodie Mather, Bryanna Hebenstreit, Sashank Santhanam, Zhuo Cheng, Samira Shaikh, Alan Zemel, Tomek Strzalkowski

    Abstract: Social engineers attempt to manipulate users into undertaking actions such as downloading malware by clicking links or providing access to money or sensitive information. Natural language processing, computational sociolinguistics, and media-specific structural clues provide a means for detecting both the ask (e.g., buy gift card) and the risk/reward implied by the ask, which we call framing (e.g.… ▽ More

    Submitted 25 February, 2020; originally announced February 2020.

    Comments: Accepted at AAAI 2020

  14. arXiv:1502.01682  [pdf, other

    cs.CL cs.LG stat.ML

    Use of Modality and Negation in Semantically-Informed Syntactic MT

    Authors: Kathryn Baker, Michael Bloodgood, Bonnie J. Dorr, Chris Callison-Burch, Nathaniel W. Filardo, Christine Piatko, Lori Levin, Scott Miller

    Abstract: This paper describes the resource- and system-building efforts of an eight-week Johns Hopkins University Human Language Technology Center of Excellence Summer Camp for Applied Language Exploration (SCALE-2009) on Semantically-Informed Machine Translation (SIMT). We describe a new modality/negation (MN) annotation scheme, the creation of a (publicly available) MN lexicon, and two automated MN tagge… ▽ More

    Submitted 5 February, 2015; originally announced February 2015.

    Comments: 28 pages, 13 figures, 2 tables; appeared in Computational Linguistics, 38(2):411-438, 2012

    ACM Class: I.2.7; I.2.6; I.5.1; I.5.4

    Journal ref: Computational Linguistics, 38(2):411-438, 2012

  15. arXiv:1410.4868  [pdf, other

    cs.CL

    A Modality Lexicon and its use in Automatic Tagging

    Authors: Kathryn Baker, Michael Bloodgood, Bonnie J. Dorr, Nathaniel W. Filardo, Lori Levin, Christine Piatko

    Abstract: This paper describes our resource-building results for an eight-week JHU Human Language Technology Center of Excellence Summer Camp for Applied Language Exploration (SCALE-2009) on Semantically-Informed Machine Translation. Specifically, we describe the construction of a modality annotation scheme, a modality lexicon, and two automated modality taggers that were built using the lexicon and annotat… ▽ More

    Submitted 17 October, 2014; originally announced October 2014.

    Comments: 6 pages, 5 figures; appeared in Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), May 2010

    ACM Class: I.2.7

    Journal ref: In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), pages 1402-1407, Valletta, Malta, May 2010. European Language Resources Association

  16. arXiv:1409.7085  [pdf, other

    cs.CL cs.LG stat.ML

    Semantically-Informed Syntactic Machine Translation: A Tree-Grafting Approach

    Authors: Kathryn Baker, Michael Bloodgood, Chris Callison-Burch, Bonnie J. Dorr, Nathaniel W. Filardo, Lori Levin, Scott Miller, Christine Piatko

    Abstract: We describe a unified and coherent syntactic framework for supporting a semantically-informed syntactic approach to statistical machine translation. Semantically enriched syntactic tags assigned to the target-language training texts improved translation quality. The resulting system significantly outperformed a linguistically naive baseline model (Hiero), and reached the highest scores yet reporte… ▽ More

    Submitted 24 September, 2014; originally announced September 2014.

    Comments: 10 pages, 7 figures, 3 tables; appeared in Proceedings of the Ninth Conference of the Association for Machine Translation in the Americas (AMTA), October 2010

    ACM Class: I.2.7; I.2.6; I.5.1; I.5.4

    Journal ref: In Proceedings of the Ninth Conference of the Association for Machine Translation in the Americas (AMTA), Denver, Colorado, October 2010

  17. arXiv:1308.6300  [pdf, ps, other

    cs.CL

    Computing Lexical Contrast

    Authors: Saif M. Mohammad, Bonnie J. Dorr, Graeme Hirst, Peter D. Turney

    Abstract: Knowing the degree of semantic contrast between words has widespread application in natural language processing, including machine translation, information retrieval, and dialogue systems. Manually-created lexicons focus on opposites, such as {\rm hot} and {\rm cold}. Opposites are of many kinds such as antipodals, complementaries, and gradable. However, existing lexicons often do not classify opp… ▽ More

    Submitted 28 August, 2013; originally announced August 2013.

    Journal ref: Computational Linguistics, 39 (3), 555-590, 2013