Kyle Shaffer

Greater Seattle Area Contact Info
486 followers 493 connections

Join to view profile

About

Experienced Data Scientist with a demonstrated history of working on machine learning…

Activity

Join now to see all activity

Experience & Education

  • Language Weaver

View Kyle’s full experience

See their title, tenure and more.

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Publications

  • AbLit: A Resource for Analyzing and Generating Abridged Versions of English Literature

    EACL 2023

    Creating an abridged version of a text involves shortening it while maintaining its linguistic qualities. In this paper, we examine this task from an NLP perspective for the first time. We present a new resource, AbLit, which is derived from abridged versions of English literature books. The dataset captures passage-level alignments between the original and abridged texts. We characterize the linguistic relations of these alignments, and create automated models to predict these relations as…

    Creating an abridged version of a text involves shortening it while maintaining its linguistic qualities. In this paper, we examine this task from an NLP perspective for the first time. We present a new resource, AbLit, which is derived from abridged versions of English literature books. The dataset captures passage-level alignments between the original and abridged texts. We characterize the linguistic relations of these alignments, and create automated models to predict these relations as well as to generate abridgements for new texts. Our findings establish abridgement as a challenging task, motivating future resources and research. The dataset is available at this http URL.

    Other authors
    See publication
  • Language Clustering for Multilingual Named Entity Recognition

    Findings of EMNLP 2021

    Recent work in multilingual natural language processing has shown progress in various tasks such as natural language inference and joint multilingual translation. Despite success in learning across many languages, challenges arise where multilingual training regimes often boost performance on some languages at the expense of others. For multilingual named entity recognition (NER) we propose a simple technique that groups similar languages together by using embeddings from a pre-trained masked…

    Recent work in multilingual natural language processing has shown progress in various tasks such as natural language inference and joint multilingual translation. Despite success in learning across many languages, challenges arise where multilingual training regimes often boost performance on some languages at the expense of others. For multilingual named entity recognition (NER) we propose a simple technique that groups similar languages together by using embeddings from a pre-trained masked language model, and automatically discovering language clusters in this embedding space. Specifically, we fine-tune an XLM-Roberta model on a language identification task, and use embeddings from this model for clustering. We conduct experiments on 15 diverse languages in the WikiAnn dataset and show our technique largely outperforms three baselines: (1) training a multilingual model jointly on all available languages, (2) training one monolingual model per language, and (3) grouping languages by linguistic family. We also conduct analyses showing meaningful multilingual transfer for low-resource languages (Swahili and Yoruba), despite being automatically grouped with other seemingly disparate languages.

    See publication
  • Beyond Fine-Tuning: Adding Capacity to Leverage Few Labels

    31st Conference on Neural Information Processing Systems (Limited Labeled Data Workshop)

    In this paper we present a technique to train neural network models on small
    amounts of data. Current methods for training neural networks on small amounts
    of rich data typically rely on strategies such as fine-tuning a pre-trained neural
    network or the use of domain-specific hand-engineered features. Here we take the
    approach of treating network layers, or entire networks, as modules and combine
    pre-trained modules with untrained modules, to learn the shift in…

    In this paper we present a technique to train neural network models on small
    amounts of data. Current methods for training neural networks on small amounts
    of rich data typically rely on strategies such as fine-tuning a pre-trained neural
    network or the use of domain-specific hand-engineered features. Here we take the
    approach of treating network layers, or entire networks, as modules and combine
    pre-trained modules with untrained modules, to learn the shift in distributions
    between data sets. The central impact of using a modular approach comes from
    adding new representations to a network, as opposed to replacing representations
    via fine-tuning. Using this technique, we are able surpass results using standard
    fine-tuning transfer learning approaches, and we are also able to significantly
    increase performance over such approaches when using smaller amounts of data.

    See publication
  • Predicting Speech Acts in MOOC Forum Posts

    Proceedings of the 9th International Conference on Weblogs and Social Media, ICWSM 2015

    Students in a Massive Open Online Course (MOOC) interact with each other and the course staff through online discussion forums. While discussion forums play a central role in MOOCs, they also pose a challenge for instructors. The large number of student posts makes it difficult for instructors to know where to intervene to answer questions, resolve issues, and provide feedback.
    In this work, we focus on automatically predicting speech acts in MOOC forum posts. Our speech act categories…

    Students in a Massive Open Online Course (MOOC) interact with each other and the course staff through online discussion forums. While discussion forums play a central role in MOOCs, they also pose a challenge for instructors. The large number of student posts makes it difficult for instructors to know where to intervene to answer questions, resolve issues, and provide feedback.
    In this work, we focus on automatically predicting speech acts in MOOC forum posts. Our speech act categories describe the purpose or function of the post in the ongoing discussion. Specifically, we address three main research questions. First, we investigate whether crowdsourced workers can reliably label MOOC forum posts using our speech act definitions. Second, we investigate
    whether our speech acts can help predict instructor interventions and assignment completion and
    performance. Finally, we investigate which types of features (derived from the post content, author, and surrounding context) are most effective for predicting our different speech act categories.

    Other authors
    • Jaime Arguello
    See publication
  • Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter

    Association for Computational Linguistics (ACL 2017)

    Other authors
  • Using Natural Language Processing to Facilitate Medical Record Abstraction in Epidemiological Studies

    American Medical Informatics Association Annual Symposium (Poster)

    The Atherosclerosis Risk in Communities (ARIC) study conducts ongoing surveillance of hospitalized cardiovascular health events and death in 4 communities in the United States (NC, MI, MN and MD). Diagnostic criteria for heart failure (HF) has been manually abstracted from medical records since 2005, including the presence of symptoms consistent with HF decompensation (new onset or worsening shortness of breath, edema, paroxysmal nocturnal dyspnea, and orthopnea) during patients'…

    The Atherosclerosis Risk in Communities (ARIC) study conducts ongoing surveillance of hospitalized cardiovascular health events and death in 4 communities in the United States (NC, MI, MN and MD). Diagnostic criteria for heart failure (HF) has been manually abstracted from medical records since 2005, including the presence of symptoms consistent with HF decompensation (new onset or worsening shortness of breath, edema, paroxysmal nocturnal dyspnea, and orthopnea) during patients' hospitalizations. The manual chart abstraction process has high repeatability under a stringent quality control protocol, but is time consuming and costly. The goal of this study is to develop and test natural language processing (NLP) tools to extract information on complex symptoms from free-text electronic medical records.

    Other authors
    • Carlton Moore
    • Anna Kucharska-Newton
    • Stephanie Haas
    • Gerardo Heiss

More activity by Kyle

View Kyle’s full profile

  • See who you know in common
  • Get introduced
  • Contact Kyle directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Kyle Shaffer in United States

Add new skills with these courses