subscribe to arXiv mailings

Counterspeakers' Perspectives: Unveiling Barriers and AI Needs in the Fight against Online Hate

Authors: Jimin Mun, Cathy Buerger, Jenny T. Liang, Joshua Garland, Maarten Sap

Abstract: Counterspeech, i.e., direct responses against hate speech, has become an important tool to address the increasing amount of hate online while avoiding censorship. Although AI has been proposed to help scale up counterspeech efforts, this raises questions of how exactly AI could assist in this process, since counterspeech is a deeply empathetic and agentic process for those involved. In this work,… ▽ More Counterspeech, i.e., direct responses against hate speech, has become an important tool to address the increasing amount of hate online while avoiding censorship. Although AI has been proposed to help scale up counterspeech efforts, this raises questions of how exactly AI could assist in this process, since counterspeech is a deeply empathetic and agentic process for those involved. In this work, we aim to answer this question, by conducting in-depth interviews with 10 extensively experienced counterspeakers and a large scale public survey with 342 everyday social media users. In participant responses, we identified four main types of barriers and AI needs related to resources, training, impact, and personal harms. However, our results also revealed overarching concerns of authenticity, agency, and functionality in using AI tools for counterspeech. To conclude, we discuss considerations for designing AI assistants that lower counterspeaking barriers without jeopardizing its meaning and purpose. △ Less

Submitted 29 February, 2024; originally announced March 2024.

Comments: To appear in CHI 2024. 22 pages, 3 figures, 7 tables

arXiv:2310.01727 [pdf, other]

Can GPT-4 Replicate Empirical Software Engineering Research?

Authors: Jenny T. Liang, Carmen Badea, Christian Bird, Robert DeLine, Denae Ford, Nicole Forsgren, Thomas Zimmermann

Abstract: Empirical software engineering research on production systems has brought forth a better understanding of the software engineering process for practitioners and researchers alike. However, only a small subset of production systems is studied, limiting the impact of this research. While software engineering practitioners could benefit from replicating research on their own data, this poses its own… ▽ More Empirical software engineering research on production systems has brought forth a better understanding of the software engineering process for practitioners and researchers alike. However, only a small subset of production systems is studied, limiting the impact of this research. While software engineering practitioners could benefit from replicating research on their own data, this poses its own set of challenges, since performing replications requires a deep understanding of research methodologies and subtle nuances in software engineering data. Given that large language models (LLMs), such as GPT-4, show promise in tackling both software engineering- and science-related tasks, these models could help replicate and thus democratize empirical software engineering research. In this paper, we examine GPT-4's abilities to perform replications of empirical software engineering research on new data. We study their ability to surface assumptions made in empirical software engineering research methodologies, as well as their ability to plan and generate code for analysis pipelines on seven empirical software engineering papers. We perform a user study with 14 participants with software engineering research expertise, who evaluate GPT-4-generated assumptions and analysis plans (i.e., a list of module specifications) from the papers. We find that GPT-4 is able to surface correct assumptions, but struggles to generate ones that apply common knowledge about software engineering data. In a manual analysis of the generated code, we find that the GPT-4-generated code contains correct high-level logic, given a subset of the methodology. However, the code contains many small implementation-level errors, reflecting a lack of software engineering knowledge. Our findings have implications for leveraging LLMs for software engineering research as well as practitioner data scientists in software teams. △ Less

Submitted 19 June, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

arXiv:2310.00242 [pdf, ps, other]

Walking = Traversable? : Traversability Prediction via Multiple Human Object Tracking under Occlusion

Authors: Jonathan Tay Yu Liang, Kanji Tanaka

Abstract: The emerging ``Floor plan from human trails (PfH)" technique has great potential for improving indoor robot navigation by predicting the traversability of occluded floors. This study presents an innovative approach that replaces first-person-view sensors with a third-person-view monocular camera mounted on the observer robot. This approach can gather measurements from multiple humans, expanding it… ▽ More The emerging ``Floor plan from human trails (PfH)" technique has great potential for improving indoor robot navigation by predicting the traversability of occluded floors. This study presents an innovative approach that replaces first-person-view sensors with a third-person-view monocular camera mounted on the observer robot. This approach can gather measurements from multiple humans, expanding its range of applications. The key idea is to use two types of trackers, SLAM and MOT, to monitor stationary objects and moving humans and assess their interactions. This method achieves stable predictions of traversability even in challenging visual scenarios, such as occlusions, nonlinear perspectives, depth uncertainty, and intersections involving multiple humans. Additionally, we extend map quality metrics to apply to traversability maps, facilitating future research. We validate our proposed method through fusion and comparison with established techniques. △ Less

Submitted 29 September, 2023; originally announced October 2023.

Comments: 6 figures, technical report

arXiv:2307.14105 [pdf, ps, other]

Active Robot Vision for Distant Object Change Detection: A Lightweight Training Simulator Inspired by Multi-Armed Bandits

Authors: Kouki Terashima, Kanji Tanaka, Ryogo Yamamoto, Jonathan Tay Yu Liang

Abstract: In ground-view object change detection, the recently emerging mapless navigation has great potential to navigate a robot to objects distantly detected (e.g., books, cups, clothes) and acquire high-resolution object images, to identify their change states (no-change/appear/disappear). However, naively performing full journeys for every distant object requires huge sense/plan/action costs, proportio… ▽ More In ground-view object change detection, the recently emerging mapless navigation has great potential to navigate a robot to objects distantly detected (e.g., books, cups, clothes) and acquire high-resolution object images, to identify their change states (no-change/appear/disappear). However, naively performing full journeys for every distant object requires huge sense/plan/action costs, proportional to the number of objects and the robot-to-object distance. To address this issue, we explore a new map-based active vision problem in this work: ``Which journey should the robot select next?" However, the feasibility of the active vision framework remains unclear; Since distant objects are only uncertainly recognized, it is unclear whether they can provide sufficient cues for action planning. This work presents an efficient simulator for feasibility testing, to accelerate the early-stage R&D cycles (e.g., prototyping, training, testing, and evaluation). The proposed simulator is designed to identify the degree of difficulty that a robot vision system (sensors/recognizers/planners/actuators) would face when applied to a given environment (workspace/objects). Notably, it requires only one real-world journey experience per distant object to function, making it suitable for an efficient R&D cycle. Another contribution of this work is to present a new lightweight planner inspired by the traditional multi-armed bandit problem. Specifically, we build a lightweight map-based planner on top of the mapless planner, which constitutes a hierarchical action planner. We verified the effectiveness of the proposed framework using a semantically non-trivial scenario ``sofa as bookshelf". △ Less

Submitted 24 October, 2023; v1 submitted 26 July, 2023; originally announced July 2023.

Comments: 7 pages, 7 figures, technical report

arXiv:2307.10168 [pdf, other]

LLMs as Workers in Human-Computational Algorithms? Replicating Crowdsourcing Pipelines with LLMs

Authors: Tongshuang Wu, Haiyi Zhu, Maya Albayrak, Alexis Axon, Amanda Bertsch, Wenxing Deng, Ziqi Ding, Bill Guo, Sireesh Gururaja, Tzu-Sheng Kuo, Jenny T. Liang, Ryan Liu, Ihita Mandal, Jeremiah Milbauer, Xiaolin Ni, Namrata Padmanabhan, Subhashini Ramkumar, Alexis Sudjianto, Jordan Taylor, Ying-Jui Tseng, Patricia Vaidos, Zhijin Wu, Wei Wu, Chenyang Yang

Abstract: LLMs have shown promise in replicating human-like behavior in crowdsourcing tasks that were previously thought to be exclusive to human abilities. However, current efforts focus mainly on simple atomic tasks. We explore whether LLMs can replicate more complex crowdsourcing pipelines. We find that modern LLMs can simulate some of crowdworkers' abilities in these "human computation algorithms," but… ▽ More LLMs have shown promise in replicating human-like behavior in crowdsourcing tasks that were previously thought to be exclusive to human abilities. However, current efforts focus mainly on simple atomic tasks. We explore whether LLMs can replicate more complex crowdsourcing pipelines. We find that modern LLMs can simulate some of crowdworkers' abilities in these "human computation algorithms," but the level of success is variable and influenced by requesters' understanding of LLM capabilities, the specific skills required for sub-tasks, and the optimal interaction modality for performing these sub-tasks. We reflect on human and LLMs' different sensitivities to instructions, stress the importance of enabling human-facing safeguards for LLMs, and discuss the potential of training humans and LLMs with complementary skill sets. Crucially, we show that replicating crowdsourcing pipelines offers a valuable platform to investigate (1) the relative strengths of LLMs on different tasks (by cross-comparing their performances on sub-tasks) and (2) LLMs' potential in complex tasks, where they can complete part of the tasks while leaving others to humans. △ Less

Submitted 19 July, 2023; v1 submitted 19 July, 2023; originally announced July 2023.

arXiv:2306.01943 [pdf, other]

NLPositionality: Characterizing Design Biases of Datasets and Models

Authors: Sebastin Santy, Jenny T. Liang, Ronan Le Bras, Katharina Reinecke, Maarten Sap

Abstract: Design biases in NLP systems, such as performance differences for different populations, often stem from their creator's positionality, i.e., views and lived experiences shaped by identity and background. Despite the prevalence and risks of design biases, they are hard to quantify because researcher, system, and dataset positionality is often unobserved. We introduce NLPositionality, a framework f… ▽ More Design biases in NLP systems, such as performance differences for different populations, often stem from their creator's positionality, i.e., views and lived experiences shaped by identity and background. Despite the prevalence and risks of design biases, they are hard to quantify because researcher, system, and dataset positionality is often unobserved. We introduce NLPositionality, a framework for characterizing design biases and quantifying the positionality of NLP datasets and models. Our framework continuously collects annotations from a diverse pool of volunteer participants on LabintheWild, and statistically quantifies alignment with dataset labels and model predictions. We apply NLPositionality to existing datasets and models for two tasks -- social acceptability and hate speech detection. To date, we have collected 16,299 annotations in over a year for 600 instances from 1,096 annotators across 87 countries. We find that datasets and models align predominantly with Western, White, college-educated, and younger populations. Additionally, certain groups, such as non-binary people and non-native English speakers, are further marginalized by datasets and models as they rank least in alignment across all tasks. Finally, we draw from prior literature to discuss how researchers can examine their own positionality and that of their datasets and models, opening the door for more inclusive NLP systems. △ Less

Submitted 2 June, 2023; originally announced June 2023.

Comments: ACL 2023

arXiv:2303.17125 [pdf, other]

A Large-Scale Survey on the Usability of AI Programming Assistants: Successes and Challenges

Authors: Jenny T. Liang, Chenyang Yang, Brad A. Myers

Abstract: The software engineering community recently has witnessed widespread deployment of AI programming assistants, such as GitHub Copilot. However, in practice, developers do not accept AI programming assistants' initial suggestions at a high frequency. This leaves a number of open questions related to the usability of these tools. To understand developers' practices while using these tools and the imp… ▽ More The software engineering community recently has witnessed widespread deployment of AI programming assistants, such as GitHub Copilot. However, in practice, developers do not accept AI programming assistants' initial suggestions at a high frequency. This leaves a number of open questions related to the usability of these tools. To understand developers' practices while using these tools and the important usability challenges they face, we administered a survey to a large population of developers and received responses from a diverse set of 410 developers. Through a mix of qualitative and quantitative analyses, we found that developers are most motivated to use AI programming assistants because they help developers reduce key-strokes, finish programming tasks quickly, and recall syntax, but resonate less with using them to help brainstorm potential solutions. We also found the most important reasons why developers do not use these tools are because these tools do not output code that addresses certain functional or non-functional requirements and because developers have trouble controlling the tool to generate the desired output. Our findings have implications for both creators and users of AI programming assistants, such as designing minimal cognitive effort interactions with these tools to reduce distractions for users while they are programming. △ Less

Submitted 17 September, 2023; v1 submitted 29 March, 2023; originally announced March 2023.

Comments: Accepted to ICSE'24

arXiv:2301.09789 [pdf, other]

A Qualitative Study on the Implementation Design Decisions of Developers

Authors: Jenny T. Liang, Maryam Arab, Minhyuk Ko, Amy J. Ko, Thomas D. LaToza

Abstract: Decision-making is a key software engineering skill. Developers constantly make choices throughout the software development process, from requirements to implementation. While prior work has studied developer decision-making, the choices made while choosing what solution to write in code remain understudied. In this mixed-methods study, we examine the phenomenon where developers select one specifi… ▽ More Decision-making is a key software engineering skill. Developers constantly make choices throughout the software development process, from requirements to implementation. While prior work has studied developer decision-making, the choices made while choosing what solution to write in code remain understudied. In this mixed-methods study, we examine the phenomenon where developers select one specific way to implement a behavior in code, given many potential alternatives. We call these decisions implementation design decisions. Our mixed-methods study includes 46 survey responses and 14 semi-structured interviews with professional developers about their decision types, considerations, processes, and expertise for implementation design decisions. We find that implementation design decisions, rather than being a natural outcome from higher levels of design, require constant monitoring of higher level design choices, such as requirements and architecture. We also show that developers have a consistent general structure to their implementation decision-making process, but no single process is exactly the same. We discuss the implications of our findings on research, education, and practice, including insights on teaching developers how to make implementation design decisions. △ Less

Submitted 23 January, 2023; originally announced January 2023.

arXiv:2209.02222 [pdf, other]

Understanding Skills for OSS Communities on GitHub

Authors: Jenny T. Liang, Thomas Zimmermann, Denae Ford

Abstract: The development of open source software (OSS) is a broad field which requires diverse skill sets. For example, maintainers help lead the project and promote its longevity, technical writers assist with documentation, bug reporters identify defects in software, and developers program the software. However, it is unknown which skills are used in OSS development as well as OSS contributors' general a… ▽ More The development of open source software (OSS) is a broad field which requires diverse skill sets. For example, maintainers help lead the project and promote its longevity, technical writers assist with documentation, bug reporters identify defects in software, and developers program the software. However, it is unknown which skills are used in OSS development as well as OSS contributors' general attitudes towards skills in OSS. In this paper, we address this gap by administering a survey to a diverse set of 455 OSS contributors. Guided by these responses as well as prior literature on software development expertise and social factors of OSS, we develop a model of skills in OSS that considers the many contexts OSS contributors work in. This model has 45 skills in the following 9 categories: technical skills, working styles, problem solving, contribution types, project-specific skills, interpersonal skills, external relations, management, and characteristics. Through a mix of qualitative and quantitative analyses, we find that OSS contributors are actively motivated to improve skills and perceive many benefits in sharing their skills with others. We then use this analysis to derive a set of design implications and best practices for those who incorporate skills into OSS tools and platforms, such as GitHub. △ Less

Submitted 6 September, 2022; originally announced September 2022.

arXiv:2203.02027 [pdf, other]

Towards Mining OSS Skills from GitHub Activity

Authors: Jenny T. Liang, Thomas Zimmermann, Denae Ford

Abstract: Open source software (OSS) development relies on diverse skill sets. However, to our knowledge, there are no tools which detect OSS-related skills. In this paper, we present a novel method to detect OSS skills and prototype it in a tool called Disko. Our approach relies on identifying relevant signals, which are measurable activities or cues associated with a skill. Our tool detects how contributo… ▽ More Open source software (OSS) development relies on diverse skill sets. However, to our knowledge, there are no tools which detect OSS-related skills. In this paper, we present a novel method to detect OSS skills and prototype it in a tool called Disko. Our approach relies on identifying relevant signals, which are measurable activities or cues associated with a skill. Our tool detects how contributors 1) teach others to be involved in OSS projects, 2) show commitment towards an OSS project, 3) have knowledge in specific programming languages, and 4) are familiar with OSS practices. We then evaluate the tool by administering a survey to 455 OSS contributors. We demonstrate that Disko yields promising results: it detects the presence of these skills with precision scores between 77% to 97%. We also find that over 54% of participants would display their high-proficiency skills. Our approach can be used to transform existing OSS experiences, such as identifying collaborators, matching mentors to mentees, and assigning project roles. Given the positive results and potential impact of our approach, we outline future research opportunities in interpreting and sharing OSS skills. △ Less

Submitted 3 March, 2022; originally announced March 2022.

Showing 1–10 of 10 results for author: Liang, J T