Skip to main content

Showing 1–16 of 16 results for author: Si, M

  1. arXiv:2407.06955  [pdf, other

    cs.CR cs.CL

    ICLGuard: Controlling In-Context Learning Behavior for Applicability Authorization

    Authors: Wai Man Si, Michael Backes, Yang Zhang

    Abstract: In-context learning (ICL) is a recent advancement in the capabilities of large language models (LLMs). This feature allows users to perform a new task without updating the model. Concretely, users can address tasks during the inference time by conditioning on a few input-label pair demonstrations along with the test input. It is different than the conventional fine-tuning paradigm and offers more… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  2. arXiv:2407.04272  [pdf, other

    cs.LG cs.DC

    Accelerating Communication in Deep Learning Recommendation Model Training with Dual-Level Adaptive Lossy Compression

    Authors: Hao Feng, Boyuan Zhang, Fanjiang Ye, Min Si, Ching-Hsiang Chu, Jiannan Tian, Chunxing Yin, Summer Deng, Yuchen Hao, Pavan Balaji, Tong Geng, Dingwen Tao

    Abstract: DLRM is a state-of-the-art recommendation system model that has gained widespread adoption across various industry applications. The large size of DLRM models, however, necessitates the use of multiple devices/GPUs for efficient training. A significant bottleneck in this process is the time-consuming all-to-all communication required to collect embedding data from all devices. To mitigate this, we… ▽ More

    Submitted 11 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

    Comments: accepted by SC '24

  3. arXiv:2312.00029  [pdf, other

    cs.CR cs.AI cs.CL

    Bergeron: Combating Adversarial Attacks through a Conscience-Based Alignment Framework

    Authors: Matthew Pisano, Peter Ly, Abraham Sanders, Bingsheng Yao, Dakuo Wang, Tomek Strzalkowski, Mei Si

    Abstract: Research into AI alignment has grown considerably since the recent introduction of increasingly capable Large Language Models (LLMs). Unfortunately, modern methods of alignment still fail to fully prevent harmful responses when models are deliberately attacked. These attacks can trick seemingly aligned models into giving manufacturing instructions for dangerous materials, inciting violence, or rec… ▽ More

    Submitted 15 March, 2024; v1 submitted 16 November, 2023; originally announced December 2023.

  4. arXiv:2311.16185  [pdf, other

    cs.LG cs.AI cs.CL

    Enhancing Sentiment Analysis Results through Outlier Detection Optimization

    Authors: Yuetian Chen, Mei Si

    Abstract: When dealing with text data containing subjective labels like speaker emotions, inaccuracies or discrepancies among labelers are not uncommon. Such discrepancies can significantly affect the performance of machine learning algorithms. This study investigates the potential of identifying and addressing outliers in text data with subjective labels, aiming to enhance classification outcomes. We utili… ▽ More

    Submitted 25 November, 2023; originally announced November 2023.

    Comments: 11 pages, 5 figures

  5. arXiv:2311.14685  [pdf, other

    cs.CY cs.CL cs.CR cs.LG

    Comprehensive Assessment of Toxicity in ChatGPT

    Authors: Boyang Zhang, Xinyue Shen, Wai Man Si, Zeyang Sha, Zeyuan Chen, Ahmed Salem, Yun Shen, Michael Backes, Yang Zhang

    Abstract: Moderating offensive, hateful, and toxic language has always been an important but challenging topic in the domain of safe use in NLP. The emerging large language models (LLMs), such as ChatGPT, can potentially further accentuate this threat. Previous works have discovered that ChatGPT can generate toxic responses using carefully crafted inputs. However, limited research has been done to systemati… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

  6. arXiv:2308.03558  [pdf, other

    cs.CR cs.CL

    Mondrian: Prompt Abstraction Attack Against Large Language Models for Cheaper API Pricing

    Authors: Wai Man Si, Michael Backes, Yang Zhang

    Abstract: The Machine Learning as a Service (MLaaS) market is rapidly expanding and becoming more mature. For example, OpenAI's ChatGPT is an advanced large language model (LLM) that generates responses for various queries with associated fees. Although these models can deliver satisfactory performance, they are far from perfect. Researchers have long studied the vulnerabilities and limitations of LLMs, suc… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

  7. arXiv:2306.13195  [pdf, other

    cs.CL cs.AI

    Prompt to GPT-3: Step-by-Step Thinking Instructions for Humor Generation

    Authors: Yuetian Chen, Bowen Shi, Mei Si

    Abstract: Artificial intelligence has made significant progress in natural language processing, with models like GPT-3 demonstrating impressive capabilities. However, these models still have limitations when it comes to complex tasks that require an understanding of the user, such as mastering human comedy writing strategies. This paper explores humor generation using GPT-3 by modeling human comedy writing… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

    Comments: 5 pages, 1 figure; ICCC '23 preprint

  8. Accelerating MPI Collectives with Process-in-Process-based Multi-object Techniques

    Authors: Jiajun Huang, Kaiming Ouyang, Yujia Zhai, Jinyang Liu, Min Si, Ken Raffenetti, Hui Zhou, Atsushi Hori, Zizhong Chen, Yanfei Guo, Rajeev Thakur

    Abstract: In the exascale computing era, optimizing MPI collective performance in high-performance computing (HPC) applications is critical. Current algorithms face performance degradation due to system call overhead, page faults, or data-copy latency, affecting HPC applications' efficiency and scalability. To address these issues, we propose PiP-MColl, a Process-in-Process-based Multi-object Inter-process… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

    Comments: Accepted by ACM HPDC 2023

  9. arXiv:2305.07406  [pdf, other

    cs.CR cs.CL cs.LG

    Two-in-One: A Model Hijacking Attack Against Text Generation Models

    Authors: Wai Man Si, Michael Backes, Yang Zhang, Ahmed Salem

    Abstract: Machine learning has progressed significantly in various applications ranging from face recognition to text generation. However, its success has been accompanied by different attacks. Recently a new attack has been proposed which raises both accountability and parasitic computing risks, namely the model hijacking attack. Nevertheless, this attack has only focused on image classification tasks. In… ▽ More

    Submitted 12 May, 2023; originally announced May 2023.

    Comments: To appear in the 32nd USENIX Security Symposium, August 2023, Anaheim, CA, USA

  10. arXiv:2301.02777  [pdf, other

    cs.AI cs.CL cs.LG

    Visual Story Generation Based on Emotion and Keywords

    Authors: Yuetian Chen, Ruohua Li, Bowen Shi, Peiru Liu, Mei Si

    Abstract: Automated visual story generation aims to produce stories with corresponding illustrations that exhibit coherence, progression, and adherence to characters' emotional development. This work proposes a story generation pipeline to co-create visual stories with the users. The pipeline allows the user to control events and emotions on the generated content. The pipeline includes two parts: narrative… ▽ More

    Submitted 6 January, 2023; originally announced January 2023.

    Comments: 8 pages, 8 figures, AIIDE INT 2022

  11. arXiv:2209.03463  [pdf, other

    cs.CY cs.AI cs.CR cs.SI

    Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain Chatbots

    Authors: Wai Man Si, Michael Backes, Jeremy Blackburn, Emiliano De Cristofaro, Gianluca Stringhini, Savvas Zannettou, Yang Zhang

    Abstract: Chatbots are used in many applications, e.g., automated agents, smart home assistants, interactive characters in online games, etc. Therefore, it is crucial to ensure they do not behave in undesired manners, providing offensive or toxic responses to users. This is not a trivial task as state-of-the-art chatbot models are trained on large, public datasets openly collected from the Internet. This pa… ▽ More

    Submitted 9 September, 2022; v1 submitted 7 September, 2022; originally announced September 2022.

    Journal ref: Published in ACM CCS 2022. Please cite the CCS version

  12. arXiv:2208.09052  [pdf, ps, other

    cs.LG

    A Review of Uncertainty for Deep Reinforcement Learning

    Authors: Owen Lockwood, Mei Si

    Abstract: Uncertainty is ubiquitous in games, both in the agents playing games and often in the games themselves. Working with uncertainty is therefore an important component of successful deep reinforcement learning agents. While there has been substantial effort and progress in understanding and working with uncertainty for supervised learning, the body of literature for uncertainty aware deep reinforceme… ▽ More

    Submitted 18 August, 2022; originally announced August 2022.

    Comments: Accepted to AIIDE 2022

  13. arXiv:2205.03692  [pdf, other

    cs.CL cs.AI

    Towards a Progression-Aware Autonomous Dialogue Agent

    Authors: Abraham Sanders, Tomek Strzalkowski, Mei Si, Albert Chang, Deepanshu Dey, Jonas Braasch, Dakuo Wang

    Abstract: Recent advances in large-scale language modeling and generation have enabled the creation of dialogue agents that exhibit human-like responses in a wide range of conversational scenarios spanning a diverse set of tasks, from general chit-chat to focused goal-oriented discourse. While these agents excel at generating high-quality responses that are relevant to prior context, they suffer from a lack… ▽ More

    Submitted 10 May, 2022; v1 submitted 7 May, 2022; originally announced May 2022.

    Comments: Accepted at NAACL 2022

  14. arXiv:2201.09880  [pdf, other

    cs.AI

    A System for Image Understanding using Sensemaking and Narrative

    Authors: Zev Battad, Mei Si

    Abstract: Sensemaking and narrative are two inherently interconnected concepts about how people understand the world around them. Sensemaking is the process by which people structure and interconnect the information they encounter in the world with the knowledge and inferences they have made in the past. Narratives are important constructs that people use sensemaking to create; ones that reflect provide a m… ▽ More

    Submitted 21 January, 2022; originally announced January 2022.

    Comments: Presented at The Ninth Advances in Cognitive Systems (ACS) Conference 2021 (arXiv:2201.06134)

    Report number: ACS2021/26

  15. arXiv:2105.15054  [pdf, other

    cs.CL cs.AI

    Telling Stories through Multi-User Dialogue by Modeling Character Relations

    Authors: Wai Man Si, Prithviraj Ammanabrolu, Mark O. Riedl

    Abstract: This paper explores character-driven story continuation, in which the story emerges through characters' first- and second-person narration as well as dialogue -- requiring models to select language that is consistent with a character's persona and their relationships with other characters while following and advancing the story. We hypothesize that a multi-task model that trains on character dialo… ▽ More

    Submitted 31 May, 2021; originally announced May 2021.

    Comments: In Proceedings of SIGDIAL 2021

  16. arXiv:2008.07524  [pdf, other

    quant-ph cs.LG stat.ML

    Reinforcement Learning with Quantum Variational Circuits

    Authors: Owen Lockwood, Mei Si

    Abstract: The development of quantum computational techniques has advanced greatly in recent years, parallel to the advancements in techniques for deep reinforcement learning. This work explores the potential for quantum computing to facilitate reinforcement learning problems. Quantum computing approaches offer important potential improvements in time and space complexity over traditional algorithms because… ▽ More

    Submitted 28 August, 2020; v1 submitted 14 August, 2020; originally announced August 2020.

    Comments: Accepted to AIIDE 2020 Updated to better reflect AAAI formatting