About
Activity
-
Start talking to Ellie Pavlick about her work — looking for evidence of understanding within large language models (LLMs) — and she might sound as if…
Start talking to Ellie Pavlick about her work — looking for evidence of understanding within large language models (LLMs) — and she might sound as if…
Liked by William Merrill
-
William Merrill, a graduate student at NYU, recently co-authored a study that used computational complexity theory to quantify the strength of…
William Merrill, a graduate student at NYU, recently co-authored a study that used computational complexity theory to quantify the strength of…
Liked by William Merrill
-
So happy to attend #AGBT2024 this past week! I had the opportunity to talk about my research and learn about new tech (sequencing DNA in space…
So happy to attend #AGBT2024 this past week! I had the opportunity to talk about my research and learn about new tech (sequencing DNA in space…
Liked by William Merrill
Experience & Education
Publications
-
Sequential Neural Networks as Automata
Deep Learning and Formal Languages workshop at ACL 2019
This work attempts to explain the types of computation that neural networks can perform by relating them to automata. We first define what it means for a real-time network with bounded precision to accept a language. A measure of network memory follows from this definition. We then characterize the classes of languages acceptable by various recurrent networks, attention, and convolutional networks. We find that LSTMs function like counter machines and relate convolutional networks to the…
This work attempts to explain the types of computation that neural networks can perform by relating them to automata. We first define what it means for a real-time network with bounded precision to accept a language. A measure of network memory follows from this definition. We then characterize the classes of languages acceptable by various recurrent networks, attention, and convolutional networks. We find that LSTMs function like counter machines and relate convolutional networks to the subregular hierarchy. Overall, this work attempts to increase our understanding and ability to interpret neural networks through the lens of theory. These theoretical insights help explain neural computation, as well as the relationship between neural networks and natural language grammar.
-
Finding Syntactic Representations in Neural Stacks
Analyzing and Interpreting Neural Networks for NLP workshop at ACL 2019
Neural network architectures have been augmented with differentiable stacks in order to introduce a bias toward learning hierarchy-sensitive regularities. It has, however, proven difficult to assess the degree to which such a bias is effective, as the operation of the differentiable stack is not always interpretable. In this paper, we attempt to detect the presence of latent representations of hierarchical structure through an exploration of the unsupervised learning of constituency structure…
Neural network architectures have been augmented with differentiable stacks in order to introduce a bias toward learning hierarchy-sensitive regularities. It has, however, proven difficult to assess the degree to which such a bias is effective, as the operation of the differentiable stack is not always interpretable. In this paper, we attempt to detect the presence of latent representations of hierarchical structure through an exploration of the unsupervised learning of constituency structure. Using a technique due to Shen et al. (2018a,b), we extract syntactic trees from the pushing behavior of stack RNNs trained on language modeling and classification objectives. We find that our models produce parses that reflect natural language syntactic constituencies, demonstrating that stack RNNs do indeed infer linguistically relevant hierarchical structure.
-
Detecting Syntactic Change Using a Neural Part-of-Speech Tagger
Computational Approaches to Historical Language Change workshop at ACL 2019
We train a diachronic long short-term memory (LSTM) part-of-speech tagger on a large corpus of American English from the 19th, 20th, and 21st centuries. We analyze the tagger's ability to implicitly learn temporal structure between years, and the extent to which this knowledge can be transferred to date new sentences. The learned year embeddings show a strong linear correlation between their first principal component and time. We show that temporal information encoded in the model can be used…
We train a diachronic long short-term memory (LSTM) part-of-speech tagger on a large corpus of American English from the 19th, 20th, and 21st centuries. We analyze the tagger's ability to implicitly learn temporal structure between years, and the extent to which this knowledge can be transferred to date new sentences. The learned year embeddings show a strong linear correlation between their first principal component and time. We show that temporal information encoded in the model can be used to predict novel sentences' years of composition relatively well. Comparisons to a feedforward baseline suggest that the temporal change learned by the LSTM is syntactic rather than purely lexical. Thus, our results suggest that our tagger is implicitly learning to model syntactic change in American English over the course of the 19th, 20th, and early 21st centuries.
Other authorsSee publication -
Context-Free Transductions with Neural Stacks
Analyzing and Interpreting Neural Networks for NLP workshop at EMNLP 2018
Co-lead author.
This paper analyzes the behavior of stack-augmented recurrent neural network (RNN) models. Due to the architectural similarity between stack RNNs and pushdown transducers, we train stack RNN models on a number of tasks, including string reversal, context-free language modeling, and cumulative XOR evaluation. Examining the behavior of our networks, we show that stack-augmented RNNs can discover intuitive stack-based strategies for solving our tasks. However, stack RNNs are…Co-lead author.
This paper analyzes the behavior of stack-augmented recurrent neural network (RNN) models. Due to the architectural similarity between stack RNNs and pushdown transducers, we train stack RNN models on a number of tasks, including string reversal, context-free language modeling, and cumulative XOR evaluation. Examining the behavior of our networks, we show that stack-augmented RNNs can discover intuitive stack-based strategies for solving our tasks. However, stack RNNs are more difficult to train than classical architectures such as LSTMs. Rather than employ stack-based strategies, more complex networks often find approximate solutions by using the stack as unstructured memory. -
End-to-end Graph-based TAG Parsing with Neural Networks
NAACL 2018
We present a graph-based Tree Adjoining Grammar (TAG) parser that uses BiLSTMs, highway connections, and character-level CNNs. Our best end-to-end parser, which jointly performs supertagging, POS tagging, and parsing, outperforms the previously reported best results by more than 2.2 LAS and UAS points. The graph-based parsing architecture allows for global inference and rich feature representations for TAG parsing, alleviating the fundamental trade-off between transition-based and graph-based…
We present a graph-based Tree Adjoining Grammar (TAG) parser that uses BiLSTMs, highway connections, and character-level CNNs. Our best end-to-end parser, which jointly performs supertagging, POS tagging, and parsing, outperforms the previously reported best results by more than 2.2 LAS and UAS points. The graph-based parsing architecture allows for global inference and rich feature representations for TAG parsing, alleviating the fundamental trade-off between transition-based and graph-based parsing systems. We also demonstrate that the proposed parser achieves state-of-the-art performance in the downstream tasks of Parsing Evaluation using Textual Entailments (PETE) and Unbounded Dependency Recovery. This provides further support for the claim that TAG is a viable formalism for problems that require rich structural analysis of sentences.
Courses
-
Advanced NLP
CPSC 677
-
Algorithms
CPSC 365
-
Computational Complexity Theory
CPSC 468
-
Deep Learning Theory and Applications
CPSC 663
-
Formal Foundations of Linguistic Theories
LING 224
-
Introduction to Analysis
MATH 301
-
Introduction to Systems Programming and Computer Organization
CPSC 323
-
NLP
CPSC 477
-
Neural Networks and Language
LING 380
-
Semantics I
LING 263
-
Syntax I
LING 253
-
Vector Calculus and Linear Algebra
MATH 230/231
Projects
-
The Book of Thoth
- Present
In January 2016, Toby Jaroslaw and I began work on the Book of Thoth, an indie game that has users write spells to get past enemies and puzzles. The game uses natural language processing techniques to interpret words (written in a language based on ancient Egyptian) into spells. There are no preset spells, so players must create their own by combining the words they have unlocked in dynamic ways. I created the spell interpreter system, core game engine components like hit detection and texture…
In January 2016, Toby Jaroslaw and I began work on the Book of Thoth, an indie game that has users write spells to get past enemies and puzzles. The game uses natural language processing techniques to interpret words (written in a language based on ancient Egyptian) into spells. There are no preset spells, so players must create their own by combining the words they have unlocked in dynamic ways. I created the spell interpreter system, core game engine components like hit detection and texture blending, and much of the levels and gameplay. The game is written from scratch in Java.
Other creatorsSee project -
DeepMusic
-
A generative neural network architecture for composing music from a single song.
Other creatorsSee project -
Voynich2Vec
-
Used word embedding techniques to analyze the Voynich, an undeciphered medieval manuscript in Yale's Beinecke library. We believe we have successfully identified morphological alternations in the manuscript.
Other creators -
Honors & Awards
-
Grace Hopper Prize for Computer Science Finalist
Yale University Computer Science Department
My team designed a deep learning model that picks winning teams in the video game Dota 2, for which we were named a finalist in Yale's Grace Hopper Prize for Computer Science.
After the prize, we went on to build a live web app demo for the project.
Live demo: http://draftnet.herokuapp.com/ -
Rising Scientist Award
Child Mind Institute
Awarded for my research on the neurolinguistics of texting acronyms done during high school.
-
Keynote Speaker at Packer Science Research Symposium 2018
Packer Collegiate Institute
I gave the keynote speech at the research symposium at my former high school. Former speakers had all been tenured professors.
Languages
-
Icelandic
Limited working proficiency
-
Latin
Full professional proficiency
-
Norse, Old
Full professional proficiency
-
English, Old (ca.450-1100)
Full professional proficiency
More activity by William
-
Interesting piece in today’s Washington Post on AI Hallucinations quoting Will Merrill!
Interesting piece in today’s Washington Post on AI Hallucinations quoting Will Merrill!
Liked by William Merrill
-
I didn't know my Mom knew the new Speaker of the House personally until I saw her quoted in the Wall Street Journal! 🤯
I didn't know my Mom knew the new Speaker of the House personally until I saw her quoted in the Wall Street Journal! 🤯
Liked by William Merrill
-
I am happy to share that I recently presented an #NLP paper, "A Procedure for Inferring a Minimalist Lexicon from an SMT Model of a Language…
I am happy to share that I recently presented an #NLP paper, "A Procedure for Inferring a Minimalist Lexicon from an SMT Model of a Language…
Liked by William Merrill
-
My father. May his memory be blessed. https://lnkd.in/gbtrg_dm
My father. May his memory be blessed. https://lnkd.in/gbtrg_dm
Liked by William Merrill
-
Had a great time going back to school to discuss #AI, #Education2.0, and #GoodNotes with my favorite teachers! Thanks CKY for the invite! Louis Wong…
Had a great time going back to school to discuss #AI, #Education2.0, and #GoodNotes with my favorite teachers! Thanks CKY for the invite! Louis Wong…
Liked by William Merrill
-
Rare personal post on LI: Very cool and touching to see Mom remembered by the US Navy Cryptology & Information Warfare community as part of their…
Rare personal post on LI: Very cool and touching to see Mom remembered by the US Navy Cryptology & Information Warfare community as part of their…
Liked by William Merrill
-
Postdoctoral scholars at #PrincetonU will receive a minimum full-time salary of $65,000 per year, beginning March 1. The new minimum salary, which…
Postdoctoral scholars at #PrincetonU will receive a minimum full-time salary of $65,000 per year, beginning March 1. The new minimum salary, which…
Liked by William Merrill
-
How does social class shape young adults’ relationships with parents? In “Privileged Dependence, Precarious Autonomy," I leverage the case of…
How does social class shape young adults’ relationships with parents? In “Privileged Dependence, Precarious Autonomy," I leverage the case of…
Liked by William Merrill
-
Excited to share what we've been working on - create a chatbot on top of large language models from Cohere with any persona you want using…
Excited to share what we've been working on - create a chatbot on top of large language models from Cohere with any persona you want using…
Liked by William Merrill
-
I'm not so great about posting my work on social media, but I had a piece about Metallica I'm really proud of published in Ultimate Classic Rock…
I'm not so great about posting my work on social media, but I had a piece about Metallica I'm really proud of published in Ultimate Classic Rock…
Liked by William Merrill
-
My first freelance article for the The Philadelphia Inquirer got published today, which is very exciting for me. It's supposed to be in the actual…
My first freelance article for the The Philadelphia Inquirer got published today, which is very exciting for me. It's supposed to be in the actual…
Liked by William Merrill
-
Yale’s Daniel Spielman has won the Breakthrough Prize in Mathematics for “multiple discoveries in theoretical computer science and mathematics.”…
Yale’s Daniel Spielman has won the Breakthrough Prize in Mathematics for “multiple discoveries in theoretical computer science and mathematics.”…
Liked by William Merrill
People also viewed
-
Yinlin Deng
PhD Student at UIUC
Connect -
Ashish Sabharwal
Connect -
Tal Linzen
Connect -
Andreas Ravichandran
Connect -
Sicco Verwer
Associate Professor at TU Delft at Technische Universiteit Delft and Co-Founder apta.tech
Connect -
Franz Mayr
Associate Professor and Academic Coordinator - Artificial Intelligence & Big Data at Universidad ORT Uruguay
Connect -
Arda Atalik
Ph.D. Student at NYU Center for Data Science | Center for Advanced Imaging Innovation and Research (CAI2R) at NYU Langone Health
Connect -
Angelica Chen
Connect -
Simon Dieck
PhD candidate in Computer Science at TU Delft
Connect -
Isaac Easton
Full Stack Engineer at Leadership Connect
Connect
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore MoreOthers named William Merrill in United States
-
William Merrill
Senior Wireless Architect
-
William Merrill
Engineering Manager at Fluor Marine Propulsion, LLC
-
William Merrill
Certified Flight Instructor
-
William Merrill
Data Modeler
198 others named William Merrill in United States are on LinkedIn
See others named William Merrill