William Merrill

New York, New York, United States Contact Info

Sign in to view William’s full profile

Welcome back

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

440 followers 419 connections

View mutual connections with William

Welcome back

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Join to view profile

NYU Center for Data Science

Yale University

Blog

About

I’m currently a PhD student at NYU and have previously worked at Google Research and the…

Activity

Start talking to Ellie Pavlick about her work — looking for evidence of understanding within large language models (LLMs) — and she might sound as if…

Start talking to Ellie Pavlick about her work — looking for evidence of understanding within large language models (LLMs) — and she might sound as if…

Liked by William Merrill
William Merrill, a graduate student at NYU, recently co-authored a study that used computational complexity theory to quantify the strength of…

William Merrill, a graduate student at NYU, recently co-authored a study that used computational complexity theory to quantify the strength of…

Liked by William Merrill
So happy to attend #AGBT2024 this past week! I had the opportunity to talk about my research and learn about new tech (sequencing DNA in space…

So happy to attend #AGBT2024 this past week! I had the opportunity to talk about my research and learn about new tech (sequencing DNA in space…

Liked by William Merrill

Join now to see all activity

Experience & Education

NYU Center for Data Science

***** ********* *** ** (***)

******** ******
******

******* **********
**** **********

******** ** ******* - ** ******** ******* *.*

2015 - 2019
**** **********

******** ** **** - ** *********** *.*

2015 - 2019

View William’s full experience

See their title, tenure and more.

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Publications

Sequential Neural Networks as Automata

Deep Learning and Formal Languages workshop at ACL 2019 July 1, 2019

This work attempts to explain the types of computation that neural networks can perform by relating them to automata. We first define what it means for a real-time network with bounded precision to accept a language. A measure of network memory follows from this definition. We then characterize the classes of languages acceptable by various recurrent networks, attention, and convolutional networks. We find that LSTMs function like counter machines and relate convolutional networks to the…

This work attempts to explain the types of computation that neural networks can perform by relating them to automata. We first define what it means for a real-time network with bounded precision to accept a language. A measure of network memory follows from this definition. We then characterize the classes of languages acceptable by various recurrent networks, attention, and convolutional networks. We find that LSTMs function like counter machines and relate convolutional networks to the subregular hierarchy. Overall, this work attempts to increase our understanding and ability to interpret neural networks through the lens of theory. These theoretical insights help explain neural computation, as well as the relationship between neural networks and natural language grammar.

See publication
Finding Syntactic Representations in Neural Stacks

Analyzing and Interpreting Neural Networks for NLP workshop at ACL 2019 June 28, 2019

Neural network architectures have been augmented with differentiable stacks in order to introduce a bias toward learning hierarchy-sensitive regularities. It has, however, proven difficult to assess the degree to which such a bias is effective, as the operation of the differentiable stack is not always interpretable. In this paper, we attempt to detect the presence of latent representations of hierarchical structure through an exploration of the unsupervised learning of constituency structure…

Neural network architectures have been augmented with differentiable stacks in order to introduce a bias toward learning hierarchy-sensitive regularities. It has, however, proven difficult to assess the degree to which such a bias is effective, as the operation of the differentiable stack is not always interpretable. In this paper, we attempt to detect the presence of latent representations of hierarchical structure through an exploration of the unsupervised learning of constituency structure. Using a technique due to Shen et al. (2018a,b), we extract syntactic trees from the pushing behavior of stack RNNs trained on language modeling and classification objectives. We find that our models produce parses that reflect natural language syntactic constituencies, demonstrating that stack RNNs do indeed infer linguistically relevant hierarchical structure.

See publication
Detecting Syntactic Change Using a Neural Part-of-Speech Tagger

Computational Approaches to Historical Language Change workshop at ACL 2019 June 18, 2019
We train a diachronic long short-term memory (LSTM) part-of-speech tagger on a large corpus of American English from the 19th, 20th, and 21st centuries. We analyze the tagger's ability to implicitly learn temporal structure between years, and the extent to which this knowledge can be transferred to date new sentences. The learned year embeddings show a strong linear correlation between their first principal component and time. We show that temporal information encoded in the model can be used…

We train a diachronic long short-term memory (LSTM) part-of-speech tagger on a large corpus of American English from the 19th, 20th, and 21st centuries. We analyze the tagger's ability to implicitly learn temporal structure between years, and the extent to which this knowledge can be transferred to date new sentences. The learned year embeddings show a strong linear correlation between their first principal component and time. We show that temporal information encoded in the model can be used to predict novel sentences' years of composition relatively well. Comparisons to a feedforward baseline suggest that the temporal change learned by the LSTM is syntactic rather than purely lexical. Thus, our results suggest that our tagger is implicitly learning to model syntactic change in American English over the course of the 19th, 20th, and early 21st centuries.

Other authors
See publication
Context-Free Transductions with Neural Stacks

Analyzing and Interpreting Neural Networks for NLP workshop at EMNLP 2018 September 8, 2018

Co-lead author.

This paper analyzes the behavior of stack-augmented recurrent neural network (RNN) models. Due to the architectural similarity between stack RNNs and pushdown transducers, we train stack RNN models on a number of tasks, including string reversal, context-free language modeling, and cumulative XOR evaluation. Examining the behavior of our networks, we show that stack-augmented RNNs can discover intuitive stack-based strategies for solving our tasks. However, stack RNNs are…

Co-lead author.

This paper analyzes the behavior of stack-augmented recurrent neural network (RNN) models. Due to the architectural similarity between stack RNNs and pushdown transducers, we train stack RNN models on a number of tasks, including string reversal, context-free language modeling, and cumulative XOR evaluation. Examining the behavior of our networks, we show that stack-augmented RNNs can discover intuitive stack-based strategies for solving our tasks. However, stack RNNs are more difficult to train than classical architectures such as LSTMs. Rather than employ stack-based strategies, more complex networks often find approximate solutions by using the stack as unstructured memory.

See publication
End-to-end Graph-based TAG Parsing with Neural Networks

NAACL 2018 April 28, 2018

We present a graph-based Tree Adjoining Grammar (TAG) parser that uses BiLSTMs, highway connections, and character-level CNNs. Our best end-to-end parser, which jointly performs supertagging, POS tagging, and parsing, outperforms the previously reported best results by more than 2.2 LAS and UAS points. The graph-based parsing architecture allows for global inference and rich feature representations for TAG parsing, alleviating the fundamental trade-off between transition-based and graph-based…

We present a graph-based Tree Adjoining Grammar (TAG) parser that uses BiLSTMs, highway connections, and character-level CNNs. Our best end-to-end parser, which jointly performs supertagging, POS tagging, and parsing, outperforms the previously reported best results by more than 2.2 LAS and UAS points. The graph-based parsing architecture allows for global inference and rich feature representations for TAG parsing, alleviating the fundamental trade-off between transition-based and graph-based parsing systems. We also demonstrate that the proposed parser achieves state-of-the-art performance in the downstream tasks of Parsing Evaluation using Textual Entailments (PETE) and Unbounded Dependency Recovery. This provides further support for the claim that TAG is a viable formalism for problems that require rich structural analysis of sentences.

See publication

Courses

Advanced NLP

CPSC 677
Algorithms

CPSC 365
Computational Complexity Theory

CPSC 468
Deep Learning Theory and Applications

CPSC 663
Formal Foundations of Linguistic Theories

LING 224
Introduction to Analysis

MATH 301
Introduction to Systems Programming and Computer Organization

CPSC 323
NLP

CPSC 477
Neural Networks and Language

LING 380
Semantics I

LING 263
Syntax I

LING 253
Vector Calculus and Linear Algebra

MATH 230/231

Projects

The Book of Thoth

Jan 2016 - Present
In January 2016, Toby Jaroslaw and I began work on the Book of Thoth, an indie game that has users write spells to get past enemies and puzzles. The game uses natural language processing techniques to interpret words (written in a language based on ancient Egyptian) into spells. There are no preset spells, so players must create their own by combining the words they have unlocked in dynamic ways. I created the spell interpreter system, core game engine components like hit detection and texture…

In January 2016, Toby Jaroslaw and I began work on the Book of Thoth, an indie game that has users write spells to get past enemies and puzzles. The game uses natural language processing techniques to interpret words (written in a language based on ancient Egyptian) into spells. There are no preset spells, so players must create their own by combining the words they have unlocked in dynamic ways. I created the spell interpreter system, core game engine components like hit detection and texture blending, and much of the levels and gameplay. The game is written from scratch in Java.

Other creators
See project
DeepMusic

Mar 2018 - May 2018
A generative neural network architecture for composing music from a single song.

Other creators
See project
Voynich2Vec

Mar 2018 - May 2018
Used word embedding techniques to analyze the Voynich, an undeciphered medieval manuscript in Yale's Beinecke library. We believe we have successfully identified morphological alternations in the manuscript.

Other creators
See project

Honors & Awards

Grace Hopper Prize for Computer Science Finalist

Yale University Computer Science Department

May 2016

My team designed a deep learning model that picks winning teams in the video game Dota 2, for which we were named a finalist in Yale's Grace Hopper Prize for Computer Science.

After the prize, we went on to build a live web app demo for the project.

Live demo: http://draftnet.herokuapp.com/
Rising Scientist Award

Child Mind Institute

May 2015

Awarded for my research on the neurolinguistics of texting acronyms done during high school.
Keynote Speaker at Packer Science Research Symposium 2018

Packer Collegiate Institute

I gave the keynote speech at the research symposium at my former high school. Former speakers had all been tenured professors.

Languages

Icelandic

Limited working proficiency
Latin

Full professional proficiency
Norse, Old

Full professional proficiency
English, Old (ca.450-1100)

Full professional proficiency

More activity by William

Interesting piece in today’s Washington Post on AI Hallucinations quoting Will Merrill!

Interesting piece in today’s Washington Post on AI Hallucinations quoting Will Merrill!

Liked by William Merrill
I didn't know my Mom knew the new Speaker of the House personally until I saw her quoted in the Wall Street Journal! 🤯

I didn't know my Mom knew the new Speaker of the House personally until I saw her quoted in the Wall Street Journal! 🤯

Liked by William Merrill
I am happy to share that I recently presented an #NLP paper, "A Procedure for Inferring a Minimalist Lexicon from an SMT Model of a Language…

I am happy to share that I recently presented an #NLP paper, "A Procedure for Inferring a Minimalist Lexicon from an SMT Model of a Language…

Liked by William Merrill
My father. May his memory be blessed. https://lnkd.in/gbtrg_dm

My father. May his memory be blessed. https://lnkd.in/gbtrg_dm

Liked by William Merrill
Had a great time going back to school to discuss #AI, #Education2.0, and #GoodNotes with my favorite teachers! Thanks CKY for the invite! Louis Wong…

Had a great time going back to school to discuss #AI, #Education2.0, and #GoodNotes with my favorite teachers! Thanks CKY for the invite! Louis Wong…

Liked by William Merrill
Rare personal post on LI: Very cool and touching to see Mom remembered by the US Navy Cryptology & Information Warfare community as part of their…

Rare personal post on LI: Very cool and touching to see Mom remembered by the US Navy Cryptology & Information Warfare community as part of their…

Liked by William Merrill
Postdoctoral scholars at #PrincetonU will receive a minimum full-time salary of $65,000 per year, beginning March 1. The new minimum salary, which…

Postdoctoral scholars at #PrincetonU will receive a minimum full-time salary of $65,000 per year, beginning March 1. The new minimum salary, which…

Liked by William Merrill
How does social class shape young adults’ relationships with parents? In “Privileged Dependence, Precarious Autonomy," I leverage the case of…

How does social class shape young adults’ relationships with parents? In “Privileged Dependence, Precarious Autonomy," I leverage the case of…

Liked by William Merrill
Excited to share what we've been working on - create a chatbot on top of large language models from Cohere with any persona you want using…

Excited to share what we've been working on - create a chatbot on top of large language models from Cohere with any persona you want using…

Liked by William Merrill
I'm not so great about posting my work on social media, but I had a piece about Metallica I'm really proud of published in Ultimate Classic Rock…

I'm not so great about posting my work on social media, but I had a piece about Metallica I'm really proud of published in Ultimate Classic Rock…

Liked by William Merrill
My first freelance article for the The Philadelphia Inquirer got published today, which is very exciting for me. It's supposed to be in the actual…

My first freelance article for the The Philadelphia Inquirer got published today, which is very exciting for me. It's supposed to be in the actual…

Liked by William Merrill
Raymond James 401(k)

Raymond James 401(k)

Liked by William Merrill
Yale’s Daniel Spielman has won the Breakthrough Prize in Mathematics for “multiple discoveries in theoretical computer science and mathematics.”…

Yale’s Daniel Spielman has won the Breakthrough Prize in Mathematics for “multiple discoveries in theoretical computer science and mathematics.”…

Liked by William Merrill
This is real-time Virtual Reality No post production, just a live recording of creator Khena B's VR experience. Using the particle effects from the…

This is real-time Virtual Reality No post production, just a live recording of creator Khena B's VR experience. Using the particle effects from the…

Liked by William Merrill

View William’s full profile

See who you know in common
Get introduced
Contact William directly

Join to view full profile

Sign in

Stay updated on your professional world

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named William Merrill in United States

198 others named William Merrill in United States are on LinkedIn

See others named William Merrill

Add new skills with these courses

See all courses

About

Activity

Start talking to Ellie Pavlick about her work — looking for evidence of understanding within large language models (LLMs) — and she might sound as if…

Liked by William Merrill

William Merrill, a graduate student at NYU, recently co-authored a study that used computational complexity theory to quantify the strength of…

Liked by William Merrill

So happy to attend #AGBT2024 this past week! I had the opportunity to talk about my research and learn about new tech (sequencing DNA in space…

Liked by William Merrill

Experience & Education

NYU Center for Data Science

******** *******

View William’s full experience

See their title, tenure and more.

Publications

Deep Learning and Formal Languages workshop at ACL 2019 July 1, 2019

Analyzing and Interpreting Neural Networks for NLP workshop at ACL 2019 June 28, 2019

Computational Approaches to Historical Language Change workshop at ACL 2019 June 18, 2019

Analyzing and Interpreting Neural Networks for NLP workshop at EMNLP 2018 September 8, 2018

NAACL 2018 April 28, 2018

Courses

Advanced NLP

CPSC 677

Algorithms

CPSC 365

Computational Complexity Theory

CPSC 468

Deep Learning Theory and Applications

CPSC 663

Formal Foundations of Linguistic Theories

LING 224

Introduction to Analysis

MATH 301

Introduction to Systems Programming and Computer Organization

CPSC 323

NLP

CPSC 477

Neural Networks and Language

LING 380

Semantics I

LING 263

Syntax I

LING 253

Vector Calculus and Linear Algebra

MATH 230/231

Projects

Jan 2016 - Present

Mar 2018 - May 2018

Mar 2018 - May 2018

Honors & Awards

Grace Hopper Prize for Computer Science Finalist

Yale University Computer Science Department

Rising Scientist Award

Child Mind Institute

Keynote Speaker at Packer Science Research Symposium 2018

Packer Collegiate Institute

Languages

Icelandic

Limited working proficiency

Latin

Full professional proficiency

Norse, Old

Full professional proficiency

English, Old (ca.450-1100)

Full professional proficiency

More activity by William

Interesting piece in today’s Washington Post on AI Hallucinations quoting Will Merrill!

Liked by William Merrill

I didn't know my Mom knew the new Speaker of the House personally until I saw her quoted in the Wall Street Journal! 🤯

Liked by William Merrill

I am happy to share that I recently presented an #NLP paper, "A Procedure for Inferring a Minimalist Lexicon from an SMT Model of a Language…

Liked by William Merrill

My father. May his memory be blessed. https://lnkd.in/gbtrg_dm

Liked by William Merrill

Had a great time going back to school to discuss #AI, #Education2.0, and #GoodNotes with my favorite teachers! Thanks CKY for the invite! Louis Wong…

Liked by William Merrill

Rare personal post on LI: Very cool and touching to see Mom remembered by the US Navy Cryptology & Information Warfare community as part of their…

Liked by William Merrill

Postdoctoral scholars at #PrincetonU will receive a minimum full-time salary of $65,000 per year, beginning March 1. The new minimum salary, which…

Liked by William Merrill

Yale’s Daniel Spielman has won the Breakthrough Prize in Mathematics for “multiple discoveries in theoretical computer science and mathematics.”…