Skip to main content

Showing 1–50 of 80 results for author: Lillicrap, T

  1. arXiv:2405.14573  [pdf, other

    cs.AI cs.LG

    AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents

    Authors: Christopher Rawles, Sarah Clinckemaillie, Yifan Chang, Jonathan Waltz, Gabrielle Lau, Marybeth Fair, Alice Li, William Bishop, Wei Li, Folawiyo Campbell-Ajala, Daniel Toyama, Robert Berry, Divya Tyamagundlu, Timothy Lillicrap, Oriana Riva

    Abstract: Autonomous agents that execute human tasks by controlling computers can enhance human productivity and application accessibility. However, progress in this field will be driven by realistic and reproducible benchmarks. We present AndroidWorld, a fully functional Android environment that provides reward signals for 116 programmatic tasks across 20 real-world Android apps. Unlike existing interactiv… ▽ More

    Submitted 10 June, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  2. arXiv:2405.12205  [pdf, other

    cs.AI cs.LG

    Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving

    Authors: Aniket Didolkar, Anirudh Goyal, Nan Rosemary Ke, Siyuan Guo, Michal Valko, Timothy Lillicrap, Danilo Rezende, Yoshua Bengio, Michael Mozer, Sanjeev Arora

    Abstract: Metacognitive knowledge refers to humans' intuitive knowledge of their own thinking and reasoning processes. Today's best LLMs clearly possess some reasoning processes. The paper gives evidence that they also have metacognitive knowledge, including ability to name skills and procedures to apply given a task. We explore this primarily in context of math reasoning, developing a prompt-guided interac… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: Preprint. Under review

  3. arXiv:2405.00451  [pdf, other

    cs.AI cs.LG

    Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning

    Authors: Yuxi Xie, Anirudh Goyal, Wenyue Zheng, Min-Yen Kan, Timothy P. Lillicrap, Kenji Kawaguchi, Michael Shieh

    Abstract: We introduce an approach aimed at enhancing the reasoning capabilities of Large Language Models (LLMs) through an iterative preference learning process inspired by the successful strategy employed by AlphaZero. Our work leverages Monte Carlo Tree Search (MCTS) to iteratively collect preference data, utilizing its look-ahead ability to break down instance-level rewards into more granular step-level… ▽ More

    Submitted 17 June, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

    Comments: 10 pages, 4 figures, 4 tables (24 pages, 9 figures, 9 tables including references and appendices)

  4. arXiv:2404.02258  [pdf, other

    cs.LG cs.CL

    Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

    Authors: David Raposo, Sam Ritter, Blake Richards, Timothy Lillicrap, Peter Conway Humphreys, Adam Santoro

    Abstract: Transformer-based language models spread FLOPs uniformly across input sequences. In this work we demonstrate that transformers can instead learn to dynamically allocate FLOPs (or compute) to specific positions in a sequence, optimising the allocation along the sequence for different layers across the model depth. Our method enforces a total compute budget by capping the number of tokens ($k$) that… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  5. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  6. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  7. arXiv:2307.10088  [pdf, other

    cs.LG cs.CL cs.HC

    Android in the Wild: A Large-Scale Dataset for Android Device Control

    Authors: Christopher Rawles, Alice Li, Daniel Rodriguez, Oriana Riva, Timothy Lillicrap

    Abstract: There is a growing interest in device-control systems that can interpret human natural language instructions and execute them on a digital device by directly controlling its user interface. We present a dataset for device-control research, Android in the Wild (AITW), which is orders of magnitude larger than current datasets. The dataset contains human demonstrations of device interactions, includi… ▽ More

    Submitted 27 October, 2023; v1 submitted 19 July, 2023; originally announced July 2023.

  8. arXiv:2301.04104  [pdf, other

    cs.AI cs.LG stat.ML

    Mastering Diverse Domains through World Models

    Authors: Danijar Hafner, Jurgis Pasukonis, Jimmy Ba, Timothy Lillicrap

    Abstract: Developing a general algorithm that learns to solve tasks across a wide range of applications has been a fundamental challenge in artificial intelligence. Although current reinforcement learning algorithms can be readily applied to tasks similar to what they have been developed for, configuring them for new application domains requires significant human expertise and experimentation. We present Dr… ▽ More

    Submitted 17 April, 2024; v1 submitted 10 January, 2023; originally announced January 2023.

    Comments: Website: https://danijar.com/dreamerv3

  9. arXiv:2211.11602  [pdf, other

    cs.LG cs.HC cs.MA

    Improving Multimodal Interactive Agents with Reinforcement Learning from Human Feedback

    Authors: Josh Abramson, Arun Ahuja, Federico Carnevale, Petko Georgiev, Alex Goldin, Alden Hung, Jessica Landon, Jirka Lhotka, Timothy Lillicrap, Alistair Muldal, George Powell, Adam Santoro, Guy Scully, Sanjana Srivastava, Tamara von Glehn, Greg Wayne, Nathaniel Wong, Chen Yan, Rui Zhu

    Abstract: An important goal in artificial intelligence is to create agents that can both interact naturally with humans and learn from their feedback. Here we demonstrate how to use reinforcement learning from human feedback (RLHF) to improve upon simulated, embodied agents trained to a base level of competency with imitation learning. First, we collected data of humans interacting with agents in a simulate… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

  10. arXiv:2210.13383  [pdf, other

    cs.AI cs.LG

    Evaluating Long-Term Memory in 3D Mazes

    Authors: Jurgis Pasukonis, Timothy Lillicrap, Danijar Hafner

    Abstract: Intelligent agents need to remember salient information to reason in partially-observed environments. For example, agents with a first-person view should remember the positions of relevant objects even if they go out of view. Similarly, to effectively navigate through rooms agents need to remember the floor plan of how rooms are connected. However, most benchmark tasks in reinforcement learning do… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: Project website: https://github.com/jurgisp/memory-maze

  11. arXiv:2210.08340  [pdf

    cs.AI q-bio.NC

    Toward Next-Generation Artificial Intelligence: Catalyzing the NeuroAI Revolution

    Authors: Anthony Zador, Sean Escola, Blake Richards, Bence Ölveczky, Yoshua Bengio, Kwabena Boahen, Matthew Botvinick, Dmitri Chklovskii, Anne Churchland, Claudia Clopath, James DiCarlo, Surya Ganguli, Jeff Hawkins, Konrad Koerding, Alexei Koulakov, Yann LeCun, Timothy Lillicrap, Adam Marblestone, Bruno Olshausen, Alexandre Pouget, Cristina Savin, Terrence Sejnowski, Eero Simoncelli, Sara Solla, David Sussillo , et al. (2 additional authors not shown)

    Abstract: Neuroscience has long been an essential driver of progress in artificial intelligence (AI). We propose that to accelerate progress in AI, we must invest in fundamental research in NeuroAI. A core component of this is the embodied Turing test, which challenges AI animal models to interact with the sensorimotor world at skill levels akin to their living counterparts. The embodied Turing test shifts… ▽ More

    Submitted 22 February, 2023; v1 submitted 15 October, 2022; originally announced October 2022.

    Comments: White paper, 10 pages + 8 pages of references, 1 figures

  12. arXiv:2206.05314  [pdf, other

    cs.LG cs.AI

    Large-Scale Retrieval for Reinforcement Learning

    Authors: Peter C. Humphreys, Arthur Guez, Olivier Tieleman, Laurent Sifre, Théophane Weber, Timothy Lillicrap

    Abstract: Effective decision making involves flexibly relating past experiences and relevant contextual information to a novel situation. In deep reinforcement learning (RL), the dominant paradigm is for an agent to amortise information that helps decision making into its network weights via gradient descent on training losses. Here, we pursue an alternative approach in which agents can utilise large-scale… ▽ More

    Submitted 16 December, 2022; v1 submitted 10 June, 2022; originally announced June 2022.

    Comments: Thirty-sixth Annual Conference on Neural Information Processing Systems (NeurIPS 2022), 16 pages

  13. arXiv:2206.03139  [pdf, other

    cs.LG cs.AI cs.CL

    Intra-agent speech permits zero-shot task acquisition

    Authors: Chen Yan, Federico Carnevale, Petko Georgiev, Adam Santoro, Aurelia Guy, Alistair Muldal, Chia-Chun Hung, Josh Abramson, Timothy Lillicrap, Gregory Wayne

    Abstract: Human language learners are exposed to a trickle of informative, context-sensitive language, but a flood of raw sensory data. Through both social language use and internal processes of rehearsal and practice, language learners are able to build high-level, semantic representations that explain their perceptions. Here, we take inspiration from such processes of "inner speech" in humans (Vygotsky, 1… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

  14. arXiv:2205.13274  [pdf, other

    cs.LG cs.AI

    Evaluating Multimodal Interactive Agents

    Authors: Josh Abramson, Arun Ahuja, Federico Carnevale, Petko Georgiev, Alex Goldin, Alden Hung, Jessica Landon, Timothy Lillicrap, Alistair Muldal, Blake Richards, Adam Santoro, Tamara von Glehn, Greg Wayne, Nathaniel Wong, Chen Yan

    Abstract: Creating agents that can interact naturally with humans is a common goal in artificial intelligence (AI) research. However, evaluating these interactions is challenging: collecting online human-agent interactions is slow and expensive, yet faster proxy metrics often do not correlate well with interactive evaluation. In this paper, we assess the merits of these existing evaluation metrics and prese… ▽ More

    Submitted 14 July, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

  15. arXiv:2202.12795  [pdf, other

    cs.LG cs.AI stat.ML

    Equilibrium Aggregation: Encoding Sets via Optimization

    Authors: Sergey Bartunov, Fabian B. Fuchs, Timothy Lillicrap

    Abstract: Processing sets or other unordered, potentially variable-sized inputs in neural networks is usually handled by aggregating a number of input tensors into a single representation. While a number of aggregation methods already exist from simple sum pooling to multi-head attention, they are limited in their representational power both from theoretical and empirical perspectives. On the search of a pr… ▽ More

    Submitted 3 July, 2022; v1 submitted 25 February, 2022; originally announced February 2022.

    Comments: Published at UAI 2022

  16. arXiv:2202.08417  [pdf, other

    cs.LG

    Retrieval-Augmented Reinforcement Learning

    Authors: Anirudh Goyal, Abram L. Friesen, Andrea Banino, Theophane Weber, Nan Rosemary Ke, Adria Puigdomenech Badia, Arthur Guez, Mehdi Mirza, Peter C. Humphreys, Ksenia Konyushkova, Laurent Sifre, Michal Valko, Simon Osindero, Timothy Lillicrap, Nicolas Heess, Charles Blundell

    Abstract: Most deep reinforcement learning (RL) algorithms distill experience into parametric behavior policies or value functions via gradient updates. While effective, this approach has several disadvantages: (1) it is computationally expensive, (2) it can take many updates to integrate experiences into the parametric model, (3) experiences that are not fully integrated do not appropriately influence the… ▽ More

    Submitted 24 May, 2022; v1 submitted 16 February, 2022; originally announced February 2022.

  17. arXiv:2202.08137  [pdf, other

    cs.LG

    A data-driven approach for learning to control computers

    Authors: Peter C Humphreys, David Raposo, Toby Pohlen, Gregory Thornton, Rachita Chhaparia, Alistair Muldal, Josh Abramson, Petko Georgiev, Alex Goldin, Adam Santoro, Timothy Lillicrap

    Abstract: It would be useful for machines to use computers as humans do so that they can aid us in everyday tasks. This is a setting in which there is also the potential to leverage large-scale expert demonstrations and human judgements of interactive behaviour, which are two ingredients that have driven much recent success in AI. Here we investigate the setting of computer control using keyboard and mouse,… ▽ More

    Submitted 11 November, 2022; v1 submitted 16 February, 2022; originally announced February 2022.

    Journal ref: Proceedings of the 39th International Conference on Machine Learning, Baltimore, Maryland, USA, PMLR 162, 2022

  18. arXiv:2112.03763  [pdf, other

    cs.LG

    Creating Multimodal Interactive Agents with Imitation and Self-Supervised Learning

    Authors: DeepMind Interactive Agents Team, Josh Abramson, Arun Ahuja, Arthur Brussee, Federico Carnevale, Mary Cassin, Felix Fischer, Petko Georgiev, Alex Goldin, Mansi Gupta, Tim Harley, Felix Hill, Peter C Humphreys, Alden Hung, Jessica Landon, Timothy Lillicrap, Hamza Merzic, Alistair Muldal, Adam Santoro, Guy Scully, Tamara von Glehn, Greg Wayne, Nathaniel Wong, Chen Yan, Rui Zhu

    Abstract: A common vision from science fiction is that robots will one day inhabit our physical spaces, sense the world as we do, assist our physical labours, and communicate with us through natural language. Here we study how to design artificial agents that can interact naturally with humans using the simplification of a virtual environment. We show that imitation learning of human-human interactions in a… ▽ More

    Submitted 2 February, 2022; v1 submitted 7 December, 2021; originally announced December 2021.

  19. arXiv:2106.13031  [pdf, other

    cs.LG cs.NE q-bio.NC

    Towards Biologically Plausible Convolutional Networks

    Authors: Roman Pogodin, Yash Mehta, Timothy P. Lillicrap, Peter E. Latham

    Abstract: Convolutional networks are ubiquitous in deep learning. They are particularly useful for images, as they reduce the number of parameters, reduce training time, and increase accuracy. However, as a model of the brain they are seriously problematic, since they require weight sharing - something real neurons simply cannot do. Consequently, while neurons in the brain can be locally connected (one of t… ▽ More

    Submitted 15 January, 2022; v1 submitted 22 June, 2021; originally announced June 2021.

  20. arXiv:2102.03406  [pdf, other

    cs.AI cs.LG

    Symbolic Behaviour in Artificial Intelligence

    Authors: Adam Santoro, Andrew Lampinen, Kory Mathewson, Timothy Lillicrap, David Raposo

    Abstract: The ability to use symbols is the pinnacle of human intelligence, but has yet to be fully replicated in machines. Here we argue that the path towards symbolically fluent artificial intelligence (AI) begins with a reinterpretation of what symbols are, how they come to exist, and how a system behaves when it uses them. We begin by offering an interpretation of symbols as entities whose meaning is es… ▽ More

    Submitted 21 January, 2022; v1 submitted 5 February, 2021; originally announced February 2021.

  21. arXiv:2012.05672  [pdf, other

    cs.LG cs.AI cs.MA

    Imitating Interactive Intelligence

    Authors: Josh Abramson, Arun Ahuja, Iain Barr, Arthur Brussee, Federico Carnevale, Mary Cassin, Rachita Chhaparia, Stephen Clark, Bogdan Damoc, Andrew Dudzik, Petko Georgiev, Aurelia Guy, Tim Harley, Felix Hill, Alden Hung, Zachary Kenton, Jessica Landon, Timothy Lillicrap, Kory Mathewson, Soňa Mokrá, Alistair Muldal, Adam Santoro, Nikolay Savinov, Vikrant Varma, Greg Wayne , et al. (4 additional authors not shown)

    Abstract: A common vision from science fiction is that robots will one day inhabit our physical spaces, sense the world as we do, assist our physical labours, and communicate with us through natural language. Here we study how to design artificial agents that can interact naturally with humans using the simplification of a virtual environment. This setting nevertheless integrates a number of the central cha… ▽ More

    Submitted 20 January, 2021; v1 submitted 10 December, 2020; originally announced December 2020.

  22. arXiv:2010.15040  [pdf, other

    stat.ML cs.LG

    Training Generative Adversarial Networks by Solving Ordinary Differential Equations

    Authors: Chongli Qin, Yan Wu, Jost Tobias Springenberg, Andrew Brock, Jeff Donahue, Timothy P. Lillicrap, Pushmeet Kohli

    Abstract: The instability of Generative Adversarial Network (GAN) training has frequently been attributed to gradient descent. Consequently, recent methods have aimed to tailor the models and training procedures to stabilise the discrete updates. In contrast, we study the continuous-time dynamics induced by GAN training. Both theory and toy experiments suggest that these dynamics are in fact surprisingly st… ▽ More

    Submitted 28 November, 2020; v1 submitted 28 October, 2020; originally announced October 2020.

  23. arXiv:2010.02193  [pdf, other

    cs.LG cs.AI stat.ML

    Mastering Atari with Discrete World Models

    Authors: Danijar Hafner, Timothy Lillicrap, Mohammad Norouzi, Jimmy Ba

    Abstract: Intelligent agents need to generalize from past experience to achieve goals in complex environments. World models facilitate such generalization and allow learning behaviors from imagined outcomes to increase sample-efficiency. While learning world models from image inputs has recently become feasible for some tasks, modeling Atari games accurately enough to derive successful behaviors has remaine… ▽ More

    Submitted 12 February, 2022; v1 submitted 5 October, 2020; originally announced October 2020.

    Comments: Published at ICLR 2021. Website: https://danijar.com/dreamerv2

  24. arXiv:2010.01298  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Beyond Tabula-Rasa: a Modular Reinforcement Learning Approach for Physically Embedded 3D Sokoban

    Authors: Peter Karkus, Mehdi Mirza, Arthur Guez, Andrew Jaegle, Timothy Lillicrap, Lars Buesing, Nicolas Heess, Theophane Weber

    Abstract: Intelligent robots need to achieve abstract objectives using concrete, spatiotemporally complex sensory information and motor control. Tabula rasa deep reinforcement learning (RL) has tackled demanding tasks in terms of either visual, abstract, or physical reasoning, but solving these jointly remains a formidable challenge. One recent, unsolved benchmark task that integrates these challenges is Mu… ▽ More

    Submitted 3 October, 2020; originally announced October 2020.

  25. arXiv:2009.05524  [pdf, other

    cs.AI cs.LG

    Physically Embedded Planning Problems: New Challenges for Reinforcement Learning

    Authors: Mehdi Mirza, Andrew Jaegle, Jonathan J. Hunt, Arthur Guez, Saran Tunyasuvunakool, Alistair Muldal, Théophane Weber, Peter Karkus, Sébastien Racanière, Lars Buesing, Timothy Lillicrap, Nicolas Heess

    Abstract: Recent work in deep reinforcement learning (RL) has produced algorithms capable of mastering challenging games such as Go, chess, or shogi. In these works the RL agent directly observes the natural state of the game and controls that state directly with its actions. However, when humans play such games, they do not just reason about the moves but also interact with their physical environment. They… ▽ More

    Submitted 29 October, 2020; v1 submitted 11 September, 2020; originally announced September 2020.

    Comments: 17 pages + appendix. Updated text and references

  26. dm_control: Software and Tasks for Continuous Control

    Authors: Yuval Tassa, Saran Tunyasuvunakool, Alistair Muldal, Yotam Doron, Piotr Trochim, Siqi Liu, Steven Bohez, Josh Merel, Tom Erez, Timothy Lillicrap, Nicolas Heess

    Abstract: The dm_control software package is a collection of Python libraries and task suites for reinforcement learning agents in an articulated-body simulation. A MuJoCo wrapper provides convenient bindings to functions and data structures. The PyMJCF and Composer libraries enable procedural model manipulation and task authoring. The Control Suite is a fixed set of tasks with standardised structure, inten… ▽ More

    Submitted 7 September, 2020; v1 submitted 22 June, 2020; originally announced June 2020.

    Comments: arXiv admin note: text overlap with arXiv:1801.00690

  27. arXiv:1912.01603  [pdf, other

    cs.LG cs.AI cs.RO

    Dream to Control: Learning Behaviors by Latent Imagination

    Authors: Danijar Hafner, Timothy Lillicrap, Jimmy Ba, Mohammad Norouzi

    Abstract: Learned world models summarize an agent's experience to facilitate learning complex behaviors. While learning world models from high-dimensional sensory inputs is becoming feasible through deep learning, there are many potential ways for deriving behaviors from them. We present Dreamer, a reinforcement learning agent that solves long-horizon tasks from images purely by latent imagination. We effic… ▽ More

    Submitted 17 March, 2020; v1 submitted 3 December, 2019; originally announced December 2019.

    Comments: 9 pages, 12 figures

  28. arXiv:1912.00953  [pdf, other

    cs.LG stat.ML

    LOGAN: Latent Optimisation for Generative Adversarial Networks

    Authors: Yan Wu, Jeff Donahue, David Balduzzi, Karen Simonyan, Timothy Lillicrap

    Abstract: Training generative adversarial networks requires balancing of delicate adversarial dynamics. Even with careful tuning, training may diverge or end up in a bad equilibrium with dropped modes. In this work, we improve CS-GAN with natural gradient-based latent optimisation and show that it improves adversarial dynamics by enhancing interactions between the discriminator and the generator. Our experi… ▽ More

    Submitted 1 July, 2020; v1 submitted 2 December, 2019; originally announced December 2019.

    Comments: Improved writing, added new analysis and evaluation

  29. Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

    Authors: Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, Karen Simonyan, Laurent Sifre, Simon Schmitt, Arthur Guez, Edward Lockhart, Demis Hassabis, Thore Graepel, Timothy Lillicrap, David Silver

    Abstract: Constructing agents with planning capabilities has long been one of the main challenges in the pursuit of artificial intelligence. Tree-based planning methods have enjoyed huge success in challenging domains, such as chess and Go, where a perfect simulator is available. However, in real-world problems the dynamics governing the environment are often complex and unknown. In this work we present the… ▽ More

    Submitted 21 February, 2020; v1 submitted 19 November, 2019; originally announced November 2019.

  30. arXiv:1911.05507  [pdf, other

    cs.LG stat.ML

    Compressive Transformers for Long-Range Sequence Modelling

    Authors: Jack W. Rae, Anna Potapenko, Siddhant M. Jayakumar, Timothy P. Lillicrap

    Abstract: We present the Compressive Transformer, an attentive sequence model which compresses past memories for long-range sequence learning. We find the Compressive Transformer obtains state-of-the-art language modelling results in the WikiText-103 and Enwik8 benchmarks, achieving 17.1 ppl and 0.97 bpc respectively. We also find it can model high-frequency speech effectively and can be used as a memory me… ▽ More

    Submitted 13 November, 2019; originally announced November 2019.

    Comments: 19 pages, 6 figures, 10 tables

  31. arXiv:1910.02720  [pdf, other

    stat.ML cs.LG cs.NE

    Meta-Learning Deep Energy-Based Memory Models

    Authors: Sergey Bartunov, Jack W Rae, Simon Osindero, Timothy P Lillicrap

    Abstract: We study the problem of learning associative memory -- a system which is able to retrieve a remembered pattern based on its distorted or incomplete version. Attractor networks provide a sound model of associative memory: patterns are stored as attractors of the network dynamics and associative retrieval is performed by running the dynamics starting from a query pattern until it converges to an att… ▽ More

    Submitted 20 April, 2021; v1 submitted 7 October, 2019; originally announced October 2019.

    Comments: ICLR 2020

  32. arXiv:1909.12892  [pdf, other

    cs.LG cs.AI stat.ML

    Automated curricula through setter-solver interactions

    Authors: Sebastien Racaniere, Andrew K. Lampinen, Adam Santoro, David P. Reichert, Vlad Firoiu, Timothy P. Lillicrap

    Abstract: Reinforcement learning algorithms use correlations between policies and rewards to improve agent performance. But in dynamic or sparsely rewarding environments these correlations are often too small, or rewarding events are too infrequent to make learning feasible. Human education instead relies on curricula--the breakdown of tasks into simpler, static challenges with dense rewards--to build up to… ▽ More

    Submitted 21 January, 2020; v1 submitted 27 September, 2019; originally announced September 2019.

    Journal ref: International Conference on Learning Representations, 2020

  33. arXiv:1907.06374  [pdf, other

    cs.LG q-bio.NC stat.ML

    What does it mean to understand a neural network?

    Authors: Timothy P. Lillicrap, Konrad P. Kording

    Abstract: We can define a neural network that can learn to recognize objects in less than 100 lines of code. However, after training, it is characterized by millions of weights that contain the knowledge about many object types across visual scenes. Such networks are thus dramatically easier to understand in terms of the code that makes them than the resulting properties, such as tuning or connections. In a… ▽ More

    Submitted 15 July, 2019; originally announced July 2019.

    Comments: 9 pages, 2 figures

  34. arXiv:1906.04304  [pdf, other

    cs.LG cs.DB cs.DS stat.ML

    Meta-Learning Neural Bloom Filters

    Authors: Jack W Rae, Sergey Bartunov, Timothy P Lillicrap

    Abstract: There has been a recent trend in training neural networks to replace data structures that have been crafted by hand, with an aim for faster execution, better accuracy, or greater compression. In this setting, a neural data structure is instantiated by training a network over many epochs of its inputs until convergence. In applications where inputs arrive at high throughput, or are ephemeral, train… ▽ More

    Submitted 10 June, 2019; originally announced June 2019.

    Comments: International Conference on Machine Learning 2019

  35. arXiv:1905.06723  [pdf, other

    cs.LG eess.SP stat.ML

    Deep Compressed Sensing

    Authors: Yan Wu, Mihaela Rosca, Timothy Lillicrap

    Abstract: Compressed sensing (CS) provides an elegant framework for recovering sparse signals from compressed measurements. For example, CS can exploit the structure of natural images and recover an image from only a few random measurements. CS is flexible and data efficient, but its application has been restricted by the strong assumption of sparsity and costly reconstruction process. A recent approach tha… ▽ More

    Submitted 18 May, 2019; v1 submitted 16 May, 2019; originally announced May 2019.

    Comments: ICML 2019

  36. arXiv:1904.10396  [pdf, other

    q-bio.NC cs.AI cs.LG

    Is coding a relevant metaphor for building AI? A commentary on "Is coding a relevant metaphor for the brain?", by Romain Brette

    Authors: Adam Santoro, Felix Hill, David Barrett, David Raposo, Matthew Botvinick, Timothy Lillicrap

    Abstract: Brette contends that the neural coding metaphor is an invalid basis for theories of what the brain does. Here, we argue that it is an insufficient guide for building an artificial intelligence that learns to accomplish short- and long-term goals in a complex, changing environment.

    Submitted 18 April, 2019; originally announced April 2019.

  37. arXiv:1904.05391  [pdf, other

    cs.LG stat.ML

    Deep Learning without Weight Transport

    Authors: Mohamed Akrout, Collin Wilson, Peter C. Humphreys, Timothy Lillicrap, Douglas Tweed

    Abstract: Current algorithms for deep learning probably cannot run in the brain because they rely on weight transport, where forward-path neurons transmit their synaptic weights to a feedback path, in a way that is likely impossible biologically. An algorithm called feedback alignment achieves deep learning without weight transport by using random feedback weights, but it performs poorly on hard visual-reco… ▽ More

    Submitted 9 January, 2020; v1 submitted 10 April, 2019; originally announced April 2019.

    Comments: Accepted for the Conference on Neural Information Processing Systems (NeurIPS) 2019

  38. arXiv:1902.00120  [pdf, other

    cs.AI

    Learning to Make Analogies by Contrasting Abstract Relational Structure

    Authors: Felix Hill, Adam Santoro, David G. T. Barrett, Ari S. Morcos, Timothy Lillicrap

    Abstract: Analogical reasoning has been a principal focus of various waves of AI research. Analogy is particularly challenging for machines because it requires relational structures to be represented such that they can be flexibly applied across diverse domains of experience. Here, we study how analogical reasoning can be induced in neural networks that learn to perceive and reason about raw visual data. We… ▽ More

    Submitted 31 January, 2019; originally announced February 2019.

  39. arXiv:1901.03559  [pdf, other

    cs.LG cs.AI stat.ML

    An investigation of model-free planning

    Authors: Arthur Guez, Mehdi Mirza, Karol Gregor, Rishabh Kabra, Sébastien Racanière, Théophane Weber, David Raposo, Adam Santoro, Laurent Orseau, Tom Eccles, Greg Wayne, David Silver, Timothy Lillicrap

    Abstract: The field of reinforcement learning (RL) is facing increasingly challenging domains with combinatorial complexity. For an RL agent to address these challenges, it is essential that it can plan effectively. Prior work has typically utilized an explicit model of the environment, combined with a specific planning algorithm (such as tree search). More recently, a new family of methods have been propos… ▽ More

    Submitted 20 May, 2019; v1 submitted 11 January, 2019; originally announced January 2019.

  40. arXiv:1812.02216  [pdf, other

    cs.LG stat.ML

    Composing Entropic Policies using Divergence Correction

    Authors: Jonathan J Hunt, Andre Barreto, Timothy P Lillicrap, Nicolas Heess

    Abstract: Composing previously mastered skills to solve novel tasks promises dramatic improvements in the data efficiency of reinforcement learning. Here, we analyze two recent works composing behaviors represented in the form of action-value functions and show that they perform poorly in some situations. As part of this analysis, we extend an important generalization of policy improvement to the maximum en… ▽ More

    Submitted 5 July, 2019; v1 submitted 5 December, 2018; originally announced December 2018.

  41. arXiv:1811.11682  [pdf, other

    cs.LG cs.AI stat.ML

    Experience Replay for Continual Learning

    Authors: David Rolnick, Arun Ahuja, Jonathan Schwarz, Timothy P. Lillicrap, Greg Wayne

    Abstract: Continual learning is the problem of learning new tasks or knowledge while protecting old knowledge and ideally generalizing from old experience to learn new tasks faster. Neural networks trained by stochastic gradient descent often degrade on old tasks when trained successively on new tasks with different data distributions. This phenomenon, referred to as catastrophic forgetting, is considered a… ▽ More

    Submitted 26 November, 2019; v1 submitted 28 November, 2018; originally announced November 2018.

    Comments: NeurIPS 2019

  42. arXiv:1811.09556  [pdf, other

    cs.LG cs.AI stat.ML

    Learning Attractor Dynamics for Generative Memory

    Authors: Yan Wu, Greg Wayne, Karol Gregor, Timothy Lillicrap

    Abstract: A central challenge faced by memory systems is the robust retrieval of a stored pattern in the presence of interference due to other stored patterns and noise. A theoretically well-founded solution to robust retrieval is given by attractor dynamics, which iteratively clean up patterns during recall. However, incorporating attractor dynamics into modern deep learning systems poses difficulties: att… ▽ More

    Submitted 23 November, 2018; originally announced November 2018.

  43. arXiv:1811.04551  [pdf, other

    cs.LG cs.AI stat.ML

    Learning Latent Dynamics for Planning from Pixels

    Authors: Danijar Hafner, Timothy Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, James Davidson

    Abstract: Planning has been very successful for control tasks with known environment dynamics. To leverage planning in unknown environments, the agent needs to learn the dynamics from interactions with the world. However, learning dynamics models that are accurate enough for planning has been a long-standing challenge, especially in image-based domains. We propose the Deep Planning Network (PlaNet), a purel… ▽ More

    Submitted 4 June, 2019; v1 submitted 11 November, 2018; originally announced November 2018.

    Comments: 20 pages, 12 figures, 1 table

  44. arXiv:1810.06721  [pdf, other

    cs.AI cs.LG

    Optimizing Agent Behavior over Long Time Scales by Transporting Value

    Authors: Chia-Chun Hung, Timothy Lillicrap, Josh Abramson, Yan Wu, Mehdi Mirza, Federico Carnevale, Arun Ahuja, Greg Wayne

    Abstract: Humans spend a remarkable fraction of waking life engaged in acts of "mental time travel". We dwell on our actions in the past and experience satisfaction or regret. More than merely autobiographical storytelling, we use these event recollections to change how we will act in similar scenarios in the future. This process endows us with a computationally important ability to link actions and consequ… ▽ More

    Submitted 21 December, 2018; v1 submitted 15 October, 2018; originally announced October 2018.

  45. arXiv:1810.02274  [pdf, other

    cs.LG cs.AI cs.CV cs.RO stat.ML

    Episodic Curiosity through Reachability

    Authors: Nikolay Savinov, Anton Raichuk, Raphaël Marinier, Damien Vincent, Marc Pollefeys, Timothy Lillicrap, Sylvain Gelly

    Abstract: Rewards are sparse in the real world and most of today's reinforcement learning algorithms struggle with such sparsity. One solution to this problem is to allow the agent to create rewards for itself - thus making rewards dense and more suitable for learning. In particular, inspired by curious behaviour in animals, observing something novel could be rewarded with a bonus. Such bonus is summed up w… ▽ More

    Submitted 6 August, 2019; v1 submitted 4 October, 2018; originally announced October 2018.

    Comments: Accepted to ICLR 2019. Code at https://github.com/google-research/episodic-curiosity/. Videos at https://sites.google.com/view/episodic-curiosity/

  46. arXiv:1807.09289  [pdf, other

    stat.ML cs.LG

    Noise Contrastive Priors for Functional Uncertainty

    Authors: Danijar Hafner, Dustin Tran, Timothy Lillicrap, Alex Irpan, James Davidson

    Abstract: Obtaining reliable uncertainty estimates of neural network predictions is a long standing challenge. Bayesian neural networks have been proposed as a solution, but it remains open how to specify their prior. In particular, the common practice of an independent normal prior in weight space imposes relatively weak constraints on the function posterior, allowing it to generalize in unforeseen ways on… ▽ More

    Submitted 30 June, 2019; v1 submitted 24 July, 2018; originally announced July 2018.

    Comments: 12 pages, 6 figures

  47. arXiv:1807.04587  [pdf, other

    cs.LG cs.AI cs.NE stat.ML

    Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures

    Authors: Sergey Bartunov, Adam Santoro, Blake A. Richards, Luke Marris, Geoffrey E. Hinton, Timothy Lillicrap

    Abstract: The backpropagation of error algorithm (BP) is impossible to implement in a real brain. The recent success of deep networks in machine learning and AI, however, has inspired proposals for understanding how the brain might learn across multiple layers, and hence how it might approximate BP. As of yet, none of these proposals have been rigorously evaluated on tasks where BP-guided deep learning has… ▽ More

    Submitted 20 November, 2018; v1 submitted 12 July, 2018; originally announced July 2018.

    Comments: NIPS 2018. Version 2 contains more experimental data including best hyperparameters found

  48. arXiv:1807.04225  [pdf, other

    cs.LG stat.ML

    Measuring abstract reasoning in neural networks

    Authors: David G. T. Barrett, Felix Hill, Adam Santoro, Ari S. Morcos, Timothy Lillicrap

    Abstract: Whether neural networks can learn abstract reasoning or whether they merely rely on superficial statistics is a topic of recent debate. Here, we propose a dataset and challenge designed to probe abstract reasoning, inspired by a well-known human IQ test. To succeed at this challenge, models must cope with various generalisation `regimes' in which the training and test data differ in clearly-define… ▽ More

    Submitted 11 July, 2018; originally announced July 2018.

    Comments: ICML 2018

  49. arXiv:1806.01830  [pdf, other

    cs.LG stat.ML

    Relational Deep Reinforcement Learning

    Authors: Vinicius Zambaldi, David Raposo, Adam Santoro, Victor Bapst, Yujia Li, Igor Babuschkin, Karl Tuyls, David Reichert, Timothy Lillicrap, Edward Lockhart, Murray Shanahan, Victoria Langston, Razvan Pascanu, Matthew Botvinick, Oriol Vinyals, Peter Battaglia

    Abstract: We introduce an approach for deep reinforcement learning (RL) that improves upon the efficiency, generalization capacity, and interpretability of conventional approaches through structured perception and relational reasoning. It uses self-attention to iteratively reason about the relations between entities in a scene and to guide a model-free policy. Our results show that in a novel navigation and… ▽ More

    Submitted 28 June, 2018; v1 submitted 5 June, 2018; originally announced June 2018.

  50. arXiv:1806.01822  [pdf, other

    cs.LG stat.ML

    Relational recurrent neural networks

    Authors: Adam Santoro, Ryan Faulkner, David Raposo, Jack Rae, Mike Chrzanowski, Theophane Weber, Daan Wierstra, Oriol Vinyals, Razvan Pascanu, Timothy Lillicrap

    Abstract: Memory-based neural networks model temporal data by leveraging an ability to remember information for long periods. It is unclear, however, whether they also have an ability to perform complex relational reasoning with the information they remember. Here, we first confirm our intuitions that standard memory architectures may struggle at tasks that heavily involve an understanding of the ways in wh… ▽ More

    Submitted 28 June, 2018; v1 submitted 5 June, 2018; originally announced June 2018.