Skip to main content

Showing 1–24 of 24 results for author: Irving, G

  1. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  2. arXiv:2311.14125  [pdf, other

    cs.AI cs.LG

    Scalable AI Safety via Doubly-Efficient Debate

    Authors: Jonah Brown-Cohen, Geoffrey Irving, Georgios Piliouras

    Abstract: The emergence of pre-trained AI systems with powerful capabilities across a diverse and ever-increasing set of complex domains has raised a critical challenge for AI safety as tasks can become too complicated for humans to judge directly. Irving et al. [2018] proposed a debate method in this direction with the goal of pitting the power of such AI models against each other until the problem of iden… ▽ More

    Submitted 23 November, 2023; originally announced November 2023.

  3. arXiv:2307.09458  [pdf, other

    cs.LG

    Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla

    Authors: Tom Lieberum, Matthew Rahtz, János Kramár, Neel Nanda, Geoffrey Irving, Rohin Shah, Vladimir Mikulik

    Abstract: \emph{Circuit analysis} is a promising technique for understanding the internal mechanisms of language models. However, existing analyses are done in small models far from the state of the art. To address this, we present a case study of circuit analysis in the 70B Chinchilla model, aiming to test the scalability of circuit analysis. In particular, we study multiple-choice question answering, and… ▽ More

    Submitted 24 July, 2023; v1 submitted 18 July, 2023; originally announced July 2023.

  4. arXiv:2302.01318  [pdf, other

    cs.CL

    Accelerating Large Language Model Decoding with Speculative Sampling

    Authors: Charlie Chen, Sebastian Borgeaud, Geoffrey Irving, Jean-Baptiste Lespiau, Laurent Sifre, John Jumper

    Abstract: We present speculative sampling, an algorithm for accelerating transformer decoding by enabling the generation of multiple tokens from each transformer call. Our algorithm relies on the observation that the latency of parallel scoring of short continuations, generated by a faster but less powerful draft model, is comparable to that of sampling a single token from the larger target model. This is c… ▽ More

    Submitted 2 February, 2023; originally announced February 2023.

  5. arXiv:2211.14275  [pdf, other

    cs.LG cs.AI cs.CL

    Solving math word problems with process- and outcome-based feedback

    Authors: Jonathan Uesato, Nate Kushman, Ramana Kumar, Francis Song, Noah Siegel, Lisa Wang, Antonia Creswell, Geoffrey Irving, Irina Higgins

    Abstract: Recent work has shown that asking language models to generate reasoning steps improves performance on many reasoning tasks. When moving beyond prompting, this raises the question of how we should supervise such models: outcome-based approaches which supervise the final result, or process-based approaches which supervise the reasoning process itself? Differences between these approaches might natur… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.

  6. arXiv:2211.01568  [pdf, other

    cs.CL cs.AI

    Fine-Tuning Language Models via Epistemic Neural Networks

    Authors: Ian Osband, Seyed Mohammad Asghari, Benjamin Van Roy, Nat McAleese, John Aslanides, Geoffrey Irving

    Abstract: Language models often pre-train on large unsupervised text corpora, then fine-tune on additional task-specific data. However, typical fine-tuning schemes do not prioritize the examples that they tune on. We show that, if you can prioritize informative training data, you can achieve better performance while using fewer labels. To do this we augment a language model with an epinet: a small additiona… ▽ More

    Submitted 10 May, 2023; v1 submitted 2 November, 2022; originally announced November 2022.

  7. arXiv:2209.14375  [pdf, other

    cs.LG cs.CL

    Improving alignment of dialogue agents via targeted human judgements

    Authors: Amelia Glaese, Nat McAleese, Maja Trębacz, John Aslanides, Vlad Firoiu, Timo Ewalds, Maribeth Rauh, Laura Weidinger, Martin Chadwick, Phoebe Thacker, Lucy Campbell-Gillingham, Jonathan Uesato, Po-Sen Huang, Ramona Comanescu, Fan Yang, Abigail See, Sumanth Dathathri, Rory Greig, Charlie Chen, Doug Fritz, Jaume Sanchez Elias, Richard Green, Soňa Mokrá, Nicholas Fernando, Boxi Wu , et al. (9 additional authors not shown)

    Abstract: We present Sparrow, an information-seeking dialogue agent trained to be more helpful, correct, and harmless compared to prompted language model baselines. We use reinforcement learning from human feedback to train our models with two new additions to help human raters judge agent behaviour. First, to make our agent more helpful and harmless, we break down the requirements for good dialogue into na… ▽ More

    Submitted 28 September, 2022; originally announced September 2022.

  8. arXiv:2206.08325  [pdf, ps, other

    cs.CL cs.AI cs.CY

    Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models

    Authors: Maribeth Rauh, John Mellor, Jonathan Uesato, Po-Sen Huang, Johannes Welbl, Laura Weidinger, Sumanth Dathathri, Amelia Glaese, Geoffrey Irving, Iason Gabriel, William Isaac, Lisa Anne Hendricks

    Abstract: Large language models produce human-like text that drive a growing number of applications. However, recent literature and, increasingly, real world observations, have demonstrated that these models can generate language that is toxic, biased, untruthful or otherwise harmful. Though work to evaluate language model harms is under way, translating foresight about which harms may arise into rigorous b… ▽ More

    Submitted 28 October, 2022; v1 submitted 16 June, 2022; originally announced June 2022.

    Comments: Accepted to NeurIPS 2022 Datasets and Benchmarks Track; 10 pages plus appendix

  9. arXiv:2203.11147  [pdf, other

    cs.CL cs.LG

    Teaching language models to support answers with verified quotes

    Authors: Jacob Menick, Maja Trebacz, Vladimir Mikulik, John Aslanides, Francis Song, Martin Chadwick, Mia Glaese, Susannah Young, Lucy Campbell-Gillingham, Geoffrey Irving, Nat McAleese

    Abstract: Recent large language models often answer factual questions correctly. But users can't trust any given claim a model makes without fact-checking, because language models can hallucinate convincing nonsense. In this work we use reinforcement learning from human preferences (RLHP) to train "open-book" QA models that generate answers whilst also citing specific evidence for their claims, which aids i… ▽ More

    Submitted 21 March, 2022; originally announced March 2022.

  10. arXiv:2203.07472  [pdf, other

    cs.CL cs.AI cs.LG

    Uncertainty Estimation for Language Reward Models

    Authors: Adam Gleave, Geoffrey Irving

    Abstract: Language models can learn a range of capabilities from unsupervised training on text corpora. However, to solve a particular problem (such as text summarization) it is typically necessary to fine-tune them on a task-specific dataset. It is often easier for humans to choose between options than to provide labeled data, and prior work has achieved state-of-the-art performance by training a reward mo… ▽ More

    Submitted 14 March, 2022; originally announced March 2022.

    Comments: 8 pages main paper, 17 pages total

    ACM Class: I.2.7

  11. arXiv:2202.03286  [pdf, other

    cs.CL cs.AI cs.CR cs.LG

    Red Teaming Language Models with Language Models

    Authors: Ethan Perez, Saffron Huang, Francis Song, Trevor Cai, Roman Ring, John Aslanides, Amelia Glaese, Nat McAleese, Geoffrey Irving

    Abstract: Language Models (LMs) often cannot be deployed because of their potential to harm users in hard-to-predict ways. Prior work identifies harmful behaviors before deployment by using human annotators to hand-write test cases. However, human annotation is expensive, limiting the number and diversity of test cases. In this work, we automatically find cases where a target LM behaves in a harmful way, by… ▽ More

    Submitted 7 February, 2022; originally announced February 2022.

  12. arXiv:2112.11446  [pdf, other

    cs.CL cs.AI

    Scaling Language Models: Methods, Analysis & Insights from Training Gopher

    Authors: Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, Eliza Rutherford, Tom Hennigan, Jacob Menick, Albin Cassirer, Richard Powell, George van den Driessche, Lisa Anne Hendricks, Maribeth Rauh, Po-Sen Huang, Amelia Glaese, Johannes Welbl, Sumanth Dathathri, Saffron Huang, Jonathan Uesato, John Mellor , et al. (55 additional authors not shown)

    Abstract: Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world. In this paper, we present an analysis of Transformer-based language model performance across a wide range of model scales -- from models with tens of millions of parameters up to a 280 billion parameter model called Gop… ▽ More

    Submitted 21 January, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

    Comments: 120 pages

  13. arXiv:2112.04426  [pdf, other

    cs.CL cs.LG

    Improving language models by retrieving from trillions of tokens

    Authors: Sebastian Borgeaud, Arthur Mensch, Jordan Hoffmann, Trevor Cai, Eliza Rutherford, Katie Millican, George van den Driessche, Jean-Baptiste Lespiau, Bogdan Damoc, Aidan Clark, Diego de Las Casas, Aurelia Guy, Jacob Menick, Roman Ring, Tom Hennigan, Saffron Huang, Loren Maggiore, Chris Jones, Albin Cassirer, Andy Brock, Michela Paganini, Geoffrey Irving, Oriol Vinyals, Simon Osindero, Karen Simonyan , et al. (3 additional authors not shown)

    Abstract: We enhance auto-regressive language models by conditioning on document chunks retrieved from a large corpus, based on local similarity with preceding tokens. With a $2$ trillion token database, our Retrieval-Enhanced Transformer (RETRO) obtains comparable performance to GPT-3 and Jurassic-1 on the Pile, despite using 25$\times$ fewer parameters. After fine-tuning, RETRO performance translates to d… ▽ More

    Submitted 7 February, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

    Comments: Fix incorrect reported numbers in Table 14

  14. arXiv:2112.04359  [pdf, other

    cs.CL cs.AI cs.CY

    Ethical and social risks of harm from Language Models

    Authors: Laura Weidinger, John Mellor, Maribeth Rauh, Conor Griffin, Jonathan Uesato, Po-Sen Huang, Myra Cheng, Mia Glaese, Borja Balle, Atoosa Kasirzadeh, Zac Kenton, Sasha Brown, Will Hawkins, Tom Stepleton, Courtney Biles, Abeba Birhane, Julia Haas, Laura Rimell, Lisa Anne Hendricks, William Isaac, Sean Legassick, Geoffrey Irving, Iason Gabriel

    Abstract: This paper aims to help structure the risk landscape associated with large-scale Language Models (LMs). In order to foster advances in responsible innovation, an in-depth understanding of the potential risks posed by these models is needed. A wide range of established and anticipated risks are analysed in detail, drawing on multidisciplinary expertise and literature from computer science, linguist… ▽ More

    Submitted 8 December, 2021; originally announced December 2021.

  15. arXiv:2103.14659  [pdf, other

    cs.AI cs.LG

    Alignment of Language Agents

    Authors: Zachary Kenton, Tom Everitt, Laura Weidinger, Iason Gabriel, Vladimir Mikulik, Geoffrey Irving

    Abstract: For artificial intelligence to be beneficial to humans the behaviour of AI agents needs to be aligned with what humans want. In this paper we discuss some behavioural issues for language agents, arising from accidental misspecification by the system designer. We highlight some ways that misspecification can occur and discuss some behavioural issues that could arise from misspecification, including… ▽ More

    Submitted 26 March, 2021; originally announced March 2021.

  16. arXiv:1909.08593  [pdf, other

    cs.CL cs.LG stat.ML

    Fine-Tuning Language Models from Human Preferences

    Authors: Daniel M. Ziegler, Nisan Stiennon, Jeffrey Wu, Tom B. Brown, Alec Radford, Dario Amodei, Paul Christiano, Geoffrey Irving

    Abstract: Reward learning enables the application of reinforcement learning (RL) to tasks where reward is defined by human judgment, building a model of reward by asking humans questions. Most work on reward learning has used simulated environments, but complex information about values is often expressed in natural language, and we believe reward learning for language is a key to making RL practical and saf… ▽ More

    Submitted 8 January, 2020; v1 submitted 18 September, 2019; originally announced September 2019.

  17. arXiv:1811.06521  [pdf, other

    cs.LG cs.AI cs.NE stat.ML

    Reward learning from human preferences and demonstrations in Atari

    Authors: Borja Ibarz, Jan Leike, Tobias Pohlen, Geoffrey Irving, Shane Legg, Dario Amodei

    Abstract: To solve complex real-world problems with reinforcement learning, we cannot rely on manually specified reward functions. Instead, we can have humans communicate an objective to the agent directly. In this work, we combine two approaches to learning from human feedback: expert demonstrations and trajectory preferences. We train a deep neural network to model the reward function and use its predicte… ▽ More

    Submitted 15 November, 2018; originally announced November 2018.

    Comments: NIPS 2018

  18. arXiv:1805.00899  [pdf, other

    stat.ML cs.LG

    AI safety via debate

    Authors: Geoffrey Irving, Paul Christiano, Dario Amodei

    Abstract: To make AI systems broadly useful for challenging real-world tasks, we need them to learn complex human goals and preferences. One approach to specifying complex goals asks humans to judge during training which agent behaviors are safe and useful, but this approach can fail if the task is too complicated for a human to directly judge. To help address this concern, we propose training agents via se… ▽ More

    Submitted 22 October, 2018; v1 submitted 2 May, 2018; originally announced May 2018.

    Comments: 24 pages, 6 figures

  19. arXiv:1701.06972  [pdf, other

    cs.AI cs.LG cs.LO

    Deep Network Guided Proof Search

    Authors: Sarah Loos, Geoffrey Irving, Christian Szegedy, Cezary Kaliszyk

    Abstract: Deep learning techniques lie at the heart of several significant AI advances in recent years including object recognition and detection, image captioning, machine translation, speech recognition and synthesis, and playing the game of Go. Automated first-order theorem provers can aid in the formalization and verification of mathematical theorems and play a crucial role in program analysis, theory r… ▽ More

    Submitted 24 January, 2017; originally announced January 2017.

    Journal ref: In Thomas Eiter and David Sands, editors, 21st International Conference on Logic for Programming, Artificial Intelligence and Reasoning (LPAR-21). EPiC Series in Computing, vol. 46, pages 85-105, EasyChair, 2017. ISSN 2398-7340

  20. arXiv:1606.04442  [pdf, other

    cs.AI cs.LG cs.LO

    DeepMath - Deep Sequence Models for Premise Selection

    Authors: Alex A. Alemi, Francois Chollet, Niklas Een, Geoffrey Irving, Christian Szegedy, Josef Urban

    Abstract: We study the effectiveness of neural sequence models for premise selection in automated theorem proving, one of the main bottlenecks in the formalization of mathematics. We propose a two stage approach for this task that yields good results for the premise selection task on the Mizar corpus while avoiding the hand-engineered features of existing state-of-the-art models. To our knowledge, this is t… ▽ More

    Submitted 26 January, 2017; v1 submitted 14 June, 2016; originally announced June 2016.

  21. arXiv:1605.08695  [pdf, other

    cs.DC cs.AI

    TensorFlow: A system for large-scale machine learning

    Authors: Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, Xiaoqiang Zheng

    Abstract: TensorFlow is a machine learning system that operates at large scale and in heterogeneous environments. TensorFlow uses dataflow graphs to represent computation, shared state, and the operations that mutate that state. It maps the nodes of a dataflow graph across many machines in a cluster, and within a machine across multiple computational devices, including multicore CPUs, general-purpose GPUs,… ▽ More

    Submitted 31 May, 2016; v1 submitted 27 May, 2016; originally announced May 2016.

    Comments: 18 pages, 9 figures; v2 has a spelling correction in the metadata

  22. arXiv:1603.04467  [pdf, other

    cs.DC cs.LG

    TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

    Authors: Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mane, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah , et al. (15 additional authors not shown)

    Abstract: TensorFlow is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines and thousands of computational de… ▽ More

    Submitted 16 March, 2016; v1 submitted 14 March, 2016; originally announced March 2016.

    Comments: Version 2 updates only the metadata, to correct the formatting of Martín Abadi's name

  23. arXiv:1404.0743  [pdf, other

    cs.DC

    Pentago is a First Player Win: Strongly Solving a Game Using Parallel In-Core Retrograde Analysis

    Authors: Geoffrey Irving

    Abstract: We present a strong solution of the board game pentago, computed using exhaustive parallel retrograde analysis in 4 hours on 98304 ($3 \times 2^{15}$) threads of NERSC's Cray Edison. At $3.0 \times 10^{15}$ states, pentago is the largest divergent game solved to date by two orders of magnitude, and the only example of a nontrivial divergent game solved using retrograde analysis. Unlike previous re… ▽ More

    Submitted 3 April, 2014; v1 submitted 2 April, 2014; originally announced April 2014.

  24. arXiv:1308.1986  [pdf, other

    cs.CG

    A deterministic pseudorandom perturbation scheme for arbitrary polynomial predicates

    Authors: Geoffrey Irving, Forrest Green

    Abstract: We present a symbolic perturbation scheme for arbitrary polynomial geometric predicates which combines the benefits of Emiris and Canny's simple randomized linear perturbation scheme with Yap's multiple infinitesimal scheme for general predicates. Like the randomized scheme, our method accepts black box polynomial functions as input. For nonmaliciously chosen predicates, our method is as fast as t… ▽ More

    Submitted 8 August, 2013; originally announced August 2013.

    Comments: 15 pages, 2 figures

    MSC Class: 68U05 ACM Class: I.3.5