Skip to main content

Showing 1–8 of 8 results for author: Siddarth, D

  1. arXiv:2406.07814  [pdf, other

    cs.AI cs.CL cs.HC

    Collective Constitutional AI: Aligning a Language Model with Public Input

    Authors: Saffron Huang, Divya Siddarth, Liane Lovitt, Thomas I. Liao, Esin Durmus, Alex Tamkin, Deep Ganguli

    Abstract: There is growing consensus that language model (LM) developers should not be the sole deciders of LM behavior, creating a need for methods that enable the broader public to collectively shape the behavior of LM systems that affect them. To address this need, we present Collective Constitutional AI (CCAI): a multi-stage process for sourcing and integrating public input into LMs-from identifying a t… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    ACM Class: I.2.7; K.4.2

    Journal ref: Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency. 1395-1417

  2. arXiv:2307.03718  [pdf, other

    cs.CY cs.AI

    Frontier AI Regulation: Managing Emerging Risks to Public Safety

    Authors: Markus Anderljung, Joslyn Barnhart, Anton Korinek, Jade Leung, Cullen O'Keefe, Jess Whittlestone, Shahar Avin, Miles Brundage, Justin Bullock, Duncan Cass-Beggs, Ben Chang, Tantum Collins, Tim Fist, Gillian Hadfield, Alan Hayes, Lewis Ho, Sara Hooker, Eric Horvitz, Noam Kolt, Jonas Schuett, Yonadav Shavit, Divya Siddarth, Robert Trager, Kevin Wolf

    Abstract: Advanced AI models hold the promise of tremendous benefits for humanity, but society needs to proactively manage the accompanying risks. In this paper, we focus on what we term "frontier AI" models: highly capable foundation models that could possess dangerous capabilities sufficient to pose severe risks to public safety. Frontier AI models pose a distinct regulatory challenge: dangerous capabilit… ▽ More

    Submitted 7 November, 2023; v1 submitted 6 July, 2023; originally announced July 2023.

    Comments: Update July 11th: - Added missing footnote back in. - Adjusted author order (mistakenly non-alphabetical among the first 6 authors) and adjusted affiliations (Jess Whittlestone's affiliation was mistagged and Gillian Hadfield had SRI added to her affiliations) Updated September 4th: Various typos

  3. arXiv:2305.15324  [pdf, other

    cs.AI

    Model evaluation for extreme risks

    Authors: Toby Shevlane, Sebastian Farquhar, Ben Garfinkel, Mary Phuong, Jess Whittlestone, Jade Leung, Daniel Kokotajlo, Nahema Marchal, Markus Anderljung, Noam Kolt, Lewis Ho, Divya Siddarth, Shahar Avin, Will Hawkins, Been Kim, Iason Gabriel, Vijay Bolina, Jack Clark, Yoshua Bengio, Paul Christiano, Allan Dafoe

    Abstract: Current approaches to building general-purpose AI systems tend to produce systems with both beneficial and harmful capabilities. Further progress in AI development could lead to capabilities that pose extreme risks, such as offensive cyber capabilities or strong manipulation skills. We explain why model evaluation is critical for addressing extreme risks. Developers must be able to identify danger… ▽ More

    Submitted 22 September, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Fixed typos; added citation

    ACM Class: K.4.1

  4. arXiv:2303.12642  [pdf

    cs.AI cs.CY cs.LG

    Democratising AI: Multiple Meanings, Goals, and Methods

    Authors: Elizabeth Seger, Aviv Ovadya, Ben Garfinkel, Divya Siddarth, Allan Dafoe

    Abstract: Numerous parties are calling for the democratisation of AI, but the phrase is used to refer to a variety of goals, the pursuit of which sometimes conflict. This paper identifies four kinds of AI democratisation that are commonly discussed: (1) the democratisation of AI use, (2) the democratisation of AI development, (3) the democratisation of AI profits, and (4) the democratisation of AI governanc… ▽ More

    Submitted 7 August, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

    Comments: V2 Changed second author affiliation; added citation to section 5.2; edit to author contribution statement; V3 camera ready version for conference proceedings. Minor content changes in response to reviewer comments

  5. arXiv:2303.11074  [pdf, ps, other

    cs.CY

    Generative AI and the Digital Commons

    Authors: Saffron Huang, Divya Siddarth

    Abstract: Many generative foundation models (or GFMs) are trained on publicly available data and use public infrastructure, but 1) may degrade the "digital commons" that they depend on, and 2) do not have processes in place to return value captured to data producers and stakeholders. Existing conceptions of data rights and protection (focusing largely on individually-owned data and associated privacy concer… ▽ More

    Submitted 20 March, 2023; originally announced March 2023.

  6. arXiv:2105.13515  [pdf, ps, other

    cs.CY

    Vaccine Credential Technology Principles

    Authors: Divya Siddarth, Vi Hart, Bethan Cantrell, Kristina Yasuda, Josh Mandel, Karen Easterbrook

    Abstract: The historically rapid development of effective COVID-19 vaccines has policymakers facing evergreen public health questions regarding vaccination records and verification. Governments and institutions around the world are already taking action on digital vaccine certificates, including guidance and recommendations from the European Commission, the WHO, and the Biden Administration. These could be… ▽ More

    Submitted 27 May, 2021; originally announced May 2021.

  7. arXiv:2008.05300  [pdf

    cs.CR

    Who Watches the Watchmen? A Review of Subjective Approaches for Sybil-resistance in Proof of Personhood Protocols

    Authors: Divya Siddarth, Sergey Ivliev, Santiago Siri, Paula Berman

    Abstract: Most current self-sovereign identity systems may be categorized as strictly objective, consisting of cryptographically signed statements issued by trusted third party attestors. This failure to provide an input for subjectivity accounts for a central challenge: the inability to address the question of "Who verifies the verifier?". Instead, these protocols outsource their legitimacy to mechanisms b… ▽ More

    Submitted 13 October, 2020; v1 submitted 26 July, 2020; originally announced August 2020.

  8. arXiv:2008.03263  [pdf, other

    cs.SI physics.soc-ph

    COVID, BLM, and the polarization of US politicians on Twitter

    Authors: Anmol Panda, Divya Siddarth, Joyojeet Pal

    Abstract: We mapped the tweets of 520 US Congress members, focusing on analyzing their engagement with two broad topics: first, the COVID-19 pandemic, and second, the recent wave of anti-racist protest. We find that, in discussing COVID-19, Democrats frame the issue in terms of public health, while Republicans are more likely to focus on small businesses and the economy. When looking at the discourse around… ▽ More

    Submitted 7 August, 2020; originally announced August 2020.

    Comments: 8 pages, 6 figures