Skip to content
@apartresearch

apartresearch

Artificial intelligence will change the world. Our mission is to ensure this happens safely and to the benefit of everyone.

Apart facilitates new research in AI safety, towards reducing societal-scale risks from the technology.

We combine a community focus with a drive for high-quality security research.


Read more about our work:

  • Our Research — Foundational research for safe and beneficial advanced AI
  • Apart Lab — Our research fellowship program for aspiring researchers in AI safety
  • Apart Sprints — Weekend-long research sprints and hackathons for AI security and governance

Twitter Badge LinkedIn Badge YouTube Badge Discord Badge Alignment Jam RSS Badge

Pinned Loading

  1. interpretability-starter interpretability-starter Public

    🧠 Starter templates for doing interpretability research

    76 2

  2. Neuron2Graph Neuron2Graph Public

    Tools for exploring Transformer neuron behaviour, including input pruning and diversification.

    Jupyter Notebook 23 5

  3. deepdecipher deepdecipher Public

    🦠 DeepDecipher: An open source API to MLP neurons

    Rust 9

  4. specificityplus specificityplus Public

    👩‍💻 Code for the ACL paper "Detecting Edit Failures in LLMs: An Improved Specificity Benchmark"

    Python 20 4

  5. Integer_Addition Integer_Addition Public

    ✱ Understanding the underlying learning dynamics of simple tasks in Transformer networks

    Jupyter Notebook 18

  6. readingwhatwecan readingwhatwecan Public

    📚📚📚📚📚📚📚📚📚 Reading everything

    CSS 15 5

Repositories

Showing 10 of 41 repositories
  • readingwhatwecan Public

    📚📚📚📚📚📚📚📚📚 Reading everything

    apartresearch/readingwhatwecan’s past year of commit activity
    CSS 15 5 1 0 Updated Sep 12, 2025
  • darkbench-ai Public
    apartresearch/darkbench-ai’s past year of commit activity
    TypeScript 2 0 0 0 Updated Jul 22, 2025
  • public_images Public

    Public images, logos, from Apart that need a public url

    apartresearch/public_images’s past year of commit activity
    0 0 0 0 Updated Jun 5, 2025
  • prompt-worms Public

    🎣 Breaking and entering through language model memory and context

    apartresearch/prompt-worms’s past year of commit activity
    Python 3 MIT 0 4 0 Updated May 27, 2025
  • DarkBench Public Forked from sjawhar/DarkBench

    Benchmarking Dark Patterns in LLMs (ICLR 2025)

    apartresearch/DarkBench’s past year of commit activity
    Python 13 MIT 9 0 1 Updated Mar 29, 2025
  • partner-site Public

    A website for partners to engage with Apart.

    apartresearch/partner-site’s past year of commit activity
    HTML 0 0 0 0 Updated Feb 9, 2025
  • Interpreting-Learned-Feedback-Patterns Public

    ✱ Interpreting learned feedback patterns in large language models

    apartresearch/Interpreting-Learned-Feedback-Patterns’s past year of commit activity
    Jupyter Notebook 4 MIT 2 7 0 Updated Jan 8, 2025
  • team-sync-lab Public
    apartresearch/team-sync-lab’s past year of commit activity
    TypeScript 0 0 0 0 Updated Nov 24, 2024
  • seqcont_circuits Public

    ✱ Interpreting how similar sequence continuation tasks share internal representations ✱

    apartresearch/seqcont_circuits’s past year of commit activity
    Jupyter Notebook 2 MIT 2 1 0 Updated Nov 9, 2024
  • 3cb Public

    3cb: Catastrophic Cyber Capabilities Benchmarking of Large Language Models

    apartresearch/3cb’s past year of commit activity
    Python 13 4 2 1 Updated Oct 30, 2024

Most used topics

Loading…