Skip to content
@aisilab

AI Safety & Interpretability Lab

Popular repositories Loading

  1. arbiter arbiter Public

    Run HuggingFace models through freeform questions and judge responses with an LLM.

    Python 4

  2. diffing-toolkit diffing-toolkit Public

    Forked from science-of-finetuning/diffing-toolkit

    A toolkit that provides a range of model diffing techniques including a UI to visualize them interactively.

    Python 1 1

  3. aisilab.github.io aisilab.github.io Public

    Website of the AI Safety & Interpretability Lab at SDU

    HTML 1

  4. Superadditive-cooperation-LLMs Superadditive-cooperation-LLMs Public

    Forked from pippot/Superadditive-cooperation-LLMs

    Study on super additive cooperation between Large Language Model agents in an Iterated Prisoner's Dilemma tournament

    Python

  5. Prolog-as-a-Tool Prolog-as-a-Tool Public

    Forked from niklasmellgren/grpo-prolog-inference

    Reinforcement fine-tuning LLMs with GRPO to generate Prolog code for symbolic reasoning and inference

    Jupyter Notebook

Repositories

Showing 5 of 5 repositories

Top languages

Loading…

Most used topics

Loading…