Members
-
imitation ★ PINNED
Clean PyTorch implementations of imitation and reward learning algorithms
Python ★ 1.8k 1y agoExplain → -
overcooked_ai ★ PINNED
A benchmark environment for fully cooperative human-AI performance.
Jupyter Notebook ★ 982 1y agoExplain → -
rlsp ★ PINNED
Reward Learning by Simulating the Past
Python ★ 46 7y agoExplain → -
adversarial-policies ★ PINNED
Find best-response to a fixed policy in multi-agent RL
Python ★ 288 4y agoExplain → -
evaluating-rewards ★ PINNED
Library to compare and evaluate reward functions
Python ★ 69 2y agoExplain → -
human_aware_rl ★ PINNED
Code for "On the Utility of Learning about Humans for Human-AI Coordination"
Python ★ 112 3y agoExplain → -
tensor-trust
A prompt injection game to collect data for robust ML research
Python ★ 71 1y agoExplain → -
overcooked-demo
Web application where humans can play Overcooked with AI agents.
JavaScript ★ 60 3y agoExplain → -
tensor-trust-data
Dataset for the Tensor Trust project
Jupyter Notebook ★ 49 2y agoExplain → -
seals
Benchmark environments for reward modelling and imitation learning algorithms.
Python ★ 46 2y agoExplain → -
eirli
An Empirical Investigation of Representation Learning for Imitation (EIRLI), NeurIPS'21
Python ★ 37 3y agoExplain → -
ranking-challenge
Testing ranking algorithms to improve social cohesion
Python ★ 32 1y agoExplain → -
learning-from-human-preferences
Reproduction of OpenAI and DeepMind's "Deep Reinforcement Learning from Human Preferences"
Python ★ 31 5y agoExplain → -
leela-interp
Code for "Evidence of Learned Look-Ahead in a Chess-Playing Neural Network"
Jupyter Notebook ★ 29 2y agoExplain → -
atari-irl
No description.
Python ★ 28 7y agoExplain → -
population-irl
(Experimental) Inverse reinforcement learning from trajectories generated by multiple agents with different (but correlated) rewards
Python ★ 27 7y agoExplain → -
deep-rlsp
Code accompanying "Learning What To Do by Simulating the Past", ICLR 2021.
Python ★ 27 5y agoExplain → -
learning_biases
Infer how suboptimal agents are suboptimal while planning, for example if they are hyperbolic time discounters.
Jupyter Notebook ★ 25 5y agoExplain → -
human_ai_robustness
No description.
Python ★ 22 6y agoExplain → -
overcooked-hAI-exp
Overcooked-AI Experiment Psiturk Demo (for MTurk experiments)
JavaScript ★ 13 5y agoExplain → -
better-adversarial-defenses
Training in bursts for defending against adversarial policies
Python ★ 10 5y agoExplain → -
interpreting-rewards
Experiments in applying interpretability techniques to learned reward functions.
Jupyter Notebook ★ 10 5y agoExplain → -
HighJax
Highway driving simulation in JAX for Reinforcement Learning research
Rust ★ 8 2mo agoExplain → -
nn-clustering-pytorch
Checking the divisibility of neural networks, and investigating the nature of the pieces networks can be divided into.
Python ★ 6 3y agoExplain → -
recon-email
Script for automatically creating the reconnaissance email.
HTML ★ 5 4y agoExplain → -
assistance-games
Supporting code for Assistance Games as a Framework paper
Python ★ 4 4y agoExplain → -
reward-preprocessing
Preprocessing reward functions to make them more interpretable
Python ★ 4 4y agoExplain → -
multiagent-competition ⑂
Code for the paper "Emergent Complexity via Multi-agent Competition"
Python ★ 4 4y agoExplain → -
PARETO
Dataset for the paper: Political Neutrality as Balanced Approval: A Large-Scale Human Evaluation of AI Responses
Jupyter Notebook ★ 3 28d agoExplain → -
reducing-exploitability
No description.
Python ★ 3 3y agoExplain → -
stable-baselines3 ⑂
PyTorch version of Stable Baselines, improved implementations of reinforcement learning algorithms.
Python ★ 3 4y agoExplain → -
dmc2gym ⑂
OpenAI Gym wrapper for the DeepMind Control Suite
Python ★ 2 4y agoExplain → -
derail
Supporting code for diagnostic seals paper
Python ★ 2 5y agoExplain → -
multi-agent
No description.
Python ★ 2 7y agoExplain → -
minerl ⑂
MineRL Competition for Sample Efficient Reinforcement Learning - Python Package
Python ★ 2 5y agoExplain → -
ranking-challenge-perspective
Prosocial Ranking Challenge Perspective Ranker
Jupyter Notebook ★ 1 1y agoExplain → -
reward-function-interpretability
No description.
Jupyter Notebook ★ 1 2y agoExplain → -
simulation-awareness
(experimental) RL agents should be more aligned if they do not know whether they are in simulation or in the real world
Python ★ 1 8y agoExplain → -
sacred ⑂
Sacred is a tool to help you configure, organize, log and reproduce experiments developed at IDSIA.
Python ★ 1 4y agoExplain → -
logical-active-classification
Use active learning to classify data represented as boundaries of regions in parameter space where a parametrised logical formula holds.
Python ★ 1 7y agoExplain → -
ilqr ⑂
Iterative Linear Quadratic Regulator with auto-differentiatiable dynamics models
Python ★ 1 7y agoExplain → -
rl-baselines3-zoo ⑂
A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.
Python ★ 1 3y agoExplain → -
cs294-149-fa18-notes
LaTeX Notes from the Fall 2018 version of CS294-149: AGI Safety and Control
TeX ★ 1 7y agoExplain → -
rc-submission-dante ⑂
PRC: Testing ranking algorithms to improve social cohesion
JavaScript ★ 0 1y agoExplain → -
rc-submission-civirank ⑂
PRC: Civirank submission
★ 0 1y agoExplain → -
ray ⑂
A fast and simple framework for building and running distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
Python ★ 0 3y agoExplain → -
sgf-viewer
A simple webpage that can visualize a sgf string encoded as a url fragment.
CSS ★ 0 3y agoExplain → -
katago-driver-bug-repro
Docker files to help reproduce bug described in https://forums.developer.nvidia.com/t/kernel-oops-null-pointer-dereference-when-closing-cuda-application-katago/211270/3
Dockerfile ★ 0 4y agoExplain → -
pytorch-summary ⑂
Model summary in PyTorch similar to `model.summary()` in Keras
Python ★ 0 4y agoExplain → -
slack-diskbot
low disk space alerts posted to Slack
Python ★ 0 4y agoExplain → -
malmo ⑂
Project Malmo is a platform for Artificial Intelligence experimentation and research built on top of Minecraft. We aim to inspire a new generation of research into challenging new problems presented by this unique environment. --- For installation instructions, scroll down to *Getting Started* below, or visit the project page for more information:
Java ★ 0 5y agoExplain → -
gym ⑂
A toolkit for developing and comparing reinforcement learning algorithms.
Python ★ 0 6y agoExplain → -
baselines ⑂
OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
Python ★ 0 6y agoExplain → -
interactive-behaviour-design
No description.
Python ★ 0 7y agoExplain → -
scenario_runner ⑂
Traffic scenario definition and execution engine
Python ★ 0 7y agoExplain → -
coiltraine ⑂
Training framework for conditional imitation learning
Python ★ 0 7y agoExplain → -
carla-autoware ⑂
Integration of AutoWare AV software with the CARLA simulator
Python ★ 0 7y agoExplain → -
interactive-behaviour-design-basicfetch
No description.
Python ★ 0 7y agoExplain → -
interactive-behaviour-design-gym
No description.
Python ★ 0 7y agoExplain → -
interactive-behaviour-design-baselines
No description.
HTML ★ 0 7y agoExplain →
No repos match these filters.