-
trlx ★ PINNED
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
Python ★ 4.7k 2y agoExplain → -
cheese ★ PINNED
Used for adaptive human in the loop evaluation of language and embedding models.
Python ★ 306 3y agoExplain → -
OpenELM ★ PINNED
Evolution Through Large Models
Python ★ 741 2y agoExplain → -
DRLX ★ PINNED
Diffusion Reinforcement Learning Library
Python ★ 195 2y agoExplain → -
Code-Pile
This repository contains all the code for collecting large scale amounts of code from GitHub.
Python ★ 110 3y agoExplain → -
autocrit
A repository for transformer critique learning and generation
Python ★ 89 2y agoExplain → -
InstructGPT
For experiments involving instruct gpt. Currently used for documenting open research questions.
★ 71 3y agoExplain → -
squeakily
A library for squeakily cleaning and filtering language datasets.
Jupyter Notebook ★ 50 3y agoExplain → -
Algorithm-Distillation-RLHF
No description.
Python ★ 35 3y agoExplain → -
decontamination
This repository contains code for cleaning your training data of benchmark data to help combat data snooping.
Jupyter Notebook ★ 28 3y agoExplain → -
treasure_trove
No description.
Jupyter Notebook ★ 22 2y agoExplain → -
nmmo-environment ⑂
Neural MMO - A Massively Multiagent Environment for Artificial Intelligence Research
Python ★ 15 2y agoExplain → -
CodeReviewSE
Stuff related to scraping the Code Review StackExchange
Python ★ 12 3y agoExplain → -
pilev2 ⑂
No description.
Python ★ 11 3y agoExplain → -
magicarp-v2
magiCARP is an API used for crossencoder training.
Python ★ 9 2y agoExplain → -
nmmo-baselines ⑂
Baselines for Neural MMO -- new users should treat this repo as a starter project
Python ★ 7 1y agoExplain → -
ArchitextRL
No description.
Python ★ 7 3y agoExplain → -
Polygraph
RLHF Mechanistic Interpretability and Deception
★ 6 2y agoExplain → -
data-preparation ⑂
Code used for sourcing and cleaning the BigScience ROOTS corpus
E ★ 4 3y agoExplain → -
FastChat ⑂
An open platform for training, serving, and evaluating large language model based chatbots.
★ 4 3y agoExplain → -
AutoPaperclipMaximizer
👀
★ 3 3y agoExplain → -
sft
No description.
Python ★ 2 3y agoExplain → -
contriever ⑂
Contriever: Unsupervised Dense Information Retrieval with Contrastive Learning
Python ★ 2 3y agoExplain → -
maxtext ⑂
A simple, performant and scalable Jax LLM!
Python ★ 1 3y agoExplain → -
diversity_metrics
No description.
Jupyter Notebook ★ 1 2y agoExplain → -
goosebox
sandboxed eval server for running code snippets
★ 1 3y agoExplain → -
tinypar ⑂
No description.
Python ★ 0 2y agoExplain →
No repos match these filters.