gitmyhub

RedPajama-Data

Python ★ 5.0k updated 17d ago

The RedPajama-Data repository contains code for preparing large datasets for training large language models.

No plain-English explanation yet — one is being written right now. Check back in a minute.