Together ORG

@togethercomputer

98 repos
1.1k followers
0 following

Python 69%
Go 9%
Jupyter Notebook 6%
TypeScript 6%
Elixir 6%

All public repos (98)

Show forks Show archived Sort

OpenChatKit

No description.

OpenChatKit is a toolkit for running and fine-tuning open-source conversational AI models on your own hardware, including a 20-billion-parameter chat model and support for retrieval-augmented responses from custom documents.

Python ★ 9.0k 2y ago
Explain →
RedPajama-Data

The RedPajama-Data repository contains code for preparing large datasets for training large language models.

Python ★ 5.0k 17d ago
Explain →
MoA

Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models

Python ★ 2.9k 1y ago
Explain →
together-cookbook

A collection of notebooks/recipes showcasing usecases of open-source models with Together AI.

Jupyter Notebook ★ 1.1k 11d ago
Explain →
stripedhyena

Repository for StripedHyena, a state-of-the-art beyond Transformer architecture

Python ★ 433 2y ago
Explain →
open_deep_research

Together Open Deep Research

Python ★ 374 1y ago
Explain →
open-data-scientist

Open AI data scientist agent that automates complex data analysis tasks using the ReAct framework. Execute Python code locally or in the cloud, upload datasets, and generate detailed analytical reports with minimal setup.

Python ★ 187 5mo ago
Explain →
OpenDataHub

No description.

★ 128 3y ago
Explain →
EinsteinArena-new-SOTA

New state-of-the-art bounds for open problems

Jupyter Notebook ★ 122 2mo ago
Explain →
finetuning

Finetune Llama-3-8b on the MathInstruct dataset

Python ★ 117 1y ago
Explain →
redpajama.cpp ⑂

Extend the original llama.cpp repo to support redpajama model.

C ★ 117 1y ago
Explain →
Llama-2-7B-32K-Instruct

No description.

Python ★ 84 2y ago
Explain →
together-python

The Official Python Client for Together's API

Python ★ 81 1mo ago
Explain →
Dragonfly

No description.

Python ★ 81 1y ago
Explain →
llamaindex-chatbot

A RAG Chatbot with Next.js, Together.ai and Llama Index

TypeScript ★ 72 1y ago
Explain →
aurora

No description.

Python ★ 67 1mo ago
Explain →
together-typescript

The official Together AI TypeScript library.

TypeScript ★ 66 2d ago
Explain →
flash-attention-3 ⑂

Fast and memory-efficient exact attention

★ 33 1y ago
Explain →
skills

Skills to help your coding agents use Together AI products.

Python ★ 32 2d ago
Explain →
saw-int4

Official implementation of Paper "System-Aware 4-Bit KV-Cache Quantization for Real-World LLM Serving"

Shell ★ 27 2mo ago
Explain →
reviewing-agents

No description.

Python ★ 20 6mo ago
Explain →
xorl

XoRL

Python ★ 11 1d ago
Explain →
together-py

No description.

Python ★ 10 1d ago
Explain →
CREST

CREST is a training-free test-time steering framework that discovers cognitive heads via simple offline calibration and then rotates activations during decoding to guide the model’s reasoning—preserving norms to avoid per-model hyperparameter tuning. This improves accuracy and reduces tokens across models and datasets.

Python ★ 10 5mo ago
Explain →
transformers_port ⑂

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python ★ 7 1y ago
Explain →
open-models-api ▣

No description.

Python ★ 6 3y ago
Explain →
diffusers ⑂

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch

Python ★ 4 1mo ago
Explain →
Decentralized_Training ⑂

No description.

★ 4 3y ago
Explain →
SMiR

synthetic data pipeline for multi-image reasoning

Python ★ 3 1y ago
Explain →
UniversalSD ⑂

Universal Stable Diffusion Pipeline(s) with Flash Attention

Python ★ 3 1y ago
Explain →
FT_Llama2 ⑂

Transformer related optimization, including BERT, GPT

C++ ★ 3 1y ago
Explain →
together-sandbox

SDKs and CLIs for working with Together Sandboxes

Python ★ 2 2d ago
Explain →
ssd ⑂

A lightweight inference engine supporting speculative speculative decoding (SSD).

★ 2 19d ago
Explain →
together-chat ⑂

Streamlit Component, for a Chatbot UI

★ 2 2y ago
Explain →
vllm-ttgi ⑂

A high-throughput and memory-efficient inference and serving engine for LLMs

Python ★ 2 2y ago
Explain →
gpt-neox ⑂

An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.

Python ★ 2 3y ago
Explain →
together-go

Official Together AI Go library

Go ★ 1 2d ago
Explain →
stytch-elixir

Elixir Client for the Stytch B2B SaaS authentication API

Elixir ★ 1 12d ago
Explain →
sprocket

Sprocket SDK for building inference workers on Together Dedicated Containers

Python ★ 1 1mo ago
Explain →
autoscaler ⑂

Autoscaling components for Kubernetes

★ 1 5mo ago
Explain →
llm-awq-ttgi ⑂

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python ★ 1 2y ago
Explain →
langchain ⑂

⚡ Building applications with LLMs through composability ⚡

★ 1 1y ago
Explain →
vllm ⑂

A high-throughput and memory-efficient inference and serving engine for LLMs

Python ★ 1 1y ago
Explain →
flash-attention ⑂

Fast and memory-efficient exact attention

Python ★ 1 2y ago
Explain →
FT_Redpajama ⑂

Transformer related optimization, including BERT, GPT

C++ ★ 1 2y ago
Explain →
H3 ⑂

Together port to run H3

Assembly ★ 1 2y ago
Explain →
Port_FasterTransformer ⑂

Transformer related optimization, including BERT, GPT

C++ ★ 1 3y ago
Explain →
FT_Bloomchat ⑂

Transformer related optimization, including BERT, GPT

C++ ★ 1 3y ago
Explain →
native_hf_models-slim ⑂

No description.

Python ★ 1 3y ago
Explain →
detect_agent

No description.

Python ★ 0 2d ago
Explain →
InferenceX ⑂

Open Source Continuous Inference Benchmark Research Platform Kimi K2.6, DeepSeekv4, GLM5 - GB200 NVL72 vs MI355X vs B200 vs GB300 NVL72 & soon™ TPUv6e/v7/Trainium2/3

Python ★ 0 5d ago
Explain →
sglang-private ⑂

SGLang is a high-performance serving framework for large language models and multimodal models.

Python ★ 0 2mo ago
Explain →
elixir-common

Shared modules and utilities for Elixir services at Together

Elixir ★ 0 13d ago
Explain →
tinker-cookbook ⑂

Post-training with Tinker

★ 0 15d ago
Explain →
archipelago

Archipelago: eval framework for AI agents on professional services tasks

Python ★ 0 23d ago
Explain →
terraform-provider-together

No description.

Go ★ 0 2d ago
Explain →
ib-kubernetes ⑂

No description.

Go ★ 0 2d ago
Explain →
SearchScales

No description.

★ 0 1mo ago
Explain →
slurm-operator ⑂

This project provides a framework that runs Slurm in Kubernetes.

Go ★ 0 24d ago
Explain →
together-kubelogin

No description.

Go ★ 0 3d ago
Explain →
keep-talking

fast bpe tokenizer

Rust ★ 0 4mo ago
Explain →
OpenHands

Private fork of https://github.com/All-Hands-AI/OpenHands.git

Python ★ 0 1mo ago
Explain →
k8s-netperf ⑂

Running Networking Performance Tests against K8s

★ 0 1mo ago
Explain →
DeepGEMM ⑂

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

★ 0 29d ago
Explain →
llama-stack ⑂

Model components of the Llama Stack APIs

★ 0 1y ago
Explain →
sglang-mla-rotation ⑂

SGLang is a high-performance serving framework for large language models and multimodal models.

★ 0 2mo ago
Explain →
xorl-wheels

Pre-compiled wheels

★ 0 20d ago
Explain →
Mooncake ⑂

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

★ 0 2mo ago
Explain →
xorl-sglang

No description.

Python ★ 0 2mo ago
Explain →
xorl-client

XoRL Client

Python ★ 0 8d ago
Explain →
together-VeOmni ⑂

VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo

★ 0 3mo ago
Explain →
together-nccl-tests ⑂

NCCL Tests

★ 0 1mo ago
Explain →
together-dgxc-benchmarking ⑂

DGXC Benchmarking provides recipes in ready-to-use templates for evaluating performance of specific AI use cases across hardware and software combinations.

★ 0 22d ago
Explain →
slurm-deb-packages ⑂

Slurm debian packages

Dockerfile ★ 0 3d ago
Explain →
rllm-public ⑂

Democratizing Reinforcement Learning for LLMs

★ 0 7mo ago
Explain →
genai-bench ⑂

Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serving systems.

★ 0 29d ago
Explain →
torchtune ⑂

PyTorch native post-training library

★ 0 9mo ago
Explain →
core-dump-handler ⑂

Save core dumps from a Kubernetes Service or RedHat OpenShift to an S3 protocol compatible object store

★ 0 9mo ago
Explain →
OLMo-snapshot ⑂

Modeling, training, eval, and inference code for OLMo

★ 0 10mo ago
Explain →
gorilla ⑂

Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)

Python ★ 0 11mo ago
Explain →
Kokoro-FastAPI ⑂

Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model w/CPU ONNX and NVIDIA GPU PyTorch support, handling, and auto-stitching

Python ★ 0 11mo ago
Explain →
torchtitan ⑂

A PyTorch native platform for training generative AI models

★ 0 11mo ago
Explain →
k8s_gateway-fork ⑂

A CoreDNS plugin to resolve all types of external Kubernetes resources

★ 0 1y ago
Explain →
k8s_gateway ⑂

A CoreDNS plugin to resolve all types of external Kubernetes resources

Go ★ 0 1y ago
Explain →
kubevirt-full ⑂ ▣

Kubernetes Virtualization API and runtime in order to define and manage virtual machines.

★ 0 1y ago
Explain →
kubevirt ⑂

Kubernetes Virtualization API and runtime in order to define and manage virtual machines.

Go ★ 0 3mo ago
Explain →
sriov-network-operator ⑂

Operator for provisioning and configuring SR-IOV CNI plugin and device plugin

Go ★ 0 2mo ago
Explain →
flyte ⑂

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.

★ 0 1y ago
Explain →
freeipa ⑂

Mirror of FreeIPA, an integrated security information management solution

★ 0 1y ago
Explain →
cluster-api-provider-kubevirt ⑂

Cluster API Provider for KubeVirt

★ 0 1y ago
Explain →
maas-ansible-playbook ⑂

An Ansible playbook for installing and configuring MAAS

★ 0 1y ago
Explain →
accelerate ⑂

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

★ 0 1y ago
Explain →
js-eventsource ⑂

EventSource client for Node.js and Browser (polyfill)

JavaScript ★ 0 1y ago
Explain →
Sequoia ⑂

scalable and robust tree-based speculative decoding algorithm

★ 0 2y ago
Explain →
TensorRT-LLM ⑂

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

★ 0 1y ago
Explain →
helm ⑂

Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110).

Python ★ 0 1y ago
Explain →
lm-evaluation-harness ⑂

A framework for few-shot evaluation of autoregressive language models.

★ 0 2y ago
Explain →
FasterTransformer ⑂

Transformer related optimization, including BERT, GPT

C++ ★ 0 2y ago
Explain →

No repos match these filters.

Made with gitmyhub, a BitVibe Labs product. · Explanations powered by explaingit.

GitHub is a trademark of GitHub, Inc. gitmyhub is independent and not affiliated with or endorsed by GitHub. Public data is shown via the GitHub API.