-
OpenChatKit
No description.
Python ★ 9.0k 2y agoExplain → -
RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
Python ★ 5.0k 17d agoExplain → -
MoA
Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models
Python ★ 2.9k 1y agoExplain → -
together-cookbook
A collection of notebooks/recipes showcasing usecases of open-source models with Together AI.
Jupyter Notebook ★ 1.1k 11d agoExplain → -
stripedhyena
Repository for StripedHyena, a state-of-the-art beyond Transformer architecture
Python ★ 433 2y agoExplain → -
open_deep_research
Together Open Deep Research
Python ★ 374 1y agoExplain → -
open-data-scientist
Open AI data scientist agent that automates complex data analysis tasks using the ReAct framework. Execute Python code locally or in the cloud, upload datasets, and generate detailed analytical reports with minimal setup.
Python ★ 187 5mo agoExplain → -
OpenDataHub
No description.
★ 128 3y agoExplain → -
EinsteinArena-new-SOTA
New state-of-the-art bounds for open problems
Jupyter Notebook ★ 122 2mo agoExplain → -
finetuning
Finetune Llama-3-8b on the MathInstruct dataset
Python ★ 117 1y agoExplain → -
redpajama.cpp ⑂
Extend the original llama.cpp repo to support redpajama model.
C ★ 117 1y agoExplain → -
Llama-2-7B-32K-Instruct
No description.
Python ★ 84 2y agoExplain → -
together-python
The Official Python Client for Together's API
Python ★ 81 1mo agoExplain → -
Dragonfly
No description.
Python ★ 81 1y agoExplain → -
llamaindex-chatbot
A RAG Chatbot with Next.js, Together.ai and Llama Index
TypeScript ★ 72 1y agoExplain → -
aurora
No description.
Python ★ 67 1mo agoExplain → -
together-typescript
The official Together AI TypeScript library.
TypeScript ★ 66 2d agoExplain → -
flash-attention-3 ⑂
Fast and memory-efficient exact attention
★ 33 1y agoExplain → -
skills
Skills to help your coding agents use Together AI products.
Python ★ 32 2d agoExplain → -
saw-int4
Official implementation of Paper "System-Aware 4-Bit KV-Cache Quantization for Real-World LLM Serving"
Shell ★ 27 2mo agoExplain → -
reviewing-agents
No description.
Python ★ 20 6mo agoExplain → -
xorl
XoRL
Python ★ 11 1d agoExplain → -
together-py
No description.
Python ★ 10 1d agoExplain → -
CREST
CREST is a training-free test-time steering framework that discovers cognitive heads via simple offline calibration and then rotates activations during decoding to guide the model’s reasoning—preserving norms to avoid per-model hyperparameter tuning. This improves accuracy and reduces tokens across models and datasets.
Python ★ 10 5mo agoExplain → -
transformers_port ⑂
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Python ★ 7 1y agoExplain → -
open-models-api ▣
No description.
Python ★ 6 3y agoExplain → -
diffusers ⑂
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch
Python ★ 4 1mo agoExplain → -
Decentralized_Training ⑂
No description.
★ 4 3y agoExplain → -
SMiR
synthetic data pipeline for multi-image reasoning
Python ★ 3 1y agoExplain → -
UniversalSD ⑂
Universal Stable Diffusion Pipeline(s) with Flash Attention
Python ★ 3 1y agoExplain → -
FT_Llama2 ⑂
Transformer related optimization, including BERT, GPT
C++ ★ 3 1y agoExplain → -
together-sandbox
SDKs and CLIs for working with Together Sandboxes
Python ★ 2 2d agoExplain → -
ssd ⑂
A lightweight inference engine supporting speculative speculative decoding (SSD).
★ 2 19d agoExplain → -
together-chat ⑂
Streamlit Component, for a Chatbot UI
★ 2 2y agoExplain → -
vllm-ttgi ⑂
A high-throughput and memory-efficient inference and serving engine for LLMs
Python ★ 2 2y agoExplain → -
gpt-neox ⑂
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
Python ★ 2 3y agoExplain → -
together-go
Official Together AI Go library
Go ★ 1 2d agoExplain → -
stytch-elixir
Elixir Client for the Stytch B2B SaaS authentication API
Elixir ★ 1 12d agoExplain → -
sprocket
Sprocket SDK for building inference workers on Together Dedicated Containers
Python ★ 1 1mo agoExplain → -
autoscaler ⑂
Autoscaling components for Kubernetes
★ 1 5mo agoExplain → -
llm-awq-ttgi ⑂
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Python ★ 1 2y agoExplain → -
langchain ⑂
⚡ Building applications with LLMs through composability ⚡
★ 1 1y agoExplain → -
vllm ⑂
A high-throughput and memory-efficient inference and serving engine for LLMs
Python ★ 1 1y agoExplain → -
flash-attention ⑂
Fast and memory-efficient exact attention
Python ★ 1 2y agoExplain → -
FT_Redpajama ⑂
Transformer related optimization, including BERT, GPT
C++ ★ 1 2y agoExplain → -
H3 ⑂
Together port to run H3
Assembly ★ 1 2y agoExplain → -
Port_FasterTransformer ⑂
Transformer related optimization, including BERT, GPT
C++ ★ 1 3y agoExplain → -
FT_Bloomchat ⑂
Transformer related optimization, including BERT, GPT
C++ ★ 1 3y agoExplain → -
native_hf_models-slim ⑂
No description.
Python ★ 1 3y agoExplain → -
detect_agent
No description.
Python ★ 0 2d agoExplain → -
InferenceX ⑂
Open Source Continuous Inference Benchmark Research Platform Kimi K2.6, DeepSeekv4, GLM5 - GB200 NVL72 vs MI355X vs B200 vs GB300 NVL72 & soon™ TPUv6e/v7/Trainium2/3
Python ★ 0 5d agoExplain → -
sglang-private ⑂
SGLang is a high-performance serving framework for large language models and multimodal models.
Python ★ 0 2mo agoExplain → -
elixir-common
Shared modules and utilities for Elixir services at Together
Elixir ★ 0 13d agoExplain → -
tinker-cookbook ⑂
Post-training with Tinker
★ 0 15d agoExplain → -
archipelago
Archipelago: eval framework for AI agents on professional services tasks
Python ★ 0 23d agoExplain → -
terraform-provider-together
No description.
Go ★ 0 2d agoExplain → -
ib-kubernetes ⑂
No description.
Go ★ 0 2d agoExplain → -
SearchScales
No description.
★ 0 1mo agoExplain → -
slurm-operator ⑂
This project provides a framework that runs Slurm in Kubernetes.
Go ★ 0 24d agoExplain → -
together-kubelogin
No description.
Go ★ 0 3d agoExplain → -
keep-talking
fast bpe tokenizer
Rust ★ 0 4mo agoExplain → -
OpenHands
Private fork of https://github.com/All-Hands-AI/OpenHands.git
Python ★ 0 1mo agoExplain → -
k8s-netperf ⑂
Running Networking Performance Tests against K8s
★ 0 1mo agoExplain → -
DeepGEMM ⑂
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
★ 0 29d agoExplain → -
llama-stack ⑂
Model components of the Llama Stack APIs
★ 0 1y agoExplain → -
sglang-mla-rotation ⑂
SGLang is a high-performance serving framework for large language models and multimodal models.
★ 0 2mo agoExplain → -
xorl-wheels
Pre-compiled wheels
★ 0 20d agoExplain → -
Mooncake ⑂
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
★ 0 2mo agoExplain → -
xorl-sglang
No description.
Python ★ 0 2mo agoExplain → -
xorl-client
XoRL Client
Python ★ 0 8d agoExplain → -
together-VeOmni ⑂
VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo
★ 0 3mo agoExplain → -
together-nccl-tests ⑂
NCCL Tests
★ 0 1mo agoExplain → -
together-dgxc-benchmarking ⑂
DGXC Benchmarking provides recipes in ready-to-use templates for evaluating performance of specific AI use cases across hardware and software combinations.
★ 0 22d agoExplain → -
slurm-deb-packages ⑂
Slurm debian packages
Dockerfile ★ 0 3d agoExplain → -
rllm-public ⑂
Democratizing Reinforcement Learning for LLMs
★ 0 7mo agoExplain → -
genai-bench ⑂
Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serving systems.
★ 0 29d agoExplain → -
torchtune ⑂
PyTorch native post-training library
★ 0 9mo agoExplain → -
core-dump-handler ⑂
Save core dumps from a Kubernetes Service or RedHat OpenShift to an S3 protocol compatible object store
★ 0 9mo agoExplain → -
OLMo-snapshot ⑂
Modeling, training, eval, and inference code for OLMo
★ 0 10mo agoExplain → -
gorilla ⑂
Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)
Python ★ 0 11mo agoExplain → -
Kokoro-FastAPI ⑂
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model w/CPU ONNX and NVIDIA GPU PyTorch support, handling, and auto-stitching
Python ★ 0 11mo agoExplain → -
torchtitan ⑂
A PyTorch native platform for training generative AI models
★ 0 11mo agoExplain → -
k8s_gateway-fork ⑂
A CoreDNS plugin to resolve all types of external Kubernetes resources
★ 0 1y agoExplain → -
k8s_gateway ⑂
A CoreDNS plugin to resolve all types of external Kubernetes resources
Go ★ 0 1y agoExplain → -
kubevirt-full ⑂ ▣
Kubernetes Virtualization API and runtime in order to define and manage virtual machines.
★ 0 1y agoExplain → -
kubevirt ⑂
Kubernetes Virtualization API and runtime in order to define and manage virtual machines.
Go ★ 0 3mo agoExplain → -
sriov-network-operator ⑂
Operator for provisioning and configuring SR-IOV CNI plugin and device plugin
Go ★ 0 2mo agoExplain → -
flyte ⑂
Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
★ 0 1y agoExplain → -
freeipa ⑂
Mirror of FreeIPA, an integrated security information management solution
★ 0 1y agoExplain → -
cluster-api-provider-kubevirt ⑂
Cluster API Provider for KubeVirt
★ 0 1y agoExplain → -
maas-ansible-playbook ⑂
An Ansible playbook for installing and configuring MAAS
★ 0 1y agoExplain → -
accelerate ⑂
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
★ 0 1y agoExplain → -
js-eventsource ⑂
EventSource client for Node.js and Browser (polyfill)
JavaScript ★ 0 1y agoExplain → -
Sequoia ⑂
scalable and robust tree-based speculative decoding algorithm
★ 0 2y agoExplain → -
TensorRT-LLM ⑂
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
★ 0 1y agoExplain → -
helm ⑂
Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110).
Python ★ 0 1y agoExplain → -
lm-evaluation-harness ⑂
A framework for few-shot evaluation of autoregressive language models.
★ 0 2y agoExplain → -
FasterTransformer ⑂
Transformer related optimization, including BERT, GPT
C++ ★ 0 2y agoExplain →
No repos match these filters.