-
lorax
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
Python ★ 3.8k 1mo agoExplain → -
llm_distillation_playbook
Best practices for distilling large language models.
Jupyter Notebook ★ 633 2y agoExplain → -
lora_bakeoff
No description.
Python ★ 21 1y agoExplain → -
json-mode-benchmark
No description.
Jupyter Notebook ★ 7 2y agoExplain → -
neuropod ⑂
A uniform interface to run deep learning models from multiple frameworks
★ 3 4y agoExplain → -
punica ⑂
Serving multiple LoRA finetuned LLM as one
Cuda ★ 2 2y agoExplain → -
dask-bigquery ⑂
No description.
Python ★ 1 4y agoExplain → -
dask-sql ⑂
Distributed SQL Engine in Python using Dask
Python ★ 1 4y agoExplain → -
litellm ⑂
Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, VLLM, NVIDIA NIM]
Python ★ 0 2h agoExplain → -
seldon-core ⑂ ▣
An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models
HTML ★ 0 9mo agoExplain → -
volcano-apis ⑂ ▣
The API (CRD) of Volcano
Go ★ 0 2y agoExplain → -
volcano ⑂ ▣
A Cloud Native Batch System (Project under CNCF)
Go ★ 0 2y agoExplain → -
huggingface_hub ⑂
The official Python client for the Huggingface Hub.
Python ★ 0 2y agoExplain → -
llama_index ⑂
LlamaIndex (GPT Index) is a data framework for your LLM applications
Python ★ 0 2y agoExplain → -
langchain ⑂
⚡ Building applications with LLMs through composability ⚡
Python ★ 0 2y agoExplain → -
kubernetes-image-puller ⑂
Kubernetes Image Puller is used for caching images on a cluster. It creates a DaemonSet downloading and running the relevant container images on each node.
★ 0 3y agoExplain → -
PyBump ⑂
Bump version in Helm Chart.yaml and setup.py files
★ 0 3y agoExplain → -
server ⑂
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
★ 0 3y agoExplain → -
last-successful-commit-action ⑂
GitHub action for identifying the last successful commit for a given workflow and branch.
★ 0 5y agoExplain →
No repos match these filters.