gitmyhub

instructor

Python ★ 13k updated 6d ago

structured outputs for llms

Instructor is a Python library that makes AI models return structured, validated data instead of raw text, define a data shape, and it handles extraction, validation, and retries automatically.

PythonPydanticTypeScriptRubyGoElixirRustsetup: easycomplexity 2/5

Instructor is a Python library that makes it simpler to get structured, predictable data back from AI language models. Instead of receiving raw text that you then have to parse and validate, you define a data shape using Python models (from a library called Pydantic) and pass it to Instructor, which handles extracting the correct fields, validating the result, and retrying if the AI makes a mistake.

The problem it solves is common when building AI applications: language models produce free-form text, but your code usually wants a specific structure, like a user name and age, or a list of products with prices. Getting that reliably means writing JSON schema definitions, parsing the response, checking for missing fields, and re-running on failures. Instructor wraps that entire process into a single clean call.

The library works with most major AI providers using the same code pattern. You configure a client pointing at OpenAI, Anthropic, Google, Groq, or a locally running model, then make a single call supplying your data shape and your message. Instructor connects to the provider, applies the structured output method that provider supports, validates the result against your definition, and returns a ready-to-use Python object.

Additional features include streaming partial results as they are generated (so your interface can update progressively), extracting nested or list-type structures, and setting a retry count so that invalid responses are automatically sent back to the model with error feedback. These features handle edge cases that raw JSON mode or manual approaches tend to miss.

Instructor is available for Python, TypeScript, Ruby, Go, Elixir, and Rust, each with their own documentation. The Python version is the original and most complete. The project is open source under the MIT license and reports over three million monthly downloads.

Where it fits