kreuzberg-dev.r-universe.dev R-universe package registry for the Kreuzberg organization. Packages | Package | Source Repository | Description | |---------|-------------------|-------------| | kreuzberg | kreuzberg-dev/kreuzberg | Extract text and metadata from 90+ file…
kreuzberg-dev.r-universe.dev
R-universe package registry for the Kreuzberg organization.
Packages
| Package | Source Repository | Description |
|---------|-------------------|-------------|
| kreuzberg | kreuzberg-dev/kreuzberg | Extract text and metadata from 90+ file formats |
| htmltomarkdown | kreuzberg-dev/html-to-markdown | High-performance HTML to Markdown converter |
Installation
r
# Install kreuzberg
install.packages("kreuzberg", repos = "https://kreuzberg-dev.r-universe.dev")
# Install htmltomarkdown
install.packages("htmltomarkdown", repos = "https://kreuzberg-dev.r-universe.dev")
Links
-
kreuzberg
A polyglot document intelligence framework with a Rust core. Extract text, metadata, images, and structured information from PDFs, Office documents, images, and 97+ formats. Available for Rust, Python, Ruby, Java, Go, PHP, Elixir, C#, R, C, TypeScript (Node/Bun/Wasm/Deno)- or use via CLI, REST API, or MCP server.
Rust ★ 8.5k 4h agoExplain → -
html-to-markdown
High performance and CommonMark compliant HTML to Markdown converter. Maintained by the Kreuzberg team. Kreuzberg is a fast, polyglot document intelligence engine with a Rust core. It extracts structured data from 56+ document formats using streaming parsers and built-in OCR.
HTML ★ 777 15h agoExplain → -
tree-sitter-language-pack
Comprehensive tree-sitter grammar compilation with polyglot bindings — Rust, Python, Node.js, Go, Java, Ruby, Elixir, PHP, C#, WASM, Dart, Kotlin-Android, Swift, Zig, and CLI. 306+ languages.
Rust ★ 399 20h agoExplain → -
liter-llm
Universal LLM API client — 142+ providers, 11 native language bindings, powered by Rust core
Rust ★ 211 1d agoExplain → -
kreuzcrawl
High-performance web crawling engine with bindings for 11 languages
Rust ★ 113 13h agoExplain → -
alef
Generate fully-typed, lint-clean language bindings for Rust libraries across 16 languages
Rust ★ 77 11h agoExplain → -
kreuzberg-cloud
Cloud-native document extraction platform — SaaS at kreuzberg.dev or self-host on any Kubernetes cluster. 90+ formats, REST API, webhooks. Built on Kreuzberg.
Rust ★ 16 22m agoExplain → -
kreuzberg-surrealdb
Extract, chunk, and embed documents from 88+ formats directly into SurrealDB.
Python ★ 16 26d agoExplain → -
ai-rulez
No description.
★ 6 9d agoExplain → -
langchain-kreuzberg
Langchain document loader for Kreuzberg
Python ★ 5 26d agoExplain → -
paddle-to-onnx
No description.
C++ ★ 4 25d agoExplain → -
plugins
Curated document-intelligence plugins for coding agents — kreuzberg, kreuzcrawl, kreuzberg-cloud.
Shell ★ 3 19h agoExplain → -
.github
Kreuzberg is a fast, polyglot document intelligence engine with a Rust core. It extracts structured data from 97+ document formats using streaming parsers and built-in OCR. Designed for RAG pipelines, batch workloads, and production deployments.
★ 2 10d agoExplain → -
kreuzberg-txtai
Kreuzberg integration for txtai — drop-in Textractor replacement and custom pipeline
Python ★ 2 26d agoExplain → -
kreuzberg-cloud-sdk
Official client SDKs (Python, TypeScript, Go) for the Kreuzberg Cloud document-processing API.
Go ★ 2 19d agoExplain → -
pre-commit-hooks
No description.
Shell ★ 1 20h agoExplain → -
llama-index-kreuzberg
LlamaIndex reader and node parser integrations for kreuzberg — 88+ format document extraction with element-aware splitting
Python ★ 1 1d agoExplain → -
kreuzberg-spring-ai
Spring AI DocumentReader integration for Kreuzberg document extraction engine
Java ★ 1 25d agoExplain → -
kreuzberg-dev.r-universe.dev
R-universe repository for Kreuzberg.dev
★ 1 25d agoExplain → -
kreuzberg-crewai
Extract text and metadata from 88+ document formats — PDF, DOCX, XLSX, HTML, images with OCR, and more — directly from your CrewAI agents.
Python ★ 1 26d agoExplain → -
homebrew-tap
No description.
Ruby ★ 0 11h agoExplain → -
actions
Shared GitHub actions
Python ★ 0 15h agoExplain → -
test_documents
Various test documents used in Kreuzberg libraries for testing and benchmarking purposes
HTML ★ 0 2d agoExplain → -
orp ⑂
A Lightweight Framework for Building ONNX Runtime Pipelines with ort
★ 0 18d agoExplain → -
gline-rs-fork ⑂
Inference engine for GLiNER models, in Rust
★ 0 18d agoExplain → -
prek ⑂
⚡ A fast Git hook manager written in Rust, designed as a drop-in alternative to pre-commit, reimagined.
Rust ★ 0 25d agoExplain →
No repos match these filters.