reader
Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/
An open-source tool that converts any web page, PDF, or Office document into clean markdown by prepending a URL prefix, so AI models can read and reason about real content without noise or login required.
Reader is an open-source tool from Jina AI that turns any web page, PDF, or document into clean text that AI models can work with. The idea is simple: prepend https://r.jina.ai/ to any URL, and Reader fetches the page, strips away the noise, and returns structured markdown. You do not need an account or API key to start using it.
The tool handles more than just web pages. PDFs hosted anywhere are parsed automatically. Word, Excel, and PowerPoint files can be uploaded directly or linked by URL. Images get a short text caption so that AI models without vision support can still reason about them. Under the hood, Reader picks between headless Chrome (for JavaScript-heavy pages) and a lightweight curl-based fetcher, choosing whichever is more appropriate for the page.
The search side of the project works through a companion service at s.jina.ai. Pass it any query and it fetches the top five web results, visits each one, and returns their full text rather than just titles and snippets. That means the AI model reading those results gets real article content, not search-engine previews.
For developers who want more control, Reader accepts request headers that adjust its behavior: you can target specific elements on a page with a CSS selector, set a timeout, cap the number of output tokens, or choose whether to use the browser renderer or the lightweight fetcher. There is an interactive code builder on the project website that lets you explore the available options before writing any code.
This repository is the open-source version of the same code running on the live service. The hosted SaaS adds a storage layer that is not included here, but you can run Reader locally in a stateless mode or with optional object-storage caching.
Where it fits
- Feed any article or web page to a language model as clean text by prepending r.jina.ai/ to its URL with no account or API key needed.
- Convert a PDF hosted online into structured markdown so an AI model can answer questions about its content.
- Build a web-research pipeline that fetches the full text of top search results for any query using the s.jina.ai companion endpoint.