gitmyhub

pdfGPT

Python ★ 7.2k updated 3mo ago

PDF GPT allows you to chat with the contents of your PDF file by using GPT capabilities. The most effective open source solution to turn your pdf files in a chatbot!

A Python app that lets you upload a PDF and ask questions about it. It finds the most relevant sections and sends them to OpenAI to generate precise answers, with optional page-number citations.

PythonOpenAI APIUniversal Sentence EncoderDockersetup: moderatecomplexity 2/5

pdfGPT is a Python application that lets you ask questions about a PDF document and get answers generated by an AI model. You upload a PDF file or provide a URL to one, then type a question, and the application finds the most relevant sections of the document and sends them to OpenAI to generate a precise answer. The author claims it was one of the earliest open-source systems of this kind, first released in 2021, and argues it remains more accurate than many later alternatives because of its simple architecture.

The technical approach works like this: the application splits the PDF into small chunks of about 150 words each. It then generates a numerical representation (called an embedding) of each chunk using a deep learning encoder called the Universal Sentence Encoder. When you ask a question, the application generates an embedding of your question and uses a nearest-neighbor search to find the five chunks most similar to it. Those five chunks are inserted into a prompt sent to OpenAI, which generates the final answer. The responses can include page number citations in square brackets so you can locate the source in the original document.

One design choice that distinguishes this project from some alternatives is that it does not use a vector database or a third-party orchestration library like LangChain. The embeddings are saved to a file on disk and reloaded on subsequent queries. The application supports OpenAI GPT models including GPT-3.5 Turbo and GPT-4.

A Docker Compose file is included for running the application in a container. A live demo is hosted on Hugging Face Spaces. The project is MIT licensed and open to contributors, though the README notes that documentation has not been kept fully up to date.

Where it fits