ws-1-llama-index-rag

HTML ★ 13 updated 25d ago

W01: LlamaIndex + Pydantic — RAG

A workshop demo project showing how to build an AI question-answering system that retrieves answers from three databases at once: a SQL database, a vector search store, and a graph database.

PythonLlamaIndexPostgreSQLQdrantNeo4jFastAPIDockersetup: moderatecomplexity 4/5

This repository contains a demonstration project called DataOps Knowledge Hub, which is a system that answers questions by pulling information from multiple types of databases at once. The technique is called RAG (Retrieval-Augmented Generation), which means an AI language model is paired with a retrieval layer so it can look up real data before answering rather than relying only on what it learned during training.

The system connects to three different storage backends. A PostgreSQL relational database handles factual and transactional queries using a text-to-SQL approach, where natural-language questions are translated into database queries automatically. A Qdrant vector database handles semantic search over documents and logs, finding content that is conceptually similar to a question even when exact keywords do not match. A Neo4j graph database handles relationship and lineage queries, which are questions about how entities connect to each other.

The technology stack includes LlamaIndex for the retrieval orchestration, Pydantic for data validation, FastAPI for the API layer, and Docker to run all the services together. Setup involves copying an environment file, adding an OpenAI API key, and running one command to start everything.

The project is labeled as Workshop 1 of a training program called AIDE Brasil Formation, so it is primarily a learning exercise demonstrating how to combine these tools rather than a finished product. The README is brief and does not go into deeper usage details.

Where it fits

Learn how to build a RAG system that translates natural-language questions into SQL and queries a relational database.
Experiment with combining vector search, graph traversal, and SQL in a single AI answer pipeline.
Use this as a starting template for a question-answering app that retrieves from multiple storage backends at once.

Open on GitHub → Full breakdown on explaingit →