gitmyhub

Finance-DeepSeek

Python ★ 35 updated 28d ago

基于 DeepSeek-R1-Distill-Qwen-1.5B 的金融领域推理增强问答系统

A finance question-answering system that runs a compact AI model on a single consumer GPU, showing its step-by-step reasoning alongside answers and supporting OpenAI-compatible API calls.

PythonPyTorchHuggingFaceFAISSDockersetup: hardcomplexity 4/5

Finance-DeepSeek is a question-answering system built for finance topics. It runs on a single consumer GPU (an RTX 4060 with 8GB of memory) and gives answers that include the reasoning steps the model took to reach them, not just a final answer. The README is written in Chinese.

The system is built on top of DeepSeek-R1-Distill-Qwen-1.5B, which is a compact AI model derived from a much larger 671-billion-parameter model through a process called knowledge distillation. The idea is that the smaller model inherits some of the larger model's reasoning behavior without requiring the same hardware. The base model is downloaded automatically from HuggingFace on first run.

Two techniques work together to improve answer quality. The first is QLoRA, a method for fine-tuning the model on finance-specific data without needing a lot of GPU memory. The second is RAG (retrieval-augmented generation), where relevant documents are searched and fed into the prompt before the model generates an answer. The document index is built using FAISS and a financial text embedding model. Users can choose between three modes: answering from the model alone, answering with retrieved context, or answering with retrieved context plus structured reasoning output.

The model's responses often contain a thinking section (wrapped in think tags) before the final answer. The system parses this automatically and can stream both parts back to the caller in order, so a front-end can display the reasoning as it arrives. The API follows the OpenAI chat completions format, so tools built for OpenAI's API can talk to it with minimal changes.

Setup involves cloning the repository, installing Python dependencies, and running a data preparation script that generates training data and builds the vector index. Optional fine-tuning can be run locally. A Docker Compose configuration is also included for containerized deployment. The project is MIT licensed.

Where it fits