Add RAG MCP Server with document search and management UI#384
Open
crab182 wants to merge 6 commits intoratfactor:mainfrom
Open
Add RAG MCP Server with document search and management UI#384crab182 wants to merge 6 commits intoratfactor:mainfrom
crab182 wants to merge 6 commits intoratfactor:mainfrom
Conversation
Complete self-hosted system for document RAG with MCP server access: - FastAPI backend with ChromaDB vector store and sentence-transformers - MCP server (SSE + Streamable HTTP) with API key authentication - React web UI for document management, search, SMB browsing, and config - Docker Compose orchestration for Unraid deployment - SMB share browsing and document ingestion from LAN shares https://claude.ai/code/session_01ByAJeYptU8ZosBaBVw1rbd
- ChromaDB list_collections: handle 0.5.x API returning strings - MCP SSE: replace dict-based response with asyncio.Queue (race condition) - MCP streamable: return 204 for notifications (JSON-RPC compliance) - SMB browser: remove unused low-level imports, improve list_shares error - Frontend: fix header merge bug in request(), clipboard fallback for HTTP - Remove unused deps (react-router-dom, lucide-react, markdown, etc.) - Switch PyPDF2 to pypdf to avoid deprecation warnings - Remove unused imports (os, shutil) from routers https://claude.ai/code/session_01ByAJeYptU8ZosBaBVw1rbd
- Move .dockerignore to per-service build contexts (backend, mcp, frontend) - Add .gitkeep for data/chromadb/ and data/config/ dirs - Fix .gitignore to track .gitkeep files in data dirs - Fix nginx.conf: proxy /sse and /messages for full MCP access via frontend - Fix SMB list_shares: use request body model (password was in URL query) - Refactor SMB ingest: extract recursive logic into helper function - Remove unused imports (Optional, struct, uuid, smbprotocol internals) - Add deployment README with architecture, quick start, and API docs https://claude.ai/code/session_01ByAJeYptU8ZosBaBVw1rbd
Modern Compose v2 emits a warning when the top-level 'version' key is present. Removing it silences the warning with no behavior change. https://claude.ai/code/session_01ByAJeYptU8ZosBaBVw1rbd
Pre-install torch from the PyTorch CPU index before sentence-transformers pulls it in. Avoids ~3GB of unnecessary NVIDIA CUDA packages (torch+nvidia-cublas+nvidia-cudnn+triton+...) that were exhausting the default 50GB Unraid docker.img during build. The all-MiniLM-L6-v2 embedding model runs on CPU with no perf concern for this workload. https://claude.ai/code/session_01ByAJeYptU8ZosBaBVw1rbd
Authentication & authorization
- All /api/ routes now require a valid API key via Depends()
- Two-tier keys: admin (manage keys/collections/SMB) vs read-only (query only)
- Bootstrap flow: first admin key can be created without auth; all subsequent
operations require authentication
- hmac.compare_digest used throughout for constant-time key comparison
(both backend auth.py and mcp_server/server.py)
- MCP server forwards the client's bearer token to the backend, so the
same key controls access end-to-end
Input validation
- Collection names restricted to [A-Za-z0-9_-]{1,64} via Pydantic validators
- Filenames sanitized through safe_filename() (NFKC normalize, regex allowlist,
reject dotfiles and path separators)
- safe_join() enforces uploads/deletes stay within the documents base dir,
closing a path-traversal vulnerability on /upload and /delete
- File upload capped at 100 MB and non-empty checked
- Query n_results bounded 1..50; query string length bounded
- APIKeyCreate name restricted with regex and length
Transport & CORS
- CORS allowlist via CORS_ALLOWED_ORIGINS env var (default: 192.168.1.52:8902)
- Methods/headers restricted to those actually used
- Origin-header check on MCP endpoints (DNS-rebinding / CSRF defense);
non-browser clients (no Origin header) still allowed
- Security headers (X-Content-Type-Options, X-Frame-Options, Referrer-Policy,
CSP, Permissions-Policy) added in nginx and via middleware on both services
- nginx server_tokens off
Error handling
- Generic 500 handler returns "Internal server error"; full exceptions logged
server-side only
- SMB routes no longer leak exception strings to clients
- MCP tool call failures return a neutral message
Container hardening
- Backend and MCP containers now run as non-root (uid 10001 / 10002)
- Backend purges build-essential after pip install (smaller surface)
- security_opt: no-new-privileges on all three services
- Memory limits set per-service
- MCP config volume mounted read-only (it only needs to validate keys)
Rate limiting
- slowapi added with a 120/minute default on the backend
Frontend
- API client reads Bearer token from localStorage and attaches it to every
request and upload
- App.jsx gates the UI: bootstrap screen when no keys exist, login screen
otherwise; Sign-out button clears the stored key
- API Keys page now lets the operator mark a key as admin or read-only
https://claude.ai/code/session_01ByAJeYptU8ZosBaBVw1rbd
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR introduces a complete self-hosted RAG (Retrieval Augmented Generation) system with MCP (Model Context Protocol) server support, designed for deployment on Unraid. The system enables cloud-based LLMs to search local documents via semantic similarity without exposing documents to external services.
Key Changes
Backend (FastAPI + ChromaDB)
backend/app/services/rag_engine.py): Semantic document search using sentence-transformers and ChromaDB with configurable chunking and embeddingbackend/app/routers/documents.py): Upload, query, and manage documents across multiple collections with support for PDF, DOCX, XLSX, and text formatsbackend/app/routers/smb.py,backend/app/services/smb_browser.py): Browse and ingest documents from LAN SMB shares with recursive directory supportbackend/app/services/auth.py): API key generation, hashing, and validation for secure access controlbackend/app/services/document_parser.py): Multi-format file parsing (PDF, DOCX, XLSX, code, config files, etc.)MCP Server (SSE Transport)
mcp_server/server.py): Server-Sent Events (SSE) based MCP server providing tools for:search_documents: Semantic search with relevance scoringlist_collections: Browse available document collectionslist_documents: List documents in a collectionget_server_status: Server health and statisticsFrontend (React + Vite)
frontend/src/pages/Dashboard.jsx): System overview with document counts and collection statisticsfrontend/src/pages/Documents.jsx): Upload files, manage collections, reindex documentsfrontend/src/pages/Search.jsx): Semantic search interface with result displayfrontend/src/pages/SMBBrowser.jsx): Browse SMB shares and ingest documents with path navigationfrontend/src/pages/APIKeys.jsx): Create, revoke, and manage API keys for MCP accessfrontend/src/pages/MCPConfig.jsx): Enable/disable MCP server and view connection detailsfrontend/src/index.css): Dark theme with comprehensive component styles (cards, forms, tables, badges, etc.)Infrastructure
docker-compose.yml): Three-service orchestration (backend, MCP server, frontend)deploy.sh): Automated setup with health checks and service validationbackend/app/config.py): Environment-based settings with persistent JSON config storagefrontend/nginx.conf): Reverse proxy for API requests and SPA routingNotable Implementation Details
all-MiniLM-L6-v2embedding model by default with configurable alternatives/app/data/volumesDeployment
The system is containerized and ready for Unraid deployment:
All services include health checks and automatic restart policies.
https://claude.ai/code/session_01ByAJeYptU8ZosBaBVw1rbd