Introduction
RAG Loom provides a production-quality retrieval-augmented generation (RAG) microservice built on FastAPI. This site covers everything required to evaluate the platform, run it locally, integrate it with external systems, and operate it in production.
Executive Snapshot
RAG Loom allows to use your proprietary data with frontier generative AI, turning static knowledge bases into live, AI-assisted experiences. Within days, business teams move from “we should explore AI” to measurable impact in customer support, research synthesis, and internal enablement.
What Is Retrieval-Augmented Generation?
- Retrieval — surface the right internal answer in milliseconds, straight from your knowledge base.
- Generation — blend that source material into clear, on-brand narratives for customers and teams.
The result is an AI copilot that stays grounded in policy-approved facts and gives stakeholders instant confidence.
Built for the Models You Already Trust
- OpenAI GPT
- Anthropic Claude
- Google Gemini
- Mistral Large
- Ollama (locally hosted models in secure private clouds for sensitive workloads.)
Why It Matters for Business Leaders
- De-risk AI rollouts — start with explainable, citation-backed answers instead of black-box chatbots.
- Accelerate time to value — launch pilot use cases in days, not quarters, with governance baked in.
- Scale with confidence — observability dashboards surface adoption, accuracy, and compliance trends for every release.
Talk with RAG Loom team
Who This Documentation Serves
- Builders who want to stand up RAG Loom locally and explore the API surface.
- Operators who deploy and monitor the service in production environments.
- Integrators connecting RAG Loom to model providers such as Ollama, OpenAI, or Cohere.
- Contributors extending the project or adapting it to bespoke workflows.
Key Capabilities
- Modular ingestion, retrieval, and generation pipelines with configurable vector stores.
- Support for local and hosted LLM providers (Ollama, OpenAI, Cohere, Hugging Face).
- Production operational tooling, including Dockerised deployment scripts and monitoring.
- Test harnesses and quick-start scripts to accelerate development.
How to Navigate the Docs
Section | When to Read | Highlights |
---|---|---|
Getting Started | First-time setup | Prerequisites, local bootstrap, project layout |
Architecture | Planning & design reviews | High-level system view and component responsibilities |
API | Building client integrations | Endpoint catalogue, payloads, and curl recipes |
Operations | Running in production | Scaling strategies, security hardening, troubleshooting guides |
Integrations | Configuring external services | Ollama integration guidance and performance tips |
Quick Actions
- Copy the environment template:
cp docs/static/files/env.example .env
- Launch the local stack with the helper script:
./utilscripts/quick_start.sh setup
- Explore interactive API docs at
http://localhost:8000/docs
- Run automated tests:
pytest
Project Structure
rag-loom/
├── app/ # FastAPI application code
├── docs/ # Documentation site (Docusaurus)
│ ├── docs/ # Markdown sources
│ ├── src/ # Presentation components and styles
│ └── static/files/ # Downloadable assets and samples
├── tests/ # Automated test suites
├── utilscripts/ # Operational helper scripts
└── Dockerfile # Containerised deployment
Need help fast? Start with Getting Started to prepare your environment, then move on to Quick Start to launch the service in minutes.