Introduction

RAG Loom provides a production-quality retrieval-augmented generation (RAG) microservice built on FastAPI. This site covers everything required to evaluate the platform, run it locally, integrate it with external systems, and operate it in production.

Executive Snapshot

RAG Loom allows to use your proprietary data with frontier generative AI, turning static knowledge bases into live, AI-assisted experiences. Within days, business teams move from “we should explore AI” to measurable impact in customer support, research synthesis, and internal enablement.

What Is Retrieval-Augmented Generation?

Retrieval — surface the right internal answer in milliseconds, straight from your knowledge base.
Generation — blend that source material into clear, on-brand narratives for customers and teams.

The result is an AI copilot that stays grounded in policy-approved facts and gives stakeholders instant confidence.

Built for the Models You Already Trust

OpenAI GPT
Anthropic Claude
Google Gemini
Mistral Large
Ollama (locally hosted models in secure private clouds for sensitive workloads.)

Why It Matters for Business Leaders

De-risk AI rollouts — start with explainable, citation-backed answers instead of black-box chatbots.
Accelerate time to value — launch pilot use cases in days, not quarters, with governance baked in.
Scale with confidence — observability dashboards surface adoption, accuracy, and compliance trends for every release.

Talk with RAG Loom team

Who This Documentation Serves

Builders who want to stand up RAG Loom locally and explore the API surface.
Operators who deploy and monitor the service in production environments.
Integrators connecting RAG Loom to model providers such as Ollama, OpenAI, or Cohere.
Contributors extending the project or adapting it to bespoke workflows.

Key Capabilities

Modular ingestion, retrieval, and generation pipelines with configurable vector stores.
Support for local and hosted LLM providers (Ollama, OpenAI, Cohere, Hugging Face).
Production operational tooling, including Dockerised deployment scripts and monitoring.
Test harnesses and quick-start scripts to accelerate development.

How to Navigate the Docs

Section	When to Read	Highlights
Getting Started	First-time setup	Prerequisites, local bootstrap, project layout
Architecture	Planning & design reviews	High-level system view and component responsibilities
API	Building client integrations	Endpoint catalogue, payloads, and curl recipes
Operations	Running in production	Scaling strategies, security hardening, troubleshooting guides
Integrations	Configuring external services	Ollama integration guidance and performance tips

Quick Actions

Copy the environment template: cp docs/static/files/env.example .env
Launch the local stack with the helper script: ./utilscripts/quick_start.sh setup
Explore interactive API docs at http://localhost:8000/docs
Run automated tests: pytest

Project Structure

rag-loom/
├── app/                      # FastAPI application code
├── docs/                     # Documentation site (Docusaurus)
│   ├── docs/                 # Markdown sources
│   ├── src/                  # Presentation components and styles
│   └── static/files/         # Downloadable assets and samples
├── tests/                    # Automated test suites
├── utilscripts/              # Operational helper scripts
└── Dockerfile                # Containerised deployment

Need help fast? Start with Getting Started to prepare your environment, then move on to Quick Start to launch the service in minutes.

Executive Snapshot​

What Is Retrieval-Augmented Generation?

Built for the Models You Already Trust

Why It Matters for Business Leaders

Who This Documentation Serves​

Key Capabilities​

How to Navigate the Docs​

Quick Actions​

Project Structure​