Skip to main content

Infrastructure

Configuration Loading Order

./utilscripts/quick_start.sh simply activates the virtual environment and launches uvicorn app.main:app. Runtime configuration is resolved inside app/core/config.py using Pydantic Settings with load_dotenv():

  1. Environment variables present in the shell take highest priority.
  2. Values from a project .env file (for example the one copied from docs/static/files/env.example) override code defaults.
  3. If neither is provided, the hard-coded defaults in Settings are used.

This means the service will run with sensible defaults out of the box, but any value you place in .env or export before running the service immediately replaces the default without editing code.

Environment Variables and Defaults

Four primary groups of settings control how the platform behaves. Start with .env to override the defaults shown here.

Platform & Runtime

VariableDefaultPurpose
PROJECT_NAMERAG Loom APIBranding for generated docs and metadata.
VERSION0.2.0API/version banner returned by / and /health.
API_V1_STR/api/v1Prefix for all routed endpoints.
CHUNK_SIZE1000Default characters per chunk during ingestion.
CHUNK_OVERLAP200Overlap between consecutive chunks.
MAX_FILE_SIZE10485760Maximum upload size (bytes).
SERVICE_PORT8000Port bound by Uvicorn.
SERVICE_HOST0.0.0.0Listen address (0.0.0.0 to expose externally).
LOG_LEVELINFOLog verbosity for FastAPI/Uvicorn.
DEBUGFalseEnables additional debug output.
RELOADFalseAuto-reload flag for local development.
WORKER_PROCESSES4Number of Uvicorn worker processes.
MAX_CONCURRENT_REQUESTS100Back-pressure guard for FastAPI.
REQUEST_TIMEOUT300Maximum request processing time (seconds).
ENABLE_METRICSTrueExpose Prometheus /metrics.
ENABLE_TRACINGFalsePlaceholder for future tracing integrations.
CORS_ORIGINS["http://localhost:3000", "http://127.0.0.1:3000"]Allow-listed front-end origins.
DATABASE_URLsqlite:///./rag_platform.dbMetadata database (SQLite by default).
UPLOAD_DIR./uploadsTemp storage for incoming files.
PROCESSED_DIR./processedLocation for processed artifacts.
CACHE_DIR./cacheGeneral cache directory.
LOGS_DIR./logsRuntime log directory.

Vector Store & Retrieval

These variables determine where embeddings are stored and how retrieval behaves. They should mirror the backend you deploy in Docker Compose or managed infrastructure.

VariableDefaultPurpose
VECTOR_STORE_TYPEchromaVector backend (chroma, qdrant, or redis).
CHROMA_PERSIST_DIRECTORY./chroma_dbOn-disk location for embedded Chroma.
QDRANT_URLhttp://localhost:6333Qdrant endpoint.
QDRANT_API_KEYNoneAuth token when Qdrant security is enabled.
REDIS_URLredis://localhost:6379Redis connection string with RediSearch.
EMBEDDING_MODELsentence-transformers/all-MiniLM-L6-v2Default embedding model identifier.
EMBEDDING_DIM384Dimensionality expected by the vector store.
TOP_K5Number of results returned by retrieval.
SIMILARITY_THRESHOLD0.7Minimum similarity score before fallback rules apply.

LLM Providers

Choose the provider that matches your deployment targets. Only the keys required by the selected provider need to be populated.

VariableDefaultPurpose
LLM_PROVIDERollamaActive adapter (ollama, openai, cohere, huggingface).
OLLAMA_BASE_URLhttp://localhost:11434Ollama daemon address.
OLLAMA_MODELgemma2:2bDefault Ollama model tag.
OLLAMA_NUM_PARALLEL2Concurrency hint for Ollama requests.
OPENAI_API_KEYNoneRequired when LLM_PROVIDER=openai.
OPENAI_MODELgpt-3.5-turboDefault OpenAI chat model.
COHERE_API_KEYNoneRequired when LLM_PROVIDER=cohere.
COHERE_MODELcommand-xlargeCohere generation model.
HUGGINGFACE_API_KEYNoneRequired for private Hugging Face models.
HUGGINGFACE_MODELgoogle/flan-t5-largeTransformers pipeline model.

Security & Access Control

Enable these when you deploy beyond trusted environments.

VariableDefaultPurpose
ENABLE_AUTHFalseToggle authentication middleware.
SECRET_KEYyour_production_secret_key_hereSigning key used when auth is enabled.
ACCESS_TOKEN_EXPIRE_MINUTES30Token lifetime for auth flows.

Tip: copy docs/static/files/env.example to .env and adjust only the values you need. Everything else will automatically fall back to the defaults above.

Production Environment Template

The repository ships with env.production at the project root as a ready-made template for hardened deployments. It pins sensible production choices—such as VECTOR_STORE_TYPE=qdrant, the Ollama provider defaults, and conservative worker limits. To adopt it:

  1. Duplicate the file (cp env.production .env) and populate any empty secrets such as QDRANT_API_KEY or OAuth tokens.
  2. Adjust values that depend on your hosting (for example QDRANT_URL, REDIS_URL, or CORS_ORIGINS).
  3. Restart the service; Settings in app/core/config.py will ingest the overrides automatically.

Every key in env.production maps directly to an attribute in Settings. The defaults shown above originate from app/core/config.py, so you can verify behaviour or introduce new configuration flags in a single place while keeping documentation in sync.

Local Infrastructure via Docker Compose

For local development the repository includes docker-compose.infra.yml, which provisions Qdrant and Ollama (pre-configured with the gemma2:2b model). The helper script ./utilscripts/dev-infra.sh orchestrates the workflow:

./utilscripts/dev-infra.sh up       # copy env.production -> .env, pull images, start services, preload Gemma 2B
./utilscripts/dev-infra.sh status # inspect container state
./utilscripts/dev-infra.sh logs # tail logs from both services
./utilscripts/dev-infra.sh down # stop and remove the containers

The script performs the following steps:

  • Copies env.production to .env (creating it if necessary) and enforces the Ollama/Qdrant connection parameters expected by the FastAPI service.
  • Ensures the required Docker images are downloaded, then calls docker compose -f docker-compose.infra.yml up -d.
  • Waits for the Qdrant (http://localhost:6333/health) and Ollama (http://localhost:11434/api/tags) endpoints to respond.
  • Downloads the gemma2:2b model via Ollama's REST API (blocking until complete) so subsequent requests succeed immediately.

After the script reports success, you can launch the API (./utilscripts/quick_start.sh start) or run tests—the .env file now points at the containerised services.