fishy/Dynavera

Viswamedha Nalabotu 43618ee3f4 Fixed readme errors

2026-03-22 12:42:33 +00:00

1.4 KiB

Raw Blame History

Deployment Topologies

This page compares local and distributed deployment shapes.

Local Development Topology

Purpose: fast iteration and debugging.

App services run via compose/dev/docker-compose.yml
Django, Celery, Redis, Postgres, Node, and inference can run together
Suitable for feature work and integration checks

Distributed Topology (VPS + GPU Node)

Purpose: production-like separation of concerns.

VPS node: web app, orchestration, API, websocket handling, task queue, database
GPU node: dedicated inference service (chat + embeddings + chunking helpers)
Request direction is primarily VPS -> GPU for model tasks

Why Split Nodes?

Keeps model latency/VRAM pressure away from user/session services
Allows independent scaling of orchestration and inference
Improves operational clarity around failures and bottlenecks

Operational Notes

Confirm inference host/port/protocol values in runtime container env
Set INFERENCE_USERNAME and INFERENCE_PASSWORD — the GPU node requires HTTP Basic Auth on all endpoints
Confirm pgvector extension is enabled in target database
Keep role flow generation permissions constrained to trusted user types