1.9 KiB
1.9 KiB
Distributed Runtime Flow
Dynavera behaves like a streaming agentic system rather than a simple CRUD app. Runtime responsibility is split into three buckets.
1) MCP Surface (Django-side tool layer)
This is the tool-facing layer that lets the model request structured actions such as retrieval and session updates.
Typical tool intents:
search_knowledge(query, role_uuid)update_progress(session context)get_role_context(role_uuid)list_training_files(role_uuid)
Conceptually, this layer translates model tool calls into standard Django queries and vector lookups.
2) Orchestrator (Channels consumer + async control loop)
The orchestrator lives in the WebSocket runtime and coordinates each user request lifecycle.
Typical interaction path:
- User sends message over WebSocket
- Orchestrator builds/updates context
- Orchestrator calls inference endpoint
- Model requests tool calls when needed
- Orchestrator executes tool calls and continues generation
- Streamed/assembled response returns to user
This is the central control plane for session continuity, tool usage, and response streaming.
3) GPU Inference Pipe (passive engine)
The GPU service is designed as a passive inference engine:
- Receives prompts/inference payloads
- Produces chat/embedding outputs
- Does not initiate calls back into the VPS
Using OpenAI-style request/response patterns keeps integration predictable.
Interface Summary
| Component | Typical Path / Endpoint | Role |
|---|---|---|
| MCP Surface | Internal Django tool handlers (and/or MCP endpoint) | Data/tool translation |
| Orchestrator | apps/onboarding/consumers/ |
Coordination + streaming |
| GPU Inference | gpu_server.py HTTP endpoints |
Generation + embeddings |