55 lines
1.9 KiB
Markdown
55 lines
1.9 KiB
Markdown
|
|
# Distributed Runtime Flow
|
||
|
|
|
||
|
|
Dynavera behaves like a streaming agentic system rather than a simple CRUD app. Runtime responsibility is split into three buckets.
|
||
|
|
|
||
|
|
## 1) MCP Surface (Django-side tool layer)
|
||
|
|
|
||
|
|
This is the tool-facing layer that lets the model request structured actions such as retrieval and session updates.
|
||
|
|
|
||
|
|
Typical tool intents:
|
||
|
|
|
||
|
|
- `search_knowledge(query, role_uuid)`
|
||
|
|
- `get_user_progress(user/session context)`
|
||
|
|
- `update_session_state(session_uuid, patch)`
|
||
|
|
|
||
|
|
Conceptually, this layer translates model tool calls into standard Django queries and vector lookups.
|
||
|
|
|
||
|
|
## 2) Orchestrator (Channels consumer + async control loop)
|
||
|
|
|
||
|
|
The orchestrator lives in the WebSocket runtime and coordinates each user request lifecycle.
|
||
|
|
|
||
|
|
Typical interaction path:
|
||
|
|
|
||
|
|
1. User sends message over WebSocket
|
||
|
|
2. Orchestrator builds/updates context
|
||
|
|
3. Orchestrator calls inference endpoint
|
||
|
|
4. Model requests tool calls when needed
|
||
|
|
5. Orchestrator executes tool calls and continues generation
|
||
|
|
6. Streamed/assembled response returns to user
|
||
|
|
|
||
|
|
This is the central control plane for session continuity, tool usage, and response streaming.
|
||
|
|
|
||
|
|
## 3) GPU Inference Pipe (passive engine)
|
||
|
|
|
||
|
|
The GPU service is designed as a passive inference engine:
|
||
|
|
|
||
|
|
- Receives prompts/inference payloads
|
||
|
|
- Produces chat/embedding outputs
|
||
|
|
- Does not initiate calls back into the VPS
|
||
|
|
|
||
|
|
Using OpenAI-style request/response patterns keeps integration predictable.
|
||
|
|
|
||
|
|
## Interface Summary
|
||
|
|
|
||
|
|
| Component | Typical Path / Endpoint | Role |
|
||
|
|
| :--- | :--- | :--- |
|
||
|
|
| MCP Surface | Internal Django tool handlers (and/or MCP endpoint) | Data/tool translation |
|
||
|
|
| Orchestrator | `apps.onboarding.consumers` | Coordination + streaming |
|
||
|
|
| GPU Inference | `gpu_server.py` HTTP endpoints | Generation + embeddings |
|
||
|
|
|
||
|
|
## Navigation
|
||
|
|
|
||
|
|
- [Application Structure (Detailed)](application-structure.md)
|
||
|
|
- [Deployment Topologies](deployment-topologies.md)
|
||
|
|
- [Project README](../README.md)
|