Updated readme and added subdocs

This commit is contained in:
Viswamedha Nalabotu 2026-02-27 02:09:54 +00:00
parent 4ac57a38d0
commit b5f25411f2
5 changed files with 358 additions and 66 deletions

269
README.md
View file

@ -1,66 +1,132 @@
# Dynavera: Distributed Agentic Onboarding System # Dynavera: Distributed Agentic Onboarding System
Dynavera is a multi-agent AI platform designed to automate role-specific onboarding. The system utilizes a distributed architecture to separate application logic from high-latency LLM inference, employing the Model Context Protocol (MCP) for internal data retrieval and Retrieval-Augmented Generation (RAG). Dynavera is a multi-agent onboarding platform that combines role-specific training flows, retrieval from organization documents, and LLM-powered guidance. The system is intentionally distributed so that app orchestration and heavy inference can run independently.
Repository: https://git.cs.bham.ac.uk/projects-2025-26/vxn217
---
## Table of Contents
- [At a Glance](#at-a-glance)
- [Inspector & Supervisor Notes](#inspector--supervisor-notes)
- [Screenshots](#screenshots)
- [System Architecture (High-Level)](#system-architecture-high-level)
- [Project Goals](#project-goals)
- [Tech Stack](#tech-stack)
- [Repository Guide](#repository-guide)
- [Evaluation Credentials](#evaluation-credentials)
- [Recommended Evaluation Walkthrough](#recommended-evaluation-walkthrough)
- [Local Setup (Cross-Platform)](#local-setup-cross-platform)
- [Common Commands](#common-commands)
- [Additional Documentation](#additional-documentation)
---
## At a Glance
Dynavera focuses on one question: **how do we deliver onboarding that is role-aware, context-aware, and operationally practical?**
The platform does this by combining:
- A Django management layer for accounts, roles, sessions, and APIs
- An agentic orchestration loop over WebSockets for responsive interactions
- A retrieval layer using pgvector and organization-provided documents
- A GPU inference service for chat completions, embeddings, and chunking support
---
## Inspector & Supervisor Notes
Primary locations relevant to technical quality, architecture reasoning, and evaluation:
- Setup, context, and high-level flow: this `README.md`
- Architecture notes: `docs/`
- Orchestration runtime: `apps/onboarding/consumers.py`
- Retrieval bridge and tool routing: `apps/onboarding/mcp.py`
- Ingestion and vectorization pipeline: `apps/knowledge/tasks.py`
- Inference service entrypoint: `gpu_server.py`
Evaluation-relevant themes represented in the codebase:
- Role-scoped onboarding generation and progression
- Retrieval grounding through uploaded training files
- Separation of management services and inference services
- End-to-end flow from upload to onboarding completion
---
## Screenshots
Placeholder slots for final screenshots.
### Home Page
![Home Page Placeholder](docs/images/home-page-placeholder.png)
### Organization Page
![Organization Page Placeholder](docs/images/organization-page-placeholder.png)
### Onboarding Loading / Generation State
![Onboarding Loading Placeholder](docs/images/onboarding-loading-placeholder.png)
### Onboarding Content Flow
![Onboarding Flow Placeholder](docs/images/onboarding-flow-placeholder.png)
---
## System Architecture (High-Level)
At a high level, Dynavera is split into a management side and an inference side. The orchestrator coordinates user interaction, tool calls, and model responses between the two.
![High Level System Architecture](docs/high-level-system-architecture.png)
For the fuller architecture narrative (runtime flow and component placement), see:
- [Distributed Runtime Flow](docs/distributed-runtime-flow.md)
--- ---
## Project Goals ## Project Goals
- [x] Distributed Orchestration: Implementation of a dual-node system (VPS/GPU) to manage real-time user interaction and heavy computational inference independently. - [x] Distributed orchestration across VPS and GPU nodes
- [x] Context-aware onboarding with RAG (semantic chunking + vector search)
- [x] Context-Aware Training: Development of a RAG pipeline that utilizes semantic chunking and vector similarity search to provide role-specific guidance. - [x] Stateful agent workflow over WebSockets
- [x] Automated ingestion from role training documents (PDF/TXT)
- [x] Agentic Workflow: Utilizing an orchestrator to manage stateful conversations, tool calls, and user progress tracking via WebSockets.
- [x] Automated Ingestion: Creating a pipeline for converting raw organizational documents (PDF/TXT) into searchable vector embeddings.
---
## System Architecture
The application is split into two primary layers:
### Management Layer (VPS)
* **Framework**: Django 5.x with Django Channels for WebSocket management.
* **Database**: PostgreSQL with the pgvector extension for semantic storage.
* **Task Queue**: Celery and Redis for asynchronous document processing and ingestion.
* **Internal Routing**: `apps/onboarding/mcp.py` serves as the Model Context Protocol router, bridging the agent to the PostgreSQL vector store.
### Intelligence Layer (GPU Node)
* **Inference Server**: `gpu_server.py` (FastAPI) located in the root, exposing endpoints for LLM chat completions and embeddings.
* **Semantic Processor**: Custom logic within the inference server for smart chunking that detects topic shifts in text to optimize retrieval accuracy.
--- ---
## Tech Stack ## Tech Stack
* **Backend**: Django, Django REST Framework, Django Channels. - **Backend**: Django, Django REST Framework, Django Channels
* **Frontend**: Vue 3, Vite, Pinia. - **Frontend**: Vue 3, Vite, Pinia
* **Database**: PostgreSQL (pgvector). - **Database**: PostgreSQL with pgvector
* **AI/ML**: FastAPI, OpenAI-compatible API structures, Sentence-Transformers. - **AI/ML**: FastAPI, Sentence Transformers, llama.cpp-compatible serving
* **Infrastructure**: Docker, Redis, Celery. - **Infra**: Docker, Redis, Celery
--- ---
## Application Structure ## Repository Guide
* **apps.accounts**: Manages User, Organization, and Role models, including invite-based onboarding logic. Key areas in the repo:
* **apps.knowledge**: Handles the RAG pipeline, including TrainingFile management and RoleRagDocument vector storage.
* **apps.onboarding**: Contains the core logic for the onboarding experience: - `apps/accounts`: user model, organization/role ownership, membership flows
* `consumers.py`: The Agent Orchestrator managing WebSocket handshakes and session loops. - `apps/knowledge`: file ingestion, chunking pipeline, vector document persistence
* `mcp.py`: The internal router for Model Context Protocol tool execution. - `apps/onboarding`: role flows, sessions, websocket orchestration, MCP-style tool routing
* `models.py`: Stores AgentConfig (prompts/tools) and OnboardingSession state. - `config/`: settings, API/ASGI routing, environment wiring
* **gpu_server.py**: The entry point for the Intelligence Layer, handling embedding generation and LLM inference. - `compose/`: development and production deployment manifests
- `gpu_server.py`: inference and embedding service
For a more detailed breakdown:
- [Application Structure (Detailed)](docs/application-structure.md)
--- ---
## Instructions for Evaluation ## Evaluation Credentials
The system is currently pre-loaded with demonstration data from internal configuration files.
### Access Credentials
| Role | Email | Password | | Role | Email | Password |
| :--- | :--- | :--- | | :--- | :--- | :--- |
@ -68,36 +134,107 @@ The system is currently pre-loaded with demonstration data from internal configu
| **Manager** | haleisaac@example.com | password | | **Manager** | haleisaac@example.com | password |
| **User** | j.thompson@example.com | password | | **User** | j.thompson@example.com | password |
### Recommended Technical Walkthrough Manager registration code: `MANAGER2026`
To verify the integration of the Knowledge Pipeline and the Agentic Orchestrator, follow these steps:
1. **Environment Setup**: Navigate to https://fyp.viswamedha.com. *
2. **Document Ingestion**: Log in as the **Manager** (haleisaac@example.com). Navigate to the **University of Birmingham** organization. Upload a PDF relevant to a specific role.
3. **Vectorization**: Observe the ingestion status. The system will extract text, send it to the GPU node for semantic chunking, and store the resulting 1536-dimension vectors in PostgreSQL.
4. **Agent Interaction**: Access the **Role Onboarding** interface. Initiate a session.
5. **Retrieval Verification**: This will query the agent regarding specific details within the uploaded PDF. The agent in `consumers.py` will trigger a tool call via `mcp.py`, retrieve the relevant document chunks, and provide a contextualized response via onboarding pages.
*Note: If the website that I hosted is not accessible, please set up the project locally by following the instructions in the Usage section below.
--- ---
## Usage ## Recommended Evaluation Walkthrough
1. Clone the repository. 1. Open https://fyp.viswamedha.com
2. Copy the `.env.example` file to `.env` or create a new `.env` file based on `.env.template`, and change the necessary environment variables. * 2. Log in as **Manager** and open the target organization
3. Deploy via Docker Compose: `docker compose -f compose/dev/docker-compose.yml --env-file .env up -d` in the root directory. 3. Upload a role-relevant document (PDF recommended)
4. Access the frontend at the configured port (usually `localhost:8000`). 4. Wait for ingestion and embedding completion
5. Start role onboarding and trigger generation
6. Check if responses are grounded in uploaded material
7. Optionally review progress details and logs
* Note: If you use a different secret key, when the fyp-django-dev container starts, you will need to execute the following command to reset all accounts to default passwords of "admin" for admin users and "password" for manager and user accounts: If the hosted deployment is unavailable, local setup is documented below.
---
## Local Setup (Cross-Platform)
### Prerequisites
- Docker Engine / Docker Desktop
- NVIDIA drivers + NVIDIA Container Toolkit (for GPU inference)
### 1) Clone
```bash
git clone https://git.cs.bham.ac.uk/projects-2025-26/vxn217
cd vxn217
```
### 2) Create `.env`
**PowerShell**
```powershell
Copy-Item .env.template .env
```
**CMD**
```cmd
copy .env.template .env
```
**macOS/Linux**
```bash
cp .env.template .env
```
Then update `.env` values for your environment.
### 3) Start services (development)
```bash
docker compose -f compose/dev/docker-compose.yml --env-file .env up -d --build
```
### 4) Access endpoints
- App: http://localhost:8000
### 5) Optional: reset seeded passwords
```bash ```bash
docker exec -it fyp-django-dev python manage.py reset_passwords docker exec -it fyp-django-dev python manage.py reset_passwords
``` ```
### Warnings Reset defaults:
* The development compose is used here to allow HMR and easier debugging. Please only use this file. - Admin users: `admin`
* Ensure that a GPU is available and CUDA drivers are properly installed for the inference server to function. - Manager and user accounts: `password`
* I have tested this on an RTX 3060 with 12GB VRAM, so I am not sure if it will work on other GPUs.
* There is no guarantee that it will load on a CPU-only machine as the batch size and model parameters are configured for GPU inference. ---
## Common Commands
Stop services:
```bash
docker compose -f compose/dev/docker-compose.yml --env-file .env down
```
Tail logs:
```bash
docker compose -f compose/dev/docker-compose.yml --env-file .env logs -f
```
Run migrations:
```bash
docker exec -it fyp-django-dev python manage.py migrate
```
---
## Additional Documentation
- [Distributed Runtime Flow](docs/distributed-runtime-flow.md)
- [Application Structure (Detailed)](docs/application-structure.md)
- [Deployment Topologies](docs/deployment-topologies.md)

View file

@ -0,0 +1,64 @@
# Application Structure (Detailed)
This page expands on where responsibilities live in the codebase.
## Core Apps
### `apps.accounts`
Handles identity and tenancy concerns:
- User model and role flags
- Organization ownership and membership
- Role assignment and invite flows
### `apps.knowledge`
Handles ingestion and retrieval data prep:
- Upload and tracking of training files
- Content extraction and chunking pipeline
- Embedding persistence in role-scoped vector documents
### `apps.onboarding`
Handles the agentic onboarding runtime:
- Session and flow models
- WebSocket consumer orchestrator
- Tool routing (MCP-style handler)
- Flow/session APIs for frontend integration
## Infrastructure Modules
### `config/*`
Framework-level config and wiring:
- Django settings
- URL/API routing
- ASGI/Channels entry points
- Celery config
### `compose/*`
Environment-specific deployment configuration:
- Development compose stack
- Production compose stack
- Inference compose profile
### `gpu_server.py`
Inference service entry point:
- Chat completions endpoint
- Embeddings endpoint
- Semantic chunking endpoint
- Health checks and model lifecycle
## Navigation
- [Distributed Runtime Flow](distributed-runtime-flow.md)
- [Deployment Topologies](deployment-topologies.md)
- [Project README](../README.md)

View file

@ -0,0 +1,37 @@
# Deployment Topologies
This page compares local and distributed deployment shapes.
## Local Development Topology
Purpose: fast iteration and debugging.
- App services run via `compose/dev/docker-compose.yml`
- Django, Celery, Redis, Postgres, Node, and inference can run together
- Suitable for feature work and integration checks
## Distributed Topology (VPS + GPU Node)
Purpose: production-like separation of concerns.
- **VPS node**: web app, orchestration, API, websocket handling, task queue, database
- **GPU node**: dedicated inference service (chat + embeddings + chunking helpers)
- Request direction is primarily **VPS -> GPU** for model tasks
## Why Split Nodes?
- Keeps model latency/VRAM pressure away from user/session services
- Allows independent scaling of orchestration and inference
- Improves operational clarity around failures and bottlenecks
## Operational Notes
- Confirm inference host/port values in runtime container env
- Confirm pgvector extension is enabled in target database
- Keep role flow generation permissions constrained to trusted user types
## Navigation
- [Distributed Runtime Flow](distributed-runtime-flow.md)
- [Application Structure (Detailed)](application-structure.md)
- [Project README](../README.md)

View file

@ -0,0 +1,54 @@
# Distributed Runtime Flow
Dynavera behaves like a streaming agentic system rather than a simple CRUD app. Runtime responsibility is split into three buckets.
## 1) MCP Surface (Django-side tool layer)
This is the tool-facing layer that lets the model request structured actions such as retrieval and session updates.
Typical tool intents:
- `search_knowledge(query, role_uuid)`
- `get_user_progress(user/session context)`
- `update_session_state(session_uuid, patch)`
Conceptually, this layer translates model tool calls into standard Django queries and vector lookups.
## 2) Orchestrator (Channels consumer + async control loop)
The orchestrator lives in the WebSocket runtime and coordinates each user request lifecycle.
Typical interaction path:
1. User sends message over WebSocket
2. Orchestrator builds/updates context
3. Orchestrator calls inference endpoint
4. Model requests tool calls when needed
5. Orchestrator executes tool calls and continues generation
6. Streamed/assembled response returns to user
This is the central control plane for session continuity, tool usage, and response streaming.
## 3) GPU Inference Pipe (passive engine)
The GPU service is designed as a passive inference engine:
- Receives prompts/inference payloads
- Produces chat/embedding outputs
- Does not initiate calls back into the VPS
Using OpenAI-style request/response patterns keeps integration predictable.
## Interface Summary
| Component | Typical Path / Endpoint | Role |
| :--- | :--- | :--- |
| MCP Surface | Internal Django tool handlers (and/or MCP endpoint) | Data/tool translation |
| Orchestrator | `apps.onboarding.consumers` | Coordination + streaming |
| GPU Inference | `gpu_server.py` HTTP endpoints | Generation + embeddings |
## Navigation
- [Application Structure (Detailed)](application-structure.md)
- [Deployment Topologies](deployment-topologies.md)
- [Project README](../README.md)

Binary file not shown.

After

Width:  |  Height:  |  Size: 67 KiB