Updated readme and added subdocs

This commit is contained in:
Viswamedha Nalabotu 2026-02-27 02:09:54 +00:00
parent 4ac57a38d0
commit b5f25411f2
5 changed files with 358 additions and 66 deletions

269
README.md
View file

@ -1,66 +1,132 @@
# Dynavera: Distributed Agentic Onboarding System
Dynavera is a multi-agent AI platform designed to automate role-specific onboarding. The system utilizes a distributed architecture to separate application logic from high-latency LLM inference, employing the Model Context Protocol (MCP) for internal data retrieval and Retrieval-Augmented Generation (RAG).
Dynavera is a multi-agent onboarding platform that combines role-specific training flows, retrieval from organization documents, and LLM-powered guidance. The system is intentionally distributed so that app orchestration and heavy inference can run independently.
Repository: https://git.cs.bham.ac.uk/projects-2025-26/vxn217
---
## Table of Contents
- [At a Glance](#at-a-glance)
- [Inspector & Supervisor Notes](#inspector--supervisor-notes)
- [Screenshots](#screenshots)
- [System Architecture (High-Level)](#system-architecture-high-level)
- [Project Goals](#project-goals)
- [Tech Stack](#tech-stack)
- [Repository Guide](#repository-guide)
- [Evaluation Credentials](#evaluation-credentials)
- [Recommended Evaluation Walkthrough](#recommended-evaluation-walkthrough)
- [Local Setup (Cross-Platform)](#local-setup-cross-platform)
- [Common Commands](#common-commands)
- [Additional Documentation](#additional-documentation)
---
## At a Glance
Dynavera focuses on one question: **how do we deliver onboarding that is role-aware, context-aware, and operationally practical?**
The platform does this by combining:
- A Django management layer for accounts, roles, sessions, and APIs
- An agentic orchestration loop over WebSockets for responsive interactions
- A retrieval layer using pgvector and organization-provided documents
- A GPU inference service for chat completions, embeddings, and chunking support
---
## Inspector & Supervisor Notes
Primary locations relevant to technical quality, architecture reasoning, and evaluation:
- Setup, context, and high-level flow: this `README.md`
- Architecture notes: `docs/`
- Orchestration runtime: `apps/onboarding/consumers.py`
- Retrieval bridge and tool routing: `apps/onboarding/mcp.py`
- Ingestion and vectorization pipeline: `apps/knowledge/tasks.py`
- Inference service entrypoint: `gpu_server.py`
Evaluation-relevant themes represented in the codebase:
- Role-scoped onboarding generation and progression
- Retrieval grounding through uploaded training files
- Separation of management services and inference services
- End-to-end flow from upload to onboarding completion
---
## Screenshots
Placeholder slots for final screenshots.
### Home Page
![Home Page Placeholder](docs/images/home-page-placeholder.png)
### Organization Page
![Organization Page Placeholder](docs/images/organization-page-placeholder.png)
### Onboarding Loading / Generation State
![Onboarding Loading Placeholder](docs/images/onboarding-loading-placeholder.png)
### Onboarding Content Flow
![Onboarding Flow Placeholder](docs/images/onboarding-flow-placeholder.png)
---
## System Architecture (High-Level)
At a high level, Dynavera is split into a management side and an inference side. The orchestrator coordinates user interaction, tool calls, and model responses between the two.
![High Level System Architecture](docs/high-level-system-architecture.png)
For the fuller architecture narrative (runtime flow and component placement), see:
- [Distributed Runtime Flow](docs/distributed-runtime-flow.md)
---
## Project Goals
- [x] Distributed Orchestration: Implementation of a dual-node system (VPS/GPU) to manage real-time user interaction and heavy computational inference independently.
- [x] Context-Aware Training: Development of a RAG pipeline that utilizes semantic chunking and vector similarity search to provide role-specific guidance.
- [x] Agentic Workflow: Utilizing an orchestrator to manage stateful conversations, tool calls, and user progress tracking via WebSockets.
- [x] Automated Ingestion: Creating a pipeline for converting raw organizational documents (PDF/TXT) into searchable vector embeddings.
---
## System Architecture
The application is split into two primary layers:
### Management Layer (VPS)
* **Framework**: Django 5.x with Django Channels for WebSocket management.
* **Database**: PostgreSQL with the pgvector extension for semantic storage.
* **Task Queue**: Celery and Redis for asynchronous document processing and ingestion.
* **Internal Routing**: `apps/onboarding/mcp.py` serves as the Model Context Protocol router, bridging the agent to the PostgreSQL vector store.
### Intelligence Layer (GPU Node)
* **Inference Server**: `gpu_server.py` (FastAPI) located in the root, exposing endpoints for LLM chat completions and embeddings.
* **Semantic Processor**: Custom logic within the inference server for smart chunking that detects topic shifts in text to optimize retrieval accuracy.
- [x] Distributed orchestration across VPS and GPU nodes
- [x] Context-aware onboarding with RAG (semantic chunking + vector search)
- [x] Stateful agent workflow over WebSockets
- [x] Automated ingestion from role training documents (PDF/TXT)
---
## Tech Stack
* **Backend**: Django, Django REST Framework, Django Channels.
* **Frontend**: Vue 3, Vite, Pinia.
* **Database**: PostgreSQL (pgvector).
* **AI/ML**: FastAPI, OpenAI-compatible API structures, Sentence-Transformers.
* **Infrastructure**: Docker, Redis, Celery.
- **Backend**: Django, Django REST Framework, Django Channels
- **Frontend**: Vue 3, Vite, Pinia
- **Database**: PostgreSQL with pgvector
- **AI/ML**: FastAPI, Sentence Transformers, llama.cpp-compatible serving
- **Infra**: Docker, Redis, Celery
---
## Application Structure
## Repository Guide
* **apps.accounts**: Manages User, Organization, and Role models, including invite-based onboarding logic.
* **apps.knowledge**: Handles the RAG pipeline, including TrainingFile management and RoleRagDocument vector storage.
* **apps.onboarding**: Contains the core logic for the onboarding experience:
* `consumers.py`: The Agent Orchestrator managing WebSocket handshakes and session loops.
* `mcp.py`: The internal router for Model Context Protocol tool execution.
* `models.py`: Stores AgentConfig (prompts/tools) and OnboardingSession state.
* **gpu_server.py**: The entry point for the Intelligence Layer, handling embedding generation and LLM inference.
Key areas in the repo:
- `apps/accounts`: user model, organization/role ownership, membership flows
- `apps/knowledge`: file ingestion, chunking pipeline, vector document persistence
- `apps/onboarding`: role flows, sessions, websocket orchestration, MCP-style tool routing
- `config/`: settings, API/ASGI routing, environment wiring
- `compose/`: development and production deployment manifests
- `gpu_server.py`: inference and embedding service
For a more detailed breakdown:
- [Application Structure (Detailed)](docs/application-structure.md)
---
## Instructions for Evaluation
The system is currently pre-loaded with demonstration data from internal configuration files.
### Access Credentials
## Evaluation Credentials
| Role | Email | Password |
| :--- | :--- | :--- |
@ -68,36 +134,107 @@ The system is currently pre-loaded with demonstration data from internal configu
| **Manager** | haleisaac@example.com | password |
| **User** | j.thompson@example.com | password |
### Recommended Technical Walkthrough
To verify the integration of the Knowledge Pipeline and the Agentic Orchestrator, follow these steps:
1. **Environment Setup**: Navigate to https://fyp.viswamedha.com. *
2. **Document Ingestion**: Log in as the **Manager** (haleisaac@example.com). Navigate to the **University of Birmingham** organization. Upload a PDF relevant to a specific role.
3. **Vectorization**: Observe the ingestion status. The system will extract text, send it to the GPU node for semantic chunking, and store the resulting 1536-dimension vectors in PostgreSQL.
4. **Agent Interaction**: Access the **Role Onboarding** interface. Initiate a session.
5. **Retrieval Verification**: This will query the agent regarding specific details within the uploaded PDF. The agent in `consumers.py` will trigger a tool call via `mcp.py`, retrieve the relevant document chunks, and provide a contextualized response via onboarding pages.
*Note: If the website that I hosted is not accessible, please set up the project locally by following the instructions in the Usage section below.
Manager registration code: `MANAGER2026`
---
## Usage
## Recommended Evaluation Walkthrough
1. Clone the repository.
2. Copy the `.env.example` file to `.env` or create a new `.env` file based on `.env.template`, and change the necessary environment variables. *
3. Deploy via Docker Compose: `docker compose -f compose/dev/docker-compose.yml --env-file .env up -d` in the root directory.
4. Access the frontend at the configured port (usually `localhost:8000`).
1. Open https://fyp.viswamedha.com
2. Log in as **Manager** and open the target organization
3. Upload a role-relevant document (PDF recommended)
4. Wait for ingestion and embedding completion
5. Start role onboarding and trigger generation
6. Check if responses are grounded in uploaded material
7. Optionally review progress details and logs
* Note: If you use a different secret key, when the fyp-django-dev container starts, you will need to execute the following command to reset all accounts to default passwords of "admin" for admin users and "password" for manager and user accounts:
If the hosted deployment is unavailable, local setup is documented below.
---
## Local Setup (Cross-Platform)
### Prerequisites
- Docker Engine / Docker Desktop
- NVIDIA drivers + NVIDIA Container Toolkit (for GPU inference)
### 1) Clone
```bash
git clone https://git.cs.bham.ac.uk/projects-2025-26/vxn217
cd vxn217
```
### 2) Create `.env`
**PowerShell**
```powershell
Copy-Item .env.template .env
```
**CMD**
```cmd
copy .env.template .env
```
**macOS/Linux**
```bash
cp .env.template .env
```
Then update `.env` values for your environment.
### 3) Start services (development)
```bash
docker compose -f compose/dev/docker-compose.yml --env-file .env up -d --build
```
### 4) Access endpoints
- App: http://localhost:8000
### 5) Optional: reset seeded passwords
```bash
docker exec -it fyp-django-dev python manage.py reset_passwords
```
### Warnings
Reset defaults:
* The development compose is used here to allow HMR and easier debugging. Please only use this file.
* Ensure that a GPU is available and CUDA drivers are properly installed for the inference server to function.
* I have tested this on an RTX 3060 with 12GB VRAM, so I am not sure if it will work on other GPUs.
* There is no guarantee that it will load on a CPU-only machine as the batch size and model parameters are configured for GPU inference.
- Admin users: `admin`
- Manager and user accounts: `password`
---
## Common Commands
Stop services:
```bash
docker compose -f compose/dev/docker-compose.yml --env-file .env down
```
Tail logs:
```bash
docker compose -f compose/dev/docker-compose.yml --env-file .env logs -f
```
Run migrations:
```bash
docker exec -it fyp-django-dev python manage.py migrate
```
---
## Additional Documentation
- [Distributed Runtime Flow](docs/distributed-runtime-flow.md)
- [Application Structure (Detailed)](docs/application-structure.md)
- [Deployment Topologies](docs/deployment-topologies.md)

View file

@ -0,0 +1,64 @@
# Application Structure (Detailed)
This page expands on where responsibilities live in the codebase.
## Core Apps
### `apps.accounts`
Handles identity and tenancy concerns:
- User model and role flags
- Organization ownership and membership
- Role assignment and invite flows
### `apps.knowledge`
Handles ingestion and retrieval data prep:
- Upload and tracking of training files
- Content extraction and chunking pipeline
- Embedding persistence in role-scoped vector documents
### `apps.onboarding`
Handles the agentic onboarding runtime:
- Session and flow models
- WebSocket consumer orchestrator
- Tool routing (MCP-style handler)
- Flow/session APIs for frontend integration
## Infrastructure Modules
### `config/*`
Framework-level config and wiring:
- Django settings
- URL/API routing
- ASGI/Channels entry points
- Celery config
### `compose/*`
Environment-specific deployment configuration:
- Development compose stack
- Production compose stack
- Inference compose profile
### `gpu_server.py`
Inference service entry point:
- Chat completions endpoint
- Embeddings endpoint
- Semantic chunking endpoint
- Health checks and model lifecycle
## Navigation
- [Distributed Runtime Flow](distributed-runtime-flow.md)
- [Deployment Topologies](deployment-topologies.md)
- [Project README](../README.md)

View file

@ -0,0 +1,37 @@
# Deployment Topologies
This page compares local and distributed deployment shapes.
## Local Development Topology
Purpose: fast iteration and debugging.
- App services run via `compose/dev/docker-compose.yml`
- Django, Celery, Redis, Postgres, Node, and inference can run together
- Suitable for feature work and integration checks
## Distributed Topology (VPS + GPU Node)
Purpose: production-like separation of concerns.
- **VPS node**: web app, orchestration, API, websocket handling, task queue, database
- **GPU node**: dedicated inference service (chat + embeddings + chunking helpers)
- Request direction is primarily **VPS -> GPU** for model tasks
## Why Split Nodes?
- Keeps model latency/VRAM pressure away from user/session services
- Allows independent scaling of orchestration and inference
- Improves operational clarity around failures and bottlenecks
## Operational Notes
- Confirm inference host/port values in runtime container env
- Confirm pgvector extension is enabled in target database
- Keep role flow generation permissions constrained to trusted user types
## Navigation
- [Distributed Runtime Flow](distributed-runtime-flow.md)
- [Application Structure (Detailed)](application-structure.md)
- [Project README](../README.md)

View file

@ -0,0 +1,54 @@
# Distributed Runtime Flow
Dynavera behaves like a streaming agentic system rather than a simple CRUD app. Runtime responsibility is split into three buckets.
## 1) MCP Surface (Django-side tool layer)
This is the tool-facing layer that lets the model request structured actions such as retrieval and session updates.
Typical tool intents:
- `search_knowledge(query, role_uuid)`
- `get_user_progress(user/session context)`
- `update_session_state(session_uuid, patch)`
Conceptually, this layer translates model tool calls into standard Django queries and vector lookups.
## 2) Orchestrator (Channels consumer + async control loop)
The orchestrator lives in the WebSocket runtime and coordinates each user request lifecycle.
Typical interaction path:
1. User sends message over WebSocket
2. Orchestrator builds/updates context
3. Orchestrator calls inference endpoint
4. Model requests tool calls when needed
5. Orchestrator executes tool calls and continues generation
6. Streamed/assembled response returns to user
This is the central control plane for session continuity, tool usage, and response streaming.
## 3) GPU Inference Pipe (passive engine)
The GPU service is designed as a passive inference engine:
- Receives prompts/inference payloads
- Produces chat/embedding outputs
- Does not initiate calls back into the VPS
Using OpenAI-style request/response patterns keeps integration predictable.
## Interface Summary
| Component | Typical Path / Endpoint | Role |
| :--- | :--- | :--- |
| MCP Surface | Internal Django tool handlers (and/or MCP endpoint) | Data/tool translation |
| Orchestrator | `apps.onboarding.consumers` | Coordination + streaming |
| GPU Inference | `gpu_server.py` HTTP endpoints | Generation + embeddings |
## Navigation
- [Application Structure (Detailed)](application-structure.md)
- [Deployment Topologies](deployment-topologies.md)
- [Project README](../README.md)

Binary file not shown.

After

Width:  |  Height:  |  Size: 67 KiB