AI / ML2023-04

Enterprise AI Platform

The company needed AI capabilities but couldn't send data to external providers — strict compliance requirements meant everything had to stay on-premise. I designed a fully private LLM platform running local 70B Llama models via Ollama, with Qdrant for vector storage and a multi-agent architecture that routes queries to specialized agents.

Key Highlights

The key challenge was getting acceptable inference speeds on standard enterprise hardware — after optimizing the pipeline, we hit <2s response times
The platform now serves around 400 employees across the organization with zero data leaving the premises

Technology Stack

Llama 3 70BOllamaOn-premise AIMulti-agentRAG

Infrastructure

QdrantDockerEnterprise Security

Core

Python

NextWhatsApp AI Business Agent