
AI / ML2023-04
Enterprise AI Platform
The company needed AI capabilities but couldn't send data to external providers — strict compliance requirements meant everything had to stay on-premise. I designed a fully private LLM platform running local 70B Llama models via Ollama, with Qdrant for vector storage and a multi-agent architecture that routes queries to specialized agents.
Key Highlights
- The key challenge was getting acceptable inference speeds on standard enterprise hardware — after optimizing the pipeline, we hit <2s response times
- The platform now serves around 400 employees across the organization with zero data leaving the premises
Technology Stack
AI
Llama 3 70BOllamaOn-premise AIMulti-agentRAG
Infrastructure
QdrantDockerEnterprise Security
Core
Python
