I build production-grade Agentic AI systems — self-correcting LangGraph agents, multi-tenant RAG pipeline, NL-SQL Rag pipelines and autonomous research workflows that make real decisions, not just retrieve text. Every system runs in Docker, streams through Kafka, caches in Redis, and is designed to handle real users — not notebook demos.
Production NL-to-SQL system for Excel Texas Wireless — non-technical admins, supervisors, and agents query a live PostgreSQL transaction database using plain English. Schema metadata vectorized with text-embedding-3-small and stored in Qdrant; a 5-node LangGraph pipeline handles schema retrieval, GPT-4o-mini SQL generation, validation with a conditional retry loop, asyncpg execution, and response formatting. Role-based access control enforced at SQL query level — WHERE clauses injected programmatically per role before execution. Redis provides a semantic cache (cosine similarity, 0.92 threshold) with role-scoped keys preventing cross-user data leakage. Full audit trail via query_logs. 35+ tests covering SQL injection, cache isolation, and RBAC bypass. 5 containers orchestrated with Docker Compose health checks.
Production RAG system serving Amazon, Flipkart, and Myntra from a single deployment — hard tenant isolation via separate Qdrant collections, no shared index. Policy PDFs ingested asynchronously through Kafka; a 5-node LangGraph pipeline handles routing, Qdrant retrieval, FlashRank cross-encoder reranking, GPT-4o-mini generation, and citation building. Redis provides a semantic cache (cosine similarity, 0.92 threshold) and per-session conversation memory. A feedback loop collects thumbs up/down via Kafka, stores ratings in PostgreSQL, and auto-rewrites underperforming prompts without redeployment. Evaluated with RAGAS (faithfulness + answer relevancy). 9 containers, fully orchestrated with Docker Compose health checks.
Production-grade multi-agent code review system built on LangGraph. Submitted code enters a 5-node stateful graph: orchestrator validates and checks a SHA256 cache, then fans out to three parallel specialist agents — bug detection, code quality, and security analysis — all running GPT-4o-mini concurrently. A synthesizer node merges parallel outputs into a structured JSON report with per-category scores. SHA256 caching in PostgreSQL eliminates duplicate LLM calls for identical code submissions. Tenacity retry logic handles transient OpenAI failures. Fully containerised with Docker Compose; FastAPI backend with Pydantic v2 request validation.
Agentic RAG system that ingests YouTube video transcripts and answers
questions strictly grounded in transcript content — no hallucination outside source material.
Transcripts are chunked, embedded with OpenAI text-embedding-3-small,
and stored in Qdrant. A LangGraph agent orchestrates retrieval,
applies a constrained generation prompt that refuses to answer beyond the transcript context,
and returns cited responses via a FastAPI backend.
Built with langchain_qdrant.QdrantVectorStore after migrating from deprecated
qdrant_client APIs — demonstrates real-world dependency management on a moving ecosystem.
Open to backend AI engineering roles — production RAG systems, multi-agent architectures, and event-driven AI pipelines.