HELLO, I'M
Nikhil Juluri
Master's student in Computer Science at UIC specializing in AI, Machine Learning, and High-Performance Inference. Passionate about building scalable GraphRAG pipelines and optimizing LLM workflows. Professional experience at Deloitte and SodexoMagic delivering enterprise-scale AI and data solutions.

+
Years Experience
+
Projects Completed
K+
Lines of Code
+
Technologies Mastered

About Me
Hi, I'm Nikhil Juluri, an AI and Machine Learning Engineer pursuing my Master's in Computer Science at the University of Illinois Chicago (GPA 3.8). I have a passion for building high-performance AI systems and scalable data pipelines. My journey started with a Bachelor's in Electronics and Communication from CBIT, Hyderabad, followed by over three and a half years at Deloitte, where I delivered production-grade enterprise solutions in the financial domain.
Currently, as a Graduate Research Assistant at UIC, I develop memory-efficient Python pipelines using PyArrow and multiprocessing, achieving a 35% memory reduction in large-scale data workflows. I recently worked as a Data Analyst at SodexoMagic, where I optimized operational reporting by 40% and implemented private LLM workflows for audit-style validation with 87% accuracy detections.
I specialize in building Retrieval-Augmented Generation (RAG) systems, graph-powered incident intelligence, and high-performance inference frameworks. I am particularly interested in the intersection of Graph Neural Networks and Large Language Models, having developed GraphRAG systems that outperform standard dense retrieval methods.
My technical toolkit includes PyTorch, LangChain, Neo4j, vLLM, and AWS. I am driven by the challenge of translating complex research into production-ready solutions, whether it's optimizing token throughput for concurrent workloads or engineering trust-aware assistants for complex document corpora.
Featured Projects
Showcasing advanced capabilities in GenAI, Large Language Models, and MLOps infrastructure.
Lazarus – Clinical AI Platform for Drug Repurposing
- •Architected a full-stack clinical AI platform that transforms failed drug assets into ranked repurposing hypotheses using a FastAPI + React/Vite control plane, PostgreSQL operational ledger, and Neo4j biomedical knowledge graph.
- •Built a typed 9-agent LLM orchestration DAG with 14 persisted reasoning steps per run, generating auditable outputs across hypothesis generation, skeptical review, evidence curation, trial strategy, effort estimation, and impact scoring.
- •Engineered real-time WebSocket streaming with polling fallback, enabling operators to monitor live agent traces, confidence scores, human-review escalations, portfolio rankings, and executive-ready PDF blueprint generation.
TrustLayer – Trust-Aware RAG Research Assistant
- •Built a trust-aware research assistant for local research-paper corpora by engineering an end-to-end RAG pipeline that indexes 5,000+ evidence chunks with Chroma vector search.
- •Improved answer reliability by implementing hybrid and corrective retrieval, combining dense embeddings with BM25 sparse search and cross-encoder reranking.
- •Increased transparency and reduced unsupported responses by 40% through verification-based abstention and an interactive dashboard for evidence visualization.
BugOrbit – Graph-Powered Incident Intelligence
- •Designed and built a graph-powered incident intelligence platform that transforms raw production telemetry into structured incidents and root-cause analysis.
- •Engineered a FastAPI and Neo4j pipeline to normalize noisy observability payloads and persist service dependencies as a live graph with <200 ms ingestion latency.
- •Developed an interactive React dashboard for live incident monitoring and dependency-graph exploration, reducing mean investigation time by 30%.
GraphRAG for Multi-Hop Question Answering
- •Built an end-to-end GraphRAG system for multi-hop QA, indexing 10,000 examples into 263,113 text chunks with dense retrieval and hybrid graph construction.
- •Designed a hybrid graph-retrieval pipeline with query-aware GraphSAGE and PCST-based evidence selection to improve multi-document reasoning.
- •Achieved significant performance gains over dense baselines, outperforming in downstream answer quality across evaluation sets.
PulseGrid (Kairos) – Real-Time Disaster Response Optimization
- •Designed a real-time graph-based decision-making system on Neo4j for resource dispatch and routing during disasters, facilitating sub-100ms updates.
- •Implemented multi-step routing using Priority Queues, Dijkstra’s, Yen’s K-shortest path, and Gale-Shapley algorithms for optimal responder matching.
- •Decreased responder deployment time by 45-50% while providing real-time route animations and ETA tracking via sub-1 second instructions.
High-Performance LLM Inference Framework
- •Built a high-performance LLM inference framework improving token generation throughput by 30-45% and reducing latency by 25% through optimized dynamic batching.
- •Designed benchmarking pipelines to analyze latency distribution (p50/p95), throughput, and GPU memory utilization across multiple model configurations.
- •Enabled systematic performance tuning and identification of inference bottlenecks under concurrent workloads using vLLM and CUDA.
Hackathon Achievements
Recognized at major hackathons for building advanced AI platforms and clinical R&D systems.
Microsoft HackWithChicago Finalist
Finalist at the Microsoft HackWithChicago hackathon for building BugOrbit, a graph-powered incident intelligence platform.
WildHacks (Northwestern University) Top 25
Selected as one of the top 25 projects at WildHacks by Northwestern University for PulseGrid, a real-time disaster response optimization system.
HackPrinceton Spring 2026 Sponsor-Track Runner-Up
Runner-up in the sponsor track at HackPrinceton Spring 2026 for Lazarus, an autonomous clinical R&D swarm.
My Experience
My professional journey in software and AI development.
Graduate Research Assistant
University of Illinois Chicago
- Developed memory-efficient Python pipelines using PyArrow and multiprocessing for ingesting multi-gigabyte transaction-style and financial telemetry flat files; achieved a memory reduction of 35% and an ingestion throughput increase of 40%.
- Developed private LLM workflows with GPT and Llama models for code translation and audit-style validation; achieved an accuracy increase of 87% for anomaly detection and a 2x reduction in debugging times.
- Implemented MLOps pipelines with MLflow for deployment on AWS SageMaker and Google Vertex AI for traceable and compliant model deployments; achieved a 30% reduction in release turnaround times.
Data Analyst
SodexoMagic
- Created interactive dashboards in Power BI and Tableau linked to SQL databases for operational KPIs, resulting in a 40% reduction in reporting time and faster decision-making processes.
- Executed data preprocessing operations on structured SQL and NoSQL (JSON) data sources using Python and Pandas, transforming unstructured data into structured datasets for repeated analysis.
- Automated data processing operations through Python programming and SQL job scheduling, saving about 3–4 hours weekly from manual reporting efforts.
Software Engineer II — Machine Learning Engineer
Deloitte
- Built enterprise GenAI advisory assistants using RAG over financial statements and market APIs with LangChain and vector databases, reducing manual analysis time by 40% while ensuring factual accuracy (0.91 groundedness).
- Improved relevance of investment product recommendations and portfolio suggestions by 25% through LoRA/QLoRA fine-tuning and context-aware prompt engineering, while optimizing token utilization and inference cost.
- Deployed scalable GenAI services on AWS Bedrock and SageMaker with provisioned throughput models, orchestrated through FastAPI microservices and integrated with Lambda, S3, and EC2 to maintain 99.9% system uptime.
Software Engineer I — Machine Learning Engineer
Deloitte
- Built credit risk prediction pipelines using ETL, feature engineering, and XGBoost with 5-fold cross-validation (AUC 0.86), improving loan approval accuracy by 18%.
- Developed hybrid investment recommendation systems using collaborative filtering and XGBoost ranking, increasing CTR by 15% through precision-optimized ranking.
- Built a fraud detection pipeline using Isolation Forest and Gradient Boosting, reducing false positives by 25%, and deployed Dockerized FastAPI APIs on AWS Lambda for real-time monitoring.
My Education
University of Illinois Chicago
Master of Science in Computer Science
Chaitanya Bharathi Institute of Technology
Bachelor of Engineering in Electronics and Communication
My Skills
Technical proficiency across various domains.
Programming & Data
Machine Learning
GenAI & NLP
Systems & MLOps
Certifications & Awards
Get In Touch
Let's work together and create something extraordinary.