About My Research

I'm a Senior Full Stack Developer specializing in Generative AI and Large Language Models, with extensive experience across healthcare, logistics, and AI domains. My journey began at University at Buffalo where I earned my MS in Engineering Science and worked as a Graduate Teaching Assistant for Machine Learning with Search Engines. My research foundation includes multimodal search engines combining visual embeddings with NLP for hybrid retrieval, achieving 35% improvement in result relevance over traditional keyword search.

Currently, I lead development of AI-powered applications at Mastery Logistics Systems, building LLM-powered logistics optimization tools, RAG systems, and vector search solutions. My work spans healthcare platforms at UPMC and analytics pipelines at Cigna. My latest research explores Small Language Models in Agentic AI systems, investigating how SLMs can replace LLMs for cost and efficiency gains.

Technical Expertise
🤖

AI/ML & Search

Deep expertise in Large Language Models, RAG systems, multimodal AI, and information retrieval

LLMs RAG NLP Computer Vision Transformers PyTorch TensorFlow scikit-learn LangChain Hugging Face Elasticsearch Vector DBs (Pinecone, FAISS, Chroma) Information Retrieval
💻

Programming & Frontend

Expertise in multiple programming languages and modern frontend technologies

Python JavaScript TypeScript Java Go C++ C# React Next.js HTML5 CSS3 Tailwind CSS Bootstrap
⚙️

Backend & Architecture

Building scalable backend systems with modern architecture patterns and APIs

Node.js Django Flask GraphQL REST gRPC WebSocket Event-Driven Architecture Microservices
☁️

Cloud & DevOps

Deploying and managing applications in cloud environments with CI/CD pipelines

AWS (SageMaker, Bedrock, Lambda, EC2, RDS, S3) GCP (BigQuery) Azure (AI Studio, Cosmos DB) Docker Kubernetes Terraform MLflow Jenkins GitHub Actions
📊

Data & Databases

Building data pipelines and managing various database systems for large-scale applications

PostgreSQL MySQL MongoDB DynamoDB Redis Kafka Data Pipelines Analytics Platforms
Research & AI Projects

Small Language Models in Agentic AI

Aug 2025 – Present

As an individual contributor, I explored how small language models (SLMs) can replace large language models (LLMs) in agentic systems, based on Belcak et al., 2025. My work combined literature study with practical side experiments, including fine-tuning open-source SLMs (2–7B) with LoRA/QLoRA on small, synthetic datasets for tool calling and structured outputs. This research builds upon the foundational work published in arXiv:2506.02153 [cs.AI] and DOI: 10.48550/arXiv.2506.02153.

Technologies: SLMs (2-7B), LoRA/QLoRA, PyTorch, Python, Prompt Engineering, Agentic AI, MetaGPT, Open Operator

Impact: Benchmarked SLMs locally on consumer GPU and compared latency and cost with LLM APIs, demonstrating that 40–70% of agent calls could be offloaded to SLMs.

Outcome: This independent research and hands-on validation supported the position that SLM-first architectures can deliver significant cost and efficiency gains in real-world agentic AI applications.

Intelligent Visual Search Engine

Jan 2022 – May 2023

Designed a multimodal search engine for the 'Machine Learning with Search Engines' course, enabling image-to-image and image-to-text retrieval using ResNet-based embeddings and cosine similarity search in Elasticsearch.

Technologies: ResNet, Elasticsearch, NLP, Python, PyTorch, Multimodal AI

Impact: Combined visual embeddings with NLP-based caption search for hybrid retrieval, improving result relevance by 35% over keyword-only search.

Outcome: Successfully demonstrated the effectiveness of multimodal approaches in search engine technology, leading to improved user experience and search accuracy.

AI-Powered Logistics Optimization

Oct 2024 – Present

Building generative AI system for shipment scheduling and route planning with vector search for contextual retrieval. Integrating RAG pipelines with logistics microservices for real-time decision-making.

Technologies: LLMs, RAG, Vector Search, AWS Bedrock, SageMaker, Python, React

Impact: Reducing planning time by 60% and boosting throughput by 25% through AI-driven optimization.

Outcome: Creating production-ready AI systems that directly impact business operations and efficiency.

Document Intelligence & Summarization

Oct 2024 – Present

Developing NLP platform to summarize shipment contracts and compliance documents using transformer-based NLP models and RAG pipelines.

Technologies: LLMs, RAG, Transformers, NLP, Python, LangChain

Impact: Reducing manual document review time by 55% and improving compliance turnaround speed.

Outcome: Automating complex document processing tasks to improve operational efficiency and accuracy.

Made with Angular + Tailwind; AI by LLMs/RAG

@Sampath Garimella 2025