Data Scientist and Machine Learning Engineer
Master’s in Data Science
(Fall 2024 - Spring 2026)
(GPA - 4.0/4.0)
Relevant Coursework: Machine Learning, Data Science, Probability and Statistics, Algorithms for Data Science, Big Data Systems, Data Representation and Modelling
Assisted development of a Legal LLM solutions to produce accurate judgments tailored to the Indian legal system.
Refined digital infrastructure for India's largest banks, reducing loading time by 10%.
A quick map of the tools and ideas I reach for most often.
Developed an ensemble meta-classifier to predict minority-class defaulters in imbalanced datasets using CatBoost and One-Class SVM as base models, with a Random Forest meta-learner dynamically switching between them based on feature patterns.
Part of the R&D team that built a predictive maintenance system to forecast potential failures in heavy machinery and estimate remaining useful life using real-time sensor data and ML techniques.
Built an interactive AI chatbot powered by LangChain ReAct agents and Groq LLMs, capable of routing
queries across multiple tools – Wikipedia for general knowledge, ArXiv for research papers, and
DuckDuckGo for real-time web search – and falling back to Llama-3 for free-form reasoning. The app
features a Streamlit UI with chat bubbles, persistent conversation history, and live streaming of
the agent’s thoughts and tool calls.
Live app
Built a Streamlit-based RAG app that lets users upload any PDF and ask natural language questions
grounded in its content. The pipeline uses PyPDFLoader for text extraction, RecursiveCharacterTextSplitter
for intelligent chunking, OpenAI text-embedding-3-small for embeddings, and an in-memory
Chroma vector store for retrieval, orchestrated via LangChain LCEL into a
Retriever → Prompt → Groq LLM → Output Parser chain powered by Groq’s Llama 3.1 8B model.
Live app
When I'm not wrestling with data, I play badminton and watch Formula - 1.