Results-driven Senior Data Scientist with an Integrated B.Tech and M.Tech in IT from IIITM Gwalior. Currently driving quantitative portfolio optimization and enterprise data architecture at QuantumStreet AI. Expertise encompasses developing advanced Expected Return (ER) models, engineering Point-in-Time data systems, and deploying auto-scaling MLOps infrastructure. Highly skilled in Python, SQL, predictive machine learning, GenAI, and NLP architecture, alongside hands-on experience building stateful Agentic RAG applications. Recognized as 'Performer of the Year' for three consecutive years for successfully architecting high-impact, production-level solutions.
QuantumStreet AI
Bengaluru
Enterprise Data Architecture & Expected Return (ER) Modeling Orchestrated a resilient data pipeline migration from S&P 500 to LSEG within Snowflake, ensuring continuity for quantitative modeling; optimized SQL joins to evaluate 5K equities in few seconds and established automated cloud cost-reduction alerts. Engineered a production-grade Point-in-Time (PiT) data system leveraging time-travel concepts to accurately map corporate actions and historical restatements, completely eradicating look-ahead bias in ML models. Researched and engineered ∼150 custom fundamental alpha signals; achieved ∼23% RMSE reduction and outperformed baseline in ∼70% of production scenarios through rigorous feature selection and cross-validation. Formulated a custom CatBoost loss function tailored to directional return prediction, delivering ∼27% improvement in directional accuracy over standard objectives. Integrated short- and long-term ER models into a monthly ETF portfolio, executing fundamental strategies that surpassed the S&P 500 (SPY) benchmark by 0.5% monthly and captured a maximum 4% annual excess return. Quantitative Portfolio Optimization & NLP Strategies Spearheaded a thematic sector-rotation strategy on the S&P 500 by applying NLP pipelines on SEC filings and earnings call transcripts; dynamically selected ∼200 equities per cycle, generating ∼5% annualized alpha in rigorous walk-forward backtesting. Combined factor-based portfolio construction with NLP-derived signals to improve signal diversity and reduce factor crowding in long-short equity portfolios. Auto-Scaling MLOps & Infrastructure Architected end-to-end auto-scaling pipelines spanning data ingestion, feature computation, model training, and inference serving; reduced training cycle times by 30–40% through parallelization and resource-aware scheduling. Implemented automated drift-detection and retraining triggers, enabling continuous model adaptation and sustaining a 2–3% performance lift over static baselines in production. Advanced Analytics & Sentiment Analysis Built a real-time financial news scraping and classification bot powered by OpenAI API, autonomously detecting and quantifying litigation risks, M&A activity, and adverse market sentiment to generate actionable event-driven signals.
Aplazo
BNPL Fraud & Risk Engineered a BNPL fraud detection model extracting ∼300 custom features, achieving a 65% GINI coefficient — a 10% lift over the legacy LightGBM baseline. Optimized BNPL risk decisioning algorithms, slashing early-stage delinquency (DOB22) by 38% to a record low of 6.17% and driving substantial annualized cost savings.
Integrated B.Tech – M.Tech
CGPA: 8.54/10 Received the Merit Award for academic excellence, ranking in the Top 4% of the graduating batch. Ranked in the Top 5 percentile in GATE 2022, demonstrating exceptional analytical and foundational engineering aptitude.
Architected an Adaptive RAG pipeline with dynamic query routing across vector retrieval, web search, and direct LLM generation via real-time intent classification, with an automated retrieval grading and query rewriting loop that retries or falls back to web search when retrieved context is insufficient. Built a ReAct agent with multi-step tool-calling for document Q&A over a multi-document vector store, supporting batch ingestion of PDFs and text files with incremental chunk merging so all uploaded documents remain jointly retrievable across sessions. Designed a persistent session management layer with full chat history storage and resume-by-ID, paired with a real-time reasoning visualization UI that streams live pipeline steps — query analysis, retrieval, grading, rewriting, and generation — as they execute.
for three consecutive years for architecting high-impact, production-level solutions.
securing the Best Acting Award in Nukkad Natak (Street Play) and leading top-tier performances at Mood Indigo.