Hi, I'm Piyush. 👋

I'm a Data Scientist who builds smart, human-centered AI systems.

Currently working as a Lead Data Scientist at EXL, I help bridge the gap between cutting-edge LLM research and real-world commercial viability.

Get to know me

Enterprise Data Science

I specialize in transitioning enterprise operations from isolated ML models to scalable "Compound AI Systems." My expertise combines technical execution in Generative AI, NLP, and Predictive Modeling with the leadership required to translate probabilistic risk into business strategy.

With 6 years of experience, I excel at solving the "Privacy-Utility" paradox—building Sovereign AI architectures for regulated industries and automating complex decision-making to drive sustained revenue growth.

Predictive Engines

XGBoost & statistical forecasting models.

GenAI & LLMs

Designing multi-agent LangGraph workflows.

Sovereign NLP

Air-gapped deployment for strict data compliance.

MLOps & Cloud

AWS SageMaker & production scaling.

Technical Expertise

Languages

  • Python
  • SQL
  • Pydantic / FastAPI

AI & ML

  • LLMs (GPT-4o, Llama-3)
  • LangGraph / RAG
  • XGBoost & Predictive
  • SpaCy / NLP

MLOps

  • Docker / Kubernetes
  • CI/CD & Git
  • MLflow
  • Model Monitoring

Cloud & DB

  • AWS SageMaker
  • Neo4j (Graph DB)
  • Vector DBs (FAISS)
  • PostgreSQL
• GPT-4o / Llama-3 • LangGraph / CrewAI • Python / FastAPI • GraphRAG (Neo4j) • AWS SageMaker • XGBoost • Solr / Semantic Search • Docker / Kubernetes • GPT-4o / Llama-3 • LangGraph / CrewAI • Python / FastAPI

Architecture & Business Impact

Connecting complex ML workflows to measurable business ROI. Here are high-level views of two core systems I have designed and deployed.

1. Agentic Business Intelligence (BI) Platform

The Challenge: Business leaders were waiting days for ad-hoc reports from engineering teams. I designed an autonomous "Analyst Agent" to democratize data access for non-technical leadership.

User: "Forecast Q3 Churn"
LangGraph Orchestrator (Router)
SQL Agent
FAISS (Vector)
Web Search
LLM Synthesis & Executive Report
💡 Insight Learned: Business leaders trust AI more when it shows its "scratchpad." By enabling the agent to log its sub-queries (e.g., "Writing SQL for table X"), we increased user adoption by 60%.
Business Value: Cut executive query time from 4 hours to <30 seconds and automated 45% of standard BI reporting.

2. Semantic Enterprise Discovery Platform

The Challenge: Legacy keyword search in Apache SOLR was causing audit bottlenecks. I built a Custom UI and an NLP engine that pulls data from SOLR and layers contextual understanding over it for semantic retrieval.

Custom Search UI (User Query)
NLP Context Engine (Doc2Vec / SpaCy)
Pulls data from Apache SOLR (Legacy Database)
💡 Insight Learned: Enhancing an existing system (SOLR) with an NLP wrapper is often far more cost-effective and scalable than completely ripping and replacing legacy infrastructure.
Business Value: Reduced manual audit verification time by 25% and improved stakeholder requirement precision by 40%.

Professional Experience

EXL Service

Lead Data Scientist

Oct 2023 – Present

Leading a team of data scientists to architect scalable "Compound AI Systems," transitioning operations from isolated models to integrated GenAI and predictive workflows.

  • LangGraph Multi-Tenant SaaS Platform: Architected a privacy-first extraction and reconciliation platform utilizing LangGraph and NLP. Engineered dynamic Pydantic schema engines to enable self-service client onboarding and strict data isolation across global tenants.
  • LLMOps & Sovereign GenAI Pipelines: Spearheaded the organizational adoption of Generative AI by deploying standardized LLM-as-a-Judge evaluation pipelines and rigorous PII guardrails (Microsoft Presidio) to ensure secure and compliant delivery cycles.
  • Compound Agentic Workflows (SQL+Vector): Engineered an autonomous "Analyst Agent" combining SQL, Vector Databases, and Web Search (Tavily) via LangGraph to synthesize multi-modal data and automate complex ad-hoc business intelligence reporting.
  • Hybrid GraphRAG Intelligence (Neo4j): Built a Hybrid Knowledge Graph and Retrieval-Augmented Generation engine combining Neo4j with state-of-the-art LLMs to map multi-hop relational dependencies and uncover hidden vulnerabilities in complex supply chain networks.
  • XGBoost & Quantile Regression Predictive Engine: Developed a robust Predictive Revenue Assurance Engine utilizing XGBoost and Quantile Regression to forecast cash flow risks and generate probabilistic risk-spreads for proactive financial intervention.

Tata Consultancy Services (TCS)

Data Scientist

July 2019 – June 2023
  • XGBoost Dual-Risk Predictive Engine: Architected a composite risk intelligence system utilizing ensemble modeling and explainable AI to proactively flag supply chain vulnerabilities by simultaneously analyzing behavioral supplier reliability and historical inventory volatility.
  • Sovereign NLP & Air-Gapped Pipelines: Orchestrated an end-to-end NLP pipeline using advanced topic modeling (BERTopic) and data orchestration tools to autonomously cluster and triage high-risk safety signals while strictly adhering to GDPR/HIPAA privacy regulations.
  • Microservice Architecture & Validation: Championed a Continuous Service Improvement (CSI) strategy by designing a standardized, proactive ML microservice architecture that integrated "Intelligent Defaults" and real-time validation into legacy enterprise applications.
  • Enterprise AI Enablement: Spearheaded an internal program, architecting interactive dashboards and business-centric runbooks to demystify machine learning concepts for executive leadership and drive cross-functional adoption.

Education

Master of Science in Machine Learning & AI

Liverpool John Moores University

Distinction | Outstanding Achiever

Post Graduate Diploma in Machine Learning & AI

IIIT Bangalore

GPA: 9.05

Bachelor of Technology in Computer Science

Bharati Vidyapeeth's College of Engineering

GPA: 8.31

Awards & Certifications

Career Awards

  • Stellar Performer Award (EXL)
    Two-time recipient for exceptional leadership and contributions to GenAI solutions.
  • Innovation Champions Award (EXL)
    Secured for innovative strategic approaches in GenAI.
  • Awards for Excellence & On The Spot (TCS)
    Conferred for consistent outstanding performance and impactful delivery.

Certifications

  • Nvidia LLM Certification
    Validated expertise in building, training, and deploying large language models.
  • Assembly of Data Scientist MLOps
    Demonstrated proficiency in end-to-end MLOps practices for scalable machine learning.

Thought Leadership & Research

NCREEE 19

Assessment of Classification Algorithms using Low-Level Features for MIR

Evaluated classification algorithms utilizing low-level features for Music Information Retrieval, comparing their performance to text and image-based methods.

AICAI'2019

Biclustering Algorithms: Analyzing Statistical Symmetry

Delved into the analysis and comparison of various Bi-clustering algorithms, focusing on statistical symmetry and interdependencies among multiple features.

ICICC , IJARCSSE

Intrusion Detection Systems using Artificial Immune System

Investigated the development of an Intrusion Detection System (IDS) inspired by natural immune systems, utilizing the concept of negative selection.

Ready to build something together?

Whether you're looking for technical leadership, exploring a new AI project, or just want to chat about Agentic GenAI—let's connect.