ML Engineer • LLM Systems • Applied Research

I build LLM systems that ship: agents, retrieval, and evaluation.

M.S. Data Science @ NYU (’26)

Portrait of Deepanshu Mody

I work on agentic workflows, RAG pipelines, and evaluation infrastructure—aiming for systems that are reliable, fast, and measurable.

FocusLLM systems, RAG, agents, tokenization, interpretability
GraduationMay 2026
Based inNew York, NY

Selected work

Representative systems and research threads.

Production RAG system (Azure ML) • 1,200 manuals

Hybrid retrieval + INT8 inference for low latency and strong token-level accuracy.

Multi-agent workflow + KG analytics (LangGraph + Neo4j)

End-to-end tool use; measured reliability, usage, latency; graph analytics for drug-target-indication signals.

Tokenization optimization (MCMC / RL) • MiniPile

Explored global optimization beyond greedy BPE; analyzed compression/entropy tradeoffs.


Experience

Industry roles spanning LLM systems, retrieval, and systems engineering.

Jun 2025 - Aug 2025

Statistics & AI/ML Intern

Pfizer • Boston, MA
  • Built a LangGraph multi-agent workflow (Gemini Flash, DeepSeek-R1) with end-to-end tool use; evaluated latency, usage, and reliability.
  • Extended and productionized the system with a Neo4j knowledge graph for drug-target-indication analytics; implemented evidence-weighted relationships and graph analytics (PageRank, community clustering).
Jul 2023 - Jul 2024

Data Scientist | Software Engineer - Data & AI

Incedo Inc. • Gurugram, India
  • Built and owned the backend for a production-grade LangChain RAG document-QA system on Azure ML Studio, serving 1,200 technical manuals; delivered 92.0 token-level F1 with sub-200 ms p50 latency using INT8 inference and hybrid BM25 + dense + cross-encoder retrieval.
Jan 2023 - Jun 2023

Software Engineering Intern

Kinara AI (now part of NXP Semiconductors) • Hyderabad, India
  • Prototyped a RISC-V vector extension and LLVM backend; implemented scatter/gather intrinsics to deliver 1.7x GEMM throughput and -34% ResNet-50 latency in cycle-accurate simulations.

Research

Independent and academic work spanning interpretability, tokenization, imaging, and GNNs.

Sep 2025 - Present

Industry-Affiliated Capstone (Advisor: Dr. Chris Tanner)

Kensho • Remote
  • Designed and evaluated MCMC- and RL-based approaches for globally optimizing BPE tokenization (entropy + compression objectives) on the MiniPile corpus.
  • Analyzed tradeoffs against standard greedy tokenization baselines.
Apr 2025 - Jun 2025

Graduate Research Assistant

NYU Grossman School of Medicine • New York, NY
  • Curated a longitudinal imaging cohort of ~2k abdominal CT scans from ~80k patients with acute pancreatitis; linked 3-year outcomes and built a DICOM-to-NIfTI pipeline with automated PHI stripping.
Jun 2022 - Dec 2022

Research Intern (Advisor: Dr. Daisuke Kihara)

Purdue University • West Lafayette, IN
  • Developed GNN models (GCN, GNN-DTI) for RNA metal-ion binding, gaining +6.2pp ROC-AUC over a CNN on 6.4k PDB structures.
  • Built a GPU-accelerated PyG stack on SLURM and a DGL graph-builder that cut preprocessing 5x and streamed 1.1M edges/s, enabling 128-config sweeps overnight.

Education

Training in data science and computer science.

Aug 2024 - May 2026

New York University

M.S., Data Science • New York, NY
  • Research Mentor - Roaring Cubs Collective
Aug 2018 - Jun 2023

Birla Institute of Technology and Science, Pilani

B.E. + M.S. Integrated Dual Degree, Computer Science and Biological Sciences • Pilani, India