Overview
As a Staff/Principal-level Machine Learning Engineer and technical lead, I design, ship, and operate production ML and GenAI systems end-to-end, turning ambiguous goals into measurable roadmaps, reliable architectures, and scalable execution. I hold a Bachelor's Degree in Computer Science from Columbia University in the City of New York. I am a U.S. citizen and do not require visa sponsorship.
I specialize in building high-impact, reusable foundations that make multiple teams faster: modern retrieval and ranking stacks, LLM/VLM-powered assistants, and distributed agentic systems with strong evaluation and safety guardrails. My work emphasizes practical optimization-cost-aware model routing, caching and batching, efficient serving (CPU/GPU), and compression techniques such as distillation and quantization to meet strict p95/p99 latency, reliability, and cost constraints in real production environments.
I am known for cross-functional leadership and technical decision-making that scales: aligning product, data, platform/SRE, privacy/security, and research stakeholders through crisp design docs, clear success metrics, and fast feedback loops. I value inclusive, high-trust teams and bring a calm, metrics-driven approach to building systems that are secure, observable, and maintainable over the long term.
My focus is generative and agentic AI: retrieval + reranking, RAG, tool-use orchestration, evaluation harnesses, and safety/guardrails paired with systems-level optimization (batching/caching, quantization/distillation, GPU efficiency, and cost-aware routing) to hit strict latency, reliability, and cost targets.
TECHNICAL SKILLS
- Software
- Programming: Python, Java, C, C++, Rust, Ruby, SQL, NoSQL, RESTful APIs, GraphQL, unit and integration testing
- Application: React.js, JavaScript, TypeScript, Swift, HTML, CSS, Django, Flask, Node.js, Express, Selenium
- Cloud: GCP, AWS, Azure, Digital Ocean, Netlify
- Tools: Git, VIM, VS Code, Splunk, Confluence, Jira, Bitbucket, GitHub Actions, Docker, Kubernetes, Linux, Shell Scripting
- Machine Learning
- Deep Learning: Agentic AI, PyTorch, TensorFlow, Keras, ML Recommender and Ranking Systems, Large Language Models (LLMs), Generative AI, RAG, LangChain, Multimodal Learning, Transformers, BERT, T5, Scikit-learn, NLP, NLTK, Knowledge Graphs
- Data: Pandas, Numpy, SciPy, PyTest, Spark, Hadoop, MapReduce, Tableau, Avro, Parquet, Data Parallelism, Model Parallelism, Hybrid Parallelism, Quantization
- MLOps: Airflow, MLFlow, AutoML, Continuous ML, YARN, Kubeflow, Jenkins, Argo, CircleCI, GPU Scaling, vLLM, Distillation
- ML Systems: GRPO, SWiRL, DPO, PPO, Kafka, Zookeeper, ETCD, SHAP, LIME, NVIDIA NeMo, NVIDIA Inference Server, CUDA
Education
Columbia University
Bachelor's Degree, Computer Science (May 2020)
Coursework
- Artificial Intelligence with Python
- Natural Language Processing with Python
- Advanced Programming with C/C++
- Algorithmic Trading with Python (audit)
- Data Structures with Java
- Cloud Computing and Big Data in AWS, GCP, and Azure with Python, JavaScript, HTML/CSS
- Introduction to Cryptography
- Fundamentals of Computer Systems
- Linear Algebra
- Building a Technology Startup
- Computer Science Theory
Stanford University
Certificate, Machine Learning Specialization in Supervised, Unsupervised, and Advanced ML Algorithms (April 2023)
Coursework
- Supervised Learning: Regression and Classification
- Build machine learning models in Python using popular machine learning libraries NumPy & scikit-learn
- Build & train supervised machine learning models for prediction & binary classification tasks, including linear regression & logistic regression
- Unsupervised Learning: Clustering, Anomaly Detection, Recommender Systems, Deep Reinforcement Learning, Collaborative Filtering, Content-Based Deep Learning
- Use unsupervised learning techniques for unsupervised learning: including clustering and anomaly detection
- Build recommender systems with a collaborative filtering approach and a content-based deep learning method
- Build a deep reinforcement learning model
- Advanced Machine Learning Algorithms: Multi-Class Classification in Neural Networks with TensorFlow, Best Practices in Machine Learning Development, Random Forests, Boosted Trees, Regression Trees, XGBoost
- Build and train a neural network with TensorFlow to perform multi-class classification
- Apply best practices for machine learning development so that your models generalize to data and tasks in the real world
- Build and use decision trees and tree ensemble methods, including random forests and boosted trees
Experience
Amazon
Tech Lead Machine Learning Engineer in Artificial General Intelligence Customization (AGI-C) Team (January 2025 - January 2026) Boston, Massachusetts - Full-time
-
Integrated GRPO, DPO, SWiRL, and RLVR into a unified RLHF stack to align LLM-generated inverse design code with verified metamaterial ground truth, enabling zero-shot generalization, physics-constrained reasoning, and unsupervised fine-tuning of instruction-following LLMs used by MIT researchers to accelerate synthesis validation for over 3 million material candidates
-
Engineered a batched, parallelized, and highly distributed reward evaluation pipeline across the 800TB MetaGen materials database, reducing runtime from 82 hours to 57 seconds for over 10 million completions, with >98% throughput efficiency using PyTorch, HuggingFace trl, and CUDA-aware sharded evaluation on multi-node clusters
-
Achieved >5000 times speedup in inference-time reward computation with >92.3% top-1 structural match accuracy using RLVR-based relative scoring, enabling scalable RLHF-style fine-tuning on commodity 16GB VRAM hardware via LoRA and 4-bit quantization, supporting models from 350M to 1.3B parameters
-
Developed a high-fidelity LLM evaluation framework for multi-turn scientific reasoning and code trace validation, analyzing over 2.1 billion tokens across 12 reasoning task types
- Utilized Hopfield episodic memory layers and text diffusion decoding to enforce logical and unit-consistent reasoning chains and Tree of Thought reasoning based prompt engineering techniques, boosting multi-hop pass rate by 94% and trace accuracy by 88%
-
Improved inverse design success rate from 46% to 97%, while reducing GPU-hour cost per training cycle by ~98%, demonstrating the real-world feasibility of modern RLHF and alignment techniques in high-throughput scientific GenAI pipelines
-
Enabled unsupervised fine-tuning workflows across domains, using self-consistency checks, rule-based constraints, and reward function introspection to curate alignment signals without human labeling—automating preference modeling for LLM self-alignment at scale
-
Deployed Self-Adapting Language Models to dynamically adjust reasoning behavior and output formatting based on prompt context, reducing domain-specific hallucinations by 87% and improving structured generation pass@1 by 83% in physics and materials applications
American Express
Senior Lead Machine Learning Engineer (May 2022 - January 2025) Phoenix, Arizona - Full-time
-
Lead a ML engineering team and lead ML projects across multiple teams to enhance AI/ML system security through LLM vulnerability assessments and the design of scalable large-scale machine learning architecture, integrating advanced safety controls and stress-testing methodologies, increasing system resilience by 49% and reducing incident response time by 75%
-
Led the design and implementation of a scalable real-time credit card fraud detection system using Generative AI and advanced deep learning techniques at American Express, processing millions of transactions per second across global regions, improving fraud detection accuracy by 30%, reducing false positives by 20%, and ensuring compliance with PCI DSS and GDPR through cross-functional collaboration with data engineering, software development, infrastructure, and compliance teams
-
Lead the design and implementation of a large-scale, real-time Fake News Detection system, collaborating across multiple teams and external vendors to handle over 200,000 articles daily; achieved a 92% true positive detection rate, significantly outperforming human accuracy by 2x, and reduced misinformation-related risks to brand reputation by 40%
-
Lead NLP Topic Modeling for Error Classification, leveraging BERTopic and BERT embeddings to automate unstructured error classification in marketing campaigns, resulting in a 70% reduction in manual error identification time and 30% faster campaign approvals; coordinated across multiple teams, including marketing, product, engineering, and compliance, driving significant operational efficiency improvements through a scalable, real-time infrastructure
-
Lead the development of a large-scale Recommender and Ranking Deep Learning System for personalized credit card and banking product offers, driving a 40% increase in customer engagement and a 20% boost in new customer acquisitions through cross-functional collaboration and integration of advanced machine learning techniques, including LLM-based insights and hybrid collaborative filtering
-
Led the development of a scalable NER solution using HuggingFace Transformers for KYC compliance, reducing manual data extraction effort by 80%, improving compliance accuracy by 50%, and accelerating customer onboarding by 40% through cross-functional collaboration across engineering, legal, compliance, and product teams
-
Led the development of a GPT-3-powered Text-to-SQL system, enabling non-technical users to generate SQL queries independently, reducing query translation time by 80%, increasing data-driven insights by 40%, and improving cross-functional collaboration with product, data science, and compliance teams