Private Draft

The 29 personas behind AI

We’ve organized every stage and persona in the AI supply chain, informed by real recruiting at frontier companies. Click any row to see matching profiles from our talent graph.

Shaped by Industry Experts
Kumar Chellapilla
Kumar ChellapillaVPE
Jennifer Anderson
Jennifer AndersonVPE / Stanford PhD
Thuan Pham
Thuan PhamCTO
Akash Garg
Akash GargCTO
Linghao Zhang
Linghao ZhangResearch Engineer
Wayne Chang
Wayne ChangEarly FB Engineer
Indrajit Khare
Indrajit KhareEM & Head of Product
← ATOMS & ENERGYUSERS & MARKETS →
← Back

ML Platform

Builds ML dev infrastructure
ML Platform

Known as: ML Platform Engineer, ML Infrastructure Engineer, Platform Engineer, Data Engineer (ML), Feature Store Engineer

Builds internal platforms for ML development: experiment tracking, model registries, config management, workflow orchestration, data infrastructure, and evaluation systems. Development-time infrastructure that makes ML teams productive — distinct from production-time systems (Model Operations, Serving Infrastructure).

Specializations

Experiment Tracking & Model Registry Experiment lineage, artifact versioning, reproducibility tooling, and the model catalog that tracks what was trained, how, and with what data. Promotion workflows that move validated models from research to staging to production. At scale, includes provenance tracking (which dataset version, which config, which code commit produced a given checkpoint) and comparison tooling that lets researchers evaluate runs side-by-side.
Config Management & Workflow Orchestration Hyperparameter management, training DAGs, pipeline scheduling, dependency management, and retry/recovery for multi-step ML workflows. Turns ad-hoc training scripts into repeatable, auditable pipelines. Includes job scheduling across heterogeneous compute (GPU/TPU/CPU), resource allocation, and the orchestration layer that coordinates data preparation, training, evaluation, and model registration into coherent workflows.
Data Infrastructure Feature stores, batch processing (Spark, lakehouse architectures), streaming pipelines, and the data plumbing that feeds both training and serving. Online/offline feature consistency, point-in-time correctness, schema evolution for ML datasets, and real-time feature computation. Different hiring profile from experiment tooling — more data engineering, less ML-specific — but organizationally lives under ML Platform at most companies.
Evaluation Infrastructure Eval harnesses, benchmark orchestration, scoring pipelines, human-eval tooling, result dashboards, and regression tracking across model versions. Provides the execution layer for evaluations — the Evaluation & Benchmarking persona designs methodology and benchmarks; this function builds and maintains the infrastructure that runs them at scale.
[1]Substrate
[2]Compute
Primary

Builds the experiment and workflow infrastructure that research and engineering teams run on.

[3]Intelligence
Secondary

Owns experiment tracking, model registries, and evaluation infrastructure that make ML development productive.

[4]Systems
Secondary

Provides orchestration and lifecycle management for models moving toward production.

[5]Distribution
Nathan Pazavich
Nathan Pazavich
Meta
Experiment lifecycle

Standardizes experiments, artifacts, lineage, and reproducibility so research iterates without losing the plot.

Zak Elwood
Zak Elwood
Uber
Workflow orchestration

Turns ad-hoc training scripts into repeatable DAGs with dependency management and guardrails.

Sandra Masha
Sandra Masha
Databricks
Eval infrastructure

Builds eval harnesses, dashboards, and human-eval tooling; methodology ownership stays with the Evaluation persona.

Early-Stage
Occasional
Growth
Common
Enterprise
Primary

Grows with ML team size. Early-stage uses W&B/MLflow; growth+ builds internal platforms.

Let’s Find Your Next Builder

If you’re hiring at the AI frontier, let’s talk.