Private Draft

The 29 personas behind AI

We’ve organized every stage and persona in the AI supply chain, informed by real recruiting at frontier companies. Click any row to see matching profiles from our talent graph.

Shaped by Industry Experts

← ATOMS & ENERGYUSERS & MARKETS →

← Back

ML Platform

Builds ML dev infrastructure

ML Platform

Summary

Known as: ML Platform Engineer, ML Infrastructure Engineer, Platform Engineer, Data Engineer (ML), Feature Store Engineer

Builds internal platforms for ML development: experiment tracking, model registries, config management, workflow orchestration, data infrastructure, and evaluation systems. Development-time infrastructure that makes ML teams productive — distinct from production-time systems (Model Operations, Serving Infrastructure).

Specializations

Experiment Tracking & Model Registry — Experiment lineage, artifact versioning, reproducibility tooling, and the model catalog that tracks what was trained, how, and with what data. Promotion workflows that move validated models from research to staging to production. At scale, includes provenance tracking (which dataset version, which config, which code commit produced a given checkpoint) and comparison tooling that lets researchers evaluate runs side-by-side.

Config Management & Workflow Orchestration — Hyperparameter management, training DAGs, pipeline scheduling, dependency management, and retry/recovery for multi-step ML workflows. Turns ad-hoc training scripts into repeatable, auditable pipelines. Includes job scheduling across heterogeneous compute (GPU/TPU/CPU), resource allocation, and the orchestration layer that coordinates data preparation, training, evaluation, and model registration into coherent workflows.

Data Infrastructure — Feature stores, batch processing (Spark, lakehouse architectures), streaming pipelines, and the data plumbing that feeds both training and serving. Online/offline feature consistency, point-in-time correctness, schema evolution for ML datasets, and real-time feature computation. Different hiring profile from experiment tooling — more data engineering, less ML-specific — but organizationally lives under ML Platform at most companies.

Evaluation Infrastructure — Eval harnesses, benchmark orchestration, scoring pipelines, human-eval tooling, result dashboards, and regression tracking across model versions. Provides the execution layer for evaluations — the Evaluation & Benchmarking persona designs methodology and benchmarks; this function builds and maintains the infrastructure that runs them at scale.

Where the Work Lives

[1]Substrate

[2]Compute

Primary

Builds the experiment and workflow infrastructure that research and engineering teams run on.

[3]Intelligence

Secondary

Owns experiment tracking, model registries, and evaluation infrastructure that make ML development productive.

[4]Systems

Secondary

Provides orchestration and lifecycle management for models moving toward production.

[5]Distribution