Project2020 — 2022

ML feature platform

Shared catalog, batch + online serving, and lineage you can trust.

Principal contributor

Feature store and serving path that standardized experimentation and reduced duplicate pipelines.

ML platform Data GCP

PythonSparkgRPCGCP

Funding & structure

Employer / product org

Data / ML platform

Why

Model quality was suffering from duplicated pipelines and inconsistent definitions of the same feature name; teams needed one path from experiment to production serving.

Pain points

Duplicated Spark jobs and slow feedback loops between training and serving.
The same feature name meant different things in different notebooks.
Staleness and drift were hard to observe once models hit live traffic.

Overview

Data scientists were rebuilding similar feature pipelines for each model. The platform provided a shared catalog, batch and online serving paths, and lineage so teams could reuse features and trust what was live in production.

Architecture

Batch computation produced versioned feature sets synced to online stores under freshness SLAs. A gRPC layer served low-latency reads for models; clients shared libraries so training and serving referenced the same logical definitions.

Diagrams

Batch to online

Technical deep dive

Python services, Spark for large-scale transforms, gRPC for low-latency serving on GCP. Emphasis on reproducibility: same logical feature, same definition, whether in a notebook, batch job, or live request.

What I did

Designed batch computation and sync to online stores with freshness SLAs.
Collaborated on the gRPC serving layer and client libraries for model teams.
Helped define governance: ownership, versioning, and deprecation of features.
Improved observability for serving latency and feature staleness.

Outcomes

Fewer one-off pipelines; teams reused features instead of re-implementing them.
Clearer path from experiment to production serving.
Reduced surprises when models moved from offline eval to live traffic.

Operational metrics around freshness and serving latency tracked per feature family.

Want to go deeper on architecture, trade-offs, or a similar build?

Get in touch