ZeroToOne Engineering

01111  00111  01110   0110   010010  0110    0110   01  10  00111
   01  01     01  01 01  01    01   01  01  01  01  010 10  01
  01   0111   01110  01  01    01   01  01  01  01  01 010  0111
 01    01     01 01  01  01    01   01  01  01  01  01  10  01
01111  00111  01  01  0110     01    0110    0110   01  10  00111

$ echo "building the future of ad tech"

Deep dives into architecture, data engineering, and the tools we build.

What We Learned Building Our ML Platform

Apr 23, 2026 · Platform Team

What We Learned Building Our ML Platform

Six ideas about feature stores, AST-generated ETLs, content-addressed artifacts, and control planes — distilled from building a semantic layer for ML.

Recent Posts

Airflow DAG Bundles: Managing DAGs Across Teams Without Helm Upgrades

Apr 15, 2026 · Manoj Babu Katragadda, Meghanath Macha · Platform Team

Airflow DAG Bundles: Managing DAGs Across Teams Without Helm Upgrades

How we use S3 DAG bundles, a sidecar sync pattern, and the bundle watcher to onboard new pipelines with zero downtime.

airflow kubernetes data-engineering dag-bundles

Building a Composable ETL Framework for Spark

Apr 2, 2026 · Manoj Babu Katragadda, Meghanath Macha · Platform Team

Building a Composable ETL Framework for Spark

How we replaced bespoke PySpark scripts with a config-driven, hook-based framework inspired by Rust's composition model.

spark data-engineering architecture python iceberg

Taming S3 Shuffle at Scale

Mar 30, 2026 · Manoj Babu Katragadda, Meghanath Macha · Platform Team

Taming S3 Shuffle at Scale

How we fixed the GET request explosion, prefix throttling, and threading edge cases that emerge when S3 shuffle meets production scale.

spark s3-shuffle performance data-engineering scale

Debugging a Spark Job with 🔥DataFlint

Mar 27, 2026 · Platform Team, Meghanath Macha · Platform Team

Debugging a Spark Job with 🔥DataFlint

How an open-source Spark UI replacement helped us find data skew, partition bloat, and shuffle spill.

spark dataflint performance data-engineering

Spark on Spot with S3 Shuffle

Mar 25, 2026 · Manoj Babu Katragadda, Meghanath Macha · Platform Team

Spark on Spot with S3 Shuffle

How S3 shuffle lets us run Spark executors 100% on spot instances with Karpenter, cutting compute costs 70-85%.

spark kubernetes spot-instances karpenter s3-shuffle cost-optimization

Running Spark on Kubernetes: An Ops-First Approach

Mar 19, 2026 · Manoj Babu Katragadda, Meghanath Macha · Platform Team

Running Spark on Kubernetes: An Ops-First Approach

How we run production Spark jobs on Kubernetes with one SparkApp YAML, pre-baked images, and sub-10-second warm starts.

spark kubernetes data-engineering airflow developer-experience