Staff Software Engineer, Data Infrastructure

Confirmed live in the last 24 hours

Zocdoc

New York, NY; USA Remote

Hybrid

Posted April 1, 2026

Job Description

Our Mission

Healthcare should work for patients, but it doesn’t. In their time of need, they call down outdated insurance directories. Then wait on hold. Then wait weeks for the privilege of a visit. Then wait in a room solely designed for waiting. Then wait for a surprise bill. In any other consumer industry, the companies delivering such a poor customer experience would not survive. But in healthcare, patients lack market power. Which means they are expected to accept the unacceptable.

Zocdoc’s mission is to give power to the patient. To do that, we’ve built the leading healthcare marketplace that makes it easy to find and book in-person or virtual care in all 50 states, across +200 specialties and +12k insurance plans. By giving patients the ability to see and choose, we give them power. In doing so, we can make healthcare work like every other consumer sector, where businesses compete for customers, not the other way around. In time, this will drive quality up and prices down.

We’re 18 years old and the leader in our space, but we are still just getting started. If you like solving important, complex problems alongside deeply thoughtful, driven, and collaborative teammates, read on.

Your Impact on our Mission:

As a Staff Software Engineer (Data Infrastructure), you’ll lead the design and development of the software that underpins our data platform: ingestion frameworks, execution services, orchestration, metadata, data governance, and developer experience. Your focus is building APIs, libraries, and services that make data producers/consumers effective, while optimizing reliability, performance, and spend on AWS.

You’ll enjoy this role if you are…

A builder of platforms and frameworks (not just point pipelines).
Comfortable with distributed systems abstractions (compute scheduling, storage layout, back‑pressure, retries, idempotency).
Excited by lakehouse tech and modern data contracts, and you want to create self‑service for hundreds of use cases.

Your day to day is…

Design and ship platform services for ingestion, transformation, orchestration, and metadata (e.g., service‑backed interfaces for Dagster/Airflow, lineage, quality, and data contracts).
Build execution & scheduling capabilities for Spark/SQL jobs (queuing, prioritization, retries, resource isolation on EMR/EKS/Databricks), focusing on throughput and developer experience.
Implement lakehouse features (Delta/Iceberg): schema evolution, partitioning, compaction, vacuum, snapshotting, ACID guarantees, and table‑format governance
Optimize Snowflake and other warehouses: cost controls, query profiling/pruning, workload isolation, RBAC; expose safe self‑service patterns.
Deliver SDKs, CLIs, and templates that standardize how teams build reliable data products; enable CI/CD for data and contract testing.
Work across AWS (S3, EMR/EKS, Glue/Athena, Lambda, Kinesis/MSK) with IaC (Terraform) and strong observability (Datadog/CloudWatch).

You’ll be successful in this role if you have…

8+ years building backend/platform software with Python/Scala/Java and strong SQL; proven track record designing distributed systems.
Deep experience with Spark (Databricks or EMR/EKS) and AWS data services; solid grasp of scheduler/executor behavior and performance tradeoffs.
Hands‑on data warehouse optimization (Snowflake ideal; others welcome).
Experience building platform APIs/SDKs that other engineers adopt; excellent collaboration and technical leadership.

Bonus if you have…

Experience at petabyte‑scale data platforms, distributed big data compute, or lakehouse engines (Delta/Iceberg).
Familiarity with metadata/governance tech (Unity Catalog, Collibra, Lake Formation).

Benefits:

pythonjavagoawsaibackenddataproductdesign