Back to Search
Overview
Senior

Senior Data Engineer

Confirmed live in the last 24 hours

Xealth

Xealth

Compensation

$155,000 - $225,000/year

Seattle, WA
On-site
Posted March 18, 2026

Job Description

Our Mission & Culture

At Xealth, we're revolutionizing healthcare by leveraging data and automation to empower care providers (building on EHRs such as Epic and Cerner) to seamlessly prescribe, deliver, and monitor digital health for patients. We are a detail-oriented team, committed to maintaining the highest standards while moving with agility and impact.

We are a highly skilled, collaborative, and passionate group, applying our expertise to improve health outcomes for millions. We believe in shared ownership and are looking for a team player who is a self-starter and self-driven to pioneer the next generation of intelligent, automated data insights.

This role offers a unique opportunity to join a data engineering team  to advance our capabilities with data processing pipelines and our analytics product offering. There is a strong preference for this person to sit in the Seattle office; however, we are open to candidates in other locations within the United States. 

What You'll Own and Deliver (Responsibilities)

As a core member of our data engineering team, you will design, build, and scale the services that power Xealth’s Analytics and Reporting Capabilities. You’ll apply solid computer science fundamentals to solve complex problems in distributed systems, data modeling, and data pipelines.

  • Data Modeling: Execute expert-level Data Modeling and Design, utilizing dimensional modeling and denormalization techniques specifically for analytic workloads.
  • Data Ingestion: Ability to consume and process high-volume bounded and unbounded data, build robust Change Data Capture (CDC) mechanisms, and gather data from API calls and webhooks.
  • Pipeline Design & Orchestration: Design, build, and optimize high-volume, real-time Streaming Data Pipelines utilizing PySpark and Databricks environments.
  • Scalability & Maintenance: Maintain and scale large Data Lake Pipelines, ensuring high performance and cost-efficiency.
  • Unit testing & Quality Assurance: Write comprehensive unit and integration tests for data pipelines to ensure code quality and production reliability.
  • Cross-Functional Collaboration: Partner with product managers and EHR specialists to translate clinical user behaviors into rich, analytical datasets, unlocking critical insights that drive evidence-based improvements in healthcare processes. 
  • Technical Leadership: Contribute to code reviews, system design discussions, and technical decisions that raise the engineering bar across the team.
  • Automation and AI in Development: Use AI-assisted coding tools like GitHub Copilot to streamline development, increase quality, and accelerate delivery.

The Expertise You'll Bring (Requirements)

We’re looking for a data engineer with strong computer science fundamentals. Someone who’s comfortable reasoning about systems, data, and code structure at scale, and who’s excited to apply those skills in healthcare.

Core Technical Competencies:

  • Data Engineering Expertise: 5+ years of professional experience building production-grade data pipelines and applications, with expert proficiency in Python, PySpark and SQL. Familiarity with JavaScript would be a plus. You must have solid hands-on experience working with modern massively parallel data processing systems.
  • CS Fundamentals: Deep understanding of algorithms and data structures, with a specific focus on distributed computing principles (concurrency, partitioning, shuffling) necessary for processing large-scale datasets.
  • Optimization & Troubleshooting: Proficient in diagnosing complex failures in distributed processing jobs (e.g., Spark executor errors, memory leaks, data skew) using logs, distributed tracing, and performance metrics.
  • Modern SQL and Non-SQL Database Design: Deep practical knowledge of open table formats, such as Delta Lake. Proficiency with common big data file formats, including Apache Parquet and Apache Avro.
  • Infrastructure as Code (IaC): Experience implementing IaC principles and tools
pythonjavajavascriptgorustawskubernetesaidataanalytics