Staff Data Engineer

Confirmed live in the last 24 hours

Able

Remote, LATAM

Hybrid

Posted April 16, 2026

Job Description

Back in 2012, we were a group of engineers and designers who decided we wanted to build things—so we did. Able started as an engineering and product hub building for a portfolio of early-stage startups. We built many relationships while developing products that were thoughtful, effective, and genuinely useful. But, since then, we’ve grown… and so has our ambition.

Now, we’re entering our next chapter—defined by applied AI. AI is a powerful force in the end-to-end software development cycle, and we’re creating practices that allow us to deliver software fast and more effectively than traditional approaches, creating meaningful value for our partners. Today, our builder mindset is driving us to become an AI-native organization across every function. We’re still evolving, and that’s part of the opportunity. If you want to build, learn, and tackle challenges alongside an ambitious team, let’s build together.

This position is 100% remote within LatAm.

About the Role

We’re looking for a Senior Data Engineer to design and build scalable data systems that power analytics and decision-making. You’ll define how data is captured, build reliable pipelines, and ensure data is accurate, accessible, and ready to use.

What We’re Looking For

Day-to-Day Responsibilities

Design, build, and operate a Databricks medallion lakehouse architecture (Bronze/Silver/Gold layers) using Delta Live Tables to support ingestion, transformation, and serving of clinical, behavioral, and operational data across a multi-country digital health platform
Architect and maintain scalable data pipelines on AWS (S3, Glue, Lambda, Kinesis, MSK/Kafka) that ingest data from diverse sources including FHIR-based clinical systems, remote patient monitoring devices, mobile applications, and third-party vendor APIs — ensuring reliability, idempotency, and observability at scale
Implement multi-country data isolation and governance leveraging Databricks Unity Catalog, enforcing data residency requirements across different countries (e.g., the US, EU, and the Kingdom of Saudi Arabia) and integrating policy-as-code consent enforcement (e.g., Open Policy Agent) aligned with regulatory requirements and guidelines (e.g., HIPAA, GDPR)
Partner with platform, compliance, and analytics teams to define and enforce data quality standards, lineage tracking, schema evolution strategies, and tamper-evident audit logging across all tiers of the lakehouse
Support clinical data interoperability by implementing and maintaining FHIR-to-OMOP mapping pipelines, enabling downstream analytics, population health reporting, and AI/ML feature engineering on harmonized datasets
Optimize data platform performance, cost, and reliability through partitioning strategies, compaction, caching, cluster sizing, and monitoring — targeting SLAs appropriate for a patient-facing healthcare platform operating at scale (e.g. 1M+ patients across a dozen markets)
Contribute to certification and compliance readiness (e.g., ISO 27001, SOC 2 Type 2) by maintaining documentation, change control processes, and validation artifacts for all data infrastructure components
Collaborate on real-time and event-driven architectures integrating Kafka-based streaming with the medallion layers and workflow orchestration, supporting adaptive patient journey logic and near-real-time analytics

Required Skills & Experience

Requires 8+ years of data engineering experience, with deep hands-on expertise in Databricks (Delta Lake, Unity Catalog, DLT), AWS data services, Python/Spark, and streaming frameworks — preferably within healthcare, life sciences, or other highly regulated industries
Strong proficiency with AWS data services such as S3, Glue, Lambda, Kinesis, Redshift, Athena, and IAM — with experience architecting end-to-end data pipelines in AWS-native or hybrid environments
Advanced Python and PySpark/Spark development skills for batch and streaming ETL/ELT pipeline development, data transformation, and data quality enforcement
Experience with streaming and event-driven architectures using Kafka (Amazon MSK or Confluent), including integration with lakehouse ingestion layers
Proven ability to implement data governance frameworks including data lineage, schema evolution, access controls, cataloging, and audit logging at enterprise scale
Strong understanding of data modeling for both analytical and operat