Data Engineer (4-month contract)

Confirmed live in the last 24 hours

Outrider

Remote

Posted April 24, 2026

Job Description

The Company

Outrider is a software company that is automating distribution yards with electric, self-driving trucks. Our system eliminates manual tasks that are hazardous and repetitive while it improves safety and efficiency. Outrider’s mission is to drive the rapid adoption of sustainable freight transportation. We are a private company founded in 2018 and backed by NEA, 8VC, Koch Disruptive Technologies, NVIDIA, and other top-tier investors. Our customers are Fortune 200 companies and our autonomous trucks are already running in distribution yards. For more information, visit www.outrider.ai

Overview

We are seeking a Data Engineer with strong Python and distributed processing skills to design, build, and maintain scalable data pipelines and the infrastructure behind our analytics products. This role focuses on delivering reliable, performant data systems by developing robust ETL/ELT workflows, optimizing data storage and processing, and ensuring data availability and quality at scale.

Key Responsibilities

Design, build, and maintain scalable ETL/ELT pipelines that ingest, transform, and deliver data across the organization.
Develop and optimize distributed data processing jobs using Python for large-scale data transformation and aggregation.
Architect and manage PostgreSQL schemas, tables, indexes, and query performance to support downstream analytics and reporting.
Build and maintain Python-based data workflows to orchestrate, validate, and deliver data reliably across environments.
Monitor and improve data quality, freshness, and completeness through automated checks, alerting, and observability tooling.
Design and manage cloud-based data infrastructure on AWS
Partner with data analysts and stakeholders to translate requirements into well-modeled, maintainable data products.
Maintain documentation for pipelines, data models, data lineage, and infrastructure.
Troubleshoot pipeline failures and data issues, providing timely root-cause analysis and remediation.

Required Qualifications

Experience: 3+ years of professional experience in data engineering.
PostgreSQL: Schema design, indexing strategies, query optimization, and performance tuning.
Python: Pipeline development, data validation, and orchestration frameworks.
Distributed Processing and Storage: Hands-on production experience with tools such as AWS Athena, Apache Spark, etc.
ETL/ELT: Proven experience designing and implementing pipelines in production.
Cloud: AWS (S3, EKS, Glue, Athena).
Data Modeling: Dimensional modeling, data warehousing patterns, and reproducible transformations.
Engineering Practices: Git workflows, code reviews, testing, and CI/CD.
LLM AI Agents: ability to use agents effectively to increase output.

Preferred Qualifications

Experience with workflow orchestration tools (e.g., AWS Step Functions, Prefect, Dagster).
Familiarity with modern data stack concepts: ELT patterns, data lakehouse architecture, semantic layers, and governance.
Experience implementing automated data quality frameworks and pipeline observability.
Exposure to streaming or near-real-time data processing (e.g., Kafka, Spark Streaming, Pub/Sub).

What Success Looks Like

Reliable pipelines. Pipelines are performant and trusted by downstream consumers.
Maintainable infrastructure. Data systems are well-documented, cost-efficient, and easy to extend.
Proactive quality. Data issues are detected early, with clear ownership and fast resolution paths.
Consistent delivery. Data is correctly modeled and delivered on time across all environments.