Data Platform Engineer

Confirmed live in the last 24 hours

Abnormal Security

Remote - USA

Remote

Posted March 27, 2026

Job Description

About the Role

You’ll build, operate, and evolve the end-to-end data platform that powers analytics, automation, and AI use cases. This is a hands-on role spanning cloud infrastructure, ingestion/ETL, and data modeling across a Medallion (bronze/silver/gold) architecture. You’ll partner directly with stakeholders to turn messy source data into trusted datasets, metrics, and data products.

Who you are

Pragmatic Builder: You write clear SQL/Python, ship durable systems, and leave pipelines more reliable than you found them.
Data-Savvy Generalist: You’re comfortable moving up and down the stack (cloud, pipelines, warehousing, and BI) and picking the right tool for the job.
Fundamentals-first & Customer-Centric: You apply strong data modeling principles and optimize the analyst/stakeholder experience through consistent semantics and trustworthy reporting.
Low-Ego, High-Ownership Teammate: You take responsibility for outcomes, seek feedback openly, and will roll up your sleeves to move work across the finish line.
High-Energy Communicator: You’re comfortable presenting, facilitating discussions, and getting in front of stakeholders to drive clarity and alignment.
Self-Starter: You unblock yourself, drive decisions, and follow through on commitments; you bring a strong work ethic and invest in continuous learning.

What you will do

Ingestion & ETL: Build reusable ingestion and ETL frameworks (Python and Spark) for APIs, databases, and un/semi-structured sources; handle JSON/Parquet and evolving schemas.
Medallion Architecture: Own and evolve Medallion layers (bronze/silver/gold) for key domains with clear lineage, metadata, and ownership.
Data Modeling & Marts: Design dimensional models and gold marts for core business metrics; ensure consistent grain and definitions.
Analytics Enablement: Maintain semantic layers and partner on BI dashboards (Sigma or similar) so metrics are certified and self-serve.
Reliability & Observability: Implement tests, freshness/volume monitoring, alerting, and runbooks; perform incident response and root-cause analysis (RCA) for data issues.
Warehouse & Performance: Administer and tune the cloud data warehouse (Snowflake or similar): compute sizing, permissions, query performance, and cost controls.
Standardization & Automation: Build paved-road patterns (templates, operators, CI checks) and automate repetitive tasks to boost developer productivity.
AI Readiness: Prepare curated datasets for AI/ML/LLM use cases (feature sets, embeddings prep) with appropriate governance.

Must Haves

3–5+ years hands-on data engineering experience; strong SQL and Python; experience building data pipelines end-to-end in production.
Strong cloud fundamentals (AWS preferred; other major clouds acceptable): object storage, IAM concepts, logging/monitoring, and managed compute.
Experience building and operating production ETL pipelines with reliability basics: retries, backfills, idempotency, incremental processing patterns (e.g., SCDs, late-arriving data), and clear operational ownership (docs/runbooks).
Solid understanding of Medallion / layered architecture concepts (bronze/silver/gold or equivalent) and experience working within each layer.
Strong data modeling fundamentals (dimensional modeling/star schema): can define grain, build facts/dimensions, and support consistent metrics.
Working experience in a modern cloud data warehouse (Snowflake or similar): can write performant SQL and understand core warehouse concepts.
Hands-on dbt experience: building and maintaining models, writing core tests (freshness/uniqueness/RI), and contributing to documentation; ability to work in an established dbt project.
Experience with analytics/BI tooling (Sigma, Looker, Tableau, etc.) and semantic layer concepts; ability to support stakeholders and troubleshoot issues end-to-end.

Nice to Have

Snowflake administration depth: warehouse sizing and cost management, advanced performance tuning, clustering strategies, and designing RBAC models
Advanced governance & security patterns: masking policies, row-level security, and least-privilege frameworks as a primary implementer/owner
Strong Spark/PySpark proficiency: deep tuning/optimizatio

pythongorustawsaidataanalyticsproductdesign