Back to Search
Overview
Mid-Level

Data Engineer II 

Confirmed live in the last 24 hours

Expedia

Expedia

India - Gurgaon
On-site
Posted April 20, 2026

Job Description

Expedia Group brands power global travel for everyone, everywhere. We design cutting-edge tech to make travel smoother and more memorable, and we create groundbreaking solutions for our partners. Our diverse, vibrant, and welcoming community is essential in driving our success.

Why Join Us?

To shape the future of travel, people must come first. Guided by our Values and Leadership Agreements, we foster an open culture where everyone belongs, differences are celebrated and know that when one of us wins, we all win.

We provide a full benefits package, including exciting travel perks, generous time-off, parental leave, a flexible work model (with some pretty cool offices), and career development resources, all to fuel our employees' passion for travel and ensure a rewarding career journey. We’re building a more open world. Join us.

Team Description
The Metrics Platform team builds and operates a scalable, multi-tenant data platform that powers analytics, experimentation, and decision-making across the organization. The platform handles large-scale distributed workloads, supports intelligent backfills, and provides reliable, cost-efficient data processing with strong observability and governance.
The team sits at the intersection of data engineering, platform engineering, and emerging AI capabilities, focusing on building systems that are not just scalable, but also intelligent and self-optimizing.

Role Summary
This role is for a hands-on data engineer who understands how distributed systems behave under load—not just how to use them.
You will design and optimize large-scale Spark workloads, deeply understand lineage across data and compute layers, and build platform capabilities that are reliable, observable, and efficient. You’ll also work with AI-driven workflows (including agentic systems) to improve platform intelligence, automation, and developer productivity.

What You’ll Do

  • Design, build, and operate scalable data processing pipelines using Spark with a strong focus on performance, cost, and reliability

  • Analyze and optimize Spark jobs at a deep level (execution plans, shuffle behavior, partitioning strategies, memory management, skew handling)

  • Understand and leverage data lineage (logical and physical) across pipelines to enable debugging, impact analysis, and intelligent recomputation/backfills

  • Build and manage orchestration workflows using Airflow, including dependency management, retries, backfills, and failure recovery

  • Design and implement REST APIs and platform services with clear contracts, observability, and access control

  • Define data models, schemas, and contracts that support multi-domain and multi-tenant use cases

  • Contribute to platform-level capabilities such as compute abstraction, lineage-driven processing, and cost optimization

  • Implement strong operational practices: monitoring, alerting, CI/CD, automated testing, and runbooks

  • Integrate AI/ML-driven solutions (including agentic workflows) to improve pipeline optimization, anomaly detection, and developer workflows



Minimum Qualifications

  • Bachelor’s degree in Computer Science or related field, or equivalent practical experience

  • 3+ years of experience in data engineering or backend/platform engineering

  • Strong hands-on experience with Apache Spark in production environments

  • Experience with workflow orchestration tools such as Airflow

  • Experience designing and building RESTful services and APIs

  • Solid understanding of data modeling, distributed systems, and system design fundamentals

  • Experience owning production systems with responsibility for reliability and performance



Preferred Qualifications

  • Experience working with large-scale distributed data processing systems

  • Familiarity with performance tuning and optimization techniques for data pipelines

  • Understanding of data lineage, data quality, and data governance concepts

  • Experience building or contributing to shared data platforms or reusable data services

  • Strong system design fundamentals, including API design and data modeling

  • Experience with observability practices such as monitoring, alerting, and debugging production systems

  • Exposure to workflow orchestration and scheduling tools

  • Familiarity with AI/ML concepts or experience working with AI-enabled systems or tools

  • Ability to work across teams and contribute to platform-level improvements

Accommodation requests

If you need assistance with any part of the application or recruiting process due to a disability, or other physical or mental health conditions, please reach out to our Recruiting Accommodations Team through the Accommodation Request.

We are proud to be named as a Best Place to Work on Glassdoor in 2024 and be recognized for award-winning culture by organizations like Forbes, TIME, Disability:IN, and others.

Expedia Group's family of brands includes: Brand Expedia®, Hotels.com®, Expedia® Partner Solutions, Vrbo®, trivago®, Orbitz®, Travelocity®, Hotwire®, Wotif®, ebookers®, CheapTickets®, Expedia Group™ Media Solutions, Expedia Local Expert®, CarRentals.com™, and Expedia Cruises™. © 2024 Expedia, Inc. All rights reserved. Trademarks and logos are the property of their respective owners. CST: 2029030-50

Employment opportunities and job offers at Expedia Group will always come from Expedia Group’s Talent Acquisition and hiring teams. Never provide sensitive, personal information to someone unless you’re confident who the recipient is. Expedia Group does not extend job offers via email or any other messaging tools to individuals with whom we have not made prior contact. Our email domain is @expediagroup.com. The official website to find and apply for job openings at Expedia Group is careers.expediagroup.com/jobs.

Expedia is committed to creating an inclusive work environment with a diverse workforce. All qualified applicants will receive consideration for employment without regard to race, religion, gender, sexual orientation, national origin, disability or age.
data