Back to Search
Overview
Lead / Manager

Engineering Manager - Platform Reliability

Confirmed live in the last 24 hours

Databricks

Databricks

London, United Kingdom
On-site
Posted April 8, 2026

Job Description

P-1535

At Databricks, we are passionate about enabling data teams to solve the world's toughest problems — from making the next mode of transportation a reality to accelerating the development of medical breakthroughs. We do this by building and running the world's best data and AI infrastructure platform so our customers can use deep data insights to improve their business. Founded by engineers — and customer obsessed — we leap at every opportunity to solve technical challenges, from designing next-gen UI/UX for interfacing with data to scaling our services and infrastructure across millions of virtual machines. And we're only getting started.

The Lakebase Platform Reliability team's footprint spans multiple stacks, systems, and stakeholders. They include AI-powered tooling and workflows for customer management, real-time observability during incidents, monitoring and auditing systems that underpin compliance requirements, and customer-facing operational APIs and maintenance workflows. You'll contribute to the wider platform mission: building resource management infrastructure, reliable distributed services, and internal tools that help Databricks engineers operate confidently across clouds and environments.

The impact you will have:

  • Hire great engineers to build an outstanding team.
  • Support engineers in their career development by providing clear feedback and develop engineering leaders.
  • Ensure high technical standards by instituting processes (architecture reviews, testing) and culture (engineering excellence).
  • Work with engineering and product leadership to build a long-term roadmap.
  • Coordinate execution and collaborate across teams to unblock cross-cutting projects.
  • Resource management infrastructure powering the big data and machine learning workloads on the Databricks platform in a scalable, secure, and cloud-agnostic way
  • Lead development of reliable, scalable services and client libraries that work with massive amounts of data on the cloud, across geographic regions and Cloud providers
  • Build tools to allow Databricks engineers to operate their services across different clouds and environments
  • Build services, products and infrastructure at the intersection of machine learning and distributed systems.

What we look for:

  • 5+ years of Engineering experience and 2+ years of Engineering Management experience.
  • Experience with large-scale distributed services and the processes around testing, monitoring, and SLAs.
  • Ability to align multiple stakeholders on competing priorities. 
  • Able to balance short-term delivery against long-term stability.
  • BS (or higher) in Computer Science, or a related field.

About Databricks

Databricks is the data and AI company. More than 10,000 organizations worldwide — including Comcast, Condé Nast, Grammarly, and over 50% of the Fortune 500 — rely on the Databricks Data Intelligence Platform to unify and democratize data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, Apache Spark™, Delta Lake and MLflow. To learn more, follow Databricks on TwitterLinkedIn and Facebook<

gomachine learningaidataanalyticsproductdesign