Back to Search
Overview
Mid-Level

Site Reliability Engineer

Confirmed live in the last 24 hours

Apple

Apple

Bengaluru
On-site
Posted April 13, 2026

Job Description

Summary

Collection of our people and their ideas encourage innovation in everything we do. Imagine what you could do here! Join Apple, and help us leave the world better than we found it. At Apple, new ideas have a way of becoming phenomenal products, services, and customer experiences very quickly. Every single day, people do amazing things at Apple. Do you want to be part of a team that builds cutting edge software service, a team that is continually innovating and is proud of making a difference? If so, bring your passion and talent and come join us to be part of something big and amazing. Join the AI and Data Platforms team at Apple, where we build and manage cloud-based data platforms handling petabytes of data at scale. We are looking for a passionate Software Engineer specializing in reliability engineering for data platforms, with a strong understanding of data and ML systems.

Description

As a Data Platform SRE, you will be responsible for developing and operating our big data platform using open source or other solutions to aid critical applications, such as analytics, reporting, and AI/ML apps. This includes working to optimize performance and cost, automate operations, and identifying and resolving production issues to ensure the best data platform experience

Minimum Qualifications

Experience: 5+ years in software site reliability engineering or software development roles. Programming: Proficient in at least one of Python, Golang, or Java. Skilled at coding for distributed systems and developing resilient data pipelines. Cloud Platforms: Hands-on experience with at least one major cloud platform (AWS, Azure, or Google Cloud Platform).

Preferred Qualifications

Expertise in designing, building, and operating critical, large-scale distributed systems with a focus on low latency, fault-tolerance, and high availability. Experience with contribution to Open Source projects is a plus. Experience with multiple public cloud infrastructure, managing multi-tenant Kubernetes clusters at scale and debugging Kubernetes/Spark issues. Experience with workflow and data pipeline orchestration tools (e.g., Airflow, DBT). Understanding of data modeling and data warehousing concepts. Familiarity with the AI/ML stack, including GPUs, MLFlow, or Large Language Models (LLMs). Data Structures & Algorithms: Strong foundation and application experience. Distributed Systems: Solid understanding and hands-on experience managing at least one distributed system (e.g. Kafka, Spark, Flink etc. ). Solid understanding of software engineering best practices, including the full development lifecycle, secure coding, and experience building reusable frameworks or libraries. Problem Solving: Demonstrated ability to independently troubleshoot and resolve complex technical issues. Creative Thinking: A track record of proposing and implementing innovative solutions to technical challenges.