Senior Software Engineer - SRE
Confirmed live in the last 24 hours
Roku
Job Description
Teamwork makes the stream work.
Roku is changing how the world watches TV
Roku is the #1 TV streaming platform in the U.S., Canada, and Mexico, and we've set our sights on powering every television in the world. Roku pioneered streaming to the TV. Our mission is to be the TV streaming platform that connects the entire TV ecosystem. We connect consumers to the content they love, enable content publishers to build and monetize large audiences, and provide advertisers unique capabilities to engage consumers.
From your first day at Roku, you'll make a valuable - and valued - contribution. We're a fast-growing public company where no one is a bystander. We offer you the opportunity to delight millions of TV streamers around the world while gaining meaningful experience across a variety of disciplines.
About the team
About the Role:
We are seeking a talented and experienced SRE (Site Reliability Engineering) Senior Software Engineer to join our dynamic team. The ideal candidate will have a strong background in SRE practices, cloud infrastructure management, and automation. If you have a consistent track record of architecting and building large-scale systems, enjoy solving intriguing system challenges at internet-scale, and if you are innovative at heart, and have a great balance of skills in learning, organizing, building, and enjoy making an impact, this role might be a great fit for you!
What you’ll be doing:
-
Design & Infrastructure
-
Contribute to postmortem culture by facilitating comprehensive, blameless post-incident reviews that identify root causes, contributing factors, and actionable remediation items. Track incident trends to identify systemic issues and prioritize reliability improvements.
-
Implement chaos engineering practices to proactively identify failure modes, validate system resilience, and build confidence in recovery procedures. Conduct game days and disaster recovery exercises.
-
-
SRE Process & Principles Implementation
-
Deploy and evolve SRE practices across the organization by establishing core SRE principles, frameworks, and methodologies. Define and implement service reliability practices, including Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Error Budgets, to balance innovation velocity with system reliability.
-
Manage Error Budgets as a m
-
Similar Jobs
Corelight
Lead Cloud Infrastructure Engineer / Site Reliability Engineer (SRE)**
New Era Technology
Site Reliability Engineer (SRE)
Xebia CEE
Senior AWS and Azure DevOps Engineer (SRE) with AI
JFrog
Software Engineer - SRE (Python)
Grafana Labs
Senior Software Engineer - Grafana Databases, SRE | United Kingdom | Remote
Grafana Labs