Senior Site Reliability Engineer
Confirmed live in the last 24 hours
Coalition
Compensation
$135,900 - $215,000/year
Job Description
About us
Coalition is the world's first Active Insurance provider designed to help prevent digital risk before it strikes. Founded in 2017, Coalition combines comprehensive insurance coverage and innovative cybersecurity tools to help businesses manage and mitigate potential cyberattacks.
Opportunities to make an impact with bold thinking are real—and happening daily at Coalition.
About the role
We are looking for a Senior Site Reliability Engineer to join our Platform SRE team. In this role, you will build and operate the infrastructure, tools, and "paved roads" that empower our developers to deliver scalable, secure, and reliable software with speed and confidence.
You'll work across the entire stack—from infrastructure automation and observability to developer enablement and system reliability. You will be a key collaborator with software engineering and security teams, helping to evolve our Infrastructure as Code (IaC), enhance CI/CD pipelines, and scale our internal developer platform. We value pragmatism and engineering excellence, primarily using Python, Go, and AWS to reduce toil and build self-service capabilities.
Responsibilities
- Infrastructure Automation: Design, build, and scale production environments using AWS and Terraform, driving architectural decisions that improve long-term maintainability and reliability.
- System Reliability: Lead efforts to improve platform resilience through failure-based testing, automated recovery strategies, and proactive capacity planning.
- Developer Enablement: Own the design and delivery of reusable platform components and self-service tools that streamline the developer experience and reduce cross-team toil.
- Observability: Define and evolve observability standards across the platform, including system metrics, distributed tracing, and SLO frameworks.
- Project Ownership: Own projects end to end—from initial scoping and effort estimation through detailed planning, execution, and successful rollout.
- Mentorship & Standards: Mentor engineers across the team, uphold high infrastructure quality, and actively shape the best practices and standards used by the organization.
- Collaboration: Engage in technical design discussions, providing guidance and feedback while adapting strategies based on team input and evolving requirements.
Skills and Qualifications
- 6+ years of experience in SRE, DevOps, Cloud Engineering, or Software Development roles
- Hands-on experience operating production environments in AWS
- Proficiency in Go or Python, with experience building production-grade automation, tooling or libraries
- Strong experience with Terraform
- Experience with container orchestration platforms like ECS or Kubernetes
- Familiar
Similar Jobs
Axon
Sr. Site Reliability Engineer I
Dropbox
Site Reliability Engineer
Coalition
Senior Site Reliability Engineer
Axon
Site Reliability Engineer II
Axon
Site Reliability Engineer II
Axon