Manager, Site Reliability Engineering
Confirmed live in the last 24 hours
CVS Health
Job Description
We’re building a world of health around every individual — shaping a more connected, convenient and compassionate health experience. At CVS Health®, you’ll be surrounded by passionate colleagues who care deeply, innovate with purpose, hold ourselves accountable and prioritize safety and quality in everything we do. Join us and be part of something bigger – helping to simplify health care one person, one family and one community at a time.
Role Summary
This role leads Site Reliability Engineering as a strategic engineering function, balancing service stability and delivery velocity through automation, observability, and disciplined operational practices. The SRE Manager is accountable for ensuring the reliability, availability, performance, and operational readiness of business‑critical platforms and applications.
The SRE Manager partners closely with Engineering, Platform, and DevSecOps teams to enable safe, scalable, and resilient software delivery.
Responsibilities
SRE Leadership & Culture
Define and drive a clear SRE vision aligned with engineering strategy and business priorities.
Build and sustain a culture of blameless incident management, continuous improvement, and operational ownership.
Lead, mentor, and develop SRE engineers across reliability engineering, automation, infrastructure, and observability.
Ensure role clarity, capacity planning, skill coverage, and knowledge sharing across the SRE organization.
Reliability Strategy & Service Management
Establish and govern SLIs, SLOs, and error budgets for supported services.
Partner with application teams to translate business expectations into measurable reliability targets.
Prioritize reliability work using data‑driven insights and error‑budget policies.
Ensure Tier‑1 and Tier‑2 applications meet availability, performance, and resilience standards.
Incident Management & Operational Excellence
Own end‑to‑end incident management, including on‑call readiness, escalation, and communications.
Lead post‑incident reviews and ensure actionable follow‑ups are tracked to closure.
Reduce operational toil through automation and standardization.
Serve as the single point of accountability for production stability and operational readiness.
Observability, Monitoring & Automation
Drive best‑in‑class observability practices across logs, metrics, traces, and alerts.
Ensure alert quality, signal‑to‑noise optimization, and actionable dashboards.
Champion automation across CI/CD, infrastructure provisioning, scaling, and recovery workflows.
Partner with DevOps and Platform teams to modernize tooling and operational frameworks.
Release Readiness & Engineering Enablement
Partner with engineering teams to embed SRE into design, development, and release planning.
Ensure production readiness standards are met before go‑live (monitoring, rollback, capacity, security).
Enable safe deployment patterns (blue‑green, canary, feature flags).
Support high‑confidence releases without compromising reliability.
Cross‑Functional Collaboration
Act as a trusted partner to Engineering, Platform, Architecture, Security, and Product leadership.
Provide clear reliability insights and operational risk visibility to senior leadership.
Influence architectural decisions to improve system resilience and scalability.
Required Qualifications
Bachelor’s degree in Computer Science, Engineering, or equivalent experience.
5+ years of experience in SRE, DevOps, Platform Engineering, or Production Engineering roles.
3+ years of people management experience leading high‑performing engineering teams.
Strong hands‑on experience with cloud platforms, container orchestration, CI/CD, monitoring, and incident management.
Proven ability to define and operate SLO‑driven reliability programs at scale.
Preferred Qualifications
7+ years of SRE experience
Experience supporting Tier‑1, large‑scale, business‑critical systems.
Deep understanding of Kubernetes, cloud‑native architectures, and distributed systems.
Strong background in automation, infrastructure as code, and observability platforms.
Experience working in regulated or compliance‑driven environments.
Ability to communicate complex operational topics clearly to executive audiences
Education
Bachelors Degree or equivalent experience
Anticipated Weekly Hours
40Time Type
Full timePay Range
The typical pay range for this role is:
$92,700.00 - $185,400.00This pay range represents the base hourly rate or base annual full-time salary for all positions in the job grade within which this position falls. The actual base salary offer will depend on a variety of factors including experience, education, geography and other relevant factors. This position is eligible for a CVS Health bonus, commission or short-term incentive program in addition to the base pay range listed above.
Our people fuel our future. Our teams reflect the customers, patients, members and communities we serve and we are committed to fostering a workplace where every colleague feels valued and that they belong.
Great benefits for great people
We take pride in our comprehensive and competitive mix of pay and benefits – investing in the physical, emotional and financial wellness of our colleagues and their families to help them be the healthiest they can be. In addition to our competitive wages, our great benefits include:
Affordable medical plan options, a 401(k) plan (including matching company contributions), and an employee stock purchase plan.
No-cost programs for all colleagues including wellness screenings, tobacco cessation and weight management programs, confidential counseling and financial coaching.
Benefit solutions that address the different needs and preferences of our colleagues including paid time off, flexible work schedules, family leave, dependent care resources, colleague assistance programs, tuition assistance, retiree medical access and many other benefits depending on eligibility.
For more information, visit https://jobs.cvshealth.com/us/en/benefits
We anticipate the application window for this opening will close on: 04/07/2026Qualified applicants with arrest or conviction records will be considered for employment in accordance with all federal, state and local laws.
Similar Jobs
FIS
Lead Site Reliability Engineer (SRE)
Mastercard
Director, Site Reliability Engineering
Equinix
Sr. Manager, Site Reliability Engineer
Mastercard
Director, Site Reliability Engineering
Okta
Senior Manager, Site Reliability Engineering - Infrastructure Platform
Microsoft