About the role
We’re building a world of health around every individual — shaping a more connected, convenient and compassionate health experience. At CVS Health®, you’ll be surrounded by passionate colleagues who care deeply, innovate with purpose, hold ourselves accountable and prioritize safety and quality in everything we do. Join us and be part of something bigger – helping to simplify health care one person, one family and one community at a time.
Position Summary
The Site Reliability Engineer (SRE) is responsible for ensuring the reliability, availability, performance, and operational scalability of the myPBM platform. This role applies software engineering practices to operations, with a focus on automation, observability, incident management, and continuous improvement to support the stable, scalable delivery of client-facing services.
The SRE partners closely with DevOps, Engineering, Infrastructure, and Security teams to balance system reliability with delivery velocity while maintaining compliance with enterprise standards.
*We prefer this person is hybrid in Richardson, TX, Northbrook, IL or Scottsdale, AZ
Primary Responsibilities
1. Reliability Engineering and Operations
Ensure high availability, resiliency, and performance of myPBM applications and infrastructure.
Define and manage SLIs, SLOs, and SLAs for critical services.
Monitor production systems and proactively identify issues before customer impact.
Lead incident response, triage, and root cause analysis (RCA).
Drive continuous improvement to reduce repeat incidents and operational toil.
2. Monitoring, Observability, and Alerting
Implement and maintain end-to-end observability across UI, APIs, and infrastructure layers.
Build and manage monitoring solutions using:
AppDynamics (APM, RUM, synthetic monitoring)
Splunk (logs, dashboards, and error tracking)
Design actionable alerts and escalation workflows using tools such as xMatters and MIR3.
Standardize dashboards and ensure data accuracy and visibility.
Continuously optimize alerting to reduce noise and improve signal quality.
3. DevSecOps and Release Engineering
Support and enhance CI/CD pipelines, including GitHub Actions and enterprise pipeline solutions.
Enforce deployment guardrails, release governance, and production readiness checks.
Support build and deployment failure triage and rollback strategies.
Partner with development teams to improve deployment reliability and automation.
Ensure adherence to change management (CAB/SNOW) and release policies
4. Infrastructure Engineering and Platform Stability
Manage and support cloud infrastructure, including AKS, compute, storage, and networking.
Ensure platform health, capacity monitoring, and performance optimization.
Support infrastructure provisioning and environment setup.
Drive disaster recovery (DR) readiness and failover validation, including RTO and RPO objectives.
Enable application onboarding onto standardized enterprise platforms.
5. Security and Compliance
Implement continuous security monitoring and vulnerability remediation.
Manage secrets, certificates, and identity integration, including IAM onboarding.
Ensure compliance with CVS security standards, audit requirements, and production readiness controls.
Enforce shift-left security practices in CI/CD pipelines.
6. Incident Management and Support Model
Participate in 24x7 on-call rotation and incident response.
Partner with Production Support to resolve incidents.
Ensure monitoring and alerting gaps are identified and closed.
Maintain incident documentation and improve standard operating procedures.
Support the full issue detection, triage, resolution, and prevention lifecycle.
7. Automation and Continuous Improvement
Automate repetitive operational tasks to reduce toil.
Implement infrastructure as code (IaC) practices.
Continuously improve deployment pipelines, monitoring, and observability.
Enable predictive insights and proactive issue prevention.
8. Collaboration and Platform Enablement
Work closely with engineering, DevOps, infrastructure, and security teams.
Enable a shared ownership model for reliability and operations.
Provide guidance on production readiness and operational best practices.
Required Qualifications
5+ years of experience in site reliability engineering, DevOps, or platform engineering including the following:.
Experience with Monitoring and observability tools such as Splunk and AppDynamics
Cloud platforms, preferably Azure, including AKS and Kubernetes
CI/CD pipelines such as GitHub Actions, Jenkins, or similar tools
Strong understanding of Incident management and root cause analysis, Monitoring, alerting, and logging practices, and Infrastructure and networking fundamentals
Scripting experience with Python, Bash, or PowerShell.
Preferred Qualifications
Experience in healthcare or other regulated environments.
Knowledge of site reliability engineering principles, including SLIs, SLOs, and error budgets.
Familiarity with DevSecOps practices and compliance requirements.
Experience supporting large-scale distributed systems.
Education
Bachelor's degree or equivalent experience.
Anticipated Weekly Hours
40Time Type
Full timePay Range
The typical pay range for this role is:
$92,700.00 - $203,940.00This pay range represents the base hourly rate or base annual full-time salary for all positions in the job grade within which this position falls. The actual base salary offer will depend on a variety of factors including experience, education, geography and other relevant factors. This position is eligible for a CVS Health bonus, commission or short-term incentive program in addition to the base pay range listed above.
Our people fuel our future. Our teams reflect the customers, patients, members and communities we serve and we are committed to fostering a workplace where every colleague feels valued and that they belong.
Great benefits for great people
We take pride in offering a comprehensive and competitive mix of pay and benefits that reflects our commitment to our colleagues and their families.
Additional details about available benefits are provided during the application process and on Benefits Moments.
Qualified applicants with arrest or conviction records will be considered for employment in accordance with all federal, state and local laws.
Aplyr's read
CVS Health is a healthcare giant blending retail pharmacy with insurance services, ideal for those interested in diverse healthcare roles and innovation.
What's promising
- •CVS Health's integration of pharmacy and insurance offers diverse career paths.
- •Strong focus on healthcare innovation with initiatives like HealthHUB locations.
- •Extensive national presence provides job stability and opportunities for relocation.
What to watch
- •Recent layoffs in certain divisions raise concerns about job security.
- •High-pressure retail environment may lead to employee burnout.
- •Complex organizational structure can slow decision-making processes.
Why CVS Health
- •CVS Health's acquisition of Aetna uniquely positions it in both retail and insurance sectors.
- •HealthHUB stores offer a distinctive model combining retail and healthcare services.
- •CVS Caremark provides a robust platform for pharmacy benefits management.
Aplyr’s read is generated by AI from public sources. Was it useful?
About CVS Health
CVS Health is a healthcare company that provides a range of services including pharmacy benefits management, retail pharmacy, and health insurance services.