Back to Search
Overview
Staff

Staff Site Reliability Engineer

Confirmed live in the last 24 hours

Blink Health

Blink Health

Remote
Remote
Posted March 9, 2026

Job Description

Company Overview:

Blink Health is the fastest growing healthcare technology company that builds products to make prescriptions accessible and affordable to everybody.  Our two primary products – BlinkRx and Quick Save – remove traditional roadblocks within the current prescription supply chain, resulting in better access to critical medications and improved health outcomes for patients. 

BlinkRx is the world’s first pharma-to-patient cloud that offers a digital concierge service for patients who are prescribed branded medications. Patients benefit from transparent low prices, free home delivery, and world-class support on this first-of-its-kind centralized platform. With BlinkRx, never again will a patient show up at the pharmacy only to discover that they can’t afford their medication, their doctor needs to fill out a form for them, or the pharmacy doesn’t have the medication in stock. 

We are a highly collaborative team of builders and operators who invent new ways of working in an industry that historically has resisted innovation. Join us!

Responsibilities

  • Establish and evolve SRE best practices across the organization, including reliability principles, error budgets, incident response, postmortems, and operational readiness standards.

  • Define and drive observability strategy for system health, performance, and reliability, including SLIs/SLOs, alerting quality, dashboards, and service health indicators.

  • Design and implement software-driven solutions within the infrastructure domain, automating manual processes and eliminating operational complexity and toil.

  • Act as a technical leader and force multiplier, helping set priorities and influencing decision-making across core cloud infrastructure, reliability tooling, and platform architecture.

  • Take ownership of large, ambiguous initiatives, driving them from concept to delivery while aligning stakeholders across engineering, security, and product.

  • Combine deep knowledge of software development, infrastructure, and security to improve platform resilience, scalability, performance, and compliance.

  • Proactively identify systemic risks and reliability gaps, recommending and leading platform upgrades and architectural improvements before they become incidents.

  • Partner with engineering teams to improve developer workflows, tooling, and operational maturity, increasing productivity while reducing cognitive load.

  • Provide technical mentorship, architecture guidance, and high-quality design and code reviews for engineers across infrastructure and product teams.

  • Lead by example in documentation and knowledge sharing, ensuring systems and processes are well-understood and not dependent on individual ownership.

  • Participate in and help mature incident response, escalation practices, and post-incident learning across the organization.

Desired Experience

  • Bachelor’s or Master’s degree in Computer Science or equivalent practical experience.

  • 7+ years of experience in site reliability engineering, infrastructure engineering, or platform engineering roles, with demonstrated impact at scale.

Reliability & Troubleshooting

  • Expert-level, methodical troubleshooting across the entire stack, from application to kernel to network.

  • Strong command-line proficiency and deep expertise in Linux systems and operating system fundamentals.

  • Advanced understanding of networking concepts including load balancing, proxies, DNS, TCP/IP, NAT, and service-to-service communicati
reactpythongoawsgcpazurekubernetesaiproductdesign