Back to Search
Overview
Lead / Manager

Engineering Manager, Site Reliability Engineering

Confirmed live in the last 24 hours

nayya

nayya

Compensation

$174,000 - $210,000/year

New York, NY (Hybrid)
Hybrid
Posted March 16, 2026

Job Description

About Nayya

Founded in 2019, Nayya is on a mission to connect people’s most important information, so they can thrive in their health and wealth. Powered by AI and advanced analytics, Nayya’s platform transforms complex benefits experiences into intuitive, seamless, and ongoing interactions—meeting people's real world needs. As a trusted platform and partner to leading employers, benefits solutions, and HR tech providers, Nayya unlocks long-term value through helping employees live more resilient lives. Backed by strategic investors like ICONIQ, Felicis Ventures, SemperVirens, Workday Ventures, MetLife Nextgen Ventures, and ADP Ventures, Nayya is ushering in the future of health and wealth for all.

 

Engineering Manager, Site Reliability Engineering

Nayya

Job Summary

We are looking for a passionate and driven Engineering Manager, Site Reliability Engineering to lead our SRE team at Nayya. In this role, you will combine strong technical expertise with people leadership to build a high-performing team that ensures the reliability, scalability, and performance of our platform. This is a hands-on leadership role where you will actively contribute to design, write code, and engage directly in incident response alongside your team.

As an Engineering Manager at a fast-paced, growth-stage startup, you will be a key partner to engineering, product, and data leadership - setting the technical direction for infrastructure and operations while developing the people and processes that make it all work. We are seeking a leader who thrives in an environment that prioritizes impatience, excellence, resilience, and courage - someone who is excited about leading teams that make an immediate impact while pushing the boundaries of what’s possible.

You will own the roadmap for reliability and infrastructure, drive strategic decisions, and foster a culture of collaboration, continuous improvement, and technical excellence across the organization.

Key Responsibilities

People Leadership & Team Building

  • Build and lead a high-performing SRE team by hiring, onboarding, and retaining top engineering talent.
  • Provide regular coaching, mentorship, and career development support to direct reports, helping engineers grow into senior technical and leadership roles.
  • Conduct meaningful performance reviews, set clear goals, and create individual development plans aligned with team and company objectives.
  • Foster a team culture rooted in ownership, psychological safety, collaboration, and continuous learning.

Technical Strategy & Execution

  • Define and drive the SRE roadmap in partnership with engineering, product, and data leadership, ensuring alignment with business priorities. 
  • Directly contribute to the design and implementation of highly available systems while guiding the team's technical approach.
  • Establish and evolve standards for infrastructure as code, observability, CI/CD, incident management, and performance tuning.
  • Partner with software engineering teams to embed reliability practices into the software development lifecycle, including SLIs, SLOs, and error budgets.

Cross-Functional Collaboration

  • Serve as the primary point of contact for SRE across the organization, translating technical reliability concepts into business impact for non-technical stakeholders.
  • Collaborate with product, software engineering, and data teams to define and implement best practices for reliability, performance, and scalability.
  • Represent the SRE team in planning and prioritization discussions, advocating for infrastructure investments and AI enablement.

Operational Excellence

  • Own and continuously improve incident management processes, including on-call rotations, escalation procedures, and blameless postmortems.
  • Balance rapid delivery with system stability, ensuring reliable deployment pipelines and minimal downtime.
  • Drive a data-informed approach to reliability by establishing and tracking key metrics, SLIs, and error budgets.
  • Adapt quickly to evolving business needs and emerging technologie
pythonjavajavascriptgorustawsaidevopsdataanalytics