Staff Software Engineer, Observability

Confirmed live in the last 24 hours

Gusto

San Francisco, CA

Hybrid

Posted March 27, 2026

Job Description

About Gusto

At Gusto, we're on a mission to grow the small business economy. We handle the hard stuff—like payroll, health insurance, 401(k)s, and HR—so owners can focus on their craft and customers. With teams in Denver, San Francisco, and New York, we’re proud to support more than 400,000 small businesses across the country, and we’re building a workplace that represents and celebrates the customers we serve. Learn more about our Total Rewards philosophy.

Staff Observability Engineer

Gusto’s Reliability Engineering team enables our product teams to build impactful products by building secure, resilient, and accessible systems, using tools like AWS, Terraform, Datadog, and Kubernetes.

About Gusto

Gusto is a modern, online people platform that helps small businesses take care of their teams. On top of full-service payroll, Gusto offers health insurance, 401(k)s, expert HR, and team management tools. Today, Gusto offices in Denver, San Francisco, and New York serve more than 200,000 businesses nationwide.

Our mission is to create a world where work empowers a better life, and it starts right here at Gusto. That’s why we’re committed to building a collaborative and inclusive workplace, both physically and virtually. Learn more about our Total Rewards philosophy.

What You’ll Do

Shape the engineering organization standards around observability.
Own and evolve the observability platform, including distributed logging, metrics, and tracing infrastructure.
Build AI-native capabilities to automatically detect anomalies, diagnose failures, and accelerate root cause analysis.
Create powerful developer experiences through dashboards, notebooks, and interactive debugging tools.
Drive reliability automation with intelligent alerting, diagnostics, and incident response systems.
Partner across engineering teams to embed observability and reliability best practices.
Mentor engineers and influence reliability culture across the organization.

What We’re Looking For

Strategic systems thinker who identifies high impact opportunities and builds scalable solutions.
Experience operating large scale distributed systems in production, especially logging platforms or time series databases.
Strong fundamentals in systems, networking, and cloud infrastructure such as Kubernetes and AWS.
Thrive in ambiguous environments and roll up your sleeves to solve unscoped problems end to end.
Product mindset or full stack instincts and excited to build real tools engineers love to use.
Strong communicator who can align technical and non technical stakeholders.
Bonus if you have built or contributed to observability ecosystems such as OpenTelemetry or Prometheus

Required Experience

Have 8+ years of relevant industry experience building and operating large-scale observability or monitoring infrastructure
Experience implementing or operating observability platforms such as Datadog, Sentry, Splunk, or similar.
Have strong SWE coding proficiency in at least one of Ruby, Python, or TypeScript.