Back to Search
Overview
Staff

Staff Software Engineer, Observability

Confirmed live in the last 24 hours

CoreWeave

CoreWeave

Compensation

$188,000 - $250,000/year

Livingston, NJ / New York, NY / Sunnyvale, CA / Bellevue, WA
Hybrid
Posted March 25, 2026

Job Description

CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence. Trusted by leading AI labs, startups, and global enterprises, CoreWeave combines superior infrastructure performance with deep technical expertise to accelerate breakthroughs and turn compute into capability. Founded in 2017, CoreWeave became a publicly traded company (Nasdaq: CRWV) in March 2025. Learn more at www.coreweave.com.

CoreWeave is the AI Hyperscaler™, delivering a cloud platform of cutting edge services powering the next wave of AI. Our technology provides enterprises and leading AI labs with the most performant, efficient and resilient solutions for accelerated computing. Since 2017, CoreWeave has operated a growing footprint of data centers covering every region of the US and across Europe. CoreWeave was ranked as one of the TIME100 most influential companies of 2024.

As the leader in the industry, we thrive in an environment where adaptability and resilience are key. Our culture offers career-defining opportunities for those who excel amid change and challenge. If you’re someone who thrives in a dynamic environment, enjoys solving complex problems, and is eager to make a significant impact, CoreWeave is the place for you. Join us, and be part of a team solving some of the most exciting challenges in the industry.  

CoreWeave powers the creation and delivery of the intelligence that drives innovation.

About the role:

We are seeking a highly experienced Staff Software Engineer to lead our efforts in building, maintaining, and optimizing highly scalable, reliable, and secure systems.

The Observability team is responsible for deploying and maintaining critical infrastructure at CoreWeave including our logging, tracing, and metrics platforms as well as the pipelines that feed them.

Key Responsibilities:

Lead and mentor engineers, fostering a culture of collaboration and continuous improvement.

  • Scale logging, tracing, and metrics platforms to support a global datacenter footprint.
  • Develop and refine monitoring and alerting to enhance system reliability.
  • Advise engineers across CoreWeave on optimal usage of Observability systems.
  • Automate interactions with CoreWeave’s Compute Infrastructure layer.
  • Manage production clusters and ensure development teams follow best practices for deployments.

Required Qualifications:

  • 7+ years of experience in Software Engineering, Site Reliability Engineering, DevOps, or a related field.
  • Deep expertise across all observability pillars using tools like ClickHouse, Elastic, Loki, Victoria Metrics, Prometheus, Thanos and/or Grafana.
  • Expertise in Kubernetes, containerization, and microservices architectures.
  • Proven track record of leading incident management and post-mortem analysis.
  • Excellent problem-solving, analytical, and communication skills.

Preferred Qualifications:

  • Experience running and scaling observability tools as a cloud provider.
  • Experience administering large-scale kubernetes clusters.
  • Deep understanding of data-streaming systems.

The base salary range for this role is $188,000 to $250,000. The starting salary will be determined based on job-related knowledge, skills, experience, and market location. We strive for both market alignment and internal equity when determining compensation. In addition to base salary, our total rewards package includes a discretionary bonus, equity awards, and a comprehensive benefits program (all based on eligibility).

gorustawskubernetesaidevopsdataproduct