Back to Search
Overview
Staff

Staff Site Reliability Engineer, Platform

Confirmed live in the last 24 hours

Kentik

Kentik

Compensation

$165,000 - $200,000/year

Remote – United States
Remote
Posted February 13, 2026

Job Description

Who we are

Kentik is the network intelligence platform for modern infrastructure teams. Unlike traditional monitoring and observability tools, we demystify complex network operations, enabling organizations to deliver applications and innovation at scale. Built by network experts to make critical insight accessible to every engineer, Kentik is the real-time source of truth that understands every network in context — from data center to cloud to the internet. This single platform unifies and correlates cloud, device, flow, and synthetic data to turn telemetry into action. Market leaders like Akamai, Booking.com, Dropbox, and Zoom rely on Kentik to run, manage, and optimize their networks.

What we do

Kentik is looking for an experienced software engineer to join our Infrastructure team. This team is in charge of the software stack that powers Kentik - from configuration management and orchestration, to datastores and data pipelines, developer experience and internal observability. We are an international group of collaborative, experienced developers and operations practitioners, with broad and deep knowledge of networks, systems and applications.

If you're a senior engineer looking to move to a staff+ role, this will be a great opportunity! You will get to work with the rest of engineering, as well as product management and field engineers.

What you'll do

You will work on a very broad and diverse set of problems and technologies critical to the smooth operation of Kentik, the productivity and happiness of other engineers and the growth of our Engineering organization.

  • Build self-service, declarative and API-driven infrastructure components in go, nodejs
  • Contribute to our internal deployment tooling (mostly python CLI tools) and service orchestration platform based on Envoy, Nomad and other Hashicorp components
  • Help formulate and execute our strategy for datastores such as postgres, kafka, redis (reliability, performance, overhead, capacity planning, …)
  • Improve the reliability of our services, with code and testing improvements as well as internal advocacy and education
  • Mentoring of junior team members
  • Create and update technical documentation for infrastructure
  • Be on the on-call escalation path for services owned by the team

What you'll bring

Studies have shown that some candidates tend to apply to jobs only if they meet 100% of the qualifications. We encourage you to apply if you meet most of the criteria - even if you don’t match all of the qualifications, your skills and experience could be valuable in this role!

  • 8+ years of relevant experience
  • Passion for building and providing amazing tools and platforms to other engineers
  • Strong coding skills in Go or Python(alternatively:  server-side javascript, ruby, java …)
  • Significant experience with data ecosystems and tools, cloud or on-prem
  • An SRE mindset and and the intent to build reliable, easy to operate systems

Nice to haves:

  • Familiarity with Temporal (or similar workflow engines) for managing workflow execution and durable execution experience
  • Most our systems run on Linux bare-metal hosts managed with puppet - so any experience with that is a plus

Our tech stack

  • Our core data engine and platform are primarily written in Go
  • We use Node.js + Express for application serving, and React as our primary UI framework
  • We also use some JS and Python for tooling/scripting
  • In addition to our own database, we use Postgres, Kafka, Mysql, and Redis
  • Internal and public APIs expose both rest/json and gRPC endpoints
  • Haproxy, Envoy for API traffic routing and balancing
  • Github for source control, PRs, issues
  • Jenkins for automated builds

What we offer

Kentik is a fully remote company that operates globally. We seek professionals that will help us thrive as an org

reactnodepythonjavajavascriptgoawsaidataproduct