Back to Search
Overview
Lead / Manager

Engineering Manager, AI & Data Infrastructure

Confirmed live in the last 24 hours

Decagon

Decagon

San Francisco
On-site
Posted April 23, 2026

Job Description

About Decagon

Decagon is the leading conversational AI platform empowering every brand to deliver concierge customer experiences.

Our technology enables industry-defining enterprises like Avis Budget Group, Block’s Cash App and Square, Chime, Oura Health, and Hunter Douglas to deploy AI agents that power personalized, deeply satisfying interactions across voice, chat, email, SMS, and every other channel.

We’re building a future where customer experiences are being redefined from support tickets and hold music to faster resolutions, richer conversations, and deeper relationships. We’re proud to be backed by world-class investors who share that vision, including a16z, Accel, Bain Capital Ventures, Coatue, and Index Ventures, along with many others.

We’re an in-office company, driven by a shared commitment to excellence and velocity. Our values — Just Get It Done, Invent What Customers Want, Winner’s Mindset, and The Polymath Principle — shape how we work and grow as a team.

About the Team

The Infrastructure team builds and operates the foundations that power Decagon: platform, model inference, compute, data, and developer experience. We partner closely with product, research, and applied AI teams to deliver high-scale, low-latency systems with clear SLOs and great developer ergonomics.

We organize around a couple of focus areas:

  • Platform: The foundational cloud stack — networking, compute, storage, security, and infrastructure-as-code — to ensure reliability, scale, and cost efficiency. CI/CD, paved paths, and core services that make shipping fast, safe, and consistent across teams.

  • ML & Data: Streaming/batch data platforms powering analytics/BI and customer-facing telemetry, including for customer-managed and on-prem environments. Realtime databases that enable low-latency agents. GPU and model-serving platforms for LLM inference with multi-provider routing.

Our mission is to deliver magical support experiences — AI agents working alongside humans to resolve issues quickly and accurately.

About the Role

We're looking for a hands-on Engineering Manager to lead the AI & Data Infrastructure team. This is a deeply technical player/coach role that sits at the core of how Decagon's agents think, respond, and learn. You'll lead the team responsible for the data and inference systems that every agent interaction depends on — from the streaming and batch pipelines that power analytics and customer-facing telemetry, to the realtime databases that back low-latency agent behavior, to the GPU and model-serving platforms that route LLM inference across multiple providers.

You'll stay close to the code and systems — reviewing designs, participating in incident response, and contributing directly when it helps the team move faster. You'll also lead by example on AI-assisted engineering, setting the standard for how the team uses AI coding tools to ship higher-quality work more quickly.

You'll hire and develop a high-performing team while partnering closely with Research, Product Engineering, Platform, and customer-facing teams to make shipping fast and safe — across our primary cloud as well as the single-tenant and on-prem environments we operate for regulated enterprise customers. Success requires strong people leadership, crisp execution across concurrent enterprise and research commitments, and the technical depth to make sound architectural calls under real constraints.

In this role, you will

  • Build, lead, and develop a high-performing team of data and ML infrastructure engineers, including hiring, coaching, and performance management.

  • Own the technical strategy and roadmap for Decagon's AI & Data Infrastructure — streaming/batch data, realtime databases, and the GPU and model-serving stack powering LLM inference.

  • Stay hands-on: review designs and PRs with depth, lead architecture for hard problems, and contribute code when the team needs it.

  • Drive architecture for high-throughput data systems and low-latency inference, including multi-provider LLM routing and CDC pipelines at scale

goawsgcpazurekubernetesaidataanalyticsproductdesign