Applied AI Engineer, Inference
Confirmed live in the last 24 hours
Weights & Biases
Compensation
$165,000 - $242,000/year
Job Description
What You'll Do
Description of the team:
The Inference team is responsible for delivering high-performance model serving capabilities that meet the needs of real production workloads. We work at the intersection of model behavior, serving systems, hardware, and customer requirements to improve throughput, latency, reliability, and quality across our inference stack.
About the role:
We are looking for an Applied AI Engineer to help us understand, measure, and improve the real-world performance of our inference platform. In the near term, this role will focus on building and running rigorous benchmarks, profiling model and system behavior, identifying bottlenecks, and driving targeted optimizations for both platform-wide and customer-specific workloads. This role is intentionally scoped around applied performance work in support of the Inference organization. Initial responsibilities center on benchmarking, optimization, and workload-driven research rather than broad ownership of frontier model research agendas. Over time, the scope of the role is expected to broaden as the team and product mature.
- Build and maintain benchmarking workflows that measure latency, throughput, quality regressions, and cost across priority models and serving configurations.
- Benchmark our inference stack against realistic customer workloads and external provider baselines to identify performance gaps and improvement opportunities.
- Prof
Similar Jobs
OpenAI
Applied AI Engineer, Codex Core Agent
Anthropic
Applied AI Engineer
Genius Sports
Senior Applied AI Engineer
Genius Sports
Senior Applied AI Engineer
Genius Sports
Senior Applied AI Engineer
Applied Intuition