About the role

Aplyr's Quick Take

This role is for a technical engineer focused on developing compiler infrastructure specifically for AI workloads. You'll be designing and optimizing systems for running AI programs efficiently across various hardware setups, rather than working on traditional language compilers. It's a hands-on, individual contributor position.

Good fit

Ideal candidates have a strong background in compiler design or runtime systems, with experience in AI or machine learning environments. You should be comfortable with complex technical challenges and have a collaborative working style.

Worth noting

The role emphasizes deep technical thinking about execution behavior and is not a typical compiler role, which may be a turn-off for those expecting traditional tasks. The company is also tackling cutting-edge challenges in AI infrastructure, which could be exciting for tech enthusiasts.

About Us

Gimlet is building the next generation of AI infrastructure: large-scale AI datacenters and the orchestration platform that coordinates them.

The future of AI will require vastly more compute than exists today. But as AI workloads become more complex and new hardware architectures emerge, simply deploying more GPUs isn't enough. The challenge is making increasingly diverse compute work together.

Gimlet's platform intelligently partitions and routes workloads across heterogeneous hardware, enabling step-function improvements in performance and efficiency. Customers deploy through production-grade APIs without needing to think about hardware selection, placement, or optimization.

We work with foundation labs, hyperscalers, and AI-native companies to power production workloads at massive scale and help define the infrastructure layer for the future of AI.

About the role

Gimlet Labs is seeking a Member of Technical Staff focused on compiler infrastructure for ML execution systems, spanning IR transformations, runtime systems, kernel orchestration, scheduling, and serving optimization.

You will help build the execution stack that transforms modern AI workloads into efficient programs running across heterogeneous hardware. The work spans runtime systems, compiler infrastructure, scheduling, memory movement, kernel orchestration, and serving optimization for large-scale inference workloads.

This is not a traditional language compiler or backend code generation role. We are looking for engineers who think deeply about execution behavior: IR transformations, runtime optimization, scheduling, memory locality, kernel composition, distributed execution, and heterogeneous serving infrastructure.

https://gimletlabs.ai/blog/low-latency-spec-decode-corsair

What you will work on

Design and implement compiler and runtime pipelines for large-scale AI inference workloads
Build and evolve IR transformations, lowering passes, and execution optimizations across graph, tensor, and kernel representations
Optimize execution for latency, throughput, memory efficiency, and heterogeneous hardware utilization
Develop scheduling, partitioning, and kernel orchestration strategies across accelerators and serving runtimes
Work on execution systems spanning compiler infrastructure, runtime behavior, memory movement, and kernel dispatch
Integrate new model architectures, execution patterns, and serving optimizations into the stack
Collaborate closely with systems, runtime, and kernel engineers to ensure correctness and performance across the full execution pipeline

You may be a good fit if

Strong systems and performance engineering fundamentals
Experience building compiler systems, compiler-adjacent infrastructure, or execution/runtime systems
Experience implementing IR transformations, compiler passes, lowering logic, or code generation systems
Ability to reason about execution behavior, memory systems, scheduling, and hardware efficiency
Strong software engineering skills in C++ and/or Python

Strong candidates may also have

Experience with MLIR, LLVM, XLA, TVM, Triton, or similar compiler/runtime infrastructure
Experience optimizing ML inference or serving workloads
Familiarity with runtime systems, kernel dispatch, launch APIs, or memory allocators
Experience working with GPUs, AI accelerators, or heterogeneous hardware systems
Experience profiling and debugging performance-critical systems
Familiarity with scheduling, partitioning, or kernel-level optimizations

What Makes Gimlet Different

At Gimlet, you will work on infrastructure problems that span the full stack of modern AI systems. Our team operates across datacenters, networking, distributed systems, compilers, runtimes, orchestration, and performance engineering to build the foundation for the next generation of AI infrastructure.

As an early member of the team, you will have significant ownership, work alongside highly technical engineers, and help shape both the systems we build and how we scale the company.

We value people who are excited to work across domains, take ownership of meaningful problems, and build technology that enables the next generation of AI.

Skills & Tags

python go ai backend data product design

Aplyr's read

Gimlet Labs is an AI-focused company pushing the boundaries of productivity and creativity, attracting talent passionate about cutting-edge technology and innovation.
Synthesized from recent postings & public sources

What's promising

•Gimlet Labs is at the forefront of AI-driven productivity tools, promising significant industry impact.
•The company offers diverse roles in AI research, attracting top technical talent.
•Strong focus on innovation and creative solutions in AI applications.

What to watch

•High competition in the AI sector may challenge Gimlet Labs' market position.
•Rapid technological changes require constant adaptation and innovation.
•Limited public information about company culture and work-life balance.

Why Gimlet Labs

•Gimlet Labs specializes in enhancing productivity through AI, setting it apart from general AI firms.
•Focus on creativity in AI solutions differentiates its product offerings.
•Emphasis on technical roles indicates a strong commitment to research and development.

Aplyr’s read is generated by AI from public sources. Was it useful?