Machine Learning Engineer
Confirmed live in the last 24 hours
CZ Biohub
Job Description
Biohub is the first large-scale initiative bringing frontier AI models, massive compute, and frontier experimental capabilities under one roof. We're building a general-purpose system to accelerate scientific discovery, integrating frontier AI models, biological foundation models, and lab capabilities, with the ultimate goal of curing disease. Our technology powers scientists around the world, translating AI capabilities into tools that accelerate research everywhere.
Biohub operates one of the largest AI compute clusters dedicated to biology, spanning three frontier research institutes with some of the world's leading biologists. We're not a startup trying to find product-market fit, and we're not a pharma company optimizing a pipeline. We're building frontier AI for fundamental science, as open science, at a scale no one else is doing. This is a unique moment for scientific acceleration. The problems are among the hardest and most impactful problems you can choose to work on, and we move at a pace that meets this moment.
Our research spans:
- Frontier molecular modeling, from protein language models (e.g., ESM) to structure prediction (e.g., ESMFold) and beyond.
- Scaled biological foundation models trained on some of the largest GPU clusters dedicated to science
- Imaging foundation models trained across the world's largest microscopy datasets
- Reasoning and agentic systems that connect frontier LLMs with biological foundation models
- Mechanistic interpretability of biological foundation models: extracting new biological knowledge directly from model weights
- Scientific data at unprecedented scale: AI systems to collect, curate, and learn from some of the richest biological datasets ever assembled
Join our Team!
As an ML Engineer, you'll join some of the strongest infrastructure engineers in AI, building the systems that connect everything together. The infrastructure problems you solve directly determine what science becomes possible.
What You'll Do
- Build and maintain pre-training infrastructure across thousands of GPUs.
- Design and optimize GPU-native data loading pipelines for scientific data workloads at petabyte scale.
- Build I/O and pipeline systems for biological data unlike anything in standard AI: microscopy volumes, transcriptomics, spatial genomics.
- Define the abstractions that researchers will build on for years.
- Own the ML lifecycle: artifact tracking, fine-tuning pipelines, monitoring, and production reliability.
- Build the DevOps and tooling that make every engineer and researcher more productive.
- Deploy Biohub's technology, powering the tools scientists use worldwide.
What You'll Bring
We're looking for engineers who've built infrastructure for large-scale ML systems and are energized by problems that don't have existing solutions yet.
- Hands-on Pytorch: custom training loops, distributed training, or low-level performance work
- GPU-native data I/O and large-scale tensor formats (Zarr, HDF5, TensorStore)
- Distributed computing frameworks (Spark, Dask, Ray)
- Docker and Kubernetes
- A track record of building systems that other engineers and researchers depend on
- Experience in build
Similar Jobs
Roku
Senior Machine Learning Engineer
Precision AQ
AI/ML Engineer II
Precision Medicine Group
AI/ML Engineer II
Descript
Senior Software Engineer, AI Platform and Enablement
Machine Learning Engineer II
Anduril Industries