About the role

Aplyr's Quick Take

This role is for a hands-on Staff ML Engineer focused on leading machine learning initiatives within the Test Engine team at Buildkite. You'll be responsible for defining the technical strategy and building models to optimize test selection based on code changes, while also mentoring other engineers.

Good fit

Ideal candidates have extensive experience in machine learning and software engineering, with a strong background in predictive modeling. A collaborative working style and the ability to influence technical direction will help you thrive in this role.

Worth noting

The position emphasizes a shift in testing strategies, moving away from traditional full test suites to a more efficient, targeted approach. This could be a unique opportunity for those interested in shaping the future of testing in CI/CD environments.

Push a one-line fix. Then watch CI grind through forty minutes of tests, ninety-five percent of which never had a chance of touching what you changed. You already know the handful that mattered. The test suite doesn't — so it runs everything, every time, just in case.

That "just in case" is the most expensive habit in software delivery. Every engineering team pays it, because the alternative — knowing which tests actually matter for a given change — has been too hard to get right.

We're building the team that gets it right. This role sits at the centre of it.

The problem worth solving

Test Engine already ingests billions of test runs. We can see the tests, the code underneath them, and how the two move together — at a scale very few people ever get to work with. The raw material for the answer is already here. Nobody's turned it into predictions yet.

That's the step to take: for a given change, work out the slice of tests most likely to fail, and run only those. Get it right and teams stop re-running what hasn't changed, and spend that time where it counts — like fixing the two percent of tests most likely to break.

It's a genuinely difficult ML problem — sparse signal, cold-start on new repos, generalising across languages and frameworks, and latency tight enough to sit in the critical path. It's also close to a blank page. There's no ML org above you setting the direction — you'd set it. And not alone: we've just hired another ML engineer, so there's someone to think out loud with from day one.

What you'll own

Machine learning in Test Engine, end-to-end — the strategy, the architecture, and the models running in production.

That means shaping the whole path: pulling features out of code changes and test history, training and evaluating models, building the serving layer that keeps predictions fast, and closing the loop so the system keeps improving. You'd make the trade-offs that matter — accuracy versus latency, what happens when confidence is low — and build the platform underneath so the next model into production is quick and repeatable, not a one-off.

✨ The person we're picturing

You've taken ML models the whole way — from rough idea to something running reliably in production, monitored and retrained, owned rather than handed off.

Two things matter more than any specific tool:

You've built ML that generalised. Not one clever model — a repeatable approach that worked across more than one use case.
You're comfortable where the signal is noisy. Classification, ranking, prediction — problems where the data doesn't hand you the answer.

Day to day you'll live in Python and SQL, on AWS, with containerised workloads and data-at-scale tooling (Spark, Flink, or similar). Experience with code analysis, CI/CD systems, or ranking problems is a real head start — a bonus, not a bar.

The one thing we won't budge on: you've shipped and owned ML in production. Prototyped and handed off doesn't count here.

Is this you?

You're likely a strong fit if you:

Get energised by a blank page and want to be the one who fills it
Care more about models working in production than papers about models
Do your best work async, with deep focus and real autonomy

This probably isn't the right role if you:

Want an established ML org around you for direction and review
Prefer research and experimentation over shipping and operating
Need close scaffolding — flat and high-autonomy means less of it

We'd rather you know that now than three interviews in.

Why Buildkite

Frontier work. CI/CD is becoming the next bottleneck in the AI era, and Buildkite is built for that moment.
Real scale. The world's leading engineering teams ship software to over a billion daily users through Buildkite. Your models sit in their critical path.
Ownership. ~150 people, flat structure, and you're the most senior ML person here — influence you don't get where the ML org is three layers deep.
Remote, properly. We've worked this way since 2013 — async, built for deep focus, with genuine overlap across ANZ and US-Pacific.

What happens next

Every application gets a response. If this is the problem you've been wanting to get your hands on, apply now, or reach out with questions first.

Job location

Our Engineering teams are based in the ANZ/PST region. This is a conscious decision, as it allows us to move quickly and minimise fully async work. So whilst Buildkite is a fully remote company, this doesn't mean that we hire in every location. Please be aware that if you're applying from outside of this region, we unfortunately aren't in a position to hire you. Currently Buildkite is not in a position to offer sponsorship.

Equal Opportunity Employer

At Buildkite, we value diversity and celebrate all types of skills, backgrounds, and experiences. We’re dedicated to fostering an inclusive environment and providing reasonable accommodations throughout our recruitment process.

If you need any accommodations or support during the application or interview process, please reach out to us at accommodations@buildkite.com.

Skills & Tags

python aws machine learning ai data product

Aplyr's read

Buildkite empowers software teams with robust CI/CD tools, attracting tech-savvy professionals who thrive on innovation and infrastructure autonomy.
Synthesized from recent postings & public sources

What's promising

•Buildkite allows developers to run builds on their own infrastructure, offering flexibility and control.
•The platform enhances productivity by automating development workflows, reducing manual intervention.
•Buildkite supports cloud integration, enabling scalable and efficient development processes.

What to watch

•Limited public information about Buildkite's financial performance and stability.
•Potentially steep learning curve for new users unfamiliar with CI/CD concepts.
•Dependence on user infrastructure may pose challenges for smaller teams lacking resources.

Why Buildkite

•Buildkite's hybrid model combines local infrastructure control with cloud capabilities.
•Focus on developer autonomy differentiates Buildkite from fully cloud-based CI/CD solutions.
•Strong emphasis on community engagement through roles like Senior Community Engineer.

Aplyr’s read is generated by AI from public sources. Was it useful?

About Buildkite

Buildkite

buildkite.com

View company

Buildkite is a leading platform for continuous integration and continuous delivery (CI/CD) that enables software teams to automate their development workflows. By allowing developers to run builds on their own infrastructure while leveraging the power of the cloud, Buildkite enhances productivity and collaboration across teams.