About the role
Build the data infrastructure for robots operating in the real world.
Robotics is moving from research labs into production across factories, warehouses, vehicles, and field deployments. When robots fail, behave unexpectedly, or need to be improved, engineers rely on data to understand what actually happened.
At Foxglove, we build the observability, visualization, and data infrastructure that makes that possible. Our tools are used by robotics and autonomous systems teams to ingest, store, query, replay, and analyze massive volumes of multimodal sensor data from live systems and from production fleets.
About the Role
We're looking for a ML Platform Engineer with deep infrastructure instincts to help design, deploy, and scale the systems that power Foxglove's data platform. This is a platform-first role: you'll own the infrastructure layer that makes ML possible in production, not just the models that run on top of it.
You'll be responsible for the reliability, scalability, and performance of the ML platform itself, from inference serving and pipeline orchestration to training infrastructure and evaluation frameworks. The problems are real and urgent: petabyte-scale multimodal robotics data, high-throughput retrieval and embedding pipelines, and the internal ML flywheel that lets our team ship fast. This is a hands-on infrastructure role, not research.
Key Responsibilities
Design, deploy, and operate production inference infrastructure — including model serving, autoscaling, load balancing, and cost optimization across cloud environments
Own the platform architecture for embedding and retrieval pipelines that power semantic search over multimodal robotics data (image, video, point cloud, and timeseries)
Build and maintain the training and evaluation infrastructure that enables rapid iteration on model performance — including job orchestration, experiment tracking, and dataset versioning
Drive cloud infrastructure decisions (AWS/GCP) that directly impact latency, throughput, reliability, and cost at scale
Define platform abstractions and internal tooling that let product engineers ship ML-powered features without needing to manage infrastructure themselves
Evaluate, integrate, and operationalize third-party ML infrastructure components; establish clear build vs. buy frameworks for the team
What We're Looking For
Deep, hands-on experience owning production ML infrastructure: inference serving, model optimization (e.g., vLLM, Triton, TorchServe), orchestration, and cloud cost management
Strong foundation in distributed systems and cloud infrastructure (AWS/GCP) — you think in terms of system reliability, failure modes, and operational burden, not just model accuracy
Experience architecting and operating retrieval systems at scale, including vector databases (e.g., Pinecone, Lance, turbopuffer, pgvector) and embedding pipelines over large, heterogeneous datasets
A platform engineer's mindset: you build systems that other engineers depend on, and you take that responsibility seriously
Proven ability to operate with high ownership — you can make hard infrastructure tradeoffs independently and move fast without breaking things
Strong communication skills; you can explain infrastructure tradeoffs clearly to both ML and non-ML engineers
Bonus Points
Familiarity with fine-tuning and domain adaptation techniques for LLMs or embedding models (i.e. SFT, PEFT)
Familiarity with data mining or hybrid search workflows, especially as applied in robotics autonomous vehicles, or physical AI workflows
Prior experience building ML platforms, evaluation frameworks, or data management tooling from the ground up
What We Offer
$300 monthly budget towards commuter benefits or building your personal workspace (remote only)
Competitive equity grant in a Series B company
Medical, Dental, Vision, and Term Life insurance coverage at 100% for employees and 75% for dependents
401(k) matching up to 4%
4 weeks vacation, plus holidays and winter break
All expenses paid company off-sites 2× per year
Why Join Us
Impact: Own growth at a fast-growing, high-leverage moment for the company.
Mission: Accelerate the development of the next generation of robotics and embodied AI.
Team: Work with world-class engineers, designers, and researchers passionate about open-source and developer tools.
Ownership: Drive initiatives end-to-end, with high autonomy and visibility.
Aplyr's read
Foxglove is a niche software firm specializing in robotics tools, attracting engineers passionate about data visualization and autonomous systems.
What's promising
- •Foxglove's focus on robotics tools positions it at the cutting edge of autonomous technology.
- •The company offers roles that emphasize data visualization, appealing to engineers with a strong interest in graphical data representation.
- •Foxglove's work in autonomous systems provides opportunities to contribute to innovative, high-impact projects.
What to watch
- •Limited public information about Foxglove's market reach and financial stability.
- •The niche focus may limit career growth opportunities outside robotics and autonomous systems.
- •Potential candidates may face a steep learning curve due to the technical complexity of the projects.
Why Foxglove
- •Foxglove specializes in tools that visualize and analyze complex robotics data, a rare focus in the software industry.
- •The company's emphasis on forward-deployed engineering roles indicates a commitment to real-world application and client interaction.
- •Foxglove's recruitment of specialized roles like Applied ML Engineer highlights its dedication to cutting-edge machine learning integration.
Aplyr’s read is generated by AI from public sources. Was it useful?
About Foxglove
Foxglove is a software company focused on building tools for robotics and autonomous systems, enabling developers to visualize and analyze data from complex systems.