Infrastructure Engineer
Confirmed live in the last 24 hours
Overland AI
Job Description
About Overland AI
Founded in 2022 and headquartered in Seattle, Washington, Overland AI is transforming land operations for modern defense. The company leverages over a decade of advanced research in robotics and machine learning, as well as a field-test forward ethos, to deliver combined capabilities for unit commanders. Our OverDrive autonomy stack enables ground vehicles to navigate and operate off-road in any terrain without GPS or direct operator control. Our intuitive OverWatch C2 interface provides commanders with precise coordination capabilities essential for mission success.
Overland AI has secured funding from prominent defense tech investors including 8VC and Point 72, and built trusted partnerships with DARPA, the U.S. Army, Marine Corps, and Special Operations Command. Backed by eight-figure contracts across the Department of Defense, we are strengthening national security by iterating closely with end users engaged in tactical operations.
Role Summary
Overland AI is looking for an experienced Infrastructure Engineer to help design, build, and operate the systems that power our AI model training, experiment management, and robotic deployments. This role spans on-premise environments, cloud infrastructure, networking, and automation. You’ll work hands-on with servers, storage, firewalls, wireless equipment, and high-performance compute resources—while also developing scalable tooling that improves reliability, observability, and developer velocity.
The ideal candidate has 5+ years of experience in infrastructure engineering, DevOps, SRE, or systems engineering, with deep knowledge of on-prem environments, AWS deployments at scale, and modern infrastructure-as-code and automation practices.
What You'll Do
- Build, operate, and evolve on-premise and cloud infrastructure supporting AI/ML development and robotics programs
- Develop CI/CD pipelines using GitLab or GitHub Actions
- Deploy and manage AWS environments including IAM, EC2, VPCs, and S3
- Implement and maintain infrastructure-as-code (Terraform, Ansible, Puppet, Chef, etc.)
- Install, configure, and troubleshoot physical servers, networking equipment, and storage systems
- Support Kubernetes clusters (clusteradm, Kops, EKS) and GitOps workflows (ArgoCD, Flux, Spinnaker)
- Build custom automation and internal infrastructure tooling
- Manage observability stacks (Prometheus/Grafana, ELK, Datadog, etc.)
- Partner closely with engineering teams to ensure reliability, security, and efficient scaling
- Document systems, processes, and runbooks to support local and remote teams
Required Qualificationspythongorustawskubernetesmachine learningaidevopsdatadesign
Similar Jobs
Roku
Senior Software Engineer, Python (Tools Development)
Roku
Senior Software Engineer, Cloud Services
Roku
Senior Software Engineer - Cloud Infrastructure & Observability
Roku
Senior Software Engineer - Cloud Infrastructure & Observability
Roku
Senior Software Engineer, Devops/SRE
Roku