Back to Search
Overview
Mid-Level

Platform Engineer II/III

Confirmed live in the last 24 hours

Zone 5 Technologies

Zone 5 Technologies

United States
Hybrid
Posted April 21, 2026

Job Description

At Zone 5 Technologies, we're redefining what's possible in unmanned aircraft systems. Our team of engineers and innovators is developing cutting-edge autonomous solutions that push the boundaries of UAS technology - solving complex challenges that matter.

We're building the future of UAS capabilities, and we're looking for exceptional talent to join us. If you're driven by hard problems, energized by rapid innovation, and ready to make an impact on next-generation flight systems, you belong here.

We are seeking a Platform Engineer to architect and operate scalable compute infrastructure that powers our autonomous vehicle simulation and testing framework. You will build elastic compute systems across AWS and on-premises clusters, enabling engineering teams to rapidly iterate on autonomy algorithms through massive parallel simulation workloads.

Responsibilities:

Elastic Compute Architecture

• Design and implement auto-scaling compute infrastructure for simulation workloads using cloud platforms

• Build and maintain on-premises GPU and CPU clusters for simulation and machine learning training

• Architect hybrid cloud solutions that optimize cost and performance across cloud and local compute resources

• Implement job scheduling and orchestration systems using Kubernetes for thousands of concurrent simulations

• Design storage solutions for large-scale simulation data, logs, and artifacts using cloud and local storage systems

Simulation Platform Development

• Deploy and maintain robotics simulation environments at scale

• Build CI/CD pipelines for automated simulation testing of autonomy software

• Create infrastructure for distributed parameter sweeps, Monte Carlo testing, and regression suites

• Develop monitoring and observability systems for simulation fleet health and resource utilization

• Implement data pipelines for simulation results ingestion, analysis, and visualization

Infrastructure as Code & Automation

• Write and maintain infrastructure as code for reproducible infrastructure deployment

• Build automation tools and CLI utilities to simplify developer access to compute resources

• Implement GitOps workflows for infrastructure changes and configuration management

• Create self-service interfaces for engineers to launch and manage simulation jobs

• Develop cost monitoring and optimization strategies for cloud and on-prem resources

System Operations & Reliability

• Monitor and optimize infrastructure performance, reliability, and cost efficiency

• Troubleshoot complex distributed systems issues across networking, storage, and compute layers

• Implement backup, disaster recovery, and business continuity strategies

• Maintain security best practices including IAM, secrets management, and network isolation

• Collaborate with autonomy, ML, and robotics teams to understand compute requirements and optimize workflows

Network Design & Infrastructure

• Design and implement network architectures for distributed simulation workloads across AWS and on-premises environments

• Configure VPCs, subnets, security groups, and routing for secure, high-performance compute clusters

• Establish hybrid cloud connectivity (VPN, Direct Connect, site-to-site tunnels) between on-premises and cloud resources

• Optimize network performance for large data transfers, multi-node communication, and distributed workloads

• Support internal infrastructure network design and provide technical guidance to engineering programs

• Troubleshoot network issues including latency, packet loss, and connectivity problems across distributed systems

Qualifications:

• Bachelor's in Computer Science, Software Engineering, or related technical field – equivalent industry experience also welcome

• 2-5+ years of experience in platform engineering, DevOps, SRE, or cloud infrastructure roles

• Strong hands-on experience with Kubernetes for container orchestration and workload management

• Experienc

nodepythongoawskubernetesmachine learningaidevopsdatadesign