Staff Cloud Engineer
Confirmed live in the last 24 hours
Harness
Job Description
Harness is the AI Software Delivery Platform company, led by technologist and entrepreneur Jyoti Bansal (founder of AppDynamics, acquired by Cisco for $3.7B). Harness has raised approximately $570M in funding and is valued at $5.5B, backed by leading investors including Goldman Sachs, Menlo Ventures, IVP, Unusual Ventures, Citi Ventures, and more. As AI accelerates code creation, the real bottleneck has shifted to everything after the code – testing, deployments, application security, reliability, compliance, and cost optimization. Harness brings AI and automation to this “outer loop,” helping teams ship software faster while maintaining security and governance throughout the entire software delivery lifecycle.
Powered by Harness AI and the Software Delivery Knowledge Graph, the Harness Platform applies deep context and intelligent automation across the software delivery lifecycle with governance and policy-driven controls embedded throughout the platform.
Over the past year, Harness powered over 185M deployments, 82M builds, 18T flag evaluations, 8M security scans, 9.1B optimized tests, 3T protected API calls, and helped manage $2.8B in cloud spend — enabling customers like United Airlines, Morningstar, and Choice Hotels to accelerate releases by up to 75%, reduce cloud costs by up to 60%, and achieve 10x DevOps efficiency.
With a global team across 26 offices and 25 countries, Harness is shaping the future of AI software delivery — and we’re looking for exceptional talent to help us move even faster.
Position Summary
As a Staff Cloud Engineer at Harness, you will play a pivotal role in designing, building, and maintaining our cloud infrastructure. You will be responsible for ensuring the reliability, scalability, and performance of our systems, incorporating a blend of Cloud Engineering and Site Reliability Engineering (SRE) practices. This role requires a strong technical background, a passion for innovation, and the ability to work collaboratively in a fast-paced environment.
Key Responsibilities
Cloud Infrastructure, Distributed Systems & Platform Engineering:
- Design, build, and manage scalable, secure, and reliable cloud infrastructure using GCP, AWS or Azure.
- Develop infrastructure-as-code using tools such as Terraform, CloudFormation, or similar.
- Lead the design and evolution of scalable, secure, multi-tenant, multi-region cloud platforms across AWS, GCP, and Azure.
- Architect and build control planes, orchestration systems, and shared platform services used across teams.
- Design and operate highly available, fault-tolerant, and self-healing distributed systems at scale.
- Define and enforce SLO-driven architectures, reliability standards, and resilience strategies.
- Drive infrastructure-as-code and platform abstractions to standardize and simplify deployments.
- Own capacity planning, scalability strategy, and performance optimization.
Observability & Operational Excellence
- Establish and scale monitoring, logging, and alerting frameworks for proactive issue detection.
- Lead incident response, root cause analysis, and continuous reliability improvements.
- Drive system-wide performance, scalability trade-offs, and efficiency optimizations.
Site Reliability Engineering (SRE):
- Implement SRE practices to ensure the reliability, availability, a
Similar Jobs
Roku
SW Engineer, Cloud Services
Roku
Senior Software Engineer, Cloud Services
MongoDB
Senior Cloud Security Engineer
MongoDB
Cloud Operations Engineer
Verra Mobility
Senior Cloud Network Engineer
Applied Intuition