Site Reliability Engineer / Platform Engineer
Confirmed live in the last 24 hours
DevRev
Job Description
About DevRev
At DevRev, we're building the future of work with Computer – your AI teammate. Unlike traditional tools, Computer unifies all your data sources, tools, and workflows into a single AI-ready platform, giving employees real-time insights, proactive suggestions, and powerful agentic actions. It extends your existing software with AI-native apps and agents that work alongside your teams and customers – updating workflows, coordinating across teams, and eliminating repetitive work. We call this Team Intelligence: human-AI collaboration that breaks down silos, brings people back together, and frees you to solve bigger problems. Backed by Khosla Ventures and Mayfield with $150M+ raised, DevRev is trusted by global companies across industries.
About the Role
We are seeking an experienced Site Reliability Engineer / Platform Engineer to join our team and help build and maintain a resilient, scalable infrastructure supporting our applications across multiple cloud providers. In this role, you will design and implement infrastructure solutions, automate operational processes, and work closely with development teams to ensure reliable, efficient systems that scale with our business.
Key Responsibilities
- Design, build, and maintain infrastructure across AWS, GCP, and Azure using Infrastructure as Code (IaC) principles
- Implement and optimize CI/CD pipelines using tools like Argo and CircleCI to enable rapid, reliable deployments
- Manage and scale Kubernetes clusters in production environments, ensuring high availability and optimal resource utilization
- Administer and optimize cloud databases including MongoDB, Redis, RDS, and other data stores for performance and reliability
- Develop monitoring, alerting, and observability solutions to identify and resolve issues before they impact users
- Automate routine operational tasks to reduce manual toil and improve system reliability
- Conduct incident response and post-mortem analysis to drive continuous improvement
- Collaborate with development teams to design systems with reliability, scalability, and operational excellence in mind
- Document infrastructure architecture, runbooks, and operational procedures
- Evaluate and implement new tools and technologies to improve platform capabilities
Required Qualifications
- 1-3 years of experience in Site Reliability Engineering, DevOps, or Platform Engineering
- Strong hands-on experience with at least two major cloud providers (AWS, GCP, Azure)
- Proficiency with Kubernetes for container orchestration and management
- Demonstrated expertise with IaC tools (Terraform, CloudFormation, Pulumi, or similar)
- Experience with CI/CD platforms, particularly Argo and/or CircleCI
Similar Jobs
Veeam Software
Site Reliability Engineer II
New Era Technology
Site Reliability Engineer (SRE)
AppOmni
Senior Site Reliability Engineer
Axle Informatics
Site Reliability Engineer
Feedzai
Site Reliability Engineer
Aviatrix