Senior Site Reliability Engineer

Confirmed live in the last 24 hours

Duetto

United States

On-site

Posted March 31, 2026

Job Description

1. About the Company

Duetto, the industry-leading hospitality revenue management system, leads the way in helping hotels, resorts and casinos optimize revenue and boost profit. Our leading SaaS platform, expanding suite of products, and incredibly skilled team have been at the heart of our continued success and our ambition for future growth knows no bounds.

Duetto is building the future of hotel revenue strategy. We’re not just another SaaS company — we’re redefining what’s possible for hotels through our category-creating platform, the Revenue & Profit Operating System.

2. Role Summary / Purpose

We are seeking a highly experienced Senior Site Reliability Engineer (SRE) to join our dynamic team. The ideal candidate will have a proven track record of designing, implementing, and maintaining scalable, secure, and highly reliable systems. As a key contributor, you will collaborate with cross-functional teams to drive architecture decisions, implement best practices, and ensure high system availability.

Our technology stack is built on AWS and primarily consists of:

Java
Python
NoSql
Single-page JavaScript web techniques (jQuery, Backbone, React, and RequireJS)
Patent-pending analytical methods on top of MongoDB
Postgres
Terraform/Terragrunt and Chef for IaC
DataDog and Prometheus
GitHub for source control
GitHub Actions and Jenkins for CI/CD

3. Key Responsibilities

Architect and implement infrastructure solutions to facilitate seamless migration of critical systems while ensuring uptime, reliability, and a high-quality experience for end users.
Design, develop, test, and maintain tools and processes to efficiently manage and operate SaaS products hosted on AWS, with a focus on scalability and automation.
Partner with developers to enhance the reliability, performance, scalability, and security of server and application architectures.
Build and maintain critical components of our infrastructure, emphasizing robustness, security, and high availability to meet demanding service-level expectations.
Foster strong cross-team collaboration by driving engagement, promoting shared goals, and ensuring alignment across technical and non-technical teams.
Lead efforts to ensure systems are secure by default, addressing vulnerabilities proactively and implementing best practices for cybersecurity preparedness.
Be willing to learn and adopt AI in DevOps/SRE workflows.
Be the last line of support for services that thousands of customers (hotels, resorts, casinos, etc.) around the world depend on 24/7.
Troubleshoot on-call incidents to ensure rapid resolution and minimal service disruption. Participate in detailed Root Cause Analysis (RCA) to identify underlying issues and work cross-functionally to implement preventative measures and long-term solutions, ensuring similar problems are avoided in the future.

4. Qualifications

Required:

5+ years of experience in an Ops, DevOps or SRE role.
Experience in System Design and Architecture.
Engineer-level experience with networking and security concepts.
Understanding of fundamentals behind load balancing technologies. Experience configuring Layer 7 load-balancing is a plus.
Experience collaborating with engineers on architecture decisions.
Experience administering Cloud Computing Services such as AWS (preferred), Azure, or GCP, including working knowledge of permissions structures, multi-account management structures, and single sign-on(SSO).
Experience with AWS ecosystem tools such as AWS IAM, VPC, EC2, ELB, RDS, S3, Lambda, API Gateway, Secrets Manager, KMS, CloudWatch, CloudTrail.
Experience with security compliance certifications such as SOC2.
Experience working in an environment with a heavy emphasis on DevOps and Service Reliability mindset.
Experience provisioning, configuring, administering, and using enterprise monitoring ecosystems like Prometheus, Grafana, DataDog or similar.
Experience with CI/CD Tools