Back to Search






Senior
Senior Technical Program Manager, Infrastructure Reliability and Quality (IRQ)
Confirmed live in the last 24 hours
Amazon Data Services, Inc.
Herndon, VA, USA
On-site
Posted April 20, 2026
Job Description
AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we’re the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain — and we’re looking for talented people who want to help.
You’ll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles. You’ll collaborate with people across AWS to help us deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. And you’ll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.
The Data Center Infrastructure Reliability and Quality (IRQ) Team owns the Quality and Reliability of critical infrastructure equipment for the lifecycle of equipment. This includes leading Design for Reliability (DFR) and Design for quality (DFQ) effort for AWS Infra New Product Development (NPD), supporting/sustaining the existing AWS critical infra equipment fleet by identifying systemic equipment issues, driving root cause analysis (RCA) corrective action to mitigate the risk in the AWS fleet.
As a Senior Technical Program Manager for Quality and Reliability for the Mechanical and Power Generation Products, you should be an exceptionally strong communicator, both written and verbally. You will lead multi-discipline and highly technical program teams. You should have experience of driving quality and reliability initiatives in a complex engineering environment and will have worked as a technical project manager on increasingly complex projects. Your experience includes data center infrastructure technologies including HVAC, power distribution systems, security devices and controls. You will generate and maintain enhanced reporting, meaningful KPIs, and process/automation improvements to ensure team efficiency and visibility of the portfolio efforts to critical stakeholders and leadership.
You will partner with other engineering teams, purchasing team and project execution/delivery teams regularly. You will make strategic decisions regarding specific projects and overall program direction. You will be capable of connecting long-term strategies with rapid growth patterns of AWS, as well as guide specific operational needs of the business.
You must be adept at identifying and communicating upcoming risks, issues, and bottlenecks as well as be instrumental in resolving those issues, often cross multiple departmental boundaries to achieve your goal. You must possess a strong sense of organization, communication and team building skill-sets that foster robust working relationships for both internal and external stakeholders.
Key job responsibilities
This TPM will lead end-to-end quality and reliability programs for mission-critical mechanical and power generation products at AWS data centers — orchestrating complex product strategies, owning program-level metrics and executive reporting, and building compelling cases for internal and external leadership where both speed and quality are non-negotiable.
- Own and execute end-to-end quality and reliability qualification and sustaining programs for highly complex, mission-critical mechanical and power generation products
- Learn and understand the AWS data center lifecycle specific to Data Center Engineering, globally
- Project manage the IRQ Engineering NPD portfolio across scope, schedule, budget, resources, quality, risk, and reporting
- Interface with Quality and Reliability Engineering teams to promote and standardize work-product inputs/outputs globally
- Sync with TPMs and Product Managers to align reliability and quality deliverables to major NPD program milestones
- Develop technical reliability and quality evaluation plans for mechanical and power generation product portfolios with engineering teams
- Work with external partners (reliability labs, OEMs, CMFs) to facilitate testing, qualification, and analysis
- Own product reliability and quality readiness at each NPD gate
- Own supplier readiness for NPDs
- Estimate and manage budget requirements per project; work with NPD teams to allocate resources
- Manage project timelines and report progress bi-weekly to executive leadership and key partners
- Build and present program-level metrics and status to executives on a regular cadence
- Develop business cases to align internal and external leadership on the most effective path for quality and reliability program management
- Drive internal team members and remo
You’ll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles. You’ll collaborate with people across AWS to help us deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. And you’ll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.
The Data Center Infrastructure Reliability and Quality (IRQ) Team owns the Quality and Reliability of critical infrastructure equipment for the lifecycle of equipment. This includes leading Design for Reliability (DFR) and Design for quality (DFQ) effort for AWS Infra New Product Development (NPD), supporting/sustaining the existing AWS critical infra equipment fleet by identifying systemic equipment issues, driving root cause analysis (RCA) corrective action to mitigate the risk in the AWS fleet.
As a Senior Technical Program Manager for Quality and Reliability for the Mechanical and Power Generation Products, you should be an exceptionally strong communicator, both written and verbally. You will lead multi-discipline and highly technical program teams. You should have experience of driving quality and reliability initiatives in a complex engineering environment and will have worked as a technical project manager on increasingly complex projects. Your experience includes data center infrastructure technologies including HVAC, power distribution systems, security devices and controls. You will generate and maintain enhanced reporting, meaningful KPIs, and process/automation improvements to ensure team efficiency and visibility of the portfolio efforts to critical stakeholders and leadership.
You will partner with other engineering teams, purchasing team and project execution/delivery teams regularly. You will make strategic decisions regarding specific projects and overall program direction. You will be capable of connecting long-term strategies with rapid growth patterns of AWS, as well as guide specific operational needs of the business.
You must be adept at identifying and communicating upcoming risks, issues, and bottlenecks as well as be instrumental in resolving those issues, often cross multiple departmental boundaries to achieve your goal. You must possess a strong sense of organization, communication and team building skill-sets that foster robust working relationships for both internal and external stakeholders.
Key job responsibilities
This TPM will lead end-to-end quality and reliability programs for mission-critical mechanical and power generation products at AWS data centers — orchestrating complex product strategies, owning program-level metrics and executive reporting, and building compelling cases for internal and external leadership where both speed and quality are non-negotiable.
- Own and execute end-to-end quality and reliability qualification and sustaining programs for highly complex, mission-critical mechanical and power generation products
- Learn and understand the AWS data center lifecycle specific to Data Center Engineering, globally
- Project manage the IRQ Engineering NPD portfolio across scope, schedule, budget, resources, quality, risk, and reporting
- Interface with Quality and Reliability Engineering teams to promote and standardize work-product inputs/outputs globally
- Sync with TPMs and Product Managers to align reliability and quality deliverables to major NPD program milestones
- Develop technical reliability and quality evaluation plans for mechanical and power generation product portfolios with engineering teams
- Work with external partners (reliability labs, OEMs, CMFs) to facilitate testing, qualification, and analysis
- Own product reliability and quality readiness at each NPD gate
- Own supplier readiness for NPDs
- Estimate and manage budget requirements per project; work with NPD teams to allocate resources
- Manage project timelines and report progress bi-weekly to executive leadership and key partners
- Build and present program-level metrics and status to executives on a regular cadence
- Develop business cases to align internal and external leadership on the most effective path for quality and reliability program management
- Drive internal team members and remo
gorustawsaiiosdataproductdesign
Similar Jobs
Salesforce
Technical Program Management Director
Lead / ManagerCalifornia - San Fra...
Live Nation Entertainment
LN Venues, Production Manager - Citizens Live at The Wylie
Lead / ManagerPittsburgh, PA, USA
Live Nation Entertainment
LN Venues, Production Manager - Observatory San Diego
Lead / ManagerSan Diego, CA, USA$76,000.00 USD
Live Nation Entertainment
LN Venues, Production Manager - House of Blues New Orleans
Lead / ManagerNew Orleans, LA, USA
U.S. Bank / Elavon
Payment Solutions Account Executive - Milwaukee, WI
Lead / Manager2 Locations
Williams Companies
E&C Project Manager Staff/Sr.
Staff2 Locations