Senior Site Reliability Engineer
Confirmed live in the last 24 hours
Mastercard
Job Description
Our Purpose
Mastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we’re helping build a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships and networks combine to deliver a unique set of products and services that help people, businesses and governments realize their greatest potential.
Title and Summary
Senior Site Reliability EngineerWho is Mastercard?We work to connect and power an inclusive, digital economy that benefits everyone, everywhere, by making transactions safe, simple, smart, and accessible. Using secure data and networks, partnerships, and passion, our innovations and solutions help individuals, financial institutions, governments, and businesses realize their greatest potential. Our decency quotient, or DQ, drives our culture and everything we do inside and outside of our company. We cultivate a culture of inclusion for all employees that respects their individual strengths, views, and experiences. We believe that our differences enable us to be a better team – one that makes better decisions, drives innovation, and delivers better business results.
About the Role
The Authorization Platform refers to both hardware (the physical communications lines of the Mastercard Network and Mastercard interface processors [MIPs]) and software (the Mastercard Network authorization application).
As a Senior Engineer in BizOps, you will play a pivotal role in advancing Mastercard’s authorization observability and operational intelligence. This role focuses on engineering solutions that enhance platform instrumentation, cross-team integration, and real-time alerting across multi-platform environments.
Your responsibilities include building and maintaining impact and blast-radius reporting in partnership with Incident Manager teams to enable rapid delivery of metrics for incidents affecting Transaction Switching Authorization. You’ll also engineer internal SLA/SLO monitoring, dashboarding, reporting systems, and support customer-facing data enablement for latency and decline inquiries.
You’ll collaborate with BizOps Squad Leads to define and deliver release observability requirements, ensuring robust logging and monitoring for new and enhanced components. You’ll also support SLA/SLO documentation and cross-team reviews to identify and implement improvements.
In addition, you’ll develop custom data solutions for operational authorization needs, including dashboards, alerts, and reporting for BizOps and engineering teams. You’ll also contribute to next-gen data enablement using AI/ML and Mastercard platforms like AI Ops to improve incident detection and prevention.
Finally, you’ll support specialized initiatives to improve the authorization experience across Transaction Switching and Auth-connected programs, addressing customer issues, configuration errors, and data mismatches.
Own application health, performance, and capacity as primary point of contact.
Define monitoring, alerting, and reliability strategies in partnership with development teams to ensure zero/low downtime.
Build dashboards, reports, and data-driven alerts (Splunk/Power BI) for real-time insights.
Design and maintain ETL processes and data pipelines for monitoring and analytics.
Perform incident management, root cause analysis, and drive reliability improvements (SLOs, post-mortems).
Support CI/CD pipelines and promote DevOps best practices and automation.
Optimize system performance, scalability, and resilience across environments.
Automate workflows and reduce manual effort using Python and tooling.
Manage and monitor cloud-based infrastructure (AWS/Azure).
Analyze operational/ITSM data to identify gaps and improve system reliability.
All About You –
Strong hands-on experience with Splunk, including building interactive dashboards, creating reports, configuring alerts, and administering Splunk environments (data inputs, indexing, user roles, and access controls).
Proven ability to design and optimize Splunk searches (SPL) for performance, scalability, and actionable insights across business and operational data.
Experience integrating and managing Splunk add-ons, forwarders, and data ingestion pipelines for structured and unstructured data sources.
Solid understanding of DevOps principles, including CI/CD pipelines, infrastructure as code, and configuration management to support reliable and scalable observability systems.
Hands-on experience with Python and relevant libraries for scripting, automation, and data processing within monitoring and analytics workflows.
Strong troubleshooting and system performance analysis skills, with the ability to identify bottlenecks and optimize system reliability using observability tools like Splunk.
Ability to analyze large datasets to drive business decisions; familiarity with predictive analytics and modeling is a plus.
Experience with data visualization tools (including Splunk dashboards and platforms like Power BI) to present insights clearly to stakeholders.
Understanding of ETL processes and experience building or maintaining data pipelines for ingesting, transforming, and analyzing operational data.
Exposure to cloud platforms such as AWS or Azure, including deploying and managing monitoring/analytics solutions in cloud environments.
Strong communication skills, ownership mindset, and ability to work both independently and collaboratively in cross-functional teams.
Growth mindset with an eagerness to explore new tools and technologies, including AI/ML applications to enhance automation, anomaly detection, and system intelligence.
Nice-to-Have (Differentiators)
Background in data engineering, including pipeline design, data modeling, and working with large-scale distributed data systems.
Experience applying AI/ML techniques within Splunk (e.g., anomaly detection, forecasting, or Splunk MLTK).
Corporate Security Responsibility
All activities involving access to Mastercard assets, information, and networks comes with an inherent risk to the organization and, therefore, it is expected that every person working for, or on behalf of, Mastercard is responsible for information security and must:
Abide by Mastercard’s security policies and practices;
Ensure the confidentiality and integrity of the information being accessed;
Report any suspected information security violation or breach, and
Complete all periodic mandatory security trainings in accordance with Mastercard’s guidelines.
Similar Jobs
Centene
Senior Site Reliability Engineer
Vanguard
Lead Software Engineer (Site Reliability)
FIS
Lead Site Reliability Engineer (Unix/Linux, Shell Scripting)- 6 to 12 Years- Bangalore
FIS
Site Reliability Engineer Senior
Mastercard
Senior Site Reliability Engineer
Mastercard