Site Reliability Engineering (SRE)
Confirmed live in the last 24 hours
Riskified
Job Description
About Us
Riskified empowers businesses to unleash ecommerce growth by taking risk off the table. Many of the world’s biggest brands and publicly traded companies selling online rely on Riskified for guaranteed protection against chargebacks, to fight fraud and policy abuse at scale, and to improve customer retention. Developed and managed by the largest team of ecommerce risk analysts, data scientists and researchers, Riskified’s AI-powered fraud and risk intelligence platform analyzes the individual behind each interaction to provide real-time decisions and robust identity-based insights. Riskified is proud to work with incredible companies in virtually all industries including Booking.com, Acer, Gucci, Lorna Jane, GoPro, and many more.
We thrive in a collaborative work setting, alongside great people, to build and enhance products that matter. Abundant opportunities to create and contribute provide us with a sense of purpose that extends beyond ourselves, leaving a lasting impact. These sentiments capture why we choose Riskified every day.
About the Role
As a Site Reliability Engineering (SRE), you’ll join a high-impact R&D team and own the infrastructure that powers Riskified’s real-time decisions. You’ll tackle challenges of architecture, scale, and reliability by designing, building, and operating cloud-native systems, and by extending the tooling that enables fast, safe delivery at scale.
At Riskified, we review millions of transactions daily and make sub-second decisions that sit in the heart of our customers’ critical business flow. We collect and analyze terabytes of textual, behavioral, social, geographical, and other data, using machine learning algorithms to power one of the world’s leading fraud-prevention platforms.
We believe great engineers and tight collaboration are the keys to scale and innovation. Our team is highly innovative and quick to adopt cutting-edge technologies (for example, EKS at scale, Karpenter, Istio, Argo CD, Graviton, Kyverno, advanced observability, and Cloudflare edge patterns).
What You'll Be Doing
- Design, build and manage our development and production cloud-native infrastructure in AWS
- Establish and evolve standards for microservices (IaC modules, Helm charts, policies)
- Build and maintain our product release workflow and continuous integration/delivery systems
- Continuously improve Riskified’s visibility into its systems and applications with advanced monitoring, metrics, and log analytics, analysis
- Understand, implement, and automate security controls, governance processes, and compliance validation
- Play a key role in product planning and execution
- Develop internal tools that remove friction and boost developer productivity.
- Continuously evaluate and introduce new technologies to improve performance, reliability, and developer velocity
Qualifications
- At least 5 years prior experience as a SRE/DevOps
- Deep cloud experience (preferably AWS) and strong Infrastructure as Code skills
- Must have cloud-native proficiency, including Kubernetes in production, familiarity with a service mesh (for example, Istio or Linkerd), and provisioning tools (for example, Karpenter)
- Observability expertise with modern monitoring and log analytics
- At least 1 year of programming experience or the ability to demonstrate strong programming proficiency
- Experience with Build/Deploy/Continuous Integration tools
- Ability to prioritize tasks and work independently
- Experience in Go or Node.js (huge advantage).
- History of open-source involvement (issues, pull requests, talks, or meetups) (advantage).
Life at Riskified
We are a fast-growing and dynamic tech company with 750+ team members globally. We value collaboration and innovative thinking. We’re looking for bright, driven, and passionate people to grow with us.
Our Tel-Aviv team is currently working in a hybrid of remote and in-office work. We have recently moved to our new space in Tel Aviv - check it out nodegoawskubernetesmachine learningaidevopsdataanalyticsproduct
Similar Jobs
Cerebras Systems
Site Reliability Engineer - Ops & Automation
Morgan Stanley
Site Reliability Engineer (Infrastructure Applications) - Director P3 - ETS
Fidelity Investments
Principal AI Site Reliability Engineer, EI Production Services
JLL