Staff Software Engineer
Confirmed live in the last 24 hours
Wrike
Job Description
About the Role:
Wrike’s Backend Reliability (BRE) team is the backbone of our backend infrastructure and the guardian of our uptime. Our mission is to achieve and sustain 99.99% availability while building the tools, components, and safety nets that the entire engineering organization relies on. As a Senior / Staff Backend Engineer on this team, you won’t just close tickets - you’ll architect core reliability solutions that shape how Wrike scales, performs, and recovers from failure.
Your Impact:
- Design, build, and maintain critical reliability components such as HTTP rate limiters, internal DB schema migration tools, circuit breakers, and distributed Redis-based caching.
- Troubleshoot complex production issues, optimize PostgreSQL usage, and ensure our distributed systems remain performant and stable under high load.
- Lead preliminary investigations during severe production incidents: identify likely root causes, assess impact, and propose mitigation options. The long-term fixes are then implemented by the owning team, based on your findings.
- Create scalable, reusable tools and frameworks that help other engineering teams build more resilient services.
- Leverage AI-powered tools and coding agents to accelerate development, analyze architectures, and automate repetitive or error-prone tasks.
- Influence reliability best practices across engineering by sharing knowledge, reviewing designs, and setting high technical standards.
Your Qualifications:
- Strong expertise with Java/JVM, building scalable, high-performance backend systems; open to leveraging other languages when appropriate.
- Solid understanding of distributed systems concepts, including high availability, CAP theorem, and fault tolerance.
- Deep experience with relational databases (PostgreSQL) and key–value / non-relational storages (Redis).
- Practical experience with containerization and cloud-native environments, including Docker and Kubernetes.
- Hands-on experience with message brokers such as RabbitMQ or Kafka.
- Ability to work independently with minimal supervision, using critical thinking to question assumptions and validate your own decisions.
- Strong written and spoken English skills suitable for collaborating in an international engineering environment.
Standout Qualities:
- Background in infrastructure engineering or Site Reliability Engineering (SRE), including infrastructure-as-code practices.
- Experience leading technical initiatives, driving cross-team projects, and mentoring other engineers while remaining an individual contributor.
- Familiarity with observability and monitoring stacks (e.g., Graylog, Zabbix, Grafana) and/or data analytics tools such as BigQuery.
- A strong interest in how complex systems fail and a track record of designing them to recover gracefully.
Team Dynamics:
You will join the Backend Reliability (BRE) team, a small, highly specialized, senior group focused solely on Wrike’s reliability. The team operates as an internal “reliability task force,” partnering closely with product and platform engineering teams across the company. You’ll collaborate with other senior engineers who value autonomy, deep technical discussions, and rigorous engineering practices. The culture is ownership-driven: you are trusted to manage your time, make architectural decisions, and drive in
Similar Jobs
Wrike
Staff Software Engineer
Thoughtworks
Senior Developer
Verisign
Software Engineer
Roku
Senior Software Engineer, Python (Tools Development)
Roku
Senior Software Engineer, Cloud Services
Roku