Staff, Backend Engineer - Catalog at Acryl Data

About the role

Aplyr's Quick Take

This role is for a Staff Backend Engineer focused on developing the core framework for DataHub, which involves building scalable systems for metadata ingestion and creating APIs for data connectivity. It's an individual contributor position that requires deep technical expertise in distributed systems and Python.

Good fit

Ideal candidates will have 8+ years of experience in backend engineering, particularly in building production-grade distributed systems. A strong background in Python and API design is essential, along with a proactive problem-solving approach.

Worth noting

The role directly impacts AI systems at scale, which could be a unique opportunity for those looking to influence how data and AI interact in enterprise settings. The company is also heavily involved in the open-source community, which may appeal to those who value collaboration and innovation.

DataHub is an AI & Data Context Platform adopted by over 3,000 enterprises, including Apple, CVS Health, Netflix, and Visa. Innovated jointly with a thriving open-source community of 13,000+ members, DataHub's metadata graph provides in-depth context of AI and data assets with best-in-class scalability and extensibility.

The company's enterprise SaaS offering, DataHub Cloud, delivers a fully managed solution with AI-powered discovery, observability, and governance capabilities. Organizations rely on DataHub solutions to accelerate time-to-value from their data investments, ensure AI system reliability, and implement unified governance, enabling AI & data to work together and bring order to data chaos.

The Challenge

As AI and data products become business-critical, enterprises face a metadata crisis:

No unified way to track the complex data supply chain feeding AI systems
Engineering teams struggling with data discovery, lineage, and governance
Organizations needing machine-scale metadata management, not just human-browsable catalogs

Why This Matters

This is where infrastructure meets impact. The metadata layer you'll build will directly power the next generation of AI systems at massive scale. Your code will determine how safely and effectively thousands of organizations deploy AI, affecting millions of users worldwide.

The Role

We're looking for an exceptional Staff, Backend engineer to lead development of DataHub's Platform framework – the core that connects diverse data systems and powers our metadata collection capabilities.

You'll Build

Scalable, fault-tolerant ingestion systems for enterprise-scale metadata
Clean, intuitive APIs for our connector ecosystem
Event-driven architectures for real-time metadata processing
Schema mapping between diverse systems and DataHub's unified model
Versioning systems for AI assets (training data, model weights, embeddings)

You Have

8+ years building production-grade distributed systems
Advanced Python and API design expertise
Experience with high-scale data processing or integration frameworks
Strong systems knowledge and distributed architecture experience
Proven track record solving complex technical challenges
Built and maintained online applications serving live traffic at scale (100+ QPS)
Set up monitoring and alerting for services
Designed indexing, storage, and data architectures to make large-scale data accessible to online services
Designed and scaled distributed systems
Hands-on experience developing in a tight loop with LLMs and applying best practices for scalable LLM development

Languages

One of Java/Scala/Kotlin/C#/Go - very strong nice-to-have / borderline must-have
Python/TypeScript/Node.js - nice-to-have

Technical Skills

AWS
Kubernetes/Docker
CI/CD deployment pipelines
Microservice Architecture

Bonus Points

Experience with DataHub or similar metadata/ETL frameworks (Airflow, Airbyte, dbt)
Open-source contributions
Experience building and maintaining services that make calls to LLMs in order to serve live traffic
Experience fine-tuning LLM-powered applications exposed to end users
Early-stage startup experience

Location and Compensation

Bay Area (hybrid, 3 days in Palo Alto office)

Salary Range: $225,000 to $300,000

Benefits and Perks

We invest in people so they can do their best work and enjoy doing it. Our benefits reflect the way we build: practical, thoughtful, and designed to support long-term growth.

Competitive compensation

We offer salaries that reflect your skills, experience, and the impact you make. You bring value—we make sure you're recognized for it.

Equity for everyone

Every team member receives an ownership stake in the company. When we grow, you grow with us.

Remote Work

All roles are remote unless otherwise specified in the job description. Review the job description to confirm if the role you are interested in is remote or hybrid.

Location flexibility

Home office, coworking space, or something in between? We support your ideal setup. You’ll receive a monthly coworking stipend to use whenever you need a change of pace or in-person collaboration time.

Comprehensive health coverage

Your well-being matters. We cover 99% of medical, dental, and vision premiums employees, and 65% for dependents.

Flexible savings accounts

We offer FSAs to help cover planned or unexpected healthcare costs. You can also opt into a Dependent Care FSA to support family needs.

Support for every path to parenthood

Through Carrot Fertility, we provide inclusive fertility benefits and family-forming support. All U.S. employees have access, regardless of age, gender identity, or family structure.

Time off that works for you

We trust you to take the time you need. Our unlimited PTO and sick leave policy is designed for flexibility, rest, and real life.

Why Join Us

DataHub is at a rare inflection point: we’ve achieved product-market fit, earned the trust of leading enterprises, and secured backing from top-tier investors like Bessemer Venture Partners and 8VC. The context platform market is expected to grow from $1B to $9B in the next five years—and we’re leading the way.

By joining our team, you’ll:

Tackle high-impact challenges at the heart of enterprise AI infrastructure
Ship production systems that power real-world use cases at global scale
Collaborate with a high-caliber team of builders who’ve scaled some of the most influential data tools in the world
Build the next generation of AI-native data systems, including conversational agents, intelligent classification, automated governance, and more

If you're passionate about technology, enjoy working with customers, and want to be part of a fast-growing company changing the industry, we want to hear from you!

Skills & Tags

node python java typescript go rust aws kubernetes

Aplyr's read

Acryl Data is a dynamic player in data management, attracting professionals passionate about improving data accessibility and usability in organizations.
Synthesized from recent postings & public sources

What's promising

•Acryl Data is at the forefront of simplifying data accessibility, a crucial need for modern businesses.
•The company offers roles that focus on both technical and customer-facing skills, providing diverse career opportunities.
•Recent hires indicate a commitment to expanding both technical capabilities and customer support functions.

What to watch

•The niche focus on data management may limit opportunities for those seeking broader tech industry roles.
•As a relatively small player, Acryl Data might face challenges competing with larger, established data firms.
•Limited public information about the company's financial health and long-term stability.

Why Acryl Data

•Acryl Data's platform specifically targets the simplification of data usability, setting it apart from generic data services.
•The company emphasizes a balance between engineering and customer success, reflecting a user-centric approach.
•Acryl Data's focus on catalog and site reliability engineering highlights its commitment to robust data infrastructure.

Aplyr’s read is generated by AI from public sources. Was it useful?