Back to Search
Overview
Staff

Staff, Backend Engineer - Catalog

Confirmed live in the last 24 hours

Acryl Data

Acryl Data

Compensation

$225,000 - $300,000/year

Palo Alto, California, United States
Hybrid
Posted April 10, 2026

Job Description

DataHub is an AI & Data Context Platform adopted by over 3,000 enterprises, including Apple, CVS Health, Netflix, and Visa. Innovated jointly with a thriving open-source community of 13,000+ members, DataHub's metadata graph provides in-depth context of AI and data assets with best-in-class scalability and extensibility.

The company's enterprise SaaS offering, DataHub Cloud, delivers a fully managed solution with AI-powered discovery, observability, and governance capabilities. Organizations rely on DataHub solutions to accelerate time-to-value from their data investments, ensure AI system reliability, and implement unified governance, enabling AI & data to work together and bring order to data chaos.

The Challenge

As AI and data products become business-critical, enterprises face a metadata crisis:

  • No unified way to track the complex data supply chain feeding AI systems
  • Engineering teams struggling with data discovery, lineage, and governance
  • Organizations needing machine-scale metadata management, not just human-browsable catalogs

Why This Matters

This is where infrastructure meets impact. The metadata layer you'll build will directly power the next generation of AI systems at massive scale. Your code will determine how safely and effectively thousands of organizations deploy AI, affecting millions of users worldwide.

The Role

We're looking for an exceptional Staff, Backend engineer to lead development of DataHub's Platform framework – the core that connects diverse data systems and powers our metadata collection capabilities.

You'll Build

  • Scalable, fault-tolerant ingestion systems for enterprise-scale metadata
  • Clean, intuitive APIs for our connector ecosystem
  • Event-driven architectures for real-time metadata processing
  • Schema mapping between diverse systems and DataHub's unified model
  • Versioning systems for AI assets (training data, model weights, embeddings)

You Have

  • 8+ years building production-grade distributed systems
  • Advanced Python and API design expertise
  • Experience with high-scale data processing or integration frameworks
  • Strong systems knowledge and distributed architecture experience
  • Proven track record solving complex technical challenges
  • Built and maintained online applications serving live traffic at scale (100+ QPS)
  • Set up monitoring and alerting for services
  • Designed indexing, storage, and data architectures to make large-scale data accessible to online services
  • Designed and scaled distributed systems
  • Hands-on experience developing in a tight loop with LLMs and applying best practices for scalable LLM development
Languages
    nodepythonjavatypescriptgorustawskubernetesdockerai