Senior Data Scientist, Machine Learning Engineer
Confirmed live in the last 24 hours
shelf
Job Description
About Shelf
The enterprise is going agentic — but most AI agents fail when they hit real business complexity. Shelf is changing that.
We’ve built the operating system for agentic AI: a platform that models your policies, workflows, and operational logic into an AI Data Model so agents don’t just respond — they reason. The result? AI that understands how your business actually runs and delivers precise, compliant, auditable outcomes at scale.
Brands like Amazon, Nespresso, HelloFresh, and KeyBank trust Shelf to power AI agents that resolve 85% of cases autonomously, cut handle times by 20–25%, and turn hours-long processes into seconds. We’re partnered with Microsoft, Salesforce, OpenAI, Snowflake, and Databricks — and recognized by Gartner (Cool Vendor) and IDC (Innovator) for our approach.
If you want to sell the infrastructure that makes agentic AI actually work in the enterprise, you’re in the right place.
Our mission is to empower humanity with better answers everywhere.
Summary
The R&D department plays a pivotal role in driving Shelf to disrupt the market. We are looking for Machine Learning experts that are able to deliver end to end with a blend of experience: Python engineering, ML engineering, and pragmatic Data science and Machine learning research. You will ship end-to-end features—from problem framing and experimentation to service deployment, and ongoing operations—quickly and with high quality. Your work will power ML- and LLM-driven services used by top enterprises like Amazon, Mayo Clinic, AmFam, and Nespresso.
This role requires strong Python engineering capabilities coupled with a strong ability to deliver robust ML solutions, along with ML research literacy to choose sound methodologies, define metrics, and evaluate different approaches effectively.
You’ll work in an agile environment, move fast, and own what you ship.
Responsibilities
- Own end-to-end delivery: ideate, research, prototype, productionize, and operate ML-powered services with an expectation to iterate and ship frequently
- Stand up robust training/evaluation pipelines: dataset curation, labeling/feedback loops, experiment tracking, offline/online metrics, and A/B testing
- Solve problems using sound methodology, evaluate approaches along with
- Transform ML models and LLM workflows (including RAG) into reusable, versioned, observable production services with CI/CD
- Collaborate with Product Owners to shape our product and requirements
- Conduct and receive code reviews; champion engineering excellence, testing discipline, and documentation
- Leverage AI coding assistants to accelerate development and create internal agents that automate parts of the engineering workflow
- Share learnings through demos, docs, and knowledge sessions; contribute to a culture of continuous improvement
Requirements
- 3+ years of professional experience researching and shipping ML-based solutions, with strong Python skills and a track record of delivering fast without sacrificing quality
- Proven experience in owning research problems end-to-end, starting from initial data analysis, through iterative research phases to delivering on production
- Practical NLP/LLM experience: transformers, embeddings, prompt design, and evaluation; ability to choose and justify metrics and methodologies
- Strong backend fundamentals: designing RESTful services, schema design, data modeling, and performance tuning for SQL and NoSQL stores
- Data processing skills: pandas/NumPy; experience with batch/stream processing and ETL orchestration (e.g., Airflow, Step Functions)
- Strong English verbal and written communication
As a plus
- LLM ops and safety: eval frameworks (e.g., RAGAS), guardrails, red-teaming, prompt optimization at scale
- Model optimization: quantization, distillation, pruning; GPU/accelerator-aware serving
- Experience with AWS ML stack (SageMaker, Batch, Step Functions, Lambda, SQS/SNS, DynamoDB, ECS, EC2, S3)
- Vector databases and search: Pinecone, Elasticsearch, pgvector, FAISS, or DeepLake
- Background in reinforcement learning, agent frameworks, or autonomous agents
- Publications, ope
Similar Jobs
Monzo
Senior Machine Learning Scientist, Borrowing
Webflow
Senior Staff Machine Learning Scientist, Assets
Accenture Federal Services
Cleared Computer Vision Scientist
Axon
Senior Agentic AI Research Scientist
Oura
Staff AI Scientist
Axon