Machine Learning - Data Scientist Lead

Confirmed live in the last 24 hours

Apple

Sunnyvale

On-site

Posted April 30, 2026

Job Description

Summary

Do you have a passion for computer vision and deep learning? Are you excited by the latest advances in multimodal models? The Video Engineering Data Analytics and Quality group is looking for a technical lead with deep expertise in evaluating machine learning and deep learning models, including foundation models and multimodal systems.

Description

In this role, you will design robust evaluation frameworks, mentor a team of engineers and scientists, and drive alignment across Apple's research, engineering, and product teams. You will combine strong analytical thinking, Python expertise, and a deep understanding of statistical evaluation and data quality. You will also help set the technical direction for how we measure and improve the quality of some of Apple's most exciting AI experiences.

Minimum Qualifications

BS and a minimum of 10 years relevant industry experience. 4+ years of industry or academic experience in machine learning or data science. 2+ years of experience leading technical projects or mentoring junior engineers or scientists. Strong experience evaluating supervised, unsupervised, and deep learning models. Hands-on experience with LLMs (such as GPT, Claude, or PaLM) and using them as scoring or judging mechanisms. Familiarity with multimodal models (such as image + text or video + audio) and their evaluation challenges. Proficiency in Python and libraries such as NumPy, pandas, scikit-learn, PyTorch, or TensorFlow. Solid understanding of statistical testing, sampling, confidence intervals, and metrics such as precision/recall, BLEU, ROUGE, and FID.

Preferred Qualifications

M.S. or Ph.D. in Computer Science, Statistics, Machine Learning, or a related field. Prior experience managing or tech-leading a team of two or more engineers or scientists. Experience with open-source evaluation tools such as OpenEval, ELO-based ranking, or LLM-as-a-Judge frameworks. Familiarity with prompt engineering, few-shot, or zero-shot evaluation techniques. Experience evaluating generative models, such as text or image generation systems. Prior contributions to ML benchmarks or public evaluations. Comfort with giving and receiving feedback in a collaborative, fast-moving environment. Strong communication and documentation skills, with the ability to write technical reports and present to non-technical audiences.

machine learningdata