About the role
Figure's vision is to deploy autonomous humanoids at a global scale. Our Helix team is looking for an experienced Training Infrastructure Engineer to take our infrastructure to the next level. This role is focused on managing the training cluster, implementing distributed training algorithms, data loaders, and developer tools for AI researchers.
Responsibilities
- Design, deploy, and maintain Figure's training clusters
- Architect, optimize, and maintain scalable deep learning frameworks for training on massive robot datasets
- Work together with AI researchers to implement training of new model architectures at a large scale
- Implement distributed training, advanced parallelization strategies, and high-performance data loaders to reduce model development cycles
- Profile, identify, and eliminate training bottlenecks at the hardware and software levels to maximize Model FLOPs Utilization (MFU)
- Implement tooling for data processing, model experimentation, and continuous integration
- Strong software engineering fundamentals
- Bachelor's or Master's degree in Computer Science, Robotics, Engineering, or a related field
- Extensive professional experience with Python and PyTorch
- Proven track record of scaling and running large-scale training experiments personally on 800+ GPUs
- Experience managing HPC clusters for deep neural network training
- Minimum of 4 years of professional, full-time experience building reliable backend systems and infrastructure
- Experience contributing to or maintaining open-source distributed training frameworks (Megatron-LM, DeepSpeed, TorchTitan)
- Experience managing cloud infrastructure (AWS, Azure, GCP)
- Experience with job scheduling / orchestration tools (SLURM, Kubernetes, LSF, etc.)
- Experience with configuration management tools (Ansible, Terraform, Puppet, Chef, etc.)
- Deep understanding of CUDA and hands-on experience writing custom GPU kernels to optimize training
The US base salary range for this full-time position is between $150,000 - $350,000 annually.
The pay offered for this position may vary based on several individual factors, including job-related knowledge, skills, and experience. The total compensation package may also include additional components/benefits depending on the specific role. This information will be shared if an employment offer is extended.
Aplyr's read
Figure AI is at the forefront of integrating AI into decision-making, attracting talent in engineering and AI development for innovative solutions.
What's promising
- •Figure AI is pioneering AI-driven decision-making across diverse industries.
- •The company offers roles in cutting-edge AI and robotics engineering.
- •Figure AI's focus on AI integration attracts top-tier engineering talent.
What to watch
- •Figure AI operates in a highly competitive AI market.
- •The company faces challenges in rapidly evolving AI technologies.
- •Limited public information about Figure AI's financial stability.
Why Figure AI
- •Figure AI specializes in AI-enhanced decision-making processes.
- •The company hires for niche roles like Humanoid Robot Operator.
- •Figure AI's Helix AI team focuses on advanced AI training methods.
Aplyr’s read is generated by AI from public sources. Was it useful?
About Figure AI
Figure AI is a technology company focused on leveraging artificial intelligence to enhance decision-making processes in various industries.
Similar roles
Sr Lead, Solutions Architect - Infrastructure, Cloud, Automation & AI Engineering
Northern Trust
Specialist - Gen AI Development
Sun Life
Automation & AI Product Owner
Rolls-Royce
Senior Business Analyst- ServiceNow Artificial Intelligence
Takeda
Senior AI Engineer
Takeda
Senior/ Lead Generative AI Developer/engineer
Citigroup