Back to Search






Mid-Level
Synthetic Data Engineer (AI Data/Training)
Confirmed live in the last 24 hours
Hyphen Connect
Australia
On-site
Posted April 24, 2026
Job Description
We are seeking a talented and innovative Synthetic Data Engineer. In this role, you will design and implement domain-specific synthetic data generation pipelines, ensuring high-quality data management for training loops. Your expertise will drive the success of data processing and model training within the organization.
Responsibilities:
- Design domain-specific synthetic data generation (SDG) pipelines via self-instruct and constitutional prompting.
- Implement automated quality scoring and de-duplication systems.
- Manage data pipelines that feed directly into SFT and DPO training loops.
Qualifications:
- Proven experience building large-scale data pipelines (Airflow, Spark, Ray).
- Deep knowledge of prompt engineering for data generation.
- Familiarity with dataset distillation and bias mitigation.
aidatadesign
Similar Jobs
Hyphen Connect
Synthetic Data Engineer (AI Data/Training)
Mid-LevelHong Kong
Hyphen Connect
Synthetic Data Engineer (AI Data/Training)
Mid-LevelSingapore
Hyphen Connect
Synthetic Data Engineer (AI Data/Training)
Mid-LevelChina
Hyphen Connect
Synthetic Data Engineer (AI Data/Training)
Mid-LevelBoston, USA
Hyphen Connect
Synthetic Data Engineer (AI Data/Training)
Mid-LevelSeattle, USA
Hyphen Connect
Synthetic Data Engineer (AI Data/Training)
Mid-LevelOregon, USA