ML Research Platform Engineer (Distributed Training & HPC)

Singapore, Singapore, Singapore • Posted June 04, 2026

Job Type: Full-time

Location: Singapore, Singapore

Posted: June 04, 2026

Category: business-and-financial-operations

Application Deadline: July 14, 2026

Role Description

Location: Singapore, Hong Kong or Shanghai

About the role
We are looking for a platform engineer to build the infrastructure that powers our next-generation machine learning research. Think: large-scale experimentation, distributed training, and reproducibility.

This is not an applied ML role. You will not be fine-tuning LLMs or building agents. Instead, you will build the systems that enable researchers to train models at scale

What you will own
Distributed training pipelines for GPU-accelerated workloads (PyTorch, JAX)
Experiment management and model versioning
Resource scheduling on on-premise HPC clusters and cloud (Slurm, Kubernetes)
Observability and debugging for complex training jobs
Data lineage
            

Interested in this role?

Click the button below to start your application for ML Research Platform Engineer (Distributed Training & HPC) at QNT Partners.
Apply Now