ML Systems Performance Engineer
toronto, on, Canada • Posted June 04, 2026
Job Type:
Full-time
Location:
toronto, on
Posted:
June 04, 2026
Category:
Engineering
Application Deadline:
July 14, 2026
Role Description
About The Role
Engineers on the inference performance team operate at the intersection of hardware and software, driving end-to-end model inference speed and throughput. Their work spans low-level kernel performance debugging and optimization, system-level performance analysis, performance modeling and estimation, and the development of tooling for performance projection and diagnostics.
Responsibilities
- Build performance models (kernel-level, end-to-end) to estimate the performance of state of the art and customer ML models.
- Optimize and debug our kernel micro code and compiler algorithms to elevate ML model inference speed, throughput and compute utilization on the Cerebras WSE.
- Debug and understand runtime performance on the system and cluster.
- Develop tools and infrastructure to help visualize performance data collected from the Wafer Scale Engine and our compute cluster.
Requirements
- ...
Interested in this role?
Click the button below to start your application for ML Systems Performance Engineer at Cerebras.
Apply Now