Performance Engineer for AI Inference

toronto, on, Canada • Posted June 01, 2026

Job Type: Full-time

Location: toronto, on

Posted: June 01, 2026

Category: Other-General

Application Deadline: July 11, 2026

Role Description

                    Elevate AI inference systems as a Performance Engineer, specializing in model evaluations and optimization on wafer-scale technology. Engage with the latest innovations to implement enhancements for greater efficiency.
This position focuses on bringing state-of-the-art AI models to production through rigorous validation and architectural prototyping. You will develop automation solutions for experimentation and collaborate with cross-functional teams to push the boundaries of AI technology. This is a unique chance to work at the intersection of software and hardware.
Key Responsibilities:
• Prototype and benchmark novel AI methodologies
• Design automation for streamlined experimental workflows
• Collaborate with silicon, runtime, and compiler teams
• Assess and optimize newly released models
Requirements:
• 3+ years in high-performance ML or systems engineering
• Strong grasp of Transformer mathematics and methodologies
• Skilled with AI toolchains and pr...
                

Interested in this role?

Click the button below to start your application for Performance Engineer for AI Inference at Cerebras.

Apply Now