AI Systems Reliability Engineer Position
toronto, on, Canada • Posted May 30, 2026
Job Type:
Full-time
Location:
toronto, on
Posted:
May 30, 2026
Category:
Other-General
Application Deadline:
July 09, 2026
Role Description
Become a Site Reliability Engineer to support cutting-edge AI technologies. Ensure system reliability and operational effectiveness utilizing your Linux and automation skills in a hybrid setup.
In this role, you will focus on the intersection of reliability and customer engineering, validating that our AI systems are production-ready. Engaging with internal teams, you will tackle complex issues and enhance monitoring and automation processes, contributing significantly to system performance and reliability.
Key Responsibilities:
• Maintain operational integrity of AI infrastructures
• Troubleshoot issues spanning compute, network, and software
• Collaborate with teams for incident response
• Enhance monitoring and observability frameworks
• Create automation solutions to boost reliability
Requirements:
• Expertise in site reliability or systems engineering
• Advanced Linux troubleshooting capabilities
• Knowledge of observability tools like Prometheus
In this role, you will focus on the intersection of reliability and customer engineering, validating that our AI systems are production-ready. Engaging with internal teams, you will tackle complex issues and enhance monitoring and automation processes, contributing significantly to system performance and reliability.
Key Responsibilities:
• Maintain operational integrity of AI infrastructures
• Troubleshoot issues spanning compute, network, and software
• Collaborate with teams for incident response
• Enhance monitoring and observability frameworks
• Create automation solutions to boost reliability
Requirements:
• Expertise in site reliability or systems engineering
• Advanced Linux troubleshooting capabilities
• Knowledge of observability tools like Prometheus
Interested in this role?
Click the button below to start your application for AI Systems Reliability Engineer Position at Tenstorrent.
Apply Now