Lead - Site Reliability Engineer
chennai, tamil nadu, India • Posted June 30, 2026
Job Type:
Full-time
Location:
chennai, tamil nadu
Posted:
June 30, 2026
Category:
automation,engineering,grafana,kubernetes,lead,linux,python,red,solution,ux
Application Deadline:
August 09, 2026
Role Description
Role Overview
We are looking for a Lead Site Reliability Engineer with 6-7 years of experience to drive reliability, observability, and incident management practices. The ideal candidate will have strong expertise in Grafana stack , production monitoring, and handling critical incidents in high-availability systems.
Key Responsibilities
- Act as the Incident Commander during production outages, ensuring timely resolution and stakeholder communication
- Lead incident response, triage, RCA (Root Cause Analysis), and postmortems
- Build and enhance observability systems using Grafana (Prometheus, Loki, Tempo)
- Define and manage SLIs, SLOs, and SLAs for critical services.
- Develop and maintain monitoring, alerting, and dashboards for proactive issue detection.
- Collaborate with Dev, Infra, an...
Interested in this role?
Click the button below to start your application for Lead - Site Reliability Engineer at FundsIndia.
Apply Now