Skip to main content
T

Site Reliability Engineer

Tekshapers

Location

Toronto, Ontario, Canada

Salary

Not specified

Type

fulltime

Posted

Today

via linkedin

Job Description

Job Title: Lead Site Reliability Engineer

Job type: Full Time

Job Location: Toronto, ON

Experience - 9\+Years

Skills and Responsibilities:

  • Deep End-to-End Systems Expertise Strong knowledge of complex, multi-tier environments spanning on-prem and cloud-native systems supporting large-scale transaction flows.
  • Advanced Observability APM Experience Hands-on expertise with Dynatrace (or similar tools), including instrumentation, monitoring, and troubleshooting distributed applications.
  • Full-Stack Troubleshooting Capability Proven ability to diagnose and resolve issues across application, infrastructure, network, and platform layers in E2E environments.
  • SRE Leadership Roadmap Execution Drives and executes SRE roadmap initiatives (e.g., SRE WCCS), including capability assessments, gap analysis, and strategic improvements.
  • Dynatrace SME Skillset (Day 1 Ready) Expertise in DQL, Grail traces, Gen3 dashboards, ActiveGate plugins, SRG workflows, and Business Events.
  • Deep Observability Fundamentals (MELT)Strong command of metrics, events, logs, and traces with ability to correlate signals for root cause analysis and performance optimization. Cloud Observability (AWS Focus)
  • Experience with AWS observability stack (CloudWatch, Application Signals, Lambda, API Gateway, tracing, and logging).
  • Engineering Automation Skills Strong programming in Python and Node.js experience with serverless (AWS Lambda, Azure Functions), ECS, and backend integrations. Platform Engineering SRE Practices
  • Experience implementing SRE principles (Google SRE), building platform capabilities like self-service pipelines, policy-as-code, and Engineering SRE strumentation frameworks. Complex Enterprise Financial Systems.
  • Experience Background in large-scale, highly integrated environments (e.g., financial services), with ability to design observability for systems with limited visibility (e.g., IBM DataPower) and monitor AI-driven systems.

Looking for more opportunities?

Browse thousands of graduate jobs and entry-level positions.

Browse All Jobs