Skip to main content
M

AWS Site Reliability Engineer

Marks Sattin

Location

Glasgow, Scotland, UK

Salary

Not specified

Type

contract

Posted

Today

via linkedin

Job Description

We’re seeking an

AWS Site Reliability Engineer (SRE)

with strong

incident operations

experience to support and improve the reliability of cloud and data platform services across

AWS and Snowflake

.

This role is hands-on and operationally focused:

proactive monitoring, rapid incident response, service restoration, root cause analysis, and automation

to improve resilience and reduce MTTR.

What you’ll do

  • Lead

incident triage, coordination, and resolution

for AWS and Snowflake services in production

  • Monitor and respond to

alerts, dashboards, and service health indicators

  • Perform

root cause analysis (RCA)

and drive post-incident remediation and continuous improvement

  • Create, maintain, and improve

runbooks

, operational procedures, and on-call readiness

  • Participate in and strengthen

on-call rotations

(including operational handovers)

  • Automate repetitive operational tasks to reduce toil, improve reliability, and

reduce MTTR

What you’ll bring (required)

  • Strong knowledge of AWS, including

EC2, S3, IAM, VPC, Lambda, CloudWatch

  • Experience with

Snowflake administration and troubleshooting

  • Familiarity with observability tooling such as

CloudWatch, Datadog, Grafana, and/or Splunk

  • Solid understanding of SRE principles:

SLIs, SLOs, error budgets, incident management

  • Scripting/automation skills in

Python, Bash, and/or Terraform

Looking for more opportunities?

Browse thousands of graduate jobs and entry-level positions.

Browse All Jobs