Applicants must be authorized to work in the U.S. without visa sponsorship

Overview

A high-growth fintech company is seeking a Site Reliability Engineer to help scale and operate a globally distributed, highly available cloud platform. This hybrid role in Manhattan focuses on reliability, automation, Kubernetes, and cloud infrastructure across AWS and GCP.

What You’ll Do

Design and maintain scalable cloud infrastructure using Infrastructure-as-Code (Terraform)
Manage and optimize Kubernetes environments using Helm and ArgoCD
Improve deployment workflows through GitOps and automation
Implement monitoring, logging, and alerting using tools such as Splunk and Grafana
Support production systems, lead incident response, and participate in a 24/7 on-call rotation
Develop automation tooling using Python and/or Go
Partner with engineering teams to improve scalability, resiliency, and operational excellence
Troubleshoot Linux systems and networking issues across distributed environments

What We’re Looking For

Strong experience with Kubernetes in production environments
Hands-on experience with AWS and GCP
Expertise in Terraform; experience with Ansible and Terragrunt is a plus
Strong Linux administration and troubleshooting skills
Experience with observability, automation, and cloud-native architectures
Solid networking knowledge and problem-solving skills
Strong communication skills and ability to work in fast-paced Agile teams

Compensation \& Benefits

Salary: $140,000 – $170,000 \+ 20% bonus (Achieved above 90% of the company’s active years)

Senior Site Reliability Engineer

Job Description

Looking for more opportunities?