Skip to main content
P

Site Reliability Engineer

PT. Indosat Tbk

Location

Jakarta, Indonesia

Salary

Not specified

Type

fulltime

Posted

Today

via linkedin

Job Description

Role Summary

We are seeking a skilled and passionate

Site Reliability Engineer (SRE)

to ensure the reliability, scalability, and performance of our hybrid and cloud-native infrastructure. You will play a critical role in automating operations, improving system resilience, and supporting mission-critical services running across Kubernetes and cloud environments.This role is ideal for engineers who enjoy solving complex infrastructure challenges, building automation, and improving platform reliability at scale

Job Description (1/2)

Reliability \& System Performance

  • Maintain high availability, scalability, and performance of production systems
  • .Define and monitor SLIs, SLOs, and error budgets to ensure service reliability.
  • Perform root cause analysis, incident response, and postmortem reviews.
  • Implement reliability improvements and proactive failure prevention.

Cloud \& Kubernetes Platform Management

  • Manage and optimize workloads running on Google Kubernetes Engine (GKE) and OpenShift.
  • Support multi-cluster and hybrid infrastructure environments.
  • Implement autoscaling and high availability architecture

CI/CD, GitOps \& Release Engineering

  • Design and maintain CI/CD pipelines using GitLab CI/CD.
  • Implement GitOps deployment workflows using Argo CD.
  • Implement safe deployment strategies including:

🔹 Infrastructure as Code \& Automation

  • Provision and manage infrastructure using Terraform / OpenTofu.
  • Develop and maintain Helm charts for Kubernetes deployments.
  • Automate operational tasks using Python scripting to reduce manual toil.

Job Description 2/2

🔹 Observability, Monitoring \& Distributed Tracing

  • Implement centralized logging using Grafana Loki and ELK Stack.
  • Build dashboards and alerts using Grafana and Datadog.
  • Implement distributed tracing using OpenTelemetry to improve system visibility.
  • Improve monitoring coverage and alert accuracy.

🔹 Performance \& Load Testing

  • Conduct load and stress testing using tools such as k6, Locust, or JMeter.
  • Analyze performance bottlenecks and implement tuning strategies.
  • Support capacity planning and performance optimization.

🔹 Data Streaming \& Integration

  • Support Change Data Capture (CDC) and real-time data streaming pipelines.
  • Work with Confluent Platform / Apache Kafka to ensure reliable event-driven data flow.

🔹 Security \& Secret Management

  • Manage secrets securely using Google Cloud Secret Manager and Kubernetes secrets, Vault Hashicorp.
  • Implement secure CI/CD and platform access practices.

Education

Bachelor’s degree in Computer Science, Informatics, Information Systems, Electrical Engineering, Mathematics/Statistics, or related field.

Experience

  • 0–4 years of experience in SRE, DevOps, Cloud Engineering, or Platform Engineering.
  • Hands-on experience supporting production systems and cloud infrastructure.

Technical Skills

  • Strong Linux system administration and networking fundamentals.
  • Hands-on experience with Kubernetes and containerized environments.
  • Experience designing and maintaining CI/CD pipelines.
  • Infrastructure as Code experience (Terraform), Ansible.
  • Helm chart development and Kubernetes deployment management.
  • Monitoring, logging, and observability best practices.
  • Programming/scripting skills in Bash, Python (Go is a plus).
  • Familiarity with Google Cloud Platform (GCP).

Looking for more opportunities?

Browse thousands of graduate jobs and entry-level positions.

Browse All Jobs