Skip to main content
M

Capacity Engineer with Data Engineer

MPower Plus

Location

Remote

Salary

Not specified

Type

fulltime

Posted

Today

via linkedin

Job Description

Job Title : Capacity Engineer with Data Engineer

Location: Remote

Please search in below order

Tools \& Techniques

  • Data Engineering Stack:

SQL, Python, Spark, Airflow for data processing and orchestration. - 3-4 years [we can go little lower also fine]

  • Monitoring \& Observability:

Prometheus, Grafana, Datadog.

  • Chaos Engineering:

Test system resilience under stress.

  • Infrastructure as Code:

Terraform, Ansible, Harness.

Data Engineer

with strong

Site Reliability Engineering (SRE)

expertise in

capacity planning

. This role ensures our infrastructure scales efficiently to meet user demand, balancing performance with cost. The engineer will forecast growth, analyze usage trends, and automate resource provisioning to prevent outages, over-provisioning, or under-provisioning. In addition, the role requires building robust data pipelines and analytical models to support forecasting and decision-making.

Key Responsibilities

·

Data Pipeline Development:

Design and maintain ETL/ELT pipelines to collect, transform, and store infrastructure usage data.

·

Data Modeling:

Build models to analyze system metrics and predict future resource needs.

·

Demand Forecasting:

Analyze historical usage patterns to predict CPU, memory, and storage requirements.

·

Load Testing \& Scaling:

Simulate traffic spikes to identify bottlenecks and ensure systems scale linearly.

·

Cost Efficiency:

Optimize resource allocation to avoid unnecessary costs while maintaining service availability.

·

Automation:

Use Infrastructure as Code (IaC) tools like Terraform to automate scaling and provisioning.

·

Architecture Review:

Collaborate with software teams to flag single points of failure and ensure resilient service design.

Tools \& Techniques

·

Monitoring \& Observability:

Prometheus, Grafana, Datadog.

·

Chaos Engineering:

Test system resilience under stress.

·

Infrastructure as Code:

Terraform, Ansible, Harness.

·

Data Engineering Stack:

SQL, Python, Spark, Airflow for data processing and orchestration.

Qualifications

· Strong background in

data engineering

and

SRE practices

.

. 10-15 years of experience

· Hands-on experience with

capacity planning, forecasting, and scaling

.

· Proficiency in

IaC tools

(Terraform, Ansible, Harness).

· Experience with

data pipelines, ETL/ELT frameworks, and big data tools

.

· Familiarity with

monitoring/observability platforms

(Prometheus, Grafana, Datadog).

· Knowledge of

chaos engineering

and resilience testing.

· Excellent collaboration and communication skills.

Looking for more opportunities?

Browse thousands of graduate jobs and entry-level positions.

Browse All Jobs