Skip to main content
L

Data Engineer (Databricks + Pyspark)

LTIMindtree

Location

Pune Division, Maharashtra, India

Salary

Not specified

Type

fulltime

Posted

Today

via linkedin

Job Description

We are looking for a skilled

Databricks Data Engineer

to design, develop, and optimize scalable data pipelines and data platforms. The ideal candidate will have strong expertise in

Databricks, SQL, Python, and PySpark

, along with experience in implementing

CI/CD practices

for data engineering workflows.

Primary Skills (Mandatory)

  1. Databricks
  • Strong hands-on experience with Databricks (Data Lakehouse architecture)
  • Expertise in Delta Lake, Unity Catalog, and Databricks Workflows
  • Experience in performance tuning and cost optimization
  1. SQL
  • Advanced SQL skills for data transformation, optimization, and querying
  • Experience in writing complex joins, window functions, and performance tuning
  1. Python
  • Proficient in Python for data engineering tasks and scripting
  • Experience with modular code development and reusable components
  1. PySpark
  • Strong experience in building scalable data pipelines using PySpark
  • Knowledge of Spark optimization techniques (partitioning, caching, etc.)
  1. CI/CD
  • Experience implementing CI/CD pipelines for data workflows
  • Familiarity with tools such as Azure DevOps, GitHub Actions, Jenkins, or similar
  • Knowledge of version control (Git) and automated deployment of data solutions

Secondary Skills

  • AWS
  • Experience working with AWS services such as S3, EC2, IAM, and EMR
  • Apache Airflow
  • Experience in workflow orchestration and scheduling using Airflow
  • AWS Glue
  • Hands-on experience in ETL processing and Glue jobs

Good to Have

  • Terraform
  • Knowledge of Infrastructure as Code (IaC) for provisioning cloud resources
  • Experience in automation and environment setup using Terraform

Key Responsibilities

  • Design and develop scalable data pipelines using Databricks and PySpark
  • Build and optimize ETL/ELT processes for large-scale data processing
  • Collaborate with data architects and stakeholders to define data models and solutions
  • Implement CI/CD pipelines for automated deployment and testing
  • Monitor, troubleshoot, and optimize data workflows for performance and cost
  • Ensure data quality, governance, and security standards are maintained
  • Work closely with cross-functional teams across data, analytics, and business domains

Required Experience

  • 8\+ years of experience in data engineering or related roles
  • Hands-on experience in Databricks-based data platform projects
  • Strong understanding of distributed data processing concepts

Looking for more opportunities?

Browse thousands of graduate jobs and entry-level positions.

Browse All Jobs