Location
Pune Division, Maharashtra, India
Salary
Not specified
Type
fulltime
Posted
Today
via linkedin
Job Description
We are looking for a skilled
Databricks Data Engineer
to design, develop, and optimize scalable data pipelines and data platforms. The ideal candidate will have strong expertise in
Databricks, SQL, Python, and PySpark
, along with experience in implementing
CI/CD practices
for data engineering workflows.
Primary Skills (Mandatory)
- Databricks
- Strong hands-on experience with Databricks (Data Lakehouse architecture)
- Expertise in Delta Lake, Unity Catalog, and Databricks Workflows
- Experience in performance tuning and cost optimization
- SQL
- Advanced SQL skills for data transformation, optimization, and querying
- Experience in writing complex joins, window functions, and performance tuning
- Python
- Proficient in Python for data engineering tasks and scripting
- Experience with modular code development and reusable components
- PySpark
- Strong experience in building scalable data pipelines using PySpark
- Knowledge of Spark optimization techniques (partitioning, caching, etc.)
- CI/CD
- Experience implementing CI/CD pipelines for data workflows
- Familiarity with tools such as Azure DevOps, GitHub Actions, Jenkins, or similar
- Knowledge of version control (Git) and automated deployment of data solutions
Secondary Skills
- AWS
- Experience working with AWS services such as S3, EC2, IAM, and EMR
- Apache Airflow
- Experience in workflow orchestration and scheduling using Airflow
- AWS Glue
- Hands-on experience in ETL processing and Glue jobs
Good to Have
- Terraform
- Knowledge of Infrastructure as Code (IaC) for provisioning cloud resources
- Experience in automation and environment setup using Terraform
Key Responsibilities
- Design and develop scalable data pipelines using Databricks and PySpark
- Build and optimize ETL/ELT processes for large-scale data processing
- Collaborate with data architects and stakeholders to define data models and solutions
- Implement CI/CD pipelines for automated deployment and testing
- Monitor, troubleshoot, and optimize data workflows for performance and cost
- Ensure data quality, governance, and security standards are maintained
- Work closely with cross-functional teams across data, analytics, and business domains
Required Experience
- 8\+ years of experience in data engineering or related roles
- Hands-on experience in Databricks-based data platform projects
- Strong understanding of distributed data processing concepts
Looking for more opportunities?
Browse thousands of graduate jobs and entry-level positions.