Architect and implement enterprise-grade Lakehouse solutions using Databricks.
Design and deliver end-to-end data engineering pipelines, including batch and real-time streaming solutions.
Lead implementation of:
Cloud-based data lakehouse platforms integrating diverse data sources.
Real-time data processing pipelines for operational and analytical use cases.
Develop scalable ETL/ELT pipelines using PySpark, Scala, and SQL.
Implement advanced data modeling solutions including 3NF, dimensional modeling, and enterprise data warehousing strategies.
Design and build incremental data loading frameworks and metadata-driven ingestion pipelines.
Establish data quality frameworks and governance standards.
Implement and manage Unity Catalog, including fine-grained security and access controls.
Leverage Databricks components such as:
Delta Live Tables
Autoloader
Structured Streaming
Databricks Workflows
Integration with orchestration tools (e.g., Apache Airflow)
Drive CI/CD automation, deployment strategies, and DevOps best practices.
Optimize performance of pipelines, Spark jobs, and compute resources.
Provide architectural guidance and technical leadership across cross-functional teams.
Engage with stakeholders and clients to translate business requirements into scalable technical solutions.

Deep expertise in:

Strong hands-on programming skills in Python, PySpark, Scala, and SQL.

Databricks developer

Job Description