Skip to main content
C

Data Engineer -Databricks, Python

CG-VAK Software & Exports Ltd.

Location

Chennai, Tamil Nadu, India

Salary

Not specified

Type

fulltime

Posted

Today

via linkedin

Job Description

Role \& Responsibilities

Key Responsibilities

  • Architect and implement enterprise-grade Lakehouse solutions using Databricks
  • Design and deliver scalable batch and real-time data pipelines using Apache Spark (PySpark/SQL)
  • Build ETL/ELT pipelines, incremental data loads, and metadata-driven ingestion frameworks
  • Implement and optimize Databricks components: Delta Lake, Delta Live Tables, Autoloader, Structured Streaming, and Workflows
  • Design large-scale data warehousing solutions with 3NF and dimensional modeling
  • Establish data governance, security, and data quality frameworks, including Unity Catalog
  • Lead ML lifecycle management using MLflow and drive AI use cases (RAG, AI/BI)
  • Manage cloud-native deployments on Microsoft Azure and integrate with enterprise systems (e.g., ServiceNow)
  • Drive CI/CD, DevOps practices, and performance optimization of Spark workloads
  • Provide technical leadership, mentor teams, and ensure successful delivery
  • Collaborate with stakeholders to translate business requirements into scalable solutions

Ideal Candidate

  • Strong Databricks Architect Profile with end-to-end Lakehouse ownership
  • Mandatory (Experience 1) – Must have 10\+ years of software engineering experience with atleast 5\+ years in Data Engineering with hands on exposure to Databricks and strong ownership of end-to-end data pipeline development.
  • Mandatory (Experience 2) – Must have atleast 5\+ years of expertise across the Databricks ecosystem — Delta Lake, Delta Live Tables, Autoloader, Structured Streaming, Workflows, Unity Catalog
  • Mandatory (Tech skill 1) – Must have worked at architecture level, owning end-to-end design through deployment
  • Mandatory (Tech skill 2) – Must have strong experience with Python and SQL for data processing and Apache Spark for performance tuning \& scalability
  • Mandatory (Tech skill 3) – Must have experience in large-scale data warehousing \& advanced data modeling (3NF and dimensional) across batch and real-time systems
  • Mandatory (AI Exposure) – Must have at least a basic working understanding of how AI services or tools work
  • Mandatory (Communication) – Must have strong stakeholder management \& requirement-gathering experience with US or UK clients
  • Mandatory (Company) – Must come from a B2B IT services or IT consulting background
  • Mandatory (Note 1) – CTC is inclusive of 5% variable
  • Preferred (Tech skill 1) – Azure Databricks or Azure data services experience (project runs on Azure DevOps)
  • Preferred (Tech skill 2) – MLflow or MLOps practices and AI use cases (RAG, AI/BI)
  • Preferred (Tech skill 3) – CI/CD, Databricks Asset Bundles (DABs) or equivalent packaging, Terraform or IaC, reusable deployment templates
  • Preferred (Integrations) – ServiceNow or enterprise system integrations
  • Preferred (Certifications) – Databricks (Data Engineer Associate or Professional, ML or GenAI tracks), Azure or AWS cloud certifications

Perks, Benefits and Work Culture

  • The company provides free AWS and Azure certification training
  • Group medical insurance of ₹5 lakhs is included in the benefits package, along with meal allowances

Skills: ci,apache,data,architect,cd,enterprise,aws,azure

Looking for more opportunities?

Browse thousands of graduate jobs and entry-level positions.

Browse All Jobs