Senior Data Engineer

Fortune 50 Healthcare

Brooksource

Remote

Overview

Our Fortune 50 Healthcare client is seeking a

Senior Data Engineer

to support our mission of improving the health and well-being of our members. This role will focus on

building scalable, secure, data centric solutions

and compliant data platforms that power analytics, clinical insights, and business decision-making across the enterprise.

The ideal candidate will have strong experience with

cloud-based data platforms, Databricks, PostgreSQL

, and healthcare data, with a passion for delivering high-quality, trusted data solutions in a regulated environment.

Key Responsibilities

Data Engineering \& Platform Development

Design, develop, and

scalable data pipeline solutions

using

Databricks (Spark)

and cloud-native services

Build and optimize

ETL/ELT workflows

for ingesting structured and unstructured healthcare data (claims, clinical, provider, and member data)

Develop and maintain data models in

PostgreSQL and enterprise data warehouses

Support

Lakehouse architecture

leveraging

Databricks, Delta Lake

, and cloud storage

Improve performance, reliability, and cost-efficiency of data platforms

Healthcare Data \& Compliance

Work with healthcare datasets, including producer/agent, broker, commission, and distribution data, ensuring proper ingestion, normalization, and optimization for analytics and reporting
Ensure compliance with

HIPAA, HITECH

, and enterprise data governance policies

Implement data security, encryption, masking, and access controls
Maintain data lineage, auditability, and regulatory reporting readiness

Advanced Data Processing

Build

real-time and batch pipelines

for analytics and operational use cases

Develop

data transformations using PySpark and SQL

within Databricks

Leverage

PostgreSQL

for transactional and analytical workloads where applicable

Integrate data from APIs, third-party vendors, and internal systems

Collaboration \& Stakeholder Engagement

Partner with business stakeholders to support data-driven initiatives and member acquisition strategies
Translate insurance distribution,

agent/producer

, and

marketing requirements

into scalable, high-quality data solutions

Support downstream consumers, including Power BI, marketing analytics teams, and operational reporting stakeholders, by delivering curated, analytics-ready datasets

Technical Leadership

Lead design and architecture discussions for enterprise data solutions
Establish and enforce

best practices

in data engineering, testing, and CI/CD

Contribute to enterprise data strategy and platform modernization

AI \& Advanced Analytics (Databricks Genie)

Leverage

Databricks Genie

(AI/BI capabilities) to enable natural language querying and democratize data access for business stakeholders

Design and optimize semantic layers and governed datasets that power Genie-driven insights with trusted, high-quality data
Collaborate with stakeholders to translate business questions into AI-assisted analytics workflows using Databricks
Ensure AI outputs are accurate, explainable, and compliant with healthcare data governance and HIPAA requirements
Leverage large language models (LLMs), including

Anthropic Claude

, to enhance data exploration, automate insight generation, and support conversational analytics use cases

Integrate

Genie capabilities with Delta Lake

and curated data models to support near real-time insights and decision-making

Partner with data scientists and analytics teams to enhance AI-driven use cases, including producer performance insights, marketing attribution, and member engagement analysis

Required Qualifications

Bachelor’s or Master’s degree in Computer Science, Engineering, or related field
5–8\+ years of experience in

data engineering

Strong programming in

Python (PySpark)

and advanced

SQL

Hands-on experience with:
Databricks

(core requirement)

PostgreSQL
Distributed data processing frameworks (Apache Spark)
Experience with

cloud platforms

(Azure preferred; AWS acceptable)

Proficiency in building and maintaining

ETL/ELT pipelines

Strong understanding of

data modeling and warehousing concepts

Preferred Qualifications

Experience in

healthcare or insurance industry

(payer experience strongly preferred)

Familiarity with healthcare standards (e.g.,

FHIR, HL7

)

Experience with:
Delta Lake / Lakehouse architecture
Orchestration tools (Airflow, Azure Data Factory)
Streaming (Kafka, Event Hubs)
Knowledge of

DevOps and CI/CD pipelines

(Azure DevOps, GitHub Actions)

Experience supporting

machine learning pipelines

Key Skills \& Competencies

Deep understanding of

data pipelines at scale

Strong experience with

Databricks ecosystem and Spark optimization

Expertise in

PostgreSQL performance tuning and schema design

Strong attention to

data quality, governance, and compliance

Excellent communication skills, especially with non-technical stakeholders
Ability to work in a

highly regulated healthcare environment

Typical Technology Stack

Data Platform:

Databricks, Delta Lake

Database:

PostgreSQL, Snowflake (optional)

Cloud:

Azure, Google, AWS

Languages:

Python, SQL

Orchestration:

Airflow, Azure Data Factory

Visualization:

Power BI

Version Control:

Git

KPIs / Success Metrics

Reliability and performance of

Databricks pipelines

Data quality and compliance adherence (HIPAA standards)
Time-to-delivery for new data products
Query performance improvements in PostgreSQL and data warehouse systems
Stakeholder adoption and satisfaction

Data Engineer

Job Description

Looking for more opportunities?