Skip to main content
B

Data Engineer

Brooksource

Location

Remote

Salary

Not specified

Type

contract

Posted

Today

via linkedin

Job Description

Senior Data Engineer

Fortune 50 Healthcare

Brooksource

Remote

Overview

Our Fortune 50 Healthcare client is seeking a

Senior Data Engineer

to support our mission of improving the health and well-being of our members. This role will focus on

building scalable, secure, data centric solutions

and compliant data platforms that power analytics, clinical insights, and business decision-making across the enterprise.

The ideal candidate will have strong experience with

cloud-based data platforms, Databricks, PostgreSQL

, and healthcare data, with a passion for delivering high-quality, trusted data solutions in a regulated environment.

Key Responsibilities

Data Engineering \& Platform Development

  • Design, develop, and

scalable data pipeline solutions

using

Databricks (Spark)

and cloud-native services

  • Build and optimize

ETL/ELT workflows

for ingesting structured and unstructured healthcare data (claims, clinical, provider, and member data)

  • Develop and maintain data models in

PostgreSQL and enterprise data warehouses

  • Support

Lakehouse architecture

leveraging

Databricks, Delta Lake

, and cloud storage

  • Improve performance, reliability, and cost-efficiency of data platforms

Healthcare Data \& Compliance

  • Work with healthcare datasets, including producer/agent, broker, commission, and distribution data, ensuring proper ingestion, normalization, and optimization for analytics and reporting
  • Ensure compliance with

HIPAA, HITECH

, and enterprise data governance policies

  • Implement data security, encryption, masking, and access controls
  • Maintain data lineage, auditability, and regulatory reporting readiness

Advanced Data Processing

  • Build

real-time and batch pipelines

for analytics and operational use cases

  • Develop

data transformations using PySpark and SQL

within Databricks

  • Leverage

PostgreSQL

for transactional and analytical workloads where applicable

  • Integrate data from APIs, third-party vendors, and internal systems

Collaboration \& Stakeholder Engagement

  • Partner with business stakeholders to support data-driven initiatives and member acquisition strategies
  • Translate insurance distribution,

agent/producer

, and

marketing requirements

into scalable, high-quality data solutions

  • Support downstream consumers, including Power BI, marketing analytics teams, and operational reporting stakeholders, by delivering curated, analytics-ready datasets

Technical Leadership

  • Lead design and architecture discussions for enterprise data solutions
  • Establish and enforce

best practices

in data engineering, testing, and CI/CD

  • Contribute to enterprise data strategy and platform modernization

AI \& Advanced Analytics (Databricks Genie)

  • Leverage

Databricks Genie

(AI/BI capabilities) to enable natural language querying and democratize data access for business stakeholders

  • Design and optimize semantic layers and governed datasets that power Genie-driven insights with trusted, high-quality data
  • Collaborate with stakeholders to translate business questions into AI-assisted analytics workflows using Databricks
  • Ensure AI outputs are accurate, explainable, and compliant with healthcare data governance and HIPAA requirements
  • Leverage large language models (LLMs), including

Anthropic Claude

, to enhance data exploration, automate insight generation, and support conversational analytics use cases

  • Integrate

Genie capabilities with Delta Lake

and curated data models to support near real-time insights and decision-making

  • Partner with data scientists and analytics teams to enhance AI-driven use cases, including producer performance insights, marketing attribution, and member engagement analysis

Required Qualifications

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or related field
  • 5–8\+ years of experience in

data engineering

  • Strong programming in

Python (PySpark)

and advanced

SQL

  • Hands-on experience with:
  • Databricks

(core requirement)

  • PostgreSQL
  • Distributed data processing frameworks (Apache Spark)
  • Experience with

cloud platforms

(Azure preferred; AWS acceptable)

  • Proficiency in building and maintaining

ETL/ELT pipelines

  • Strong understanding of

data modeling and warehousing concepts

Preferred Qualifications

  • Experience in

healthcare or insurance industry

(payer experience strongly preferred)

  • Familiarity with healthcare standards (e.g.,

FHIR, HL7

)

  • Experience with:
  • Delta Lake / Lakehouse architecture
  • Orchestration tools (Airflow, Azure Data Factory)
  • Streaming (Kafka, Event Hubs)
  • Knowledge of

DevOps and CI/CD pipelines

(Azure DevOps, GitHub Actions)

  • Experience supporting

machine learning pipelines

Key Skills \& Competencies

  • Deep understanding of

data pipelines at scale

  • Strong experience with

Databricks ecosystem and Spark optimization

  • Expertise in

PostgreSQL performance tuning and schema design

  • Strong attention to

data quality, governance, and compliance

  • Excellent communication skills, especially with non-technical stakeholders
  • Ability to work in a

highly regulated healthcare environment

Typical Technology Stack

  • Data Platform:

Databricks, Delta Lake

  • Database:

PostgreSQL, Snowflake (optional)

  • Cloud:

Azure, Google, AWS

  • Languages:

Python, SQL

  • Orchestration:

Airflow, Azure Data Factory

  • Visualization:

Power BI

  • Version Control:

Git

KPIs / Success Metrics

  • Reliability and performance of

Databricks pipelines

  • Data quality and compliance adherence (HIPAA standards)
  • Time-to-delivery for new data products
  • Query performance improvements in PostgreSQL and data warehouse systems
  • Stakeholder adoption and satisfaction

Looking for more opportunities?

Browse thousands of graduate jobs and entry-level positions.

Browse All Jobs