Job Description
Senior Data Engineer
Fortune 50 Healthcare
Brooksource
Remote
Overview
Our Fortune 50 Healthcare client is seeking a
Senior Data Engineer
to support our mission of improving the health and well-being of our members. This role will focus on
building scalable, secure, data centric solutions
and compliant data platforms that power analytics, clinical insights, and business decision-making across the enterprise.
The ideal candidate will have strong experience with
cloud-based data platforms, Databricks, PostgreSQL
, and healthcare data, with a passion for delivering high-quality, trusted data solutions in a regulated environment.
Key Responsibilities
Data Engineering \& Platform Development
- Design, develop, and
scalable data pipeline solutions
using
Databricks (Spark)
and cloud-native services
- Build and optimize
ETL/ELT workflows
for ingesting structured and unstructured healthcare data (claims, clinical, provider, and member data)
- Develop and maintain data models in
PostgreSQL and enterprise data warehouses
- Support
Lakehouse architecture
leveraging
Databricks, Delta Lake
, and cloud storage
- Improve performance, reliability, and cost-efficiency of data platforms
Healthcare Data \& Compliance
- Work with healthcare datasets, including producer/agent, broker, commission, and distribution data, ensuring proper ingestion, normalization, and optimization for analytics and reporting
- Ensure compliance with
HIPAA, HITECH
, and enterprise data governance policies
- Implement data security, encryption, masking, and access controls
- Maintain data lineage, auditability, and regulatory reporting readiness
Advanced Data Processing
- Build
real-time and batch pipelines
for analytics and operational use cases
- Develop
data transformations using PySpark and SQL
within Databricks
- Leverage
PostgreSQL
for transactional and analytical workloads where applicable
- Integrate data from APIs, third-party vendors, and internal systems
Collaboration \& Stakeholder Engagement
- Partner with business stakeholders to support data-driven initiatives and member acquisition strategies
- Translate insurance distribution,
agent/producer
, and
marketing requirements
into scalable, high-quality data solutions
- Support downstream consumers, including Power BI, marketing analytics teams, and operational reporting stakeholders, by delivering curated, analytics-ready datasets
Technical Leadership
- Lead design and architecture discussions for enterprise data solutions
- Establish and enforce
best practices
in data engineering, testing, and CI/CD
- Contribute to enterprise data strategy and platform modernization
AI \& Advanced Analytics (Databricks Genie)
- Leverage
Databricks Genie
(AI/BI capabilities) to enable natural language querying and democratize data access for business stakeholders
- Design and optimize semantic layers and governed datasets that power Genie-driven insights with trusted, high-quality data
- Collaborate with stakeholders to translate business questions into AI-assisted analytics workflows using Databricks
- Ensure AI outputs are accurate, explainable, and compliant with healthcare data governance and HIPAA requirements
- Leverage large language models (LLMs), including
Anthropic Claude
, to enhance data exploration, automate insight generation, and support conversational analytics use cases
- Integrate
Genie capabilities with Delta Lake
and curated data models to support near real-time insights and decision-making
- Partner with data scientists and analytics teams to enhance AI-driven use cases, including producer performance insights, marketing attribution, and member engagement analysis
Required Qualifications
- Bachelor’s or Master’s degree in Computer Science, Engineering, or related field
- 5–8\+ years of experience in
data engineering
- Strong programming in
Python (PySpark)
and advanced
SQL
- Hands-on experience with:
- Databricks
(core requirement)
- PostgreSQL
- Distributed data processing frameworks (Apache Spark)
- Experience with
cloud platforms
(Azure preferred; AWS acceptable)
- Proficiency in building and maintaining
ETL/ELT pipelines
- Strong understanding of
data modeling and warehousing concepts
Preferred Qualifications
- Experience in
healthcare or insurance industry
(payer experience strongly preferred)
- Familiarity with healthcare standards (e.g.,
FHIR, HL7
)
- Experience with:
- Delta Lake / Lakehouse architecture
- Orchestration tools (Airflow, Azure Data Factory)
- Streaming (Kafka, Event Hubs)
- Knowledge of
DevOps and CI/CD pipelines
(Azure DevOps, GitHub Actions)
- Experience supporting
machine learning pipelines
Key Skills \& Competencies
- Deep understanding of
data pipelines at scale
- Strong experience with
Databricks ecosystem and Spark optimization
- Expertise in
PostgreSQL performance tuning and schema design
- Strong attention to
data quality, governance, and compliance
- Excellent communication skills, especially with non-technical stakeholders
- Ability to work in a
highly regulated healthcare environment
Typical Technology Stack
- Data Platform:
Databricks, Delta Lake
- Database:
PostgreSQL, Snowflake (optional)
- Cloud:
Azure, Google, AWS
- Languages:
Python, SQL
- Orchestration:
Airflow, Azure Data Factory
- Visualization:
Power BI
- Version Control:
Git
KPIs / Success Metrics
- Reliability and performance of
Databricks pipelines
- Data quality and compliance adherence (HIPAA standards)
- Time-to-delivery for new data products
- Query performance improvements in PostgreSQL and data warehouse systems
- Stakeholder adoption and satisfaction
Looking for more opportunities?
Browse thousands of graduate jobs and entry-level positions.