Job Description
About Us:
We are a healthcare technology company that provides platforms and solutions to improve the management and access of cost-effective pharmacy benefits. Our technology helps enterprise and partnership clients simplify their businesses and helps consumers save on prescriptions.
As a leader in SaaS technology for healthcare, we offer innovative solutions with integrated intelligence on a single enterprise platform that connects the pharmacy ecosystem. With our expertise and modern, modular platform, our partners use real-time data to transform their business performance and optimize their innovative models in the marketplace.
Position Summary
The Senior Data Engineer is responsible for building and operating the data infrastructure that powers RxSense’s analytics, AI systems, and business intelligence capabilities. You will work across Snowflake and SQL Server to design pipelines that move claims data, pricing data, and clinical information from operational systems into governed, accessible, and AI-ready formats. This role is critical to our AI-native transformation—every agentic system, pricing model, and clinical intelligence capability depends on reliable, well-governed data. You will also play a key role in solving longstanding data access and governance challenges, building the foundations that enable both humans and AI agents to work with data safely and efficiently.
Key Responsibilities
Data Pipeline Design and Operations
- Design, build, and maintain production-grade ETL/ELT pipelines that move data between SQL Server (operational), Snowflake (analytical), and downstream consumers including AI systems, reporting tools, and business intelligence platforms.
- Optimize data ingestion and transformation patterns for healthcare-scale volumes—millions of claims, pricing transactions, and member records processed daily.
- Implement data quality checks, validation rules, and monitoring that catch issues before they propagate to analytics, AI models, or regulatory reports.
- Build and maintain data models in Snowflake that support self-service analytics, enabling product, clinical, actuarial, and operations teams to answer their own questions.
- Manage pipeline scheduling, orchestration, and SLA monitoring to ensure data freshness targets are met across all business-critical data products.
Data Governance and Access
- Implement role-based access controls (RBAC) and data governance frameworks that enable squad-level and group-level data access rather than ad hoc individual permissions.
- Build and maintain data catalogs and lineage documentation that make it clear what data exists, where it comes from, what transformations have been applied, and who has access.
- Design data access patterns specifically for AI agents, ensuring agents can retrieve the data they need with appropriate authorization, audit trails, and containment boundaries.
- Ensure all data infrastructure complies with HIPAA requirements, including data de-identification for non-production environments, PHI access logging, and encryption at rest and in transit.
- Collaborate with security and IT teams to implement secrets management best practices for database credentials, API keys, and service accounts used in data pipelines.
Platform and Infrastructure
- Architect Snowflake environments for cost-effective performance, including warehouse sizing, clustering, materialized views, and query optimization strategies.
- Support the lower-environment data strategy by implementing alternatives to full production data replication, including data subsetting, synthetic data generation, and lookback-window-based approaches.
- Collaborate with DevOps and infrastructure teams on AWS-based data infrastructure, including S3 storage optimization, IAM policies for data access, and cost management across data storage tiers.
- Evaluate and implement data integration tools and frameworks that reduce pipeline development time while maintaining reliability and observability.
AI and Analytics Enablement
- Partner with the AI team to build data foundations for AI workloads, including feature stores, training data pipelines, and governed access to claims and pricing data for model development.
- Build data pipelines that support real-time and near-real-time use cases for AI-driven pricing, claims analysis, and clinical intelligence.
- Develop data products that leverage RxSense’s longitudinal claims data as a compounding competitive advantage—enabling trend analysis, formulary optimization, and cost management insights.
- Support the development of financial visibility tools that enable reporting on per-customer cost and spend, closing a critical gap in current business intelligence capabilities.
Qualifications
- 5\+ years of professional data engineering experience with strong proficiency in SQL and data pipeline development for production systems.
- Deep hands-on experience with Snowflake, including data modeling, performance optimization, warehouse management, and cost governance.
- Strong experience with SQL Server, including complex query development, stored procedures, and understanding of transactional database patterns.
- Proficiency in Python for data pipeline development, automation, and integration with cloud services and APIs.
- Experience with ETL/ELT tools and frameworks (dbt, Matillion, Airflow, or equivalent) and data orchestration patterns.
- Experience with AWS cloud services relevant to data engineering (S3, Glue, Lambda, IAM, or equivalent).
- Strong understanding of data governance, data quality frameworks, and access control models—particularly in environments with sensitive or regulated data.
- Experience building data catalogs, lineage documentation, or metadata management systems.
- Excellent problem-solving skills and a track record of building data systems that are reliable, well-documented, and maintainable by teams beyond the original author.
- Bachelor’s degree in Computer Science, Data Science, Engineering, or a related field, or equivalent practical experience.
Preferred
- Experience in healthcare, pharmacy benefits, health insurance, or a related regulated industry. Understanding of claims data structures, pharmacy transactions, or PBM operations is highly valued.
- Experience with healthcare data standards (HL7, FHIR, NCPDP) and HIPAA compliance requirements for data infrastructure.
- Experience building data infrastructure that supports AI/ML workloads, including feature engineering, training data management, and model serving pipelines.
- Familiarity with data de-identification techniques and strategies for managing PHI in non-production environments.
- Experience with Redis or distributed caching systems and understanding of how cached data layers interact with analytical systems.
- Track record of migrating or modernizing legacy data architectures into cloud-native platforms.
- Experience with data observability tools (Monte Carlo, Metaplane, or equivalent) and SLA-driven pipeline monitoring.
RxSense believes that a diverse workforce is a more talented and productive workforce. As such, we are an Equal Opportunity and Affirmative Action employer. Our recruitment process is free from discriminatory hiring practices and all qualified applicants are considered for employment without regard to race, color, religion, sex, gender, sexual orientation, gender identity, ancestry, age, or national origin. Neither will qualified applicants be discriminated against on the basis of disability or protected veteran status. We believe in the strength of the collaboration, creativity and sense of community a diverse workforce
brings.
Looking for more opportunities?
Browse thousands of graduate jobs and entry-level positions.