Skip to main content
I

Data Scientist - Clinical Data Extraction & AI Integration

Invent Health

Location

Chennai, Tamil Nadu, India

Salary

Not specified

Type

fulltime

Posted

Today

via linkedin

Job Description

Data Scientist - Clinical Data Extraction \& AI Integration

Experience Level:

3-6 Years

Employment Type:

Full-time

About the Role

We are seeking an experienced Data Scientist to join our healthcare technology team, focusing on medical document processing and data extraction systems. You'll be working with cutting-edge AI technologies to build robust solutions that extract critical information from clinical documents, improving healthcare data workflows and patient care outcomes.

Key Responsibilities

Data Science \& Analytics

  • Design and implement statistical models for medical data quality assessment
  • Develop predictive algorithms for encounter classification and validation
  • Build machine learning pipelines for document pattern recognition
  • Create data-driven insights from clinical document structures
  • Implement feature engineering for medical terminology extraction

Advanced Analytics \& ML

  • Apply natural language processing (NLP) techniques to clinical text
  • Develop statistical validation frameworks for extracted medical data
  • Build anomaly detection systems for medical document processing
  • Create predictive models for discharge date estimation and encounter duration
  • Implement clustering algorithms for provider and encounter classification

AI \& LLM Integration

  • Integrate and optimize Large Language Models via AWS Bedrock and API services
  • Design and refine AI prompts for clinical content extraction with high accuracy
  • Implement fallback logic and error handling for AI-powered extraction systems
  • Develop pattern matching algorithms for medical terminology
  • Create validation layers for AI-extracted medical information

Healthcare Domain Expertise

  • Work with medical document structures
  • Implement healthcare-specific validation rules
  • Handle medical terminology extraction and clinical context analysis
  • Ensure HIPAA compliance and data security best practices

Technologies \& Tools

  • Languages:

Python 3\.8\+, R, SQL, JSON

  • Data Science Stack:

pandas, numpy, scipy, scikit-learn, spaCy, NLTK

  • ML Frameworks:

TensorFlow, PyTorch, transformers, huggingface

  • Visualization:

matplotlib, seaborn, plotly, Tableau, PowerBI

  • AI Platforms:

AWS Bedrock, Anthropic Claude, OpenAI APIs

  • Cloud Services:

AWS (SageMaker, S3, Lambda, Bedrock)

  • Research Tools:

Jupyter notebooks, Git, Docker, MLflow

Looking for more opportunities?

Browse thousands of graduate jobs and entry-level positions.

Browse All Jobs