Skip to main content
Y

Data Scientist

Yochana

Location

Remote

Salary

Not specified

Type

Full-time

Posted

Today

via linkedin

Job Description

Data Scientist

Primary skills:

  • PDF extraction \& OCR post-processing: Hands-on experience extracting structured data from PDF, HTML, and scanned sources. Proficiency with Python libraries (pdfplumber, PyMuPDF, Tesseract OCR) and the ability to build robust post-processing and validation pipelines.
  • Python proficiency: Working experience in Python for data manipulation, analysis, and pipeline development. Comfortable with core OOP concepts (classes, encapsulation, inheritance) at a functional level.
  • Generative AI: Knowledge and experience in Generative AI (LLM models, prompting techniques, RAG / GraphRAG solutions).
  • SQL: Strong working experience in SQL for data querying, validation, and transformation.
  • Data analysis \& transformation: Experience analysing, transforming, manipulating and interpreting data.
  • Collaborative code repositories: Experience with shared code repositories (Git/GitHub).

Good to have skills

  • :Azure Databricks / PySpark: Experience with Databricks and PySpark for high-volume distributed data processing scenarios
  • .XBRL / iXBRL: Familiarity with XBRL and iXBRL financial reporting formats is an advantage for financial data product work
  • .Agile and scrum tools: Experience working with agile and scrum tools (Azure DevOps)
  • .Knowledge graphs / graph databases: Experience with Neo4j / Cypher or similar graph technologies

.

Looking for more opportunities?

Browse thousands of graduate jobs and entry-level positions.

Browse All Jobs