About Us

HENI is an international art services business working with leading artists and estates across printmaking, marketplaces for physical artworks, NFTs, publishing, digital, video production, art research and analysis. HENI is at the cutting edge of art and tech using the latest and best technologies, to make art accessible to audiences worldwide.

Position Overview

We are looking for a Data Scientist with a strong academic foundation to join our Data team. You will apply your research skills to customer analytics — building models, creating dashboards, and delivering insights that support commercial teams and C-suite decision-making. You will also play a key role in broader data initiatives across the organisation, working to shape and deliver cross-team projects.

Key Responsibilities

Customer Analytics

Design and execute customer analytics: segmentation models, retention analysis, and behavioral insights
Create and maintain Customer Data Reports for C-suite stakeholders covering key business metrics
Build and maintain dashboards in Apache Superset for self-serve business intelligence
Write analytic SQL queries to support accounts and client liaison teams
Integrate third-party data sources (e.g. HubSpot, Facebook Business) into the customer data platform

Data Engineering \& Platform

Build and maintain data pipelines
Develop internal data applications for ad-hoc analysis and customer research
Implement data quality checks and validation to ensure pipeline reliability
Support data architecture decisions and contribute to broader data platform improvements
Respond to ad-hoc data requests from across the business
Contribute to HENI News data initiatives

Required Technical Skills

Data Processing \& Analytics

Strong Python skills — the primary language for all data work
pandas and numpy for exploratory analysis and smaller datasets
SQL for analytical queries and database interaction
Strong foundation in statistical modelling and/or machine learning (e.g. scikit-learn, scipy, statsmodels)
Experience with data visualisation libraries (matplotlib, seaborn, or similar)
Experience working with REST APIs

Visualisation \& Reporting

Experience with BI/dashboarding tools (e.g. Superset, Looker, Metabase)
Experience building internal data tools or apps (e.g. Streamlit, Dash)

Software Development Practices

Git and version control workflows
Familiarity with automated testing approaches: unit tests, integration tests, and data quality tests
Familiarity with Infrastructure as Code, containerization (Docker), CI/CD
Writing clean, maintainable, well-structured code

Nice-to-Have Skills

Experience with distributed data processing frameworks (e.g. PySpark, Spark SQL)
Experience with cloud-based data pipeline tools (e.g. AWS Glue, Azure Data Factory, GCP Dataproc)
Experience with container orchestration (e.g. Kubernetes, Docker swarm, AWS ECS)
Familiarity with cloud object storage (e.g. S3, GCS, Azure Blob) and columnar data formats (e.g. Parquet)
Experience with CRM/marketing platform APIs (e.g. HubSpot, Salesforce or similar)
Experience integrating LLM APIs (e.g. Gemini/Vertex AI, OpenAI/ChatGPT) to build sophisticated data products

Our Stack

AWS (S3, RDS, Glue, ECS, EC2)
Airbyte, Apache Airflow
Streamlit, Apache Superset
Delta Lake, PostgreSQL
Docker, Kubernetes, AWS CDK
Git

Programming Languages

Python (primary)
SQL (strong)

Education \& Experience

PhD in a quantitative discipline (e.g. Statistics, Computer Science, Physics, Mathematics, Engineering, or related field)
1-2 years of industry experience in a data science, analytics, or software role
Ability to translate academic research skills into practical business insights
Experience presenting data insights to non-technical stakeholders
Eager to learn production data engineering practices and cloud tooling

Data Scientist

Job Description

Looking for more opportunities?