Location
New York, NY
Salary
Not specified
Type
fulltime
Posted
Today
Job Description
The Commodities Technology team builds and operates the data platform that aggregates and curates critical commodities data, including weather, supply/demand, storage, transportation and other fundamental and alternative datasets. This curated “content layer” is central to how our Portfolio Managers and researchers understand markets and construct trades.
We are seeking a Commodities Content Engineer who will focus on building robust ETL workflows and data models on top of our commodities data platform.
In this role, you will use Python and SQL to design, implement and maintain pipelines that ingest, clean, transform, and catalog commodities datasets. You will work closely with quantitative researchers, data analysts, and the broader Commodities Technology team to translate domain requirements into well‑structured, reliable data assets that can be easily discovered and reused across strategies.
This is a hands‑on engineering role with significant exposure to commodities data and the opportunity to shape how that data is represented and consumed across the firm.
Key Responsibilities
- Design and implement end‑to‑end ETL workflows in Python and SQL to ingest and transform commodities data from multiple vendors and internal sources.
- Build and maintain standardized data models, schemas, and metadata that make commodities datasets easy to understand and discover within the platform.
- Use Airflow (or similar tools) to schedule, monitor, and manage data pipelines, ensuring reliability and timely delivery.
- Implement robust validation, reconciliation, and anomaly‑detection checks to ensure data completeness, correctness, and consistency.
- Leverage AI to automate schema inference across structured and semi-structured data sources, manage schema drift, and accelerate development of scalable ingestion pipelines.
- Apply AI-driven data quality, observability, and documentation capabilities to detect anomalies, monitor data health, and generate clear lineage and technical documentation across complex data workflows.
- Leverage Git, GitHub Actions, and automated testing (PyTest) to maintain high‑quality code and repeatable deployments.
- Partner with commodities PMs, researchers, and data strategists to understand use cases and continuously refine datasets, definitions, and documentation.
Required Qualifications
- 4 years of experience in data engineering, analytics engineering, or similar roles focused on building and maintaining ETL pipelines.
- Strong skills in Python and SQL, with experience working with large datasets and complex transformations.
- Hands‑on experience with Airflow or other workflow schedulers.
- Familiarity with version control (Git), CI/CD pipelines (GitHub Actions or equivalent), and test automation (e.g., PyTest).
- Strong attention to detail, data quality and documentation; ability to reason for edge cases and data integrity.
- Ability to work independently, communicate clearly with both technical and non‑technical stakeholders, and manage work across multiple concurrent initiatives.
Preferred Qualifications
- Knowledge of commodities markets and commodities data (e.g., weather, supply/demand, storage, freight, flows).
- Experience with data warehousing technologies (e.g., Snowflake, columnar storage formats, or analytic databases).
- Prior experience in a financial services, trading, or research driven environment.
- Exposure to data catalog / data governance tools and best practices.
The estimated base salary range for this position is $175,000 to $250,000, which is specific to New York and may change in the future. Millennium pays a total compensation package which includes a base salary, discretionary performance bonus, and a comprehensive benefits package. When finalizing an offer, we take into consideration an individual’s experience level and the qualifications they bring to the role to formulate a competitive total compensation package.
Looking for more opportunities?
Browse thousands of graduate jobs and entry-level positions.