Location
New York, NY
Salary
Not specified
Type
fulltime
Posted
Today
Job Description
We provide organizations with invaluable foresight, empowering them to anticipate outcomes and proactively make the right decisions at the right time, every time.
We're a small, dedicated, mission-driven team and we intend to stay that way. We believe the best work happens when exceptionally talented people are given ownership, trust and the space to operate without bureaucratic friction. We work with urgency and intellectual honesty and expect new team members to match our velocity. We seek individuals who thrive at the frontier, who push beyond conventional limits, who bring curiosity and conviction in equal measure, and who want their work to have demonstrable impact in the world. If you're energized by the idea of a small team doing things that feel impossible, let’s build together.
ABOUT THE ROLE
As a Data Engineer, you'll build and scale the data acquisition and enrichment infrastructure that makes our simulations accurate. The Data team owns the pipelines that ingest, process, and serve the diverse real-world data sources our simulation engine depends on — from public demographic datasets to proprietary consumer behavior signals. You'll work on turning messy, heterogeneous data into clean, structured inputs that power every simulation we run.
RESPONSIBILITIES
- Design and build scalable data ingestion pipelines for diverse sources: public datasets (census, labor statistics), licensed proprietary data, and web-scraped sources
- Develop the data enrichment layer that joins location-level behavioral data with demographic profiles, workplace characteristics, and consumer behavior markers
- Build and maintain systems for processing unique data types — foot-traffic patterns, cross-shopping behavior, and trade area demographics
- Implement data quality monitoring and validation to ensure incoming data meets accuracy thresholds before it reaches the simulation engine
- Collaborate with AI/ML and research teams to identify and integrate new data sources that improve simulation fidelity
- Own the data infrastructure that serves enriched datasets to the simulation pipeline at the speed and reliability production demands
YOU MAY BE A FIT IF
- You've built production data pipelines that ingest and process data from multiple heterogeneous sources at scale
- You have experience working with geospatial data, census datasets, or similar public/proprietary data sources
- You care deeply about data quality and have built systems to detect, flag, and remediate data issues automatically
- You're comfortable with the full data lifecycle: acquisition, cleaning, transformation, storage, and serving
- You have strong SQL skills and experience with both OLTP and analytical databases
- You can work independently to scope, plan, and execute data infrastructure projects
STRONG CANDIDATES MAY ALSO
- Have experience with geospatial processing (reverse geocoding, census block mapping, trade area analysis)
- Have built ETL/ELT pipelines for alternative data (foot traffic, mobility, transaction data)
- Have worked with imputation techniques for handling missing or sparse data
- Have familiarity with demographic modeling or population statistics
LOCATION
This role is based in New York City. This is an in-person company and during this exciting period of hypergrowth, we work 6 days a week in office. Candidates are expected to be located within the New York City metropolitan area or open to relocation.
BENEFITS
We take care of our people. In addition to a competitive base salary and equity participation, we offer comprehensive medical, vision, and dental coverage, visa sponsorship and relocation support, and various other benefits and perks.
Looking for more opportunities?
Browse thousands of graduate jobs and entry-level positions.