Location
San Francisco, CA
Salary
Not specified
Type
fulltime
Posted
Today
Job Description
The Role
You'll be our first dedicated data engineer, building the data foundation that every team at Comfy will rely on. We stood up the basics of a data stack recently — now we need someone to take ownership of it and turn it into something the whole company trusts.
This means designing dimensional models, establishing dbt standards, building pipeline observability, and making data self-serve for product, BizOps, growth, and engineering. You'll work across Comfy Cloud usage, open-source telemetry, Registry activity, GPU scheduling, and billing data. Our stack includes Snowflake and dbt, but we care way more about your ability to learn and ship than whether you've used these exact tools before.
You might be a good fit if
- You've owned a data warehouse end-to-end and built the modeling standards that kept it trustworthy as things scaled
- You care about data quality enough to write the tests before anyone asks you to
- You're comfortable working directly with non-technical stakeholders to define metrics, resolve ambiguity, and build models people can actually self-serve against
- You instinctively reach for automation to eliminate toil, not because it's trendy, but because life is short
- You've shipped dimensional models that survived contact with stakeholders who kept changing their minds
**What you'll do**
- Own the warehouse end-to-end. Design dimensional models, establish dbt conventions, and build datasets that serve analytics, product, and finance — starting with unifying Comfy Cloud events, open-source install metrics, and Registry activity into a clean model with dbt tests that prove it's correct.
- Build observability into the stack. Alerting, lineage, freshness monitoring from ingestion to consumption — so you know when an upstream schema change breaks something before the Monday metrics review, not after.
- Catch data quality problems at the source. Write the controls and tests that surface issues before they show up in a dashboard and someone has to ask why the numbers look wrong.
- Make data self-serve for every team. Work directly with BizOps, product, and engineering to understand what they're trying to answer, then build the models that let them answer it themselves — including the free-to-paid conversion funnel, a single agreed-upon definition of "active user," and an end-to-end billing model that reconciles usage events with invoices and flags discrepancies automatically.
Nice to have
- Python for data engineering (not notebooks, actual pipeline code)
- Snowflake, BigQuery, or similar cloud warehouse
- Airflow, Dagster, or similar orchestration
- Experience with product analytics or BI tools
- Familiarity with ComfyUI or node-based workflow tools
Looking for more opportunities?
Browse thousands of graduate jobs and entry-level positions.