Location
Sandy, UT
Salary
$125,000 - $155,000 /yearly
Type
fulltime
Posted
Today
Job Description
DATA PIPELINE ENGINEER — AMAZON DATA INFRASTRUCTURE
Reach Brands \| Lehi / Salt Lake City, UT \| Full-Time \| Hybrid, 1–2 days per week in office
ABOUT REACH
Reach Brands is an Amazon agency managing 40\+ active consumer brands. Our team runs Amazon operations across catalog health, listings, advertising, inventory, account strategy, and brand performance.
We are building a more automated, AI-assisted operating system for the business. That system depends on one thing above all else: clean, reliable, timely data.
Right now, our data infrastructure is too fragile for where we are going. We need someone who can own the pipelines, fix the weak points, and build a dependable foundation for reporting, AI agents, operational alerts, and client decision-making.
THE ROLE
- The Data Pipeline Engineer owns the data infrastructure that powers Reach’s internal operating system.
- You will be responsible for the pipelines feeding our account health system, client reporting, AI agents, executive dashboards, and operational monitoring. This includes Amazon Seller Central, Vendor Central, Amazon SP-API, Snowflake, Fivetran, Google Workspace ingestion, dbt, Airflow/Astronomer, and AI system data support.
- This is not a dashboard-building role. This is an infrastructure, reliability, and data quality role. You are the person who makes sure the data is fresh, accurate, traceable, and usable before the business depends on it.
WHAT YOU’LL OWN
Core Data Infrastructure
- Own and maintain Snowflake as the primary data warehouse
- Manage schemas, queries, access control, data models, and freshness monitoring
- Improve data organization so internal teams and AI systems can rely on trusted sources of truth
- Build pipelines that fail loudly instead of silently
- Use dbt to structure, transform, document, and maintain trusted data models
- Use Astronomer or Airflow to schedule, monitor, and maintain production data workflows
Amazon Data Pipelines
- Build and maintain Amazon data pipelines across Seller Central, Vendor Central, and Amazon SP-API
- Ingest and normalize Amazon operational data including inbound shipments, inventory, search terms, catalog/listing data, account health, brand health, sales, fees, advertising, and performance metrics
- Build reliable pipelines for account-level, brand-level, and client-level reporting across multiple Amazon accounts
- Handle Amazon data realities: rate limits, pagination, throttling, retries, delayed availability, schema changes, incomplete data, suppressed listings, stranded inventory, and mismatches between reports and source systems
- Use dbt to transform raw Amazon data into trusted operational models for reporting, alerts, executive dashboards, and AI agent workflows
- Use Astronomer or Airflow to schedule, monitor, retry, and manage Amazon data workflows
- Create freshness checks, anomaly checks, reconciliation checks, and failure alerts so Amazon pipeline issues surface before they affect reporting, AI agents, or account decisions
- Work with internal operators to confirm whether pipeline outputs match what is visible in Amazon Seller Central, Vendor Central, and related source systems
Fivetran / ETL Reliability
- Monitor, troubleshoot, and improve Fivetran connectors
- Fix current pipeline gaps, including Grow\+ account health data
- Add and maintain connectors as new client and operational data sources come online
- Create checks that identify sync failures, stale tables, schema drift, and missing data
Google Workspace Ingestion
- Build reliable ingestion from Google Drive, Gmail, and Google Sheets into structured storage
- Work with OAuth, service accounts, and domain-wide delegation
- Normalize messy business data into usable tables and trusted sources
- Support workflows where documents, spreadsheets, emails, and folders become structured operational inputs
AI Agent Data Support
- Provide the data foundation for ReachIQ and internal AI agents
- Help agents access clean context, structured records, commitment data, and account information
- Build safeguards so AI systems are not acting on stale, incomplete, or incorrect data
- Support structured and unstructured data workflows used by AI systems
Data Quality \& Observability
- Build monitoring and alerting so failures surface before they become business problems
- Own freshness checks, deduplication, reconciliation, source-of-truth discipline, and data validation
- Document lineage and explain where key metrics come from
- Build confidence that when a GM, executive, or AI agent sees a number, it can be trusted
WHAT SUCCESS LOOKS LIKE
First 30 Days
- Understand the current Snowflake, Fivetran, Amazon, Google Workspace, dbt, Airflow/Astronomer, and AI-agent data architecture
- Identify the most fragile pipelines and highest-risk data quality issues
- Document current data sources, owners, sync frequency, and failure points
- Fix at least one meaningful pipeline, freshness, or reconciliation issue
First 60 Days
- Implement monitoring for critical pipelines
- Improve reliability of Amazon and/or Google Workspace data ingestion
- Establish clear source-of-truth rules for key operational datasets
- Reduce manual checking by creating automated data quality checks
- Improve dbt models and Airflow/Astronomer workflows for critical data flows
First 90 Days
- Critical pipelines have freshness checks and visible failure alerts
- Grow\+ / account health data pipeline gaps are resolved or clearly mapped with a remediation plan
- Snowflake structure is cleaner, better documented, and easier for internal systems to use
- Amazon data pipelines are more reliable, traceable, and easier to troubleshoot
- AI agents have more reliable access to the data and context they need
- Leadership has better visibility into what data is trusted, stale, broken, or incomplete
REQUIRED EXPERIENCE
- 2\+ years maintaining production data pipelines with real uptime expectations
- Amazon marketplace data experience, including Seller Central, Vendor Central, Amazon SP-API, or Amazon reporting data
- Experience building or maintaining pipelines for Amazon data such as inventory, shipments, sales, fees, advertising, search terms, catalog/listing data, account health, or brand health
- Strong Snowflake experience
- dbt experience
- Astronomer or Airflow experience
- Experience debugging Fivetran or comparable ETL/ELT tools
- Experience with Python for data workflows, scripts, automation, or APIs
- Experience with Google Workspace APIs, including Drive, Gmail, Sheets, OAuth, service accounts, or domain-wide delegation
- Strong data quality instincts: freshness, deduplication, schema drift, reconciliation, lineage, and source-of-truth discipline
- Comfortable owning production systems where failures affect business decisions
- Able to communicate clearly with non-technical operators and executives
- Based in Utah and available in the Lehi/Salt Lake City area 1–2 days per week
PREFERRED EXPERIENCE
- Experience supporting AI agents, LLM workflows, RAG, or context retrieval systems
- Experience building pipeline monitoring, alerts, or observability tooling
- Experience in e-commerce, Amazon marketplace operations, or agency environments
- Experience with multi-account or multi-client data environments
- Comfortable working in a fast-moving, imperfect operating environment
TECH STACK
- Amazon Seller Central
- Amazon Vendor Central
- Amazon SP-API
- Snowflake
- Fivetran
- Python
- dbt
- Astronomer / Airflow
- Google Workspace APIs: Drive, Gmail, Sheets
- AI/LLM systems using structured and unstructured business context
WHAT THIS IS NOT
- This is not a BI analyst role
- This is not primarily a dashboard-building role
- This is not a research role
- This is not a “wait for tickets” engineering role
- This is not a role for someone who only wants clean, perfectly documented systems
This is a role for someone who likes fixing messy, real business infrastructure and making it dependable.
WHO WILL THRIVE IN THIS ROLE
You will thrive here if you:
- Like making fragile systems reliable
- Care deeply about whether the data is actually correct
- Can find the root cause when a number looks wrong
- Understand Amazon data well enough to know when the warehouse does not match the source system
- Are comfortable working across APIs, warehouses, pipelines, spreadsheets, and AI systems
- Prefer clear ownership over narrow job descriptions
- Can explain technical problems in plain English
- Think “it ran” is not good enough if nobody can trust the output
COMPENSATION
Base salary: $125,000 – $155,000 depending on experience. Full benefits.
HOW TO APPLY
ONLY FOR UTAH RESIDENTS
Send a short note answering these three questions:
- Tell us about an Amazon or marketplace data pipeline you have built, maintained, or fixed.
- Tell us about a production data pipeline that broke. What happened, and how did you diagnose it?
- What did you change so the issue would not happen again?
No generic cover letters. Applications without specific examples will not be reviewed.
Looking for more opportunities?
Browse thousands of graduate jobs and entry-level positions.