Skip to main content
C

Software Engineer, AI Platform ($180K–$250K + equity) Enterprise AI Infrastructure

CoffeeSpace

Location

San Francisco, CA

Salary

$180,000 - $250,000 /yearly

Type

fulltime

Posted

Today

via linkedin

Job Description

About the job

This role is being recruited by CoffeeSpace on behalf of a VC-backed enterprise AI startup building the work intelligence layer for Fortune 500 companies.

We’re identifying a small number of exceptional AI platform, data platform, and infrastructure engineers from our network.

If there’s a strong fit, we’ll introduce you directly to the founding team.

Location:

San Francisco, CA, in-office Monday to Friday

Compensation:

$180k–$250k base \+ competitive equity

Employment type:

Full-time

Experience:

3–8 years

About the company

This company is building the work intelligence layer for the world’s largest enterprises.

Their platform embeds into the systems and processes large companies already rely on, captures how work actually gets done, measures productivity and process conformance, and identifies where AI and automation can have the highest impact.

This is not an AI notetaker or a thin wrapper. The company is building the data layer for the autonomous enterprise: an ontology of how work happens across large organizations, powered by massive-scale LLM inference, agent infrastructure, and enterprise-grade data systems.

The company is backed by top-tier VCs, has raised $6M, is headquartered in San Francisco, and is already working with Fortune 500 customers across healthcare, insurance, and retail.

The team is small, technical, and moving quickly, with around 18 people today.

About the role

This is an AI platform and data infrastructure role.

You’ll own the data platform that powers every AI feature the company ships.

A single enterprise customer can generate 10–30TB of LLM inference per week, running continuously 24/7 for the life of the contract. Your job is to make that infrastructure reliable, observable, cost-efficient, and scalable.

You’ll work on LLM-driven ETL systems, agent transformation infrastructure, orchestration, observability, evals, and the production data flows that turn messy work activity into structured, queryable intelligence for the product layer.

This is a high-ownership seat. Reliability, throughput, cost, and debuggability all matter. Nothing meaningful works without this platform.

What you’ll do

  • Own and evolve the data platform powering every AI feature across the company
  • Run and improve complex LLM-driven ETL pipelines handling 10–30TB of inference per customer per week
  • Build agent transformation infrastructure that turns LLM and agent outputs into structured, queryable data
  • Improve reliability, accuracy, throughput, and cost of LLM-driven jobs in production
  • Build observability, evals, and tooling so the team can debug, iterate, and measure system performance quickly
  • Partner with AI engineers to expose new platform capabilities and shape the interfaces they build on
  • Make practical tradeoffs across reliability, latency, cost, and speed in a fast-moving startup environment
  • Participate in an SRE-style on-call rotation and incident response, compensated separately on top of base salary

The ideal candidate

  • 3–8 years of experience in data platform, backend, infrastructure, or production systems engineering
  • Has owned production data platforms or ETL pipelines end-to-end, including retries, backfills, failure recovery, and operational debugging
  • Has worked with terabyte-to-petabyte scale data processing
  • Strong Python production engineering experience, ideally with FastAPI or similar frameworks
  • Experience with AWS infrastructure such as ECS, Lambda, SQS, Step Functions, RDS, and S3
  • Comfortable with infrastructure as code, such as Terraform or similar tools
  • Experience with DAG or orchestration systems such as Dagster, Airflow, Prefect, Temporal, or similar
  • Strong PostgreSQL experience, ideally including pgvector or vector-adjacent systems
  • Has operated LLM-based systems in production and understands cost, latency, reliability, and throughput tradeoffs
  • Has experience with AI-enabled ETL pipelines or agentic data transformation systems
  • Strong abstract problem-solving ability and a reliability mindset
  • Excited to work in-person five days a week in San Francisco

Bonus points

  • Experience at an early-stage startup, ideally Seed to Series C, during a growth inflection
  • Experience with distributed compute systems
  • Experience building observability, evals, or debugging infrastructure for AI systems
  • Meaningful side projects, research, open-source work, founder experience, or another signal of deep technical curiosity outside of work
  • This role is not a fit if
  • You are primarily an analytics or BI engineer without platform or infrastructure ownership
  • You have only worked on clean batch analytics pipelines and do not want to own production reliability
  • You are uncomfortable with on-call, incident response, or debugging live production systems
  • You rely heavily on LLMs without understanding why systems are built a certain way
  • You prefer narrowly scoped tickets over owning ambiguous infrastructure problems end-to-end
  • You are looking for a remote role

Why this role

  • You’ll own some of the most critical infrastructure at the company.
  • The platform processes unusually large volumes of LLM inference for real enterprise customers, with real operational stakes. Few AI-native startups are operating at this scale, and even fewer are doing it inside Fortune 500 workflows where reliability, privacy, and cost all matter.
  • This is a chance to build the infrastructure layer behind a new category of enterprise AI: systems that understand how work happens, where automation can help, and how autonomous agents can eventually operate inside large organizations.
  • You’ll be joining early, with strong customer pull, top-tier backing, and a technical surface area that spans LLM pipelines, data infrastructure, agent systems, orchestration, observability, and enterprise-scale reliability.

Next steps

  1. Apply via this LinkedIn job post
  2. We’ll review and reach out if there’s a strong match
  3. If aligned, we’ll introduce you directly to the founding team
  4. If this role isn’t the right fit, we may suggest and make introductions to other high-signal startup roles we’re actively recruiting for, always with your permission

A quick note on authenticity

This is a real, active role we are supporting in close partnership with the hiring team. We do not post speculative roles and work directly with teams on their actual hiring needs.

Looking for more opportunities?

Browse thousands of graduate jobs and entry-level positions.

Browse All Jobs