Skip to main content
A

Senor/Staff Software Engineer (Data Platform)

Actively AI

Location

San Francisco, CA

Salary

Not specified

Type

fulltime

Posted

Today

via linkedin

Job Description

About Actively AI

Our thesis is that businesses of the future will be powered by agentic human-in-loop-machines that make every business function 10x more efficient.

Actively AI is building that superintelligent machine for Enterprise GTM organizations, focused on increasing productivity per rep. We power the day-to-day for outbound teams at dozens of companies like Samsara, Ramp, Verkada, and Ironclad.

Why does this matter?

Because revenue is the ultimate fuel for businesses.

The hundreds of millions of dollars we generate for our customers enables them to employ more people, innovate faster, and deliver more value to

their

customers.

In addition to top-notch customers that love our product, our team is incredibly high caliber - the co-founders are former Stanford AI researchers and the engineering team comes from Harvard, CMU, Berkeley, Brex, Scale AI, and Google. We're also backed by top investors, including Bain Capital Ventures, First Round Capital (seed investors in Uber, Square, Roblox, Clearbit), Lachy Groom, and Stanford AI faculty.

We have a very ambitious product and scaling roadmap, there’s strong market interest in what we are doing, and it’s time to put the foot on the gas. If you get excited by the thought of working really hard on these kinds of problems with a high caliber team, then Actively AI is the right place for you.

About Actively

Actively is building superintelligence for GTM teams.

We train custom reasoning agents to think like a company’s best sales reps, and then use them to evaluate millions of datapoint about every single prospect account in the company’s TAM to feed their sales reps the optimal actions to take every day to maximize revenue.

Our customers include some of the fastest-growing companies out there — Ramp, Verkada, Justworks, Ironclad, etc. — for whom we’ve driven hundreds of millions of dollars in added revenue in the past year.

We’ve raised over $22\.5M in funding, including a recent Series A led by Bain Capital Ventures

About the Role

We’re looking for a Senior/Staff Data Platform Engineer to build and scale the foundation of Actively’s data ecosystem; the pipelines, transformations, and infrastructure that power every agent, insight, and workflow across the company.

Actively's agents make decisions in real time, which accounts to prioritize, what actions to take, when to involve a human. All of that reasoning runs on data: CRM records, call transcripts, external signals, and customer-specific context pulled from dozens of sources. When that data is stale, malformed, or missing context, the agents get it wrong.

You'll build and scale the data foundation that every agent, insight, and workflow at Actively depends on; designing pipelines that handle diverse, often messy inputs and turn them into clean, structured, agent-ready representations. At scale, that means millions of accounts, each with their own data shapes, business rules, and edge cases, all needing to stay fresh and reliable.

The challenge isn't just throughput. It's building infrastructure that's opinionated enough to enforce quality and consistency, but flexible enough to adapt as new data sources, customer configurations, and agent capabilities keep evolving.

What You’ll Do

  • Own the ingestion and transformation layer.

Design and scale pipelines that pull structured and unstructured data from CRM systems, call transcripts, and external signals, normalizing and enriching it into representations agents can reason over in real time.

  • Build for operational use, not just analytics.

The data you produce doesn't power dashboards; it powers decisions. Freshness, accuracy, and low-latency access matter here in ways they don't in a typical data warehouse.

  • Keep data current as the world changes.

Architect real-time and mini-batch workflows using technologies like Pub/Sub, Kafka, or modern ETL tools to ensure data stays synchronized as customer activity happens.

  • Solve for customer-specific variation at scale.

Every customer has their own CRM configuration, field naming, and business logic. You'll build transformation systems that stay consistent and correct across all of them without becoming brittle.

  • Own reliability end to end.

Observability, lineage, schema management, alerting; you define what "trust in the data" means and make sure it holds across thousands of accounts, so agents and other teams can confidently build on top of it.

  • Work across the full stack.

Python, SQL, DBT, BigQuery, Snowflake and move between layers fluidly, contributing wherever the work needs it.

Who You Are

  • Deep roots in data systems, not just data tooling. You have 5\+ years designing and operating core data infrastructure from ingestion and transformation to serving and observability in high-growth environments where the data needed to be right, fresh, and fast.
  • Built for agents and models, not just reports. You've worked on data systems that power ML models, intelligent workflows, or real-time decisioning. You understand the different demands that put on infrastructure compared to a typical analytics stack.
  • Fluent across the modern data stack. Proficient in Python, SQL, and DBT, with hands-on experience in BigQuery or Snowflake, and familiar with orchestration tools like Fivetran, Airflow, or Polytomic.
  • Fluent in real-time infrastructure. You've built streaming and mini-batch pipelines using Pub/Sub, Kafka, Dataflow, or similar technologies, and understand the trade-offs between latency, throughput, and operational complexity.
  • Startup-proven or product-platform experience. You've either built a data platform from scratch at an early-stage company or worked at a data-focused product company (e.g. Segment, dbt Labs) scaling systems across many customers.
  • Self-directed and accountable for quality. You take work from design to production without being managed through it, and you hold yourself responsible for whether the data your systems produce is actually trustworthy.

Nice to Haves

  • Prior experience at a data infrastructure or platform company (e.g. Segment, Databricks, Confluent, Fivetran) or meaningful contributions to open-source data tooling.
  • Familiarity with embedding and vector pipelines like chunking strategies, index management, and keeping representations in sync with fast-changing source data.
  • Experience building data pipelines where correctness was a hard requirement like financial data, compliance systems, or other domains where bad data has real downstream consequences.

Looking for more opportunities?

Browse thousands of graduate jobs and entry-level positions.

Browse All Jobs