See Me Please is an accessibility testing platform connecting enterprise organisations with diverse testers (people who are blind, have low vision, are deaf, neurodivergent, or older) to deliver authentic accessibility insights. Our customers include Westpac, the Australian Government, multiple state governments, and leading universities. We have secured equity-free government funding recognising our innovative product offering which is helping us scale into international markets and are heading into Series A.

THE ROLE

You want to build the machine, not just use it. This is a full-stack role whose centre of gravity sits a level above typing code: you'll design and build the skills, agents, evals, and tools that let agentic coding tools do real, trustworthy engineering work on our codebase; and you'll own product features end to end while doing it.

This isn't prompt-writing. It's a genuine engineering discipline: designing agents that use real tools and verify themselves, writing evals that determine what's trustworthy, engineering the context agents reason over, and composing agents into workflows. The standards that govern how we engineer, our baseline, design system, accessibility tenets, review discipline, live inside this machine, versioned and rated, running every day.

WHY THIS ROLE IS DIFFERENT

We're early-stage but not idea-stage. Our platform is already trusted by some of Australia's largest institutions to inform real decisions about accessibility, risk, and service design. You're shaping the systems that turn lived experience into insights enterprise customers act on.

Most teams keep their engineering standards in people's heads and write every line by hand. We're deliberately encoding how we engineer into an agentic machine and spending our own judgment on the parts that don't hand off well: defining the right work, shaping the platform, staying close to the customer, deciding what's trustworthy.

You'll collaborate with academic CS researchers on long-term problems at the intersection of accessibility, data, and applied AI. Exceptional backing, proven customers, a small team of seasoned technologists who've escaped big-org bureaucracy.

WHAT YOU'LL DO

Build the agentic machine: skills, agents, evals, and tools that wrap Claude Code to do real engineering work on our stack
Write evals that determine whether a skill is trustworthy, and decide what graduates from experimental to production
Turn product goals into crisply defined work the machine and the team can execute
Own features end to end, Go API or Aurora pipeline through to the React surface in the Insights portal
Contribute to LLM-driven analysis pipelines, video and audio processing at scale, and event-driven architectures on AWS
Collaborate with academic CS researchers on long-term problems at the intersection of accessibility, data, and AI

WHAT YOU BRING

A real interest in building agentic systems, and ideally evidence of it i.e skills, agents, evals, or tooling others came to rely on
The engineering instinct to direct a machine well: encoding how you think, building evals that prove it works, catching where an agent is confidently wrong
Comfortable in Go (or a typed language and excited to deepen); solid TypeScript and React on Next.js
Strong SQL and data modelling in Postgres: schema design, access patterns, migrations
Experience with event-driven or async architectures and the failure modes they produce
Experience with video data at scale — ingestion, processing, transcription, or analysis
3–5 years of full-stack experience (we care more about evidence than tenure)
Comfort in a startup where ambiguity is the steady state and systems are far from perfect

NICE TO HAVE

Agentic systems built using Claude Code, MCP servers, or custom agent tooling
LLM/AI pipelines: retrieval, evaluation, prompt design, or evals
AWS serverless: Lambda, EventBridge, API Gateway; infrastructure-as-code (SST, CDK)
Multi-region data sovereignty for regulated customers
Background in regulated environments: government, financial services, healthcare
Familiarity with sqlc, Atlas, or type-safe SQL tooling

ENGINEERING VALUES

You ship it, you own it.
Build the machine, encode how we engineer into skills, agents, and evals; run them daily; know when to override them
Communicate openly, share context, ask questions, give feedback
Deep sense of pride and ownership, because good enough is not good enough

WHAT WE OFFER

Competitive salary based on experience
Equity participation in a growing startup
Flexible working with sensible WFH
MacBook Pro and home office setup
Genuine impact on digital accessibility at international scale
And legitimate deep tech work.

IS THIS YOU?

A startup isn't for everyone, and we'd rather say that plainly than find out six months in. There's no place to hide here, no large team to absorb a slow quarter, no established playbook to follow, no senior layer to catch decisions before they land.

What there is: real problems, real customers, and the chance to turn something with genuine commercial promise into something that scales. If you're energised by that, by building in motion, owning the outcome, and seeing your work matter quickly then we should talk. If you do your best work with clear structure, stable priorities, and a well-defined lane, this probably isn't the right moment for you, and that's okay too.

The pace we work at only holds if we're genuinely aligned, and velocity in our early stage context needs in person time. We need you close enough to Sydney to be in the office a few days a week: for the conversations that don't fit a ticket, the decisions that need a whiteboard, and the kind of momentum that's hard to manufacture remotely.

Agentic AI Systems Engineer

Job Description

Looking for more opportunities?