Skip to main content
M

Inference Performance Engineer

material

Location

New York, NY

Salary

Not specified

Type

Full-time

Posted

Today

via ashby

Job Description

About us

We're an inference cloud built for AI ASICs, generating tokens 5–7× faster than existing GPU infrastructure at a fraction of the price. We closed our oversubscribed seed round with $97M in compute allocation and $200M in hardware financing underway.

We believe that when something is important enough, no obstacle is insurmountable. If you thrive on extreme ownership, outsized impact, and relentless optimism, this is the place for you.

About the role

Build the inference runtime that powers our ASIC cloud including batching, KV cache optimization, scheduling, and APIs. You'll be working with experts from top manufacturers operating at the frontier of AI hardware design.

What you'll do

  • Build and improve the inference runtime that serves our ASIC hardware

  • Own scheduling, continuous batching, KV cache optimization, prefill, and decode separation

  • Optimize tokens/sec, TTFT, p99 latency, and cost per token

  • Collaborate with hardware and compiler teams to update kernels and operators

  • Maintain the OpenAI-compatible API surface

  • Benchmarking and regression testing

What you'll need

  • BS in Computer Science or related field

  • 3+ years of software engineering experience: Rust, Go, or Python

  • Solid fundamentals in concurrency, memory, and tail latency

  • Familiarity with modern LLM inference: transformers, attention, KV cache, batching, speculative decoding, quantization

  • Experience with model serving: vLLM, TGI, SGLang, TensorRT-LLM, llama.cpp, or custom runtimes

What we'd like

  • CUDA, ROCm, Triton, kernel-level work, or experience with non-NVIDIA accelerators

  • Built and scaled an OpenAI-compatible API in production

What we offer

  • Competitive cash compensation

  • Generous stock options

  • 100% paid medical, dental, and vision insurance for employees

  • Flexible PTO

  • Paid Holidays

 

Equal Employment Opportunity

We're an Equal Opportunity Employer and do not discriminate on the basis of any protected status under applicable law.

Looking for more opportunities?

Browse thousands of graduate jobs and entry-level positions.

Browse All Jobs