About Us

We are a

stealth-mode startup

building next-generation infrastructure for the AI industry. Our mission is to make advanced language models portable, efficient, and customizable for real-world deployments. We’re building tools that allow vendors to fine-tune models easily and deploy them securely on diverse hardware.

Role

We are seeking a

ML Engineer (Python)

to help design and implement our

AI Pipelines

. This is not an academic research role — you will be productizing and automating existing fine-tuning techniques (LoRA/QLoRA) so vendors can train and manage their own adapters with minimal effort.

You’ll work closely with backend engineers (Node.js) who orchestrate jobs and dashboards, while you focus on the

training pipelines and adapter export logic

Responsibilities

Implement and maintain LoRA/QLoRA fine-tuning pipelines using PyTorch \+ Hugging Face Transformers \+ PEFT.
Develop logic for incremental training and adapter stacking, producing clean, versioned “delta packs.”
Automate data preprocessing (tokenization, formatting, filtering) for user-supplied datasets.
Build training scripts/workflows that integrate with orchestration backends (Node.js, REST/gRPC, or job queues).
Implement monitoring hooks (loss curves, checkpoints, eval metrics) to feed into dashboards.
Collaborate with DevOps to ensure reproducible, portable training environments.
Write tests to guarantee reproducibility and correctness of adapter outputs.
Willingness to occasionally be present in the office for discussions and team collaboration.

Requirements

Strong programming skills in Python.
Hands-on experience with PyTorch and the Hugging Face ecosystem (Transformers, Datasets, PEFT).
Familiarity with LoRA/QLoRA or parameter-efficient fine-tuning methods.
Understanding of mixed precision training (FP16/BF16) and memory optimization techniques.
Experience building training scripts that are production-ready (reproducibility, logging, error handling).
Comfortable working in Linux GPU environments (CUDA, ROCm).
Ability to collaborate with backend/frontend engineers who are not ML specialists.

Nice to Have

Experience with bitsandbytes, xformers, or flash-attention.
Familiarity with distributed training (multi-GPU, NCCL, DeepSpeed, or Accelerate).
Prior work in MLOps or packaging ML pipelines for deployment.
Contributions to open-source ML libraries.

Why Join

Build the core training product that lets vendors adapt models safely and efficiently.
Focus on product engineering, not open-ended research.
Collaborate with a lean, highly technical team at the intersection of AI and systems.
Competitive compensation, equity potential, and flexible remote work.

ML Engineer - Austin, TX

Job Description

Looking for more opportunities?