Location
Austin, TX
Salary
Not specified
Type
fulltime
Posted
Today
Job Description
About Us
We are a
stealth-mode startup
building next-generation infrastructure for the AI industry. Our mission is to make advanced language models portable, efficient, and customizable for real-world deployments. We’re building tools that allow vendors to fine-tune models easily and deploy them securely on diverse hardware.
Role
We are seeking a
AI
ML Engineer (Python)
to help design and implement our
AI Pipelines
. This is not an academic research role — you will be productizing and automating existing fine-tuning techniques (LoRA/QLoRA) so vendors can train and manage their own adapters with minimal effort.
You’ll work closely with backend engineers (Node.js) who orchestrate jobs and dashboards, while you focus on the
training pipelines and adapter export logic
.
Responsibilities
- Implement and maintain LoRA/QLoRA fine-tuning pipelines using PyTorch \+ Hugging Face Transformers \+ PEFT.
- Develop logic for incremental training and adapter stacking, producing clean, versioned “delta packs.”
- Automate data preprocessing (tokenization, formatting, filtering) for user-supplied datasets.
- Build training scripts/workflows that integrate with orchestration backends (Node.js, REST/gRPC, or job queues).
- Implement monitoring hooks (loss curves, checkpoints, eval metrics) to feed into dashboards.
- Collaborate with DevOps to ensure reproducible, portable training environments.
- Write tests to guarantee reproducibility and correctness of adapter outputs.
- Willingness to occasionally be present in the office for discussions and team collaboration.
Requirements
- Strong programming skills in Python.
- Hands-on experience with PyTorch and the Hugging Face ecosystem (Transformers, Datasets, PEFT).
- Familiarity with LoRA/QLoRA or parameter-efficient fine-tuning methods.
- Understanding of mixed precision training (FP16/BF16) and memory optimization techniques.
- Experience building training scripts that are production-ready (reproducibility, logging, error handling).
- Comfortable working in Linux GPU environments (CUDA, ROCm).
- Ability to collaborate with backend/frontend engineers who are not ML specialists.
Nice to Have
- Experience with bitsandbytes, xformers, or flash-attention.
- Familiarity with distributed training (multi-GPU, NCCL, DeepSpeed, or Accelerate).
- Prior work in MLOps or packaging ML pipelines for deployment.
- Contributions to open-source ML libraries.
Why Join
- Build the core training product that lets vendors adapt models safely and efficiently.
- Focus on product engineering, not open-ended research.
- Collaborate with a lean, highly technical team at the intersection of AI and systems.
- Competitive compensation, equity potential, and flexible remote work.
Looking for more opportunities?
Browse thousands of graduate jobs and entry-level positions.