Skip to main content
E

LLM / GenAI Engineer

Evlo AI

Location

Los Angeles, CA

Salary

Not specified

Type

fulltime

Posted

Today

via linkedin

Job Description

About The Role

The role focuses on building, optimizing, and scaling production-grade Generative AI systems, moving beyond basic API wrappers to construct robust RAG pipelines, multi-agent orchestrations, and fine-tuning workflows. The engineer will collaborate closely with product and data platform teams to integrate advanced language models into core enterprise workflows.

This position requires deep technical knowledge of LLM mechanics, vector search optimization, and systematic evaluation. The team prioritizes building deterministic, reliable, and low-latency AI features that deliver measurable business value under strict production SLAs.

Key Responsibilities

  • Design and optimize advanced Retrieval-Augmented Generation (RAG) pipelines utilizing hybrid search, query rewriting, and reranking models
  • Develop and deploy autonomous agentic workflows and multi-step reasoning systems using LangChain, LangGraph, or custom orchestration frameworks
  • Fine-tune open-source models (such as Llama, Mistral) using PEFT techniques like LoRA and QLoRA on domain-specific datasets
  • Build and scale low-latency vector database architectures with Pinecone, Qdrant, or pgvector, ensuring efficient indexing and partitioning
  • Implement systematic LLM evaluation and observability frameworks using tools like Arize Phoenix, LangSmith, or Ragas to monitor drift, bias, and accuracy
  • Optimize model inference pipelines for latency and cost using quantization (AWQ, GPTQ) and serving frameworks like vLLM or TGI

What We Are Looking For

  • 3-6 years of software engineering experience, with at least 1\.5 years of hands-on experience deploying LLMs and generative systems to production
  • Strong software development skills in Python, including experience with asynchronous programming, FastAPI, and robust unit/integration testing
  • Proven experience with vector databases and semantic search optimization techniques at scale
  • Solid understanding of ML fundamentals, transformer architectures, tokenization, and embedding models
  • BS or MS in Computer Science, Data Science, or a related highly quantitative field
  • Bonus: Experience with model optimization (TensorRT-LLM), custom pre-training, or contributing to open-source GenAI frameworks

Looking for more opportunities?

Browse thousands of graduate jobs and entry-level positions.

Browse All Jobs