Skip to main content
S

Machine Learning Engineer

Scale.jobs

Location

New York, NY

Salary

Not specified

Type

fulltime

Posted

Today

via linkedin

Job Description

About The Role

The role involves architecting and scaling machine learning systems that power core product features, focusing on the bridge between experimental research and high-availability production environments. The engineer will collaborate with cross-functional teams to integrate predictive models into real-time application workflows, ensuring high throughput and low latency for global users.

The team focuses on solving complex problems in natural language processing and predictive analytics, utilizing modern LLM orchestration and MLOps practices. This position is critical for driving the technical roadmap of model deployment, monitoring infrastructure, and continuous integration of new machine learning capabilities.

Key Responsibilities

  • Build and maintain end-to-end ML pipelines using Python, PyTorch, and orchestration tools like Airflow or Prefect to automate training and deployment workflows
  • Develop and optimize Retrieval-Augmented Generation (RAG) systems and agentic workflows using frameworks like LangChain or LlamaIndex for enterprise-grade applications
  • Implement robust monitoring and observability for models in production, tracking metrics for data drift, concept drift, and inference latency using tools like Prometheus or Weights \& Biases
  • Optimize model inference performance through quantization, pruning, or the use of specialized engines like TensorRT and ONNX Runtime
  • Containerize machine learning services using Docker and Kubernetes to ensure scalable and reproducible deployments across cloud environments (AWS or GCP)
  • Collaborate on data strategy and feature engineering, building scalable data transformation modules with PySpark or SQL to support high-dimensional model inputs

What We Are Looking For

  • 3–6 years of professional experience in machine learning engineering or software engineering with a focus on data-intensive applications
  • Proven track record of deploying and scaling at least two machine learning models in a production environment serving significant traffic
  • Advanced proficiency in Python and deep familiarity with deep learning frameworks such as PyTorch or JAX
  • Hands-on experience with vector databases like Pinecone, Weaviate, or Milvus and their integration into semantic search systems
  • BS, MS, or PhD in Computer Science, Data Science, Mathematics, or a related quantitative technical field
  • Bonus: Experience with parameter-efficient fine-tuning (PEFT), distributed training techniques (DeepSpeed/FSDP), or contributing to open-source ML libraries

Looking for more opportunities?

Browse thousands of graduate jobs and entry-level positions.

Browse All Jobs