We are looking for a goal-oriented and driven AI/ML Engineer with strong experience in building, deploying, and scaling AI/ML applications. The ideal candidate will have hands-on experience with generative AI, agentic AI systems, RAG applications, LLM platforms, APIs, cloud deployment, and production-ready AI architectures.

Key Responsibilities

AI/ML Model Development

Develop, train, fine-tune, and optimise machine learning, Generative AI and neural network models to meet specific business and functional requirements.

Generative AI and Agentic AI Development

Design and build generative AI applications, agentic AI workflows, and multi-agent architectures using modern AI frameworks and orchestration tools.

RAG and GraphRAG Applications

Build Retrieval-Augmented Generation applications, including GraphRAG solutions using knowledge graphs, Neo4j, Astra DB, vector databases, and related retrieval technologies.

LLM Application Development

Work with both open-source and closed-source large language models to build scalable AI applications, including model routing, prompt engineering, evaluation, and optimisation.

Voice-Based AI Implementation

Design and implement voice-based AI solutions, including speech-to-text, text-to-speech, conversational AI, and voice-enabled intelligent assistants.

API Development and Integration

Create robust API endpoints using tools such as FastAPI to enable seamless access to AI models and integration with external systems and applications.

AI Platform Development

Architect and develop a user-friendly AI platform where multiple AI models can be accessed, managed, and utilised through API calls.

System Design and Scalable Architecture

Contribute to the design of scalable, reliable AI systems, including queue-based processing, asynchronous workflows, distributed services, caching mechanisms, and production-grade backend architecture.

LLM Performance and Caching Optimisation

Optimise LLM performance and scalability using caching mechanisms such as KV cache, response caching, prompt caching, and efficient model-serving strategies.

Observability and Monitoring

Implement observability, logging, tracing, monitoring, and evaluation workflows using tools such as Langfuse and related platforms to track system performance, reliability, cost, and user interactions.

Cloud Deployment and Infrastructure

Deploy AI/ML applications across different cloud providers and server environments, ensuring scalability, reliability, security, and performance.

Continuous Improvement

Continuously monitor, update, and improve models, APIs, workflows, and platforms based on user feedback, system performance, and evolving AI technologies.

Skills And Qualifications

Minimum 3 years of experience in building AI/ML software and production-ready AI applications.
Strong expertise in machine learning, neural networks, deep learning, and generative AI applications.
Proficiency in Python and AI/ML frameworks such as TensorFlow, PyTorch, NumPy, LangChain, LangGraph, FastAPI, and related tools.
Experience with agentic AI, multi-agent architecture, RAG, GraphRAG, and LLM-based application development.
Hands-on experience with Langfuse, LiteLLM, observability tools, tracing, model monitoring, and AI evaluation workflows.
Experience working with queues, asynchronous processing, caching mechanisms, scalable system design, and backend architecture.
Strong understanding of knowledge graphs, vector databases, Neo4j, Astra DB, and graph-based retrieval systems.
Experience with both open-source and closed-source LLMs.
Experience deploying AI applications across different cloud providers and server environments.
Good understanding of software engineering best practices, including clean code, testing, documentation, CI/CD, version control, and maintainable system design.
Excellent problem-solving abilities with strong attention to detail.
Strong communication skills and the ability to collaborate effectively in a team-oriented environment.

Bonus / Preferred Experience

Experience implementing voice-based AI applications, including conversational AI, speech-to-text, text-to-speech, and voice assistant technologies.
Experience scaling LLM applications using caching mechanisms such as KV cache, prompt caching, response caching, and efficient inference strategies.
Experience working across multiple cloud providers.
Experience integrating both open-source and closed-source LLMs into production applications.
Experience with advanced LLM operations, including model routing, cost optimisation, monitoring, and performance tuning.

Educational Requirements

Bachelor’s or Master’s degree in Computer Science, Engineering, Artificial Intelligence, Data Science, or a related field, with a focus on AI/ML.

Working Hours

Candidates must be available to work core UK business hours, 9:00–17:30 GMT/BST, Monday–Friday.

AI / ML Engineer

Job Description

Looking for more opportunities?