Skip to main content
E

Cloud Platform Engineer

ebp Global

Location

Remote

Salary

Not specified

Type

fulltime

Posted

Today

via linkedin

Job Description

Cloud Platform Engineer (m/f)

📍 US \| 🕒 Full-Time \| Remote

Company Description

ebp Global is a high-performing boutique consultancy firm best known for delivering tailored, impactful solutions to our clients’ most complex problems, from conceptualisation to implementation. Our expertise covers a wide range of value chain activities from strategy, organisational design and operating models, through operations and business process optimisation, to information flows and analytics. It is through our hands-on approach, and deep knowledge that we are proud to claim some of the world’s most well-known companies, across a wide variety of industries as long-term client partners.

We are uniquely global, not just operating on a global scale but operating in a global nature, with one another and our clients too. Our team is made up of experts with operational, industry related experience; instilling a true understanding of our client’s problems with a passion to solve and improve.

See https://ebp-global.com/ for further details about our company.

Job Overview

We are seeking a highly skilled and experienced Cloud Platform Engineer with expertise in Azure and AWS to join our dynamic IT team.

The ideal candidate will be responsible for designing, implementing, and managing our cloud architecture and infrastructure, ensuring the highest levels of availability, performance, and security. Overall, you’ll strive for efficiency by aligning cloud systems with business goals.

You are required to work closely with colleagues to effectively gather and translate requirements into solutions. Contribute to the delivery of robust, supportable and sustainable infrastructure solutions in accordance with agreed organisational standards that ensure services are resilient, scalable and future proof.

A self-starter with an inquisitive nature and would want to look beyond the obvious to explore why things are there. Critical and conceptual thinking and problem-solving skills are essential alongside passion for networking.

Job Responsibilities

  • Design and Architecture:
  • Design and implement scalable and secure network architectures in both Azure and AWS environments. -- Develop comprehensive architectural blueprints and documentation for cloud infrastructure.
  • Plan and execute cloud migration strategies, including hybrid cloud solutions.
  • Design infrastructure for AI/ML workloads including GPU/TPU compute clusters, high-throughput storage, and low-latency networking between nodes
  • Architect MLOps pipelines integrating model training, versioning, and deployment workflows on cloud platforms (e.g., Azure ML, AWS SageMaker)
  • Infrastructure Setup and Management:
  • Deploy and manage virtual networks, subnets, route tables, and network gateways.
  • Implement and manage VPN connections, Direct Connect (AWS), and ExpressRoute (Azure).
  • Configure and manage load balancers, firewalls, and security groups.
  • Oversee DNS setup and management within cloud environments.
  • Deploy and manage AI-specific services such as AWS SageMaker, Azure Machine Learning, and GPU-enabled VM fleets
  • Set up and manage vector databases (e.g., Pinecone, Weaviate, pgvector on RDS) and object storage optimized for large model artifacts
  • Configure container orchestration (Kubernetes/EKS/AKS) for scalable model serving and inference endpoints
  • Deploy and manage API hosting environments including containerized REST APIs using Docker and Kubernetes (EKS/AKS)
  • Configure and manage API Gateways (AWS API Gateway, Azure API Management) for routing, throttling, and versioning
  • Security and Compliance:
  • Implement and maintain robust security protocols to safeguard cloud infrastructure.
  • Conduct regular security audits and compliance checks.
  • Ensure cloud infrastructure adheres to industry standards and regulatory requirements.
  • Implement data governance and access controls for sensitive training datasets and model artifacts
  • Ensure compliance with AI-specific regulations and responsible AI frameworks (e.g., EU AI Act considerations)
  • Performance Optimization:
  • Monitor network performance and implement tuning measures to optimize throughput and latency.
  • Troubleshoot and resolve network-related issues promptly.
  • Conduct capacity planning and scaling to accommodate growing workloads.
  • Optimize inference latency and throughput for deployed models using techniques like auto-scaling endpoints, spot instances, and caching layers
  • Monitor GPU utilization, model drift, and endpoint health using tools like CloudWatch, Azure Monitor, or Prometheus
  • Automation and Scripting:
  • Develop and maintain Infrastructure as Code (IaC) using tools like Terraform, CloudFormation, or ARM templates.
  • Automate deployment, configuration, and management tasks using scripting languages such as Python, PowerShell, or Bash.
  • Build and maintain CI/CD pipelines for model deployment using tools like MLflow, Kubeflow, or Azure DevOps
  • Automate model retraining triggers, A/B deployment rollouts, and blue/green model switches
  • Experience deploying Python-based REST APIs using frameworks such as FastAPI or Flask
  • Build CI/CD pipelines for automated testing, containerization, and deployment of Python APIs to cloud environments
  • AI/ML Platform Support:
  • Support LLM and generative AI deployments including API gateway configuration for models like Azure OpenAI or AWS Bedrock
  • Manage prompt caching layers, rate limiting, and cost monitoring for AI API consumption
  • Collaborate with data science and AI teams to translate model requirements into scalable cloud infrastructure
  • Collaboration and Support:
  • Work closely with development, operations, and security teams to ensure seamless integration and operation of cloud services.
  • Provide technical guidance and support to junior network engineers and other team members.
  • Participate in on-call rotation for after-hours support as needed.
  • API Development \& Management:
  • Design, deploy, and manage RESTful APIs built in Python (FastAPI, Flask, or Django REST Framework)
  • Manage full API lifecycle — versioning, documentation (Swagger/OpenAPI), deprecation, and rollout strategies
  • Implement API security best practices including OAuth2, API key management, rate limiting, and JWT authentication
  • Monitor API performance, uptime, and error rates using tools like CloudWatch, Azure Monitor, or Datadog
  • Manage API monetization or access tiers where applicable, using gateway-level policies

Key Skills for a Cloud Platform Engineer

  • Minimum of 5 years of experience in network engineering, with at least 3 years focused on cloud environments.
  • Proven experience designing and managing network infrastructure in both Azure and AWS.

Education:

  • Bachelor's degree in Computer Science, Information Technology, or a related field. Relevant certifications and experience may be considered in lieu of a degree.

Certifications (Preferred):

  • AWS Certified Solutions Architect – Professional or AWS Certified Advanced Networking – Specialty.
  • AWS Certified Machine Learning – Specialty
  • Microsoft Certified: Azure AI Engineer Associate
  • Relevant network certifications (e.g., Cisco CCNA/CCNP).

Technical Skills:

  • Proficiency with Python REST API development and deployment (FastAPI, Flask)
  • Hands-on experience with AWS API Gateway or Azure API Management (APIM)
  • Familiarity with OpenAPI/Swagger specifications and API documentation practices
  • Understanding of API security standards — OAuth2, JWT, mTLS, API key rotation
  • Experience with containerizing APIs using Docker and deploying via Kubernetes or serverless functions (Lambda, Azure Functions)

Soft Skills:

  • Accuracy and attention to detail
  • Problem-solving aptitude is essential
  • Excellent communication and presentation skills
  • Ability to learn and upgrade technical skills, in the fast-paced data analysis field
  • Ability to understand and visualize multidimensionality of business facts/measures
  • Ability to work in a dynamic, agile environment within a geographically distributed team

Why ebp Global?

  • Boutique, high-expertise consulting firm
  • Remote, flexible working environment
  • Global team
  • Direct exposure to senior industry experts
  • Visible impact on company growth

Please apply by sending your CV (in English) to [email protected]

Applicants must reside and have the right to work in the USA.

Only short-listed candidates will be contacted.

Personal data collected will be used for recruitment purpose only.

Looking for more opportunities?

Browse thousands of graduate jobs and entry-level positions.

Browse All Jobs