Location
San Mateo, CA
Salary
Not specified
Type
fulltime
Posted
Today
via linkedin
Job Description
The Role
The company is seeking a skilled DevOps Engineer to build and maintain robust infrastructure supporting our AI-powered dental platform. You will automate deployments, optimize cloud resources for ML workloads, and ensure seamless scalability for high-volume dental imaging and analytics. This role is pivotal in enabling engineering velocity while upholding security and reliability in a regulated healthtech environment.
Responsibilities
- Design, implement, and manage CI/CD pipelines using tools like Jenkins, GitHub Actions, or ArgoCD for rapid, reliable AI model and application deployments.
- Provision and optimize infrastructure as code (Terraform, Pulumi) across AWS/GCP, including Kubernetes clusters, VPCs, and GPU instances for ML training/inference.
- Monitor and troubleshoot production systems with Prometheus, Grafana, ELK stack, ensuring proactive issue resolution and 99\.99% availability.
- Enhance security postures through automated scans, compliance checks (HIPAA), and zero-trust practices for sensitive dental data.
- Collaborate with engineering teams to automate data pipelines, cost management, and disaster recovery strategies.
- Drive infrastructure improvements, capacity planning, and on-call rotations to support 24/7 global operations.
- Document processes, contribute to runbooks, and evangelize DevOps best practices across the organization.
Who you are
- You are passionate about automation, treating infrastructure as disposable code to enable developer productivity.
- You excel in fast-paced settings, diagnosing complex issues across distributed systems with calm precision.
- You communicate clearly, bridging gaps between developers, data scientists, and leadership on infra needs.
- You prioritize reliability and security, especially in health-regulated domains, while optimizing for velocity.
- You continuously learn emerging tools and apply them to solve real bottlenecks in AI/ML workflows.
- You take ownership end-to-end, from design to post-mortems, fostering a culture of blameless improvement.
Qualifications
- 5\+ years in DevOps, SRE, or platform engineering roles, with hands-on experience in cloud-native environments.
- Expertise in IaC (Terraform/CloudFormation), container orchestration (Kubernetes/EKS/GKE), and scripting (Bash/Python/Go).
- Proficiency in CI/CD tools, monitoring stacks, and networking; experience with ML ops (e.g., Kubeflow, Sagemaker) preferred.
- Strong Linux/Unix fundamentals, with knowledge of databases, queues, and load balancers in production.
- Track record supporting high-scale, data-intensive apps; healthtech or AI experience a plus.
- Familiarity with security tools (Vault, Falco) and compliance frameworks like HIPAA/SOC2\.
Looking for more opportunities?
Browse thousands of graduate jobs and entry-level positions.