TL;DR
Tech Lead AWS DevOps Engineer (AI): Modernizing and scaling a global digital health SaaS platform by evolving cloud, ML, and Kubernetes infrastructure with an accent on reliability, automation, cost efficiency, and operational excellence. Focus on hands-on work with modern AWS stack, ML platforms, CI/CD tooling, and providing technical mentorship to mid-level DevOps engineers.
Location: Hybrid, Plovdiv, Bulgaria
Company
hirify.global, a leading software and digital engineering services company, creates durable technical solutions for digital transformation at scale, focusing on specialized software businesses and technology consulting.
What you will do
- Manage and enhance CI/CD pipelines for application and ML deployment workflows.
- Lead operations for multi-tenant SaaS workloads on AWS, ensuring scalability, high availability, and cost efficiency.
- Build, maintain, and automate infrastructure for production, data, and AI/ML workloads using IaC (AWS CDK or Terraform).
- Own incident response, postmortems, and operational runbooks to improve system reliability.
- Monitor and optimize cloud usage, including GPU and compute clusters, for cost control and performance.
- Develop monitoring, alerting, observability, and security best practices, collaborating across engineering and data science teams.
Requirements
- 7+ years of experience in DevOps/CloudOps/SRE.
- Expertise in Infrastructure as Code (AWS CDK) and advanced CI/CD pipelines (GitHub Actions, ArgoCD).
- Hands-on experience with Amazon EKS and deep operational knowledge of AWS services (Fargate, EC2, S3, RDS, Lambda, IAM, CloudWatch, CloudTrail).
- Strong troubleshooting skills across infrastructure, pipelines, and observability (Splunk, Grafana, OpenTelemetry).
- Understanding of FinOps principles, cost monitoring, and right-sizing in AWS, with familiarity with GPU workloads.
- Excellent spoken and written English language skills, strong collaboration and mentoring abilities.
Nice to have
- Subject Matter Expertise in AWS, EKS, or Infrastructure as Code.
- Working experience with AI/ML platforms such as AWS SageMaker, Kubeflow, or MLflow.
- Knowledge of MongoDB operations and performance optimization.
- Familiarity with Projen, Helm charts, and Kubernetes manifests.
- Awareness of regulatory compliance frameworks (SOC 2, ISO 27001, NIST, HIPAA).
Culture & Benefits
- Opportunity to work on impactful, global projects for recognizable brands.
- Commitment to diversity, equity, and inclusion, fostering an inspiring workplace.
- Culture of openness, trust, and empowerment through latest technology.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →