TL;DR
Senior Cloud and DevOps Engineer (AI/ML): Designing, optimizing, and securing scalable cloud platforms for AI model training and deployment with an accent on operational efficiency, cost control, and high availability in AWS environments. Focus on managing cloud infrastructure with Terraform and Python, developing CI/CD pipelines, and ensuring platform performance and reliability.
Location: Must be based in Barcelona, Spain, with flexible work model and remote work possibilities.
Company
hirify.global is a Digital Accelerator, part of EPAM Systems, fostering a multicultural, startup-minded culture that promotes innovation and continuous learning.
What you will do
- Manage cloud infrastructure and optimize costs, particularly in AWS environments using Terraform and Python.
- Design, develop, and maintain CI/CD pipelines and infrastructure for AI model training and deployment.
- Ensure platform scalability and efficient resource utilization.
- Conduct technical research and propose optimized architectural solutions.
- Troubleshoot incidents and guarantee smooth, zero-downtime deployments.
- Collaborate closely with AI Ops and technical teams to ensure platform performance, stability, and reliability.
Requirements
- Minimum of 7 years of professional experience in similar roles (Cloud Engineer, DevOps Engineer, MLOps Engineer, or related positions).
- Advanced hands-on experience with AWS (infrastructure management, networking, security, and cost optimization).
- Strong expertise in Terraform for Infrastructure as Code (IaC).
- Solid experience using Python for automation and scripting.
- Proven experience designing and maintaining CI/CD pipelines in production environments.
- Experience managing infrastructure for machine learning model training and deployment.
- Advanced English level, both spoken and written.
- Demonstrated experience ensuring scalability, high availability, and resilience in cloud architectures.
Nice to have
- Experience with microservices-based architectures and containerization (Docker, Kubernetes).
- Knowledge of observability practices (monitoring, logging, alerting).
- Experience with cost governance and optimization strategies in complex cloud environments.
- Familiarity with MLOps practices and the end-to-end lifecycle of ML models in production.
- AWS or DevOps/MLOps-related certifications.
Culture & Benefits
- Permanent contract with a competitive salary.
- Flexible work model and remote work possibilities.
- Personalized career plan and continuous training (certifications, English, etc.).
- Participation in stable, high-technical-impact projects.
- Flexible working hours with a strong focus on work-life balance.
- Social benefits tailored to your needs.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →