HPC SME (HPC/Linux)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
HPC SME (HPC/Linux): Review, validate, and architect HPC solutions through POCs and benchmarking, then oversee implementation, integration, testing, and lifecycle management. Focus on diagnosing critical infrastructure and performance issues, managing HPC/Linux clusters and job schedulers, and leading technical delivery with end-to-end problem ownership.
Location: Hybrid (expectation to work on average 2 days per week from an HPE office)
Company
provides edge-to-cloud IT services and solutions for connecting, protecting, analyzing, and acting on data and applications.
What you will do
- Review and validate HPC solutions and environments using POCs and benchmarking.
- Architect and design HPC solutions tailored to customer needs.
- Oversee solution implementation, integration, and testing; diagnose and correct issues during rollout.
- Maintain HPC environment lifecycle management and ensure integrity after outages to minimize downtime.
- Provide training, documentation, and ongoing support; triage and resolve infrastructure-related tickets.
- Lead team operations and deliverables with technical expertise, including technical sessions and case reviews.
Requirements
- 8–12 years of experience with Linux distributions such as SLES, RHEL, and Ubuntu/Debian.
- 5–8 years of experience managing HPC/Linux clusters with strong understanding of cluster architecture.
- Skilled in installing and configuring applications on Linux.
- Experience installing, administering, and maintaining hardware, system software, networking, accounts, and security on VMware configurations.
- Experience with HPC clusters and GPUs, including job schedulers such as SLURM or PBS.
- Experience with monitoring and resource usage tracking (e.g., Grafana/Nagios/OpsRamp) and scripting/automation (Shell/Python, Ansible).
Culture & Benefits
- Hybrid work model with an expectation of working from an HPE office about 2 days per week.
- Comprehensive health and wellbeing benefits.
- Personal and professional development programs, including growth as a knowledge expert or across divisions.
- Inclusive culture with flexibility to manage work and personal needs.
Hiring process
- Review of experience and technical fit for HPC/Linux cluster operations and solution delivery.
- Assessment through interviews and technical discussions focused on critical infrastructure problem ownership.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →