Sr. HPC Systems Engineer

160 000 - 220 000$

Формат работы

onsite

Тип работы

fulltime

Грейд

senior

Английский

Страна

Вакансия из Hirify RU Global, списка компаний с восточно-европейскими корнями
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Sr. HPC Systems Engineer: Administering and managing HPC clusters, storage systems, and high-speed networks to support hirify.global engineering applications with an accent on Linux-based compute cluster installation and integration. Focus on troubleshooting complex systems, automating reoccurring tasks, and ensuring reliability in a fast-paced environment.

Location: Hawthorne, CA

Salary: $160,000.00–$220,000.00/per year

Company

hirify.global develops advanced technologies for space exploration with the ultimate goal of enabling human life on Mars.

What you will do

Administer and manage HPC clusters, storage systems, and high-speed networks.
Provide application support to hirify.global employees across engineering disciplines.
Install and integrate Linux-based compute clusters.
Write instructional documentation and convey technical ideas in non-technical terms.

Requirements

5+ years of hands-on experience with client and server hardware/software, management tools, enterprise networking, virtualization, and security technologies.
Bachelor's degree in computer science, engineering, math, or scientific discipline with 5+ years of systems engineering experience; OR 7+ years of professional experience building software in lieu of a degree.
Experience with Linux.
Must be a U.S. citizen or national, U.S. lawful permanent resident (aka green card holder), Refugee under 8 U.S.C. § 1157, or Asylee under 8 U.S.C. § 1158 due to ITAR requirements.

Nice to have

5+ years of professional experience building, deploying, and troubleshooting Linux systems.
Experience with scripting languages (Bash, Python) to automate tasks.
Experience building, deploying, and troubleshooting HPC clusters.
Familiarity with cluster resource managers (Slurm, PBS, LSF).
Experience with monitoring and alerting technologies (Prometheus, Grafana, Nagios).
Familiarity with scientific and engineering computing (CFD, FEA).
Familiarity with ML frameworks (PyTorch, Tensorflow), GPU usage, and Cuda.
Experience with containers (Docker, Podman, Singularity) and automated configuration management (Puppet, Ansible).

Culture & Benefits

Comprehensive medical, vision, and dental coverage.
Access to a 401(k) retirement plan.
Eligibility for long-term incentives (company stock, stock options) and potential discretionary bonuses.
Paid parental leave, 3 weeks of paid vacation, and 10+ paid holidays per year.
Work in a fast-paced, challenging environment with mission-critical and sensitive systems.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →

Текст вакансии взят без изменений

Источник - загрузка...