Staff Engineer Engineering Compute Infrastructure and Grid Operations (Semiconductor)

128 000 - 189 370$

Формат работы

onsite

Тип работы

fulltime

Грейд

lead

Английский

Страна

Вакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:

TL;DR

Staff Engineer Engineering Compute Infrastructure and Grid Operations (Semiconductor): Designing and operating large-scale compute infrastructure for chip design and verification with an accent on grid job management and distributed storage. Focus on improving job reliability, diagnosing system failures, and optimizing I/O performance in high-throughput environments.

Location: Westborough, MA, Austin, TX, or Santa Clara, CA. Must be eligible to access export-controlled information under U.S. export control laws (EAR).

Salary: $128,000 – $189,370 per annum

Company

hirify.global provides essential semiconductor solutions for data infrastructure across enterprise, cloud, AI, and carrier architectures.

What you will do

Own and evolve grid job management infrastructure for large regressions and high-volume batch workloads.
Debug and resolve grid job failures, including scheduling issues, hung jobs, and resource starvation.
Improve job reliability through the implementation of watchdogs, retries, heartbeats, and failure detection.
Manage shared engineering storage systems, resolving issues related to I/O performance, file contention, and permissions.
Design and deploy monitoring, logging, and metrics to proactively detect infrastructure problems.
Act as a technical bridge between engineering users, tools teams, and central IT to translate requirements into improvements.

Requirements

Bachelor’s degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent experience.
8+ years of experience in compute infrastructure, grid operations, or large-scale engineering environments.
Strong experience with grid or batch schedulers such as LSF, UGE, Slurm, or PBS.
Deep Linux systems knowledge, including process management and resource monitoring.
Experience with shared storage systems including NFS and enterprise filers.
Strong scripting skills in Python, shell, or similar languages.

Nice to have

Experience supporting EDA or engineering compute workloads.
Familiarity with job controller or wrapper-based execution architectures.
Experience operating environments with thousands of concurrent batch jobs.
Exposure to cloud or hybrid compute environments.

Culture & Benefits

Comprehensive benefits covering financial well-being, family support, and mental/physical health.
Employee stock purchase plan with a 2-year look back.
Robust mental health resources and family support programs.
Recognition and service awards to celebrate milestones and contributions.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →