Machine Learning Operations Engineer (AI)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
TL;DR
Machine Learning Operations Engineer (AI): Owning and scaling production inference systems for a conversation intelligence platform with an accent on high availability, reliability, and efficiency. Focus on building systems for production traffic growth, designing monitoring for ML workloads, and optimizing GPU-based infrastructure.
Location: Hybrid in Somerville, MA
Salary: $150,000 – $200,000
Company
is a leader in conversational voice intelligence, building a platform to detect harm and prevent fraud through voice understanding.
What you will do
- Deploy, monitor, and maintain production machine learning inference systems.
- Manage fleets of inference machines to ensure system health and performance.
- Design monitoring, alerting, and incident response systems for ML workloads.
- Build systems and processes for scaling inference infrastructure under variable load.
- Collaborate on infrastructure-as-code for production deployments.
- Support and contribute to GPU-based training and inference infrastructure.
Requirements
- Experience deploying and maintaining production software systems.
- Strong proficiency with AWS, Python, and Linux.
- Experience building monitoring and alerting systems for production environments.
- Exposure to PyTorch or similar ML frameworks.
- Experience working with GPU-based applications and basic GPU tooling.
- Must be based in or able to work hybrid in Somerville, MA
Nice to have
- Experience with ML model serving systems or dedicated model servers.
- Experience monitoring GPU performance for inference workloads.
- Experience optimizing machine learning model inference.
- Familiarity with audio or multimedia data (codecs, streaming, real-time systems).
- Experience with Terraform or CloudFormation.
Culture & Benefits
- Competitive salary and equity.
- Full health, dental, and vision coverage, including HSA and FSA.
- Flexible PTO and a work-from-anywhere policy for up to 8 weeks per year.
- Weekly team lunches and a deeply inclusive, human-centered culture.
- Leadership and technical learning sessions with career development support.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →