20 часов назад
Software Engineer (Ai)
Мэтч & Сопровод
Для мэтча с этой вакансией нужен Plus
Описание вакансии
Текст:
TL;DR
Software Engineer (AI): Architecting and operating core systems that power AI at Slack with an accent on reliability, security, and self-service capabilities. Focus on solving complex scalability and reliability challenges at the intersection of distributed systems, GPU infrastructure, and modern ML stacks.
Location: Seattle, Austin, Atlanta, Bellevue (USA)
Company
's Slack AI mission is to transform how people work by making Slack an AI-powered operating system.
What you will do
- Design, build, and operate systems to train, serve, and deploy machine learning models at scale.
- Evolve GPU backed inference infrastructure to support high throughput, latency sensitive workloads, including large scale model serving.
- Architect and optimize distributed training and data processing systems using platforms such as Ray, Airflow, Spark, or similar technologies.
- Build and maintain Kubernetes based platforms and orchestration layers using tools such as KubeRay, vLLM, and internally developed services.
- Develop robust monitoring, observability, and alerting for production ML workloads to ensure operational excellence.
Requirements
- Significant professional experience in software engineering with a strong focus on infrastructure, backend systems, platform engineering, or MLOps.
- Deep experience building and operating distributed systems, including expert level knowledge of Kubernetes and container based platforms.
- Hands on experience with modern ML infrastructure and serving stacks such as Ray or KubeRay, vLLM, or similar training and inference orchestration frameworks.
- Experience working with GPU infrastructure, including performance optimization and operational management at scale.
- Experience building and operating cloud native systems on public cloud platforms such as AWS, GCP, or Azure, including infrastructure as code.
- Excellent written communication, as well as ability to thrive in an asynchronous and globally distributed infrastructure team.
Culture & Benefits
- Work at the intersection of distributed systems, GPU infrastructure, and modern ML stacks.
- Solve complex scalability and reliability challenges.
- Play a critical part in shaping the long term technical foundations of Slack’s AI capabilities.
Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →