Назад
Company hidden
1 месяц назад

Research Engineer, Voice (AI)

225 000 - 325 000$
Формат работы
onsite
Тип работы
fulltime
Грейд
middle
Английский
b2
Страна
US
Вакансия из списка Hirify.GlobalВакансия из Hirify Global, списка международных tech-компаний
Для мэтча и отклика нужен Plus

Мэтч & Сопровод

Для мэтча с этой вакансией нужен Plus

Описание вакансии

Текст:
/

TL;DR

Research Engineer, Voice (AI): Developing and optimizing neural models for speech synthesis, recognition, and audio generation to advance the spoken intelligence of the Pi AI agent with an accent on real-time spoken dialogue and neural audio codecs. Focus on building production-grade training pipelines, exploring diffusion-based synthesis, and integrating multimodal foundation models into a conversational stack.

Location: Palo Alto, California, United States (must be based in the Bay Area)

Salary: $225,000 – $325,000

Company

hirify.global is a Public Benefit Corporation creating Pi, an emotionally intelligent AI designed to help people navigate decisions, emotions, and challenges.

What you will do

  • Research, develop, and optimize neural models for text-to-speech, automatic speech recognition, audio generation, and spoken dialogue systems.
  • Build and maintain production-grade training and inference pipelines with a focus on latency, naturalness, and scalability.
  • Run end-to-end experiments including data curation, architecture design, training, and ablation studies.
  • Collaborate with ML engineers and product teams to integrate voice models into Pi’s real-time conversational stack.
  • Apply advances in neural audio codecs, diffusion-based synthesis, and multimodal foundation models.
  • Develop robust evaluation frameworks using perceptual metrics and automated benchmarks.

Requirements

  • 2-5 years of research or engineering experience in audio, speech, or multimodal ML (graduate work included).
  • Strong proficiency in PyTorch and experience training large-scale neural models on GPU/accelerator clusters.
  • Deep understanding of audio fundamentals: spectrograms, mel features, vocoders, and signal processing.
  • Proven ability to move research from prototype to production with efficient, CUDA-aware training loops.
  • Familiarity with generative architectures such as diffusion models and autoregressive codecs.
  • Bachelor’s degree in CS, Electrical Engineering, Linguistics, or related field; MS or PhD strongly preferred.
  • Must be based in or able to live in the Bay Area.

Culture & Benefits

  • Comprehensive medical, dental, and vision insurance options.
  • 401k matching program.
  • Unlimited paid time off.
  • Parental leave and flexibility for all parents and caregivers.
  • Meaningful equity component to share in the company's long-term success.
  • Support for country-specific visa needs for international employees living in the Bay Area.

Будьте осторожны: если работодатель просит войти в их систему, используя iCloud/Google, прислать код/пароль, запустить код/ПО, не делайте этого - это мошенники. Обязательно жмите "Пожаловаться" или пишите в поддержку. Подробнее в гайде →