E

Speech ML Engineer

ElevenLabs

New York, NYRemote$200,000 - $320,0004 weeks ago

full-timeseniorcustom

About the Role

ElevenLabs is looking for a Senior Speech ML Engineer to work on our text-to-speech and voice cloning models. You will push the boundaries of speech synthesis quality, developing models that produce natural, expressive, and emotionally nuanced speech. This role involves working on novel architectures for speech generation, prosody modeling, and multi-speaker adaptation. You will work with large-scale speech data and train models that serve millions of API requests daily. The ideal candidate has deep expertise in speech synthesis, audio signal processing, and generative models.

Requirements

- 5+ years of experience in speech/audio ML - Deep knowledge of TTS architectures (Tacotron, VITS, etc.) - Strong PyTorch and CUDA optimization skills - Experience with voice cloning and speaker adaptation - Understanding of audio signal processing - PhD in Speech Processing or related field preferred

Required Skills

PythonPyTorchCUDATransformers

About ElevenLabs

The most realistic AI voice platform. Text-to-speech, voice cloning, and audio AI.

Visit Company Website

Ready to Apply?

Join ElevenLabs and work on cutting-edge AI technology