San Francisco, CAHybrid$220,000 - $380,0001 months ago
full-timeseniorllamamistralcustom
About the Role
Push the boundaries of large-scale model training. Build infrastructure that enables training of frontier AI models.
You'll work on:
- Developing distributed training frameworks
- Optimizing GPU utilization and communication
- Implementing novel training techniques
- Supporting open source model releases
Requirements
- PhD or MS in CS/ML or equivalent experience
- Expert-level PyTorch and CUDA knowledge
- Experience with multi-GPU/multi-node training
- Publications in ML systems a plus
Required Skills
PythonPyTorchCUDADistributed Training
About Together AI
Open source AI cloud for training and inference at scale.