San Francisco, CAHybrid$160,000 - $260,0001 months ago
full-timemidllamamistralcustom
About the Role
Together AI is hiring an AI Infrastructure Engineer to build our open source AI cloud platform. You will work on model serving, distributed inference, and training infrastructure that supports some of the largest open source models.
This role involves optimizing inference throughput, reducing latency, and building reliable systems that handle millions of API requests. You will work with vLLM, TensorRT, and custom inference engines.
The ideal candidate has experience building high-performance ML serving systems and is passionate about open source AI.
Requirements
- 3+ years of ML infrastructure experience
- Experience with model serving (vLLM, TensorRT, Triton)
- Strong Python and C++ skills
- Experience with Kubernetes and container orchestration
- Understanding of GPU optimization
- Familiarity with open source LLMs
Required Skills
PythonC++CUDAKubernetesDocker
About Together AI
Open source AI cloud for training and inference at scale.