San Francisco, CAHybrid$250,000 - $400,0001 months ago
full-timeseniorcustom
About the Role
Build the infrastructure that powers Grok and xAI's research. Work on training clusters, inference systems, and developer tools.
Responsibilities:
- Design and operate large-scale GPU clusters
- Build efficient training and inference pipelines
- Optimize distributed computing systems
- Create tools for researchers and engineers
Requirements
- 5+ years systems engineering experience
- Experience with large-scale distributed systems
- Strong Python and C++ skills
- Knowledge of GPU programming and optimization
Required Skills
PythonC++CUDAKubernetesDistributed Training
About xAI
Elon Musk's AI company building Grok and advancing AI understanding.