Build production AI systems at Microsoft AI. Work on infrastructure infrastructure that powers our products and serves millions of users.
What you'll do:
- Design and implement scalable infrastructure systems
- Optimize performance and reliability
- Collaborate with research teams to deploy new capabilities
- Build tools and frameworks for internal teams
- Participate in on-call rotations and incident response
Requirements
- 10+ years of experience in software engineering or ML
- Strong programming skills in Python, PyTorch, Distributed Training
- Experience with machine learning frameworks and tools
- Track record of delivering complex projects
- Experience mentoring and leading teams
- Strong communication and collaboration skills
- Publications in top venues preferred
- Industry recognition and thought leadership
- Experience setting technical strategy
Required Skills
PythonPyTorchDistributed Training
About Microsoft AI
Enterprise AI with Azure OpenAI, Copilot, and AI infrastructure.