<div class="content-intro"><h2><strong>About Us:</strong></h2> <p data-start="107" data-end="729">At Fireworks, we’re building the future of generative AI infrastructure. Our platform delivers the highest-quality models with the fastest and most scalable inference in the industry. We’ve been independently benchmarked as the leader in LLM inference speed and are driving cutting-edge innovation through projects like our own function calling and multimodal models. Fireworks is a Series C company valued at $4 billion and backed by top investors including Benchmark, Sequoia, Lightspeed, Index, and Evantic. We’re an ambitious, collaborative team of builders, founded by veterans of Meta PyTorch and Google Vertex AI.</p></div><h2><strong>The Role:</strong> </h2> <p>As a Training Infrastructure Engineer, you'll design, build, and optimize the infrastructure that powers our large-scale model training operations. Your work will be essential to developing high-performance AI training infrastructure. You'll collaborate with AI researchers and engineers to create robust training pipelines, optimize distributed training workloads, and ensure reliable model development.</p> <h2><strong>Key Responsibilities:</strong></h2> <ul> <li>Design and implement scalable infrastructure for large-scale model training workloads</li> <li>Develop and maintain distributed training pipelines for LLMs and multimodal models</li> <li>Optimize training performance across multiple GPUs, nodes, and data centers</li> <li>Implement monitoring, logging, and debugging tools for training operations</li> <li>Architect and maintain data storage solutions for large-scale training datasets</li> <li>Automate infrastructure provisioning, scaling, and orchestration for model training</li> <li>Collaborate with researchers to implement and optimize training methodologies</li> <li>Analyze and improve efficiency, scalability, and cost-effectiveness of training systems</li> <li>Troubleshoot complex performance issues in distributed training environments</li> </ul> <h2><strong>Minimum Qualifications:</strong></h2> <ul> <li>Bachelor's degree in Computer Science, Computer Engineering, or related field, or equivalent practical experience</li> <li>3+ years of experience with distributed systems and ML infrastructure</li> <li>Experience with PyTorch</li> <li>Proficiency in cloud platforms (AWS, GCP, Azure)</li> <li>Experience with containerization, orchestration (Kubernetes, Docker)</li> <li>Knowledge of distributed training techniques (data parallelism, model parallelism, FSDP)</li> </ul> <h2><strong>Preferred Qualifications:</strong></h2> <ul> <li>Master's or PhD in Computer Science or related field</li> <li>Experience training large language models or multimodal AI systems</li> <li>Experience with ML workflow orchestration tools</li> <li>Background in optimizing high-performance distributed computing systems</li> <li>Familiarity with ML DevOps practices</li> <li>Contributions to open-source ML infrastructure or related projects</li> </ul><div class="content-pay-transparency"><div class="pay-input"><div class="description"><p>Total compensation for this role also includes meaningful equity in a fast-growing startup, along with a competitive salary and comprehensive benefits package. Base salary is determined by a range of factors including individual qualifications, experience, skills, interview performance, market data, and work location. The listed salary range is intended as a guideline and may be adjusted.</p></div><div class="title">Base Pay Range (Plus Equity)</div><div class="pay-range"><span>$175,000</span><span class="divider">—</span><span>$220,000 USD</span></div></div></div><div class="content-conclusion"><h2><strong>Why Fireworks AI?</strong></h2> <ul> <li>Solve Hard Problems: Tackle challenges at the forefront of AI infrastructure, from low-latency inference to scalable model serving.</li> <li>Build What’s Next: Work with bleeding-edge technology that impacts how businesses and developers harness AI globally.</li> <li>Ownership & Impact: Join a fast-growing, passionate team where your work directly shapes the future of AI—no bureaucracy, just results.</li> <li>Learn from the Best: Collaborate with world-class engineers and AI researchers who thrive on curiosity and innovation.</li> </ul> <p><em>Fireworks AI is an equal-opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all innovators.</em></p></div>