LH
LLMHire
Browse JobsAgentsNewSalary InsightsCompaniesBlogPricing

Never Miss an AI Job

Get weekly AI job alerts delivered to your inbox.

Join 500+ AI professionals. Unsubscribe anytime.

LH
LLMHire

The #1 job board for AI & LLM engineers. Find your next role in the AI revolution.

Jobs

  • Browse Jobs
  • Companies
  • Job Alerts
  • Post a Job
  • Pricing

Resources

  • Blog
  • CyberOS.devScan code for vulnerabilities
  • EndOfCoding.comStay ahead with AI news
  • Vibe Coding AcademyLearn skills employers want
  • Vibe Coding Ebook22 chapters, 200+ prompts
  • Video Tutorials@endofcoding on YouTube

Company

  • About
  • Contact
  • Privacy
  • Terms

Contact

  • hello@llmhire.com
  • Get in Touch

© 2026 LLMHire. All rights reserved.

VeriduxLabsBuilt by VeriduxLabs
Back to all jobs
F

ML Systems Engineer

Fireworks AI
San Francisco, CAHybrid$190,000 - $300,0001 months ago
full-timeseniorllamamistralstable-diffusion

About the Role

Build the fastest AI inference platform. Optimize serving of LLMs and diffusion models at scale. Responsibilities: - Optimize model serving latency and throughput - Implement custom CUDA kernels - Build batching and scheduling systems - Support multiple model architectures

Requirements

- 3+ years ML systems experience - Strong C++/CUDA programming - Experience with model quantization - Knowledge of transformer architectures

Required Skills

PythonC++CUDAPyTorch

About Fireworks AI

Fast and affordable AI inference platform for production workloads.

Visit Company Website

Ready to Apply?

Join Fireworks AI and work on cutting-edge AI technology