Browse Jobs AgentsNew Salary Insights Companies Blog Pricing

Never Miss an AI Job

Get weekly AI job alerts delivered to your inbox.

Join 500+ AI professionals. Unsubscribe anytime.

LLMHire

The #1 job board for AI & LLM engineers. Find your next role in the AI revolution.

Jobs

Browse Jobs
Companies
Job Alerts
Post a Job
Pricing

Resources

Blog
CyberOS.devScan code for vulnerabilities
EndOfCoding.comStay ahead with AI news
Vibe Coding AcademyLearn skills employers want
Vibe Coding Ebook22 chapters, 200+ prompts
Video Tutorials@endofcoding on YouTube

Company

About
Contact
Privacy
Terms

Contact

hello@llmhire.com
Get in Touch

Built by VeriduxLabs

Back to all jobs

ML Systems Engineer

Fireworks AI

San Francisco, CAHybrid$190,000 - $300,0001 months ago

full-timeseniorllamamistralstable-diffusion

About the Role

Build the fastest AI inference platform. Optimize serving of LLMs and diffusion models at scale. Responsibilities: - Optimize model serving latency and throughput - Implement custom CUDA kernels - Build batching and scheduling systems - Support multiple model architectures

Requirements

- 3+ years ML systems experience - Strong C++/CUDA programming - Experience with model quantization - Knowledge of transformer architectures

Required Skills

PythonC++CUDAPyTorch

About Fireworks AI

Fast and affordable AI inference platform for production workloads.

Visit Company Website

Ready to Apply?

Join Fireworks AI and work on cutting-edge AI technology