Skip to content
CoreWeave logo

Senior Production Engineer (Reliability)

CoreWeaveGPU Cloud company
Livingston$182,000 - $242,000Senior
Software Engineering

About the role

Senior Production Engineer to own and improve critical reliability tooling for CoreWeave's AI cloud.

  • Production Engineering ensures CoreWeave’s cloud delivers world-class reliability, performance, and operational excellence.
  • We are hiring a Senior Production Engineer to take direct, hands-on ownership of critical tooling that drives reliability and delivery success.
  • Key Responsibilities Take hands-on ownership of critical systems and frameworks, driving their architecture, implementation, and long-term evolution.
  • Lead end-to-end delivery of engineering projects that improve availability, scalability, operational automation, and failure recovery.
  • Build and maintain observability, alerting, automated remediation, and resilience testing for the systems you support.
  • Participate in incident response as a subject-matter expert; drive deep root-cause investigations and implement lasting fixes.
  • Ship production code regularly in Python, Go, or similar languages, and participate in on-call rotations.
  • Requirements 7+ years of engineering experience building and operating distributed systems or cloud platforms.
  • Demonstrated ability to debug complex production issues end-to-end, across services, infrastructure layers, and automation.
  • Strong programming or scripting ability (Python, Go, or similar), with experience shipping and operating production services and tools.
View original posting →

Tech stack

PythonGoKubernetesCI/CDDockerLinuxGitREST APIgRPC

Match insights

Tech:Python, Go, Kubernetes, CI/CD, Docker
Level:Senior

More roles at CoreWeave

View open roles at CoreWeave