Staff Software Engineer, Inference Runtime
AnthropicGenerative AI, company
San Francisco, United States$405,000 - $485,000 USDLead
Software Engineering
About the role
Technical lead for Inference Runtime, owning the shared, accelerator-agnostic core of the inference serving stack.
- •Anthropic's Inference organization serves Claude to millions of users and enterprise customers with the speed, reliability, and efficiency that frontier AI demands.
- •We are looking for a Staff Engineer to be a technical lead for Inference Runtime, owning the shared, accelerator-agnostic core of our inference serving stack.
- •Key Responsibilities Set technical direction for the team, owning the architecture and roadmap for the shared runtime of the inference serving stack.
- •Own and evolve the accelerator-agnostic runtime itself
- •its interfaces, internal boundaries, and build structure.
- •Drive efficient accelerator usage
- •utilization, scheduling, memory management
- •across GPU, TPU, and Trainium.
- •Build the runtime's validation surface around partitioned builds, change-scoped testing, and canary/shadow/rollback.
- •Act as a technical counterpart to Anthropic's central Infrastructure org on compilers, build systems, and toolchains.
Tech stack
RustPython
Match insights
Tech:Rust, Python
Level:Lead