Skip to content
Anthropic logo

Software Engineer, Safeguards Evals

AnthropicGenerative AI, company
San Francisco, United States$320,000 - $485,000 USDSenior
Data & AI

About the role

Builds evaluation infrastructure for AI safety systems, measuring agent performance and driving improvements.

  • This role builds the evaluation infrastructure that answers questions about the effectiveness of Anthropic's AI safety systems.
  • You'll sit at the intersection of applied ML research and engineering, designing experiments to measure how well an investigative agent performs across harm areas.
  • Key Responsibilities Build and own the evaluation harness for an agentic investigation system.
  • Construct high-quality eval datasets representing real-world misuse.
  • Measure agent performance end-to-end and drive improvements.
  • Requirements Proficiency in Python and comfort working across the stack.
  • Experience building and maintaining data pipelines.
  • Experience working with LLMs and a working understanding of their capabilities and failure modes.
View original posting →

Tech stack

PythonLLMsAnthropic API

Match insights

Tech:Python, LLMs, Anthropic API
Level:Senior

More roles at Anthropic

View open roles at Anthropic