JobMesh

Senior Software Engineer in Test (AI Agentic Systems)

Collective Health · Lehi, Utah, US

At Collective Health, we’re transforming how employers and their people engage with their health benefits by seamlessly integrating cutting-edge technology,...

Job description

At Collective Health, we’re transforming how employers and their people engage with their health benefits by seamlessly integrating cutting-edge technology, compassionate service, and world-class user experience design. This is not a traditional QA role . You will be the quality owner for an LLM-based multi-agent pipeline that autonomously adjudicates health insurance claims for self-funded plan sponsors. You are building a Three-Tier Evaluation Framework to ensure our Gemini-powered agents reason correctly, call tools accurately, and produce DOL-ready outcomes. You will work at the intersection of Vertex AI, healthcare compliance, and high-scale data engineering. Your work directly determines whether claims are paid correctly and whether the company can withstand a Department of Labor (DOL) or state DOI audit. The stakes are real, the domain is hard, and the problems are genuinely novel. What you'll do: - Outcome Evaluation (The "What") - Golden Set Governance: Build and maintain a versioned library of "Grounding Data" results by working with senior claims examiners to define "Ground Truth." - Model-as-a-Judge Automation: Design automated "LLM-grading-LLM" workflows using custom r...