JobMesh

Data Scientist - AI Evaluation

Wizard · US

About Wizard Wizard is the top-performing AI Shopping Agent, delivering the best products from across the web with unmatched accuracy, quality, and trust. Th...

Job description

About Wizard Wizard is the top-performing AI Shopping Agent, delivering the best products from across the web with unmatched accuracy, quality, and trust. The Role: We’re looking for a Data Scientist to own how we measure, understand and improve the accuracy of our AI agent. This role sits at the intersection of data science, machine learning and product and is focused on evaluation, experimentation and insight generation. You won’t be building models but you will make sure they work in real world scenarios. You will build the systems to measure what good looks like and partner closely with ML, AI Engineering and Product to continuously improve the agent’s performance. What You’ll Do: - Define and evolve accuracy metrics across the full shopping experience (retrieval, ranking, recommendations and outcomes) - Design and run experiments to measure improvements and regressions - Build and maintain evaluation datasets, benchmarks and scoring frameworks - Translate ambiguous product questions into clear, measurable hypotheses and analysis - Partner with ML Engineers to validate model changes and guide iteration - Identify failure modes and edge cases and drive improvements through data...