Research Engineer Intern, Evaluations
TensorStax · San Francisco, California, US
Research Engineer Intern, Evaluations & Benchmarks Location: San Francisco (Hybrid) About TensorStax: TensorStax is building fully autonomous AI systems to m...
Job description
Research Engineer Intern, Evaluations & Benchmarks Location: San Francisco (Hybrid) About TensorStax: TensorStax is building fully autonomous AI systems to manage and optimize mission-critical data infrastructure. Our research integrates reinforcement learning and language models to enhance reasoning over large-scale data lakes and warehouses, detect failures in pipelines, and autonomously construct and optimize data workflows with high precision. We are looking for a Research Engineer Intern to design evaluation frameworks and benchmarks that assess the autonomy, adaptability, and reliability of AI agents in data engineering environments. This role is ideal for candidates passionate about AI evaluations, language model benchmarking, and autonomous data systems. What You’ll Do: - Develop evaluation environments to test AI agents' ability to reason, plan, and act autonomously within mission-critical data pipelines. - Design benchmarks to assess model capabilities in failure detection, pipeline optimization, and agentic decision-making in data workflows. - Implement automated assessment frameworks for language model-based agents operating over data lakes and warehouses. - Work with s...