Data Engineer - Dallas, TX
Photon · US
We are seeking a Data Engineer to build and scale the data infrastructure powering our Agentic AI products. You will be responsible for the "Ingestion-to-Ins...
Job description
We are seeking a Data Engineer to build and scale the data infrastructure powering our Agentic AI products. You will be responsible for the "Ingestion-to-Insight" pipeline that allows autonomous agents to access, search, and reason over vast amounts of proprietary and public data. Your role is critical: you will design the RAG (Retrieval-Augmented Generation) architectures and data pipelines that ensure our agents have the right context at the right time to make accurate decisions. Key Responsibilities: - AI-Ready Data Pipelines: Design and implement scalable ETL/ELT pipelines that process both structured (SQL, logs) and unstructured (PDFs, emails, docs) data specifically for LLM consumption. - Vector Database Management: Architect and optimize Vector Databases (e.g., Pinecone, Weaviate, Milvus, or Qdrant) to ensure high-speed, relevant similarity searches for agentic retrieval. - Chunking & Embedding Strategies: Collaborate with AI Engineers to optimize data chunking strategies and embedding models to improve the "recall" and "precision" of the agent's knowledge retrieval. - Data Quality for AI: Develop automated "Data Cleaning" workflows to remove noise, PII (Personally Identifia...