Senior Scientist, Synthetic Data and Privacy
NVIDIA · Santa Clara, California, US
NVIDIA is at the forefront of the AI revolution, and our research is shaping the future of large language models. We are looking for a Senior Scientist to jo...
Job description
NVIDIA is at the forefront of the AI revolution, and our research is shaping the future of large language models. We are looking for a Senior Scientist to join our team and help advance our capabilities in generating synthetic datasets and privacy-preserving AI. You will contribute to open-source libraries within the NVIDIA NeMo ecosystem that enable high-quality synthetic data generation at scale while ensuring data privacy. This role combines hands-on software engineering with research in privacy-enhancing methods, and you will collaborate with research, engineering, product teams, and external labs. What you'll be doing: Build and implement advanced pipelines for generating synthetic datasets using innovative LLM-based methodologies and automated quality evaluation frameworks. Research and implement privacy-preserving techniques such as differentially private training (DP-SGD), identifying and replacing sensitive information via NER models, and membership inference protection. Design and maintain open-source software libraries and SDKs with clean APIs and developer-facing documentation, applying robust software design patterns. Drive software excellence through modern developmen...