Sr. Data Engineer-Databricks SME (Remote)
Tier One Technologies · Raleigh, North Carolina, US
Overview - Tier One Technologies is seeking a Data Engineer to support our US Government client with data ingestion, data deduplication and data tagging for...
Job description
Overview: - Tier One Technologies is seeking a Data Engineer to support our US Government client with data ingestion, data deduplication and data tagging for migration of a large-scale data environment into Databricks. - This remote contract-to-hire position will be originated in Raleigh, NC. - SELECTED CANDIDATES WITHOUT REQUIRED CLEARANCE WILL BE SUBJECT TO A FEDERAL GOVERNMENT BACKGROUND INVESTIGATION TO RECEIVE IT. Responsibilities: - Design, develop, and maintain scalable data ingestion pipelines to onboard structured, semi-structured, and unstructured data from batch and streaming sources (e.g., APIs, databases, flat files, message queues) into the Azure/Databricks environment. - Implement de-duplication strategies across large-scale datasets using deterministic and probabilistic matching techniques to ensure data integrity and reduce redundancy within the Data Lake. - Develop and enforce data tagging frameworks to classify, label, and annotate datasets with appropriate metadata (e.g., sensitivity, source, domain, lineage) to support data governance, discoverability, and compliance requirements. - Assist with Operationalizing deployments and support of Cloud services for ETL...