JobMesh

Big Data Lead

Hexaware · US

Responsibilities: • Experience with big data processing and distributed computing systems like Spark. • Implement ETL pipelines and data transformation proce...

Job description

Responsibilities: • Experience with big data processing and distributed computing systems like Spark. • Implement ETL pipelines and data transformation processes. • Ensure data quality and integrity in all data processing workflows. • Troubleshoot and resolve issues related to PySpark applications and workflows. • Understand source, dependencies and data flow from converted PySpark code. • Strong programming skills in Python and SQL. • Experience with big data technologies like Hadoop, Hive, and Kafka. • Understanding of data warehousing concepts and relational databases like SQL. • Demonstrate and document code lineage. • Integrate PySpark code with frameworks such as Ingestion Framework, DataLens, etc., • Ensure compliance with data security, privacy regulations, and organizational standards. • Knowledge of CI/CD pipelines and DevOps practices. • Strong problem-solving and analytical skills. • Excellent communication and leadership abilities. Qualifications: • 4+ years of experience in big data development, Hadoop , Hive & Spark framework. • Good to have experience in SAS. • Strong Python, PySpark Development and SQL knowledge. • Certification in big data or cloud technologies is...