Data Engineer Consultant - Spark/Python

Location: Mumbai, Maharashtra

Category: IT Engineer & Developer Jobs

Job Description

Required Skills :

• Must have excellent knowledge in Apache Spark and Python programming experience

• Deep technical understanding of distributed computing and broader awareness of different Spark version

• Strong UNIX operating system concepts and shell scripting knowledge

• Hands-on experience using Spark & Python

• Deep experience in developing data processing tasks using PySpark such as reading data from external sources, merge data, perform data enrichment and load in to target data destinations.

• Externally certified in one of the cloud services (foundational or advanced)- (AWS, GCP, Azure, Snowflake, Databricks)

• Experience in deployment and operationalizing the code, knowledge of scheduling tools like Airflow, Control-M etc. is preferred

• Experience in creating visualizations in either of Tableau, PowerBI, Qlik, Looker or any of the other reporting tools

• Good knowledge of Hadoop, Hive and Cloudera/ Hortonworks Data Platform

• Should have exposure with Jenkins or equivalent CICD tool & Git repository

• Experience handling CDC operations for huge volume of data

• Should understand and have operating experience with Agile delivery model

• Should have experience in Spark related performance tuning

• Should be well versed with understanding of design documents like HLD, TDD etc

• Should be well versed with Data historical load and overall Framework concepts

• Should have participated in different kinds of testing like Unit Testing, System Testing, User Acceptance Testing, etc

Preferred Skills

• Exposure to PySpark, Cloudera/ Hortonworks, Hadoop and Hive.

• Exposure to AWS S3/EC2 and Apache Airflow

(ref:hirist.tech)

Apply on Company Website You will be redirected to the employer’s website