Description
Role/ Job Title: Tech Lead
Place of Work: Mumbai
Roles & Responsibilities:
• 'Minimum 7 years of Data Engineering experience and 5 years in large scale Data Lake ecosystem
• Clean, prepare and optimise data at scale for ingestion and consumption.
• Drive the implementation of new data management projects and re-structure of the current data architecture.
• Work with business stakeholders to identify and document high impact business problems and potential solutions.
• First-hand experience with the complete software development life cycle including requirement analysis, design, development, deployment, and support.
• Advanced understanding of Data Lake/Lakehouse architecture and experience/exposure to Databricks techniques
• Work on end-to-end data lifecycle from Data Ingestion, Data Transformation and Data Consumption layer. Versed with API and its usability.
• A suitable candidate will also be proficient Scala, Spark, Spark Streaming, AWS, EMR.
• He/She should have machine learning experience and experience with big data infrastructure inclusive of MapReduce, Hive, HDFS, YARN, HBase, Oozie, etc.
• The candidate will additionally demonstrate substantial experience and a deep knowledge of data mining techniques, relational, and non-relational databases.
Secondary Responsibilities:
• 'Implement complex automated workflows and routines using workflow scheduling tools.
• Build continuous integration, test-driven development and production deployment frameworks.
• Excellent oral and written communication skills Learn and use internally available analytic technologies.
• Build models at scale using vast amounts of structured and unstructured heterogeneous types of data.
• Identify key performance indicators and establish strategies on how to deliver on these key points for analysis solutions Use educational background in data engineering and perform data mining analysis.
• Work with BI analysts/engineers to create prototypes, implementing traditional classifiers and determiners, predictive and regressive analysis points.
• Engage in the delivery and presentation of solutions Participate in data storage architecture design discussions.
• Apply machine learning and/or statistical techniques to time series classification and telemetry anomaly detection problems.
Key Success Metrics:
'Ensure timely deliverables.
Spot Data fixes.
Lead technical aspects of the projects.
Error free deliverables