Description
Responsibilities: -
• Design and Develop ETL Processes in AWS Glue to migrate data from S3 with ORC/Parquet/Text Files.
• Data Extraction, aggregations, and consolidation of data.
• Create external and managed tables with partitions in Redshift using Glue Catalog.
• Create user defined functions UDF in Redshift and in Python.
• Build S3 buckets and managed policies for S3 buckets and used S3 bucket and Glacier for storage and backup on AWS .
• Amazon IAM service enabled to grant permissions and resources to users.
• Managed roles and permissions of users with the help of AWS IAM .
Must Have: -
• 3+ yrs working experience on AWS platform using data services
• Big Data Ecosystems (Must have working experience): S3, Redshift, Glue and at least one ingestion service like DMS, Appflow, Data Transfer/Data Sync
• Must have used Step Functions with Lambda
• Scripting Languages: Python Understanding of cloud watch, SNS and even bridge
• Excellent analytical and problem-solving skills.
• Experience working in Agile, Scrum
Good to Have :-
• Exposure to Big Data Ecosystems ( Lake Formation, Data Migration Service, Appflow, AWS EMR, DynamoDB)
• Scripting Languages: PySpark
• Exposure to Alembic
• Exposure to CI/CD pipeline (Gitlab).
• Preferable AWS Code Commit, Code Build, Code Deploy