Seeking an experienced AWS Data Engineer with strong expertise in Amazon S3 and enterprise data pipelines. The role involves building scalable ETL pipelines, managing data in AWS environments, and supporting big data processing using Spark/Hadoop technologies.
Roles and Responsibilities
- Design, develop, and maintain ETL/ELT data pipelines for large-scale data processing.
- Utilize Amazon S3 for data storage, retrieval, and management within data workflows.
- Develop data ingestion and transformation pipelines using PySpark and SQL.
- Write Shell/Python scripts to automate data movement between S3 and other platforms.
- Implement S3 features such as versioning, lifecycle policies, access control, and encryption.
- Validate HiveQL queries, HDFS structures, and Spark-based processing in Hadoop environments.
- Monitor and schedule batch workflows using AutoSys or similar tools.
- Collaborate with cross-functional teams and maintain technical documentation.
Required Skills
- AWS Data Engineering
- Amazon S3
- PySpark / Spark
- SQL
- Shell Scripting
- AutoSys
Preferred Skills
- Hadoop, Hive, HDFS
- Oracle
- Cloudera Platform
- Banking or Payments domain experience
- Knowledge of Data Warehousing, ETL/ELT, Data Quality