Essential Functions
- Drives world-class design and development of analytical data pipelines.
- Provides thought leadership and propose industry standards and implementations.
- Adhere to processes to ensure data pulled from various sources meets quality standards, is curated and enhanced for analytical use and there is a "single source of truth"
- Work with counterparts from Tech to build frameworks that integrate data pipelines and machine learning models that facilitate use by data scientists for priority use cases;
- Enterprise Data and Analytics team focused on "last mile" transformations on select data required for use cases
- Maintain database structure and standard definitions for business users across Macy's
- Work with data architects to build the foundational extract / load / transform process and regularly review the architecture and recommend effectiveness improvements
- Collaborate with Technology to future-proof data & analytics software, tools, and code to reduce risk and support pipeline owners
- Work with Legal and Privacy teams to adhere to data privacy and security standards
- Work with Data Architect to implement the data models, standards, and quality rules
- Work with the Data Science team to understand data formatting and sourcing needs to enable them to build out use cases as efficiently as possible.
- Keep data separated and secured using masking and encryption.
- Explain the requirement to offshore team and create Interface to determine the most efficient and cost-effective approach to meet business requirements.
- Daily onsite-offshore coordination
Qualifications and Competencies
- At least 8 years of overall experience in building ETL/ELT, data warehousing and big data solutions.
- At least 5 years of experience in building data models and data pipelines to process different types of large datasets.
- At least 3 years of experience with Python, Spark, Hive, Hadoop, Kinesis, Kafka.
- Proven expertise in relational and dimensional data modeling.
- Understand PII standards, processes, and security protocols.
- Experience in building data warehouse using Cloud Technologies such as AWS or GCP Services, and Cloud Data Warehouse preferably Google BigQuery.
- In depth knowledge in Big Data solutions and Hadoop ecosystem.
- Strong SQL knowledge – able to translate complex scenarios into queries.
- Strong Programming experience in Java or Python
- Experience in Google Cloud platform (especially BigQuery & Dataflow)
- Experience with Google Cloud SDK & API Scripting
- Experience in Hadoop ( Hive, MapReduce, Spark )
- Experience in Onsite-Offshore coordination
- Experience in Test Driven Development
- Experience in Agile processes and DevOps methodologies.
- Experience in NoSQL Databases
- Experience with Data modeling and mapping.
- Experience of Retail domain will be an added advantage
- Google Cloud Professional Data Engineer Certification is an advantage.
- Programming Language – Java / Python
- Google Cloud – Big Query, Pub/ Sub, Dataflow, Composer DAGs, Cloud storage
- CI/CD – GitHub, Jenkins