Design, develop, and implement data ingestion, transformation, and curation processes using Big Data tools such as Spark (Scala/Python/Java), Hive, HDFS, Sqoop, Kafka, Kerberos, Impala, and CDP 7.x.
Build and maintain high-performance, scalable, and reliable ETL/ELT pipelines for batch and real-time data processing.
Ingest large volumes of structured and unstructured data from diverse sources to support analytics and data warehouse needs.
Design and optimize data warehouses and data marts for both on-prem and cloud analytics consumption.
Work closely with business and functional analysts to understand data requirements and translate them into scalable technical solutions.
Collect, store, process, and analyze large datasets ensuring performance optimization and data quality.
Develop reusable frameworks to streamline development and reduce time-to-delivery.
Ensure that code follows best practices with performance tuning and maintainability in mind.
Collaborate with global teams to drive project delivery, suggest improvements, and implement best practices.
Stay up to date with emerging cloud data technologies and continuously enhance technical skills.
Required Technical Skills
Strong in Python, SQL, and PySpark for data manipulation and pipeline development.
Extensive experience with Informatica IDMC and IBM DataStage.
Proficient in Airflow and Autosys.
Hands-on expertise in AWS (Glue, S3, Lambda, Redshift, etc.) and good understanding of Azure integrations.
Experience working with Salesforce, Snowflake, and AWS data integrations.
Ability to design and support ER diagrams using SAP PowerDesigner.
Exposure to tools such as Kafka, Hive, Impala, Sqoop, and HDFS.
Preferred Qualifications
Strong analytical and problem-solving skills with the ability to work on unstructured datasets.
Experience building data ingestion frameworks for both real-time and batch processing.
Familiarity with DevOps, CI/CD pipelines, and version control (Git).
Knowledge of data governance, metadata management, and data quality frameworks.
Excellent communication and collaboration skills to work in a globally distributed team.
Education: Bachelor’s or Master’s degree in Computer Science, Information Technology, or related field.