We are seeking a highly experienced Big Data Lead to design, develop, and manage scalable data processing solutions using the Hadoop ecosystem and modern data engineering technologies. The ideal candidate will have strong hands-on expertise in Apache Spark, Scala, Python, Hadoop, and cloud platforms (AWS or GCP). This role requires technical leadership in building robust data pipelines, supporting large-scale analytics, and mentoring data engineering teams.
Key Responsibilities
- Lead the design and development of scalable big data platforms and data pipelines.
- Develop and optimize batch and real-time data processing solutions using Apache Spark.
- Work with Spark RDD APIs and implement distributed data processing applications.
- Build and maintain ETL workflows for processing large volumes of structured and unstructured data.
- Utilize Hadoop ecosystem tools such as Hadoop, Hive, Pig, and Oozie.
- Implement streaming data pipelines using technologies like Kafka.
- Collaborate with data scientists, analysts, and engineering teams to support analytics and business intelligence requirements.
- Ensure performance tuning, scalability, and reliability of big data systems.
- Provide technical guidance and mentorship to junior data engineers.
- Work with cloud platforms (AWS or GCP) to deploy and manage data infrastructure.
- Maintain and support Linux-based environments for data processing systems.
Required Skills
- 10+ years of experience in Big Data, Data Engineering, or related roles.
- Strong hands-on experience with Hadoop ecosystem technologies including Hadoop, Hive, Pig, and Oozie.
- Extensive experience with Apache Spark and strong knowledge of Spark RDD APIs.
- Proficiency in Scala and Python programming.
- Experience with Core Java.
- Strong knowledge of SQL and scripting languages.
- Hands-on experience with Kafka or similar streaming frameworks.
- Strong experience working in Linux environments.
Data & Analytics Knowledge
- Strong understanding of Data Warehousing concepts.
- Experience with Data Modeling techniques.
- Familiarity with data visualization and analytics tools such as Tableau or R.
Cloud Experience
- Hands-on experience with AWS or GCP cloud platforms.
- Experience building data engineering workflows and big data solutions in cloud environments.
Preferred Qualifications
- Experience designing real-time streaming architectures.
- Strong experience in performance tuning and optimization for Spark and Hadoop.
- Prior experience leading data engineering teams or projects.