We are looking for a skilled Spark/Scala Developer with proven big data experience to join our team. The ideal candidate will have a strong background in big data technologies, including Spark and Scala, and experience with data processing pipelines and SQL querying. This role involves developing and maintaining data processing solutions, with a focus on advanced SQL, Scala, and integration with GCP data lakes.
Key Responsibilities:
- Utilize Spark and Scala to develop and optimize big data solutions, leveraging experience in implementation or data science.
- Write and execute advanced SQL queries, working with parquet files and Hive.
- Work with GCP data lakes to integrate and process data effectively.
- Collaborate in an Agile environment, contributing to iterative development and continuous improvement.
- Build and maintain data processing pipelines for production "hands-off" batch systems, including ETL and analytics pipelines.
- Apply a strong coding background in Java, Python, or Scala, and utilize testing libraries such as Scalatest to ensure code quality. Understand and implement various test types, including unit and integration tests.
Requirements:
- Proven big data experience, with expertise in Spark and Scala.
- Knowledge of Scala and Hive, including experience with parquet file formats.
- Advanced SQL skills, with the ability to write complex SQL queries.
- Experience with GCP data lake environments.
- Familiarity with Agile methodologies.
- Expert knowledge of at least one big data technology.
- Strong coding skills in Java, Python, or Scala.
- Experience with testing libraries (e.g., Scalatest) and understanding of different test types (unit vs. integration tests).
- Experience in building data processing pipelines, both traditional ETL and analytics pipelines.