PySpark / Java Developer

2T Consulting

Blue Bell, PA

Posted On: May 01, 2026

Posted On: May 01, 2026

Job Overview

Job Type

Contract - W2, Contract - Independent

Experience

5 - 10 Years

Salary

Depends on Experience

Work Arrangement

On-Site

Travel Requirement

0%

Required Skills

  • Apache Spark
  • Java
  • Microsoft SQL Server
  • ETL
  • Hadoop
  • Hive
Job Description
Roles and Responsibilities
  • Design, develop, and maintain scalable ETL pipelines for large-scale structured and unstructured data.
  • Build and optimize data processing applications using PySpark and Java.
  • Work extensively with relational databases and big data platforms for data extraction, transformation, and loading.
  • Analyze and resolve performance bottlenecks in high-volume SQL procedures and big data processing jobs.
  • Develop efficient data movement and transformation workflows across distributed systems.
  • Collaborate with cross-functional teams to understand end-to-end data flow and business requirements.
  • Support production systems, troubleshoot issues, and ensure data pipeline reliability.
Required Skills & Experience
  • 5+ years of experience in Microsoft SQL Server and relational database development for data extraction applications.
  • Strong understanding of ETL concepts, database technologies, and large-scale data processing.
  • Proven experience in performance tuning of SQL queries and understanding of different indexing strategies.
  • 2+ years of experience working with big data technologies including:
    • Hadoop
    • Spark / PySpark
    • Hive
    • Impala
    • Python
  • 2+ years of hands-on experience with the Cloudera Hadoop Ecosystem, including:
    • HDFS
    • Hive
    • Impala
    • Spark
    • Kafka
    • Hue
    • Oozie
    • YARN
    • Sqoop
  • Experience in processing large volumes of structured and unstructured data using Spark.
  • Strong understanding of end-to-end (E2E) data pipeline architecture and application workflows.
Preferred Skills
  • Domain experience in healthcare claims data or healthcare analytics.
  • Experience with distributed data processing and optimization in production environments.
  • Strong troubleshooting and analytical skills in complex data ecosystems.

Job ID: 2C321128


Posted By

Shayne

Sr. Recruiter