We are looking for a skilled QA Tester with a strong background in testing data pipelines, ETL processes, and analytics solutions built on modern platforms such as Databricks and Snowflake. You will be responsible for validating curated datasets and business KPIs across the Databricks Medallion Architecture (Bronze, Silver, Gold layers).
Key Responsibilities
- Design and execute test plans for Databricks Silver and Gold layers.
- Validate data transformations, aggregations, and business rules.
- Develop and maintain automated data quality checks using Pytest or similar frameworks.
- Perform reconciliation testing across Bronze → Silver → Gold layers.
- Validate incremental loads, schema evolution, and CDC (Change Data Capture) processes.
- Conduct performance testing on pipelines, partitioning strategies, and queries.
- Ensure data governance and security policies (row/column-level access, Unity Catalog) are met.
- Document test cases, test results, and bugs aligned with Agile/Scrum practices.
- Collaborate with data engineers, business analysts, and product owners to define testing criteria.
- Contribute to CI/CD pipelines by integrating automated tests (Azure DevOps, GitHub Actions, Jenkins).
Required Skills
- SQL (Databricks SQL) expertise for data validation, reconciliation, and exploratory testing.
- Hands-on experience with:
- ETL testing
- Power BI or Tableau
- Pytest or similar test automation frameworks
- Deep understanding of Databricks Medallion Architecture.
- Experience with schema evolution, incremental data testing, and CDC validations.
- Familiarity with CI/CD practices and tools: Azure DevOps, GitHub Actions, or Jenkins.
- Strong analytical and debugging skills for large-scale datasets.
- Agile/Scrum experience and ability to work in sprint-based delivery cycles.
Nice-to-Have Skills
- Experience with Snowflake testing or similar modern cloud data platforms.
- Exposure to Unity Catalog and data governance principles.
- Python scripting experience for custom data quality validations.
- Knowledge of data lineage and metadata management tools.