Kafka Operations Administrator

2T Consulting

Seattle, WA

Posted On: Feb 02, 2026

Posted On: Feb 02, 2026

Job Overview

Job Type

Full-time

Experience

7 - 10 Years

Salary

Depends on Experience

Work Arrangement

On-Site

Travel Requirement

0%

Required Skills

  • Kafka Operations
  • Grafana
  • Prometheus
  • Splunk
  • Linux
Job Description

We are seeking a highly skilled Kafka Operations Administrator to manage and maintain production-grade Apache Kafka clusters. The ideal candidate will have deep experience in Kafka operations, monitoring, automation, and production support within enterprise environments. This role includes 24x7 on-call responsibilities, incident management, performance tuning, and ensuring high availability and disaster recovery.

Roles and Responsibilities
  • Deploy, configure, and manage Kafka clusters and related services to meet SLA requirements
  • Participate in 24x7 on-call rotation, responding to incidents, alerts, and escalations
  • Triage, diagnose, and remediate production incidents; coordinate with stakeholders, developers, and infrastructure teams
  • Implement automation for provisioning, scaling, backups, and disaster recovery
  • Maintain monitoring, alerting thresholds, dashboards, and Kafka ecosystem health
  • Harden Kafka deployments by configuring TLS, ACLs, RBAC, encryption, and vulnerability remediation
  • Perform routine maintenance including Kafka ecosystem upgrades (controllers, brokers, connect, and schema registry) and rolling restarts
  • Create and maintain runbooks, automation scripts, and post-incident reports
  • Optimize performance and resource utilization through benchmarking and tuning
  • Support Kafka Connect and Schema Registry services; troubleshoot connector issues
  • Contribute to CI/CD pipeline improvements for infrastructure and deployment automation

 

Required Technical / Functional Skills
  • Production-grade Apache Kafka operations experience, including managing, maintaining, and upgrading Kafka clusters
  • Strong experience with high availability, disaster recovery, failover, and overall reliability
  • Proficient in monitoring and observability tools, including:
    • Grafana (dashboards)
    • Prometheus
    • Splunk
    • JMX metrics
  • Automation and orchestration expertise using:
    • Terraform
    • Ansible
    • Helm
    • Kubernetes (EKS/AKS/GKE)
  • Strong Linux system administration, including troubleshooting and scripting for infrastructure management
  • Production support experience following ITIL processes
  • Experience in 24x7 on-call rotations, incident documentation, and postmortems
  • Experience with JVM tuning, GC analysis, and network/disk I/O diagnostics
  • Strong understanding of TCP/IP, routing, switching, and firewall configurations relevant to Kafka operations

 

Required Skills
  • Deep Kafka performance tuning and capacity planning experience
  • Knowledge of message delivery semantics and guarantees (at-least-once, exactly-once)
  • Cloud-native security/compliance experience (IAM, VPC, KMS, Security Groups)
  • Relevant certifications: Confluent Certified Administrator, AWS/Azure/GCP
  • Experience with Apache Kafka in KRaft mode
  • Containerization and orchestration experience (Docker, Kubernetes)
  • CI/CD pipeline and Git-based workflows
  • Experience building custom Kafka Connect libraries and knowledge of serialization formats (Avro, JSON)
  • Strong understanding of networking across on-prem and cloud environments
  • Best practices for topic management and streaming security (TLS, ACLs, RBAC, encryption)
  • Kafka ecosystem tooling experience (Kafka Connect, Schema Registry)

 

Qualifications
  • Bachelor’s degree in Computer Science, Engineering, or related field (preferred)
  • 7+ years of experience in Kafka operations or platform engineering
  • Proven experience in production support and infrastructure automation

Job ID: 2C320377


Posted By

Shayne

Sr. Recruiter


Related Jobs
  • Full-time

  • Company
  • COMPANY

    2T Consulting

  • Company
  • experience

    7 - 10 Years

  • Travel Requirements
  • Work Arrangement

    On-Site

  • Wallet
  • SALARY

    Depends on Experience

  • Skills
  • SKILLS

    • Kafka Operations
    • Grafana
    • Prometheus
    • Splunk
    • +1 more

Posted On: Feb 02, 2026