OMS Platform Reliability Lead

2T Consulting

Berkeley Heights, NJ

Posted On: Jun 25, 2026

Posted On: Jun 25, 2026

Job Overview

Job Type

Contract - Independent, Contract - Corp-to-Corp

Experience

8 - 20 Years

Salary

Depends on Experience

Work Arrangement

On-Site

Travel Requirement

0%

Required Skills

  • Platform Engineering
  • SRE
  • Order Management Systems
  • GraphQL
Job Description
Required Qualifications
  • Bachelor’s degree in Computer Science, Software Engineering, or a related technical field.
  • 5+ years of experience in OMS Technical Operations, Platform Engineering, Site Reliability Engineering (SRE), or Production Support within high-volume, event-driven SaaS environments.
  • Strong experience supporting Order Management Systems (OMS) and distributed integrations.
  • Advanced knowledge of GraphQL queries, mutations, aliases, fragments, and variables.
  • Strong understanding of REST APIs, JSON, and event-driven architectures (Pub/Sub, Kafka, Event Grid, or similar).
  • Hands-on experience with observability and monitoring tools such as Splunk, Datadog, ELK Stack, or New Relic.
  • Strong experience with Git, version control, and deployment processes.
  • Proficiency in reading, debugging, and troubleshooting Java-based applications and custom extensions.
  • Strong understanding of ITIL processes with an SRE mindset focused on automation and operational excellence.
  • Excellent analytical, troubleshooting, and communication skills.
Roles & Responsibilities

Platform Reliability & Automation

  • Design and implement automation solutions to improve OMS platform reliability and reduce manual intervention.
  • Develop automated order remediation and recovery mechanisms for synchronization failures across integrated systems.
  • Build tools and utilities using platform SDKs, APIs, and scripting to support operational efficiency.
  • Drive self-healing capabilities and proactive platform monitoring.

Monitoring & Observability

  • Develop and maintain dashboards to monitor API performance, GraphQL query execution, system health, and integration success rates.
  • Implement and optimize alerting strategies to proactively identify stuck orders, inventory discrepancies, and integration failures.
  • Analyze system metrics and trends to improve platform stability and performance.

Incident Management & Root Cause Analysis

  • Serve as the technical escalation point for complex production incidents and platform issues.
  • Perform deep-dive troubleshooting across application logs, integrations, APIs, and event-driven workflows.
  • Lead root cause analysis efforts and implement long-term corrective actions.
  • Document technical resolutions, workarounds, and operational best practices.

Performance & Platform Optimization

  • Analyze API response times, integration bottlenecks, and application performance issues.
  • Collaborate with engineering teams to recommend and implement platform improvements.
  • Support scalability, reliability, and operational readiness initiatives.

Stakeholder & Vendor Collaboration

  • Act as the technical liaison between business, architecture, engineering, and operations teams.
  • Collaborate with platform vendors and internal teams on upgrades, enhancements, and release planning.
  • Mentor support and operations teams on troubleshooting, GraphQL optimization, and technical best practices.

Release & Change Management

  • Review and validate platform configurations, integrations, and deployments during release cycles.
  • Support change management processes and operational readiness activities.
  • Manage source control and branching strategies for operational fixes and configuration updates.
Preferred Qualifications
  • Experience with Fluent Commerce OMS, including GraphQL APIs, Webhooks, and business rules.
  • Experience supporting eCommerce, order fulfillment, or retail technology platforms.
  • Familiarity with CI/CD pipelines and DevOps practices.
  • Experience working in cloud-based SaaS environments.

Job ID: 2C321609


Posted By

Shayne

Sr. Recruiter