Site Reliability Engineer (SRE) – ML Platform

Neshent Tech

Austin, TX/ Sunnyvale, CA

Posted On: Aug 26, 2025

Posted On: Aug 26, 2025

Job Overview

Job Type

Full-time

Experience

6 - 12 Years

Salary

Depends on Experience

Work Arrangement

On-Site

Travel Requirement

0%

Required Skills

  • MLOps
  • Kubernetes
  • Python
  • Linux
Job Description
Roles and Responsibilities
  • Build and maintain continuous deployment pipelines using GitHub Actions, Flux, and Kustomize.
  • Design and implement scalable cloud-based MLOps solutions on AWS.
  • Containerize and deploy data science and machine learning models using Docker, VLLM, and Kubernetes.
  • Collaborate effectively with data scientists, data engineers, and solution architects; document processes and system designs.
  • Develop and deploy scalable tools and services for training and inference of machine learning models.
  • Apply knowledge of machine learning models, including large language models (LLMs), in production environments.

 

Qualifications
  • 6+ years of experience in MLOps or related roles, with strong expertise in Kubernetes, Python, MongoDB, and AWS.
  • Proficiency in Linux system administration.
  • Solid understanding of Apache Solr.
  • Hands-on experience with containerization and orchestration using Docker and Kubernetes in cloud environments.
  • Experience building and maintaining MLOps pipelines using frameworks like Kubeflow, MLflow, DataRobot, or Airflow.
  • Familiarity with workflow orchestration tools such as Argo, Airflow, or Kubeflow Pipelines.
  • Experience in developing custom cloud integrations using APIs.
  • Knowledge of machine learning methodologies, best practices, and model lifecycle management.
  • Proven ability to develop and maintain machine learning systems using open-source tools.
  • Understanding of the tools and workflows used by data scientists, with experience in test automation and CI/CD practices.
  • Strong software testing, benchmarking, and continuous integration skills.
  • Ability to translate business requirements into scalable technical solutions.

Job ID: NT250269


Posted By

Abhishek

Resource Manager


Related Jobs
  • Full-time

  • Company
  • COMPANY

    Neshent Tech

  • Company
  • experience

    6 - 12 Years

  • Travel Requirements
  • Work Arrangement

    On-Site

  • Wallet
  • SALARY

    Depends on Experience

  • Skills
  • SKILLS

    • MLOps
    • Kubernetes
    • Python
    • Linux

Posted On: Aug 26, 2025