Site Reliability Engineer (SRE) – ML Platform
Neshent Tech
Austin, TX/Sunnyvale, CA
Posted On: Aug 26, 2025
Posted On: Aug 26, 2025
Job Overview
Salary
Depends on Experience
Required Skills
- MLOps
- Kubernetes
- Python
- Linux
Job Description
Roles and Responsibilities
- Build and maintain continuous deployment pipelines using GitHub Actions, Flux, and Kustomize.
- Design and implement scalable cloud-based MLOps solutions on AWS.
- Containerize and deploy data science and machine learning models using Docker, VLLM, and Kubernetes.
- Collaborate effectively with data scientists, data engineers, and solution architects; document processes and system designs.
- Develop and deploy scalable tools and services for training and inference of machine learning models.
- Apply knowledge of machine learning models, including large language models (LLMs), in production environments.
Qualifications
- 6+ years of experience in MLOps or related roles, with strong expertise in Kubernetes, Python, MongoDB, and AWS.
- Proficiency in Linux system administration.
- Solid understanding of Apache Solr.
- Hands-on experience with containerization and orchestration using Docker and Kubernetes in cloud environments.
- Experience building and maintaining MLOps pipelines using frameworks like Kubeflow, MLflow, DataRobot, or Airflow.
- Familiarity with workflow orchestration tools such as Argo, Airflow, or Kubeflow Pipelines.
- Experience in developing custom cloud integrations using APIs.
- Knowledge of machine learning methodologies, best practices, and model lifecycle management.
- Proven ability to develop and maintain machine learning systems using open-source tools.
- Understanding of the tools and workflows used by data scientists, with experience in test automation and CI/CD practices.
- Strong software testing, benchmarking, and continuous integration skills.
- Ability to translate business requirements into scalable technical solutions.
Job ID: NT250269
Related Jobs

COMPANY
Neshent Tech

experience
6 - 12 Years

Work Arrangement
On-Site

SALARY
Depends on Experience

SKILLS
- MLOps
- Kubernetes
- Python
- Linux