Machine Learning Operations Engineer - AWS (with LLM Focus)

PB Consulting

Hollywood, FL

Posted On: Jun 28, 2024

Posted On: Jun 28, 2024

Job Overview

Job Type

Full-time

Experience

8 - 12 Years

Salary

$110,000 - $140,000 Per Year

Work Arrangement

Remote

Travel Requirement

0%

Required Skills

  • AWS
  • LLM
  • REST API
  • Flask
  • GenAI
  • CI/CD
Job Description
Roles and Responsibilities
  • Develop strategies for prompt engineering and management to enhance LLM outputs and ensure consistency and safety.
  • Continuously optimize LLMOps processes and infrastructure for cost-efficiency while maintaining high performance and reliability.
  • Apply techniques like quantization, distillation, and pruning to optimize LLM models for efficient inference on AWS infrastructure.
  • Design and implement MLOps infrastructure on AWS tailored for LLMs, leveraging services like SageMaker, EC2 (with GPU instances), S3, ECS/EKS, Lambda, and more.
  • Build and manage CI/CD pipelines specifically for LLM deployment, addressing unique challenges like model size, inference optimization, and versioning.
  • Work closely with data scientists, researchers, and software engineers to integrate LLM models into production systems effectively.

 

Required Experience/Skills
  • Strong proficiency in AWS services relevant to MLOps and LLMs, including SageMaker, EC2 (with GPU instances), S3, ECS/EKS, Lambda, and API Gateway.
  • Deep understanding of LLM architectures (e.g., Transformers), training techniques, and inference optimization strategies.
  • Proficiency in Python and experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation), REST API frameworks (e.g., Flask, FastAPI), and LLM libraries (e.g., Hugging Face Transformers).
  • Familiarity with monitoring and logging tools for LLMs, such as Prometheus, Grafana, and CloudWatch.
  • Experience with Docker and container orchestration (e.g., Kubernetes, ECS) for LLM deployment.

Job ID: PC240262


Posted By

Naincy Chauhan

Sr. Manager