AI Operations (AI Ops) Engineer

Long Finch Technologies

Fremont, CA

Posted On: Sep 03, 2025

Posted On: Sep 03, 2025

Job Overview

Job Type

Contract - Corp-to-Corp, Contract - Independent, Contract - W2

Experience

11 - 20 Years

Salary

Depends on Experience

Work Arrangement

Hybrid

Travel Requirement

0%

Required Skills

  • Artificial Intelligence (AI)
  • AIOps
  • Machine Learning
  • MLOps
  • Python
  • LLM
Job Description

AI Ops Engineer with a strong background in Python, API development, Large Language Models (LLM) concepts, ML Ops, Azure Cloud and AI operations with 8-10 years of experience working on advanced AI/ML systems, cloud infrastructure, and API integrations, with a focus on operationalizing AI models and maintaining robust systems for AI-driven applications. This role requires a combination of technical expertise in cloud computing, machine learning, and software engineering. Collaborate with IT operations and business teams to support business user issues, requests, Production support and deployments; advocate best practices and recommend technical solutions for improvements in usability of application and systems performance

Responsibilities:

Technical Operations: Review, Implement and support enterprise-level AI platforms and services to drive IT operation excellence. Ensuring that new use cases are onboarded smoothly and operationalized

Optimization: Analyze business processes to identify areas for automation and work with business stakeholders and IT teams to determine requirements and design software bots to reduce operational toil.

AI Ops & Model Deployment: Lead the operationalization and deployment of AI/ML models into production environments, ensuring they are highly available, scalable, and performant. Implement and monitor Continuous Integration (CI) and Continuous Deployment (CD) pipelines.

Python Development: Design and develop Python-based solutions for automating and managing the lifecycle of AI/ML models, including data ingestion, model training, and real-time prediction workflows.

API Integration: Build and maintain robust APIs for model serving and integration with other systems. Ensure seamless communication between models, data pipelines, and consumer applications.

LLM Concepts and Implementation: Apply knowledge of Large Language Models (LLMs) to develop AI-driven applications and services, ensuring models are optimized and performing efficiently in production.

ML Ops: Implement and maintain Machine Learning Operations (ML Ops) practices for version control, monitoring, logging, and debugging of AI/ML models in production. Support model retraining, versioning, and A/B testing.

Cloud Infrastructure: Leverage Azure Cloud services for hosting and scaling AI applications, ensuring security, compliance, and performance. Implement infrastructure as code (IaC) using tools like Azure DevOps.

Collaboration: Work closely with backend engineers, data engineers/developers, infrastructure engineers , operational SMEs and business stakeholders to tackle evolving challenges in the field of AI/ML to ensure AI solutions meet business requirements and performance benchmarks.

Monitoring & Optimization: Continuously monitor the performance of deployed AI models and optimize them for efficiency, cost-effectiveness, and accuracy. Implement alerting and logging mechanisms by scripts or through observability solution.

Documentation & Best Practices: Document AI Ops processes, Use cases, tools, and workflows. Establish and enforce best practices for managing AI models in production environments.


Job ID: LF250028


Posted By

Shubham Singh


Related Jobs
  • Contract - Corp-to-Corp
  • Contract - Independent
  • Contract - W2

  • Company
  • COMPANY

    Long Finch Technologies

  • Company
  • experience

    11 - 20 Years

  • Travel Requirements
  • Work Arrangement

    Hybrid

  • Wallet
  • SALARY

    Depends on Experience

  • Skills
  • SKILLS

    • Artificial Intelligence (AI)
    • AIOps
    • Machine Learning
    • MLOps
    • +2 more

Posted On: Sep 03, 2025