Infrastructure Operations Lead Cloud and AI GenAI Enablement

Tampa

Saturday, 30 May 2026

Lead and provide direction for our Managed Service Provider (MSP)Lead Managed Service Provider in Operations for Azure, GCP and AWS Cloud environment. Drives moderate to complex processing improvements through optimization, enhancements and implementation of new operational features and functions around Cloud compliance, metrics/reporting and cost optimization. Provide senior level expertise on decisions and priorities regarding the enterprises overall Cloud Operations strategy, consumption, and optimization opportunities - understand Cost controls and the various cost optimization techniques. Identifies, drives and assists in the implementation of opportunities to standardize Cloud environments. Provides Coud governance, processes and technical advisory support to business units and projects by working cross-functionally and provides recommendations that support the business needs. Participant as required (Level 2/3 escalation point) for Incident Management. Participate and develop client relationships within Operations, Business partners, Managed Service Providers and Cloud Providers. Work with cross-functional teams to support the engineering and implementation of new Cloud applications or solutions and define the related risks and onboard new capabilities. Ability to communicate at all levels within an organization and influence strategic direction. Ability to work with minimal supervision, making decisions based upon priorities, schedules and understanding business initiatives. This leader will explore and prototype AI-driven solutions to automate incident response, predict system failures, summarize complex telemetry data, and develop intelligent copilots to support Operations teams. Lead research and evaluation of cutting-edge AI and Gen. AI tools applicable to Infrastructure Operations (e.g., LL - Ms, vector databases, predictive analytics)Design and prototype AI-driven systems for automated incident detection, anomaly classification, infrastructure forecasting/resiliency – leading to lower Mtt. R and manual overhead in mission-critical environments. Develop and lead the strategic roadmap for AI adoption in Infrastructure Operations. Collaborate with Infrastructure and Cloud Operations teams to pilot and integrate AI/ Gen. AI features into critical workflows. Modernize observability and alerting using AI/ ML models for proactive monitoring and self-healing actions. Lead R&D of Gen. AI solutions for predictive alerting, incident triage and infrastructure automation. Build AI copilots and natural language tools for infrastructure operations teams. Integrate LL - Ms into observability platforms for real-time RCA and log summarization. Pilot and productionize Gen. AI-based assistants, bots, and copilots to support ticket triage, knowledge management and resolution workflows. Identify automation opportunities and implement AI-enhanced runbooks, workflows and self-healing mechanisms. Contribute to a strategic roadmap for Gen. AI maturity within Infrastructure & Operations, including tools, governance and organizational readiness. Partner with internal data science and clinical innovation teams to create proofs of concept, build ML/ Gen. AI pipelines, and integrate with existing toolchains (e.g., ServiceNow, Splunk, Terraform)Autonomous log summarization, RCA generation and playbook suggestions. Natural language interfaces for querying system health or telemetry. Act as a Gen. AI ambassador, helping Infrastructure Operations teams upskill in AI-augmented technologies and use cases. Use your skills to make an impact Required Qualifications. Bachelor's in Computer Science, Artificial Intelligence, Healthcare Informatics, or a related field 10 years in infrastructure operations or engineering, with at least 3 years of hands-on involvement in AI/ ML or Gen. AI R&D - Deep understanding of large language models (LL - Ms), vector databases, retrieval-augmented generation (RAG), and model orchestration (e.g., Lang. Chain, Haystack). Experience integrating AI/ Gen. AI capabilities with infrastructure automation tools (Terraform, Ansible, Python, Bash)Familiarity with healthcare systems and compliance frameworks (HIPAA, HITRUST)Proficiency with observability and telemetry platforms (e.g., Splunk, Dyna. Trace, SolarWinds) and AI-driven monitoring. Strong problem-solving and experimentation mindset, with the ability to move from concept to pilot rapidly. Experience with Continuous Integration and Deployment Pipelines, i.e. Azure DevOps, Jenkins, Git, Git Hub. Has hands on scripting experience using one of the following: Terraform, Cloud Formation, PowerShell, Azure CLI, Python, JSON, Perl or Bash. Preferred:Master's degree. Azure, AWS, GCP, ITIL and/or SRE certifications. Experience with Gen. AI platforms (e.g., Azure OpenAI, Google Vertex AI)Experience deploying or evaluating open-source LL - Ms or fine-tuning models for infrastructure use cases.

apply
 
Loading Similar Jobs...
JOBZ is an independent Job Search Engine. JOBZ is not an agent or representative and is not endorsed, sponsored or affiliated with any employer. JOBZ uses proprietary technology to keep the availability and accuracy of its job listings and their details. All trademarks, service marks, logos, domain names, job descriptions and other company descriptions / details are the property of their respective holder. JOBZ does not have its users apply for a job on the J-O-B-Z.com website. Additionally, JOBZ may provide a list of third-party job listings that may not be affiliated with any employer. Please make sure you understand and agree to the website's Terms & Conditions and Privacy Policies you are applying on as they may differ from ours and are not in our control.