Senior DevOps Engineer
Descrição da Vaga
We are strengthening GPU\-capable orchestration on Kubernetes and Linux, and need a Senior DevOps Engineer to standardize automation and scheduling performance. You will administer Kubernetes with Volcano, manage quotas and isolation, and automate operations using Python and Bash to support advanced AI and research work. Send your application to get started **Responsibilities** * Provision, configure, and support GPU\-enabled Kubernetes clusters and standalone Linux compute environments to keep scheduling and performance at peak * Operate Volcano job scheduling, handling queue setup, POD execution, GPU allocation, and namespace quota enforcement * Own Kubernetes administration end\-to\-end, including namespaces, RBAC, resource quotas, and workload isolation strategies * Automate job submission, resource provisioning, and reporting through Python and Shell scripting maintained over time * Coordinate with orchestration, optimization, and observability teams to enhance scheduling efficiency, capacity utilization, and researcher workflows * Observe infrastructure health and resource consumption, and share data for optimization and reporting requirements * Drive continuous improvements to infrastructure, tooling, and automation workflows to boost performance, scalability, and usability * Support operational processes that ensure researchers have an efficient experience across diverse AI and computational workloads **Requirements** * 3\+ years of DevOps or infrastructure engineering experience in large, complex environments * Expert proficiency administering Kubernetes, including namespaces, POD scheduling/distribution, PVC, NFS, and resource quota management * Hands\-on background with Volcano scheduler for GPU jobs, including queue setup and workload prioritization with Kubernetes integration * Track record of managing GPU cluster environments both in Kubernetes and on standalone Linux compute nodes * Advanced capability with Python for infrastructure automation and solid UNIX Shell scripting such as Bash * Strong Linux system administration skills with troubleshooting, performance tuning, and configuration management experience * Solid understanding of infrastructure automation and orchestration concepts and the tools used to implement them * Fluent English communication skills (spoken and written) to support direct client collaboration **Nice to have** * Helm knowledge for packaging and managing Kubernetes applications * Experience with monitoring and observability stacks, especially Prometheus, Grafana and Loki * Familiarity with Infrastructure as Code, including Terraform * Exposure to multi\-cloud Kubernetes environments such as Amazon EKS and Google GKE * Understanding of Azure Networking, including VPN, ExpressRoute and network security * Experience using AI\-assisted coding tools like GitHub Copilot, ChatGPT and Claude * Knowledge of hybrid (cloud and on\-premises) scheduling and resource optimization approaches
Vaga originalmente publicada em: indeed
Receba vagas como esta no seu email
Crie um alerta gratuito e seja o primeiro a saber de novas oportunidades
Alertas que entendem o que você quer
Não receba qualquer vaga. Receba apenas as que combinam exatamente com o que você busca.
Filtro:
Você recebe tudo isso:
Filtro:
Você recebe apenas:
Zero ruído. Só vagas relevantes para você.
Outros exemplos de filtros precisos:
Filtros Combinados
Combine linguagem + framework + nível + localização. Seja tão específico quanto quiser.
Email Diário
Receba um resumo diário apenas com vagas que passam nos seus filtros. Sem spam.
Kanban Visual
Organize suas candidaturas em um quadro Kanban. Acompanhe cada processo seletivo.
Planos simples, sem surpresas
Comece grátis e faça upgrade quando quiser
Premium
- Tudo do plano gratuito
- Vagas salvas ilimitadas
- Quadros Kanban ilimitados
- Alertas de vagas por email
- Suporte prioritário
Pronto para encontrar sua vaga ideal?
Junte-se a milhares de desenvolvedores que já usam o Job For Dev
Encontre as melhores oportunidades para desenvolvedores no Job For Dev