Chief Site Reliability Engineer

EPAM Systems
Gerente
Presencial
Publicado em 13 de novembro de 2025

Descrição da Vaga

Become a key member of our Enterprise Technology team as a **Chief Site Reliability Engineer** , overseeing critical infrastructure and enterprise applications. You will leverage your expertise in Site Reliability Engineering, CI/CD, cloud platforms, Kubernetes, and security to build resilient and scalable systems. If you are driven to lead innovation and maintain high availability, we invite you to join us.   **Responsibilities** * Oversee and enhance enterprise application infrastructure through advanced DevOps strategies * Design and manage CI/CD pipelines to facilitate efficient and dependable software delivery * Administer and upgrade Kubernetes clusters ensuring scalability and robust security * Create and maintain automation tools and scripts primarily in Python * Direct cloud infrastructure operations on Amazon Web Services and Microsoft Azure with emphasis on security and identity management * Collaborate with development teams to refine infrastructure as code practices using Terraform * Monitor system performance and implement proactive reliability measures * Coordinate operational requests and maintenance activities effectively * Diagnose and resolve complex infrastructure and deployment challenges * Ensure adherence to security standards and company policies across all systems * Document infrastructure setups and standard operating procedures comprehensively * Lead disaster recovery and business continuity initiatives * Continuously assess emerging technologies to enhance system reliability and efficiency   **Requirements** * Extensive experience of at least 7 years in Site Reliability Engineering or equivalent DevOps roles * Advanced proficiency in Python programming language * Comprehensive experience with Amazon Web Services and Microsoft Azure including API usage, authentication, and serverless solutions * Deep understanding of cloud networking, Kubernetes cluster management, security, IAM, and configuration automation * Strong knowledge of CI/CD workflows, source control systems, containerization, and infrastructure as code with Terraform * Proven expertise in enabling and improving IaaS environments * Demonstrated success in managing enterprise\-scale software development and deployments * Thorough understanding of automation techniques related to CI/CD and IaaS * Exceptional analytical and complex problem\-solving abilities * Effective management of operational requests and maintenance processes * Strong communication skills with English proficiency at B2\+ level   **We offer** * International projects with top brands * Work with global teams of highly skilled, diverse peers * Healthcare benefits * Employee financial programs * Paid time off and sick leave * Upskilling, reskilling and certification courses * Unlimited access to the LinkedIn Learning library and 22,000\+ courses * Global career opportunities * Volunteer and community involvement opportunities * EPAM Employee Groups * Award\-winning culture recognized by Glassdoor, Newsweek and LinkedIn

Vaga originalmente publicada em: linkedin

💼 Encontre as melhores oportunidades para desenvolvedores no Job For Dev