Lead Site Reliability Engineer

EPAM Systems
Lead
Remoto 🌐
Publicado em 05 de novembro de 2025

Descrição da Vaga

Become a key member of our team as a **Lead Site Reliability Engineer**, specializing in managing and enhancing SAP cloud infrastructure with a focus on security, reliability, and sovereign cloud migration. You will work closely with development and SRE teams to automate infrastructure management and ensure rapid incident handling. Apply now to contribute to a critical project with significant impact. **Responsibilities** * Monitor infrastructure ticket queues * Troubleshoot issues within infrastructure * Coordinate communication and resolution efforts * Handle incidents and deliver solutions or suggest alternatives * Document incident closure details * Conduct root cause analyses and aid in developing permanent fixes * Support development teams in deploying solutions to cloud infrastructure * Help development teams diagnose application flow issues tied to infrastructure * Plan scheduled tasks including notifying stakeholders * Record chronological actions and resolution summaries * Manage communications with stakeholders and internally, including calls * Track infrastructure KPIs and create root cause analysis reports * Ensure root cause analyses and action plans prevent incident recurrence * Achieve all infrastructure SLAs and KPIs related to uptime, availability, and security **Requirements** * Extensive experience in Site Reliability Engineering and DevOps with over 5 years in similar roles * Advanced expertise in Amazon Web Services cloud infrastructure * Proficiency in scripting languages such as PowerShell and Python * Hands\-on experience with containerization including Docker and Kubernetes * Knowledge of infrastructure as code tools like Terraform * Strong understanding of networking fundamentals and network security * Experience with Windows Server administration and IT security standards * Proven skills in provisioning and operating cloud infrastructure * Excellent troubleshooting and incident management abilities * Strong communication skills for cross\-team coordination * Collaborative approach coupled with excellent organizational skills * Bachelor's degree in computer science or related discipline * English proficiency at B2\+ level in both written and spoken forms **Nice to have** * Familiarity with Microsoft Azure and Google Cloud Platform * Knowledge of SAP environments and language technology tools * Ability in German or other additional languages * Experience with security compliance and data residency requirements * Certifications in AWS or Kubernetes administration

Vaga originalmente publicada em: indeed

💼 Encontre as melhores oportunidades para desenvolvedores no Job For Dev