Middle SRE / Observability Engineer
Descrição da Vaga
We are strengthening our platform team with a **Middle Site Observability Engineer** to keep Kubernetes production services stable for AI research on Azure Stack. You will enhance observability, handle business\-hours operational support, and work closely with engineering and research partners to improve reliability and processes—apply now. **Responsibilities** * Develop, operate, and enhance observability capabilities, including dashboards and visualizations in Grafana or similar tools * Establish and maintain metrics, SLIs, SLOs, and alerting approaches for production platforms * Deliver business\-hours operational support for Kubernetes\-based environments through troubleshooting, log analysis, and metrics\-driven investigations * Assist with production operations for SQL\-based systems by diagnosing issues and supporting performance investigations * Investigate incidents and system behavior to identify root causes, participate in post\-incident reviews, and propose improvements to monitoring and reliability practices * Partner with engineering, platform, and research teams to raise observability standards, refine operational processes, and increase system reliability * Create and maintain documentation, share knowledge across the team, and drive ongoing improvement activities **Requirements** * Hands\-on experience of 2\+ years in Site Reliability Engineering, DevOps or Production Support for live production systems * Practical knowledge of observability and monitoring stacks such as Grafana, Prometheus, Elastic Stack, or Datadog * Solid understanding of Linux systems with strong troubleshooting abilities and log analysis skills * Background supporting Kubernetes\-based production environments * Working experience with SQL production support, including query troubleshooting and basic performance analysis * Proficiency in automation scripting using Python, Bash, or similar languages * Ability to assess incidents, determine root causes, and contribute to continuous improvement efforts * Effective communication skills and comfort collaborating with distributed, cross\-functional teams * English proficiency at an intermediate to advanced level (B1–C1\)
Vaga originalmente publicada em: indeed
Receba vagas como esta no seu email
Crie um alerta gratuito e seja o primeiro a saber de novas oportunidades
Alertas que entendem o que você quer
Não receba qualquer vaga. Receba apenas as que combinam exatamente com o que você busca.
Filtro:
Você recebe tudo isso:
Filtro:
Você recebe apenas:
Zero ruído. Só vagas relevantes para você.
Outros exemplos de filtros precisos:
Filtros Combinados
Combine linguagem + framework + nível + localização. Seja tão específico quanto quiser.
Email Diário
Receba um resumo diário apenas com vagas que passam nos seus filtros. Sem spam.
Kanban Visual
Organize suas candidaturas em um quadro Kanban. Acompanhe cada processo seletivo.
Planos simples, sem surpresas
Comece grátis e faça upgrade quando quiser
Premium
- Tudo do plano gratuito
- Vagas salvas ilimitadas
- Quadros Kanban ilimitados
- Alertas de vagas por email
- Suporte prioritário
Pronto para encontrar sua vaga ideal?
Junte-se a milhares de desenvolvedores que já usam o Job For Dev
Encontre as melhores oportunidades para desenvolvedores no Job For Dev