Site Reliability Engineer (SRE)

Bydrec, Inc.
Lead
Remoto 🌐
Publicado em 31 de outubro de 2025

Descrição da Vaga

**Excellent opportunity to work REMOTELY with a U.S.\-based company. Candidates living in Mexico, Central, or South America are welcome to apply.** **About The Company** Bydrec, Inc. is a California\-based company that connects top Tech talent from Latin America with U.S. companies looking to expand their development teams. Learn more at bydrec.com. Our client is a dynamic company that requires a proactive and self\-assured engineer to help define and lead this project. The ideal candidate must be able to ensure their platform remains fast, resilient, and scalable, especially during high\-traffic live events. This is a unique opportunity to contribute to the future of reliability at a company where uptime and user experience are paramount. **What You’ll Do** * Optimize Performance: Continuously monitor and analyze system performance, identify bottlenecks, and implement solutions to improve efficiency and scalability across our cloud\-native infrastructure. * Monitoring \& Alerting: Design and manage robust observability systems using Prometheus, Grafana, ELK stack, and APM tools to ensure real\-time visibility into platform health. * Incident Management: Lead incident response efforts, perform root cause analysis, and drive post\-mortem processes to prevent recurrence and improve system resilience. * Cloud Infrastructure: Architect and maintain infrastructure across Azure and GCP, ensuring high availability, security, and cost\-effectiveness. * Automation \& Tooling: Build and maintain automation scripts and playbooks using Python and Ansible to reduce manual effort and improve deployment consistency. * Container Orchestration: Manage Kubernetes clusters to support dynamic scaling and seamless deployment of microservices. * CI/CD \& GitOps: Collaborate with development teams to enhance GitLab pipelines and promote GitOps practices for reliable and repeatable deployments. * Cross\-Team Collaboration: Work closely with Engineering, Development, and Technical Operations to align reliability goals with product and business objectives. **Technical Requirements** * 5\+ years of experience in Site Reliability Engineering, DevOps, or Infrastructure roles within a SaaS or cloud\-native environment. * Strong expertise in: * Azure and GCP cloud platforms * Kubernetes and container orchestration * Monitoring tools: Prometheus, Grafana, ELK stack, APM solutions * Automation: Python, Ansible * CI/CD: GitLab * Proven success in performance tuning, incident response, and system scalability. * Excellent communication and collaboration skills across technical and non\-technical teams. * Initiative, confidence, and a builder’s mindset—ready to shape a nascent function and drive impact from day one. * Sense of urgency during critical incidents, as the work focuses on maintaining high availability. * Advanced level of English **Must Have Skills** * Experience using APM (Application Performance Monitoring) tools — also referred to as Observability platforms. * Skill in leveraging logs for monitoring, alerting, and forensics. * Expertise working with modern cloud\-native environments, with experience in both on\-premise and cloud infrastructure (due to the ongoing migration).

Vaga originalmente publicada em: linkedin

Receba vagas como esta no seu email

Crie um alerta gratuito e seja o primeiro a saber de novas oportunidades

Criar Alerta Gratuito

Alertas que entendem o que você quer

Não receba qualquer vaga. Receba apenas as que combinam exatamente com o que você busca.

Alerta genérico

Filtro:

Python

Você recebe tudo isso:

Vaga de Python + Django
Vaga de Python + Flask
Vaga de Python + ETL/Data
Vaga de Python + Machine Learning
...e muito ruído no seu email
Alerta inteligente

Filtro:

Python+FastAPI

Você recebe apenas:

Desenvolvedor Python + FastAPI
Backend Engineer (FastAPI)
API Developer - Python/FastAPI

Zero ruído. Só vagas relevantes para você.

Outros exemplos de filtros precisos:

JavaScript+React+Remoto
Java+Spring Boot+Sênior
Go+Kubernetes

Filtros Combinados

Combine linguagem + framework + nível + localização. Seja tão específico quanto quiser.

Email Diário

Receba um resumo diário apenas com vagas que passam nos seus filtros. Sem spam.

Kanban Visual

Organize suas candidaturas em um quadro Kanban. Acompanhe cada processo seletivo.

Planos simples, sem surpresas

Comece grátis e faça upgrade quando quiser

Gratuito

R$ 0para sempre
  • Busca de vagas ilimitada
  • Salvar até 10 vagas
  • 1 quadro Kanban
Criar Conta Grátis
Popular

Premium

R$ 9,90/mês
  • Tudo do plano gratuito
  • Vagas salvas ilimitadas
  • Quadros Kanban ilimitados
  • Alertas de vagas por email
  • Suporte prioritário
3 dias grátis, sem cartão

Pronto para encontrar sua vaga ideal?

Junte-se a milhares de desenvolvedores que já usam o Job For Dev

Encontre as melhores oportunidades para desenvolvedores no Job For Dev