Site Reliability Engineer (Middle) ID38916

AgileEngine
Sênior
Presencial
Publicado em 23 de outubro de 2025

Descrição da Vaga

**Important:** *after confirming your application on this platform, you’ll receive an email with the next step: completing your application on our internal site, LaunchPod. So keep an eye on your inbox and don’t miss this step — without it, the process can’t move forward.* **What you will do** * Shift: Monday – Thursday 8AM – 7PM PST (11AM – 10PM EST) with rotating on\-call; * On call shifts: every 6 weeks, for one week as primary responder and next week as secondary; * Manage alerts daily, check systems, and escalate issues as needed; * Be part of a team that provides 24×7 on\-call support for critical SaaS events; * Be available in case of emergencies when team members are not available or need help; * Document issues and remediation steps; * Proactively create appropriate monitors in the EKS/K8S ecosystem; * Deploy to EKS/K8s cluster using Terraform and Helm; * Learn and maintain existing infrastructure running under Docker Swarm; * Improve existing infrastructure health by implementing checks and scripts to correct known issues; * Maintain and develop deployment code; * Automate manual tasks; * Implement/integrate new technologies in our Cloud Infrastructure; * Collaborate with other teams and departments to provide the highest level of support and assistance; * Apply a real customer focus when planning deployments/updates, having the customer in the forefront of the mind, and considering the impact on them before making changes; * Work closely on solutions with Support, Customer Success, Migration, and Professional Services teams to provide the best in class SaaS service to our customers; * Perform RCA and take necessary corrective actions to prevent the recurrence of issues; * Create and assign alert\-related actions to the appropriate team after the investigation; * Handle support requests for environment\-specific actions; * Identify and provide automation requirements to improve RCA. **Must haves** * **2\+ years** of professional experience; * **Experience working with Datadog**; * Hands\-on experience as an AWS Cloud Engineer; * Working knowledge of EKS/Terraform/Helm; * Working Experience with Docker and Docker Swarm; * Good understanding of AWS IAM roles and policies; * Experience logging and monitoring AWS resources using CloudWatch logs; * Experience working in a Linux environment; * Proficient in Bash and/or Python scripting; * A strong understanding of web technologies such as REST APIs; * Working Experience with monitoring solutions, such as Grafana and Prometheus; * Excellent oral and written communication skills; * Customer\-facing communication skills to effectively explain issues and RCAs to them; * Experience in Product/Application Support for SaaS\-based products; * Understanding of APIs, Databases, Systems Architecture, and Design; * Designing, implementing, and operating in a DevSecOps; * Excellent communication skills, both written and verbal; * Ability to work independently as well as within a collaborative environment; * A technical aptitude with the desire to learn new and evolving technologies; * Upper\-Intermediate English level. **Nice to haves** * Experience with GCP or Azure; * Certifications: AWS Certified DevOps Engineer – Professional or AWS Certified Advanced Networking Specialty. **About us** AgileEngine is an Inc. 5000 company that creates award\-winning software for Fortune 500 brands and trailblazing startups across 17\+ industries. We rank among the leaders in areas like application development and AI/ML, and our people\-first culture has earned us multiple Best Place to Work awards. If you're looking for a place to grow, make an impact, and work with people who care, we'd love to meet you! **Perks and benefits** * **Professional growth:** Accelerate your professional journey with mentorship, TechTalks, and personalized growth roadmaps. * **Competitive compensation:** We match your ever\-growing skills, talent, and contributions with competitive USD\-based compensation and budgets for education, fitness, and team activities. * **A selection of exciting projects:** Join projects with modern solutions development and top\-tier clients that include Fortune 500 enterprises and leading product brands. * **Flextime:** Tailor your schedule for an optimal work\-life balance, by having the options of working from home and going to the office, whatever makes you the happiest and most productive. Job Type: Full\-time

Vaga originalmente publicada em: indeed

💼 Encontre as melhores oportunidades para desenvolvedores no Job For Dev