DevOps Engineer (Senior/Lead) ID43756
Descrição da Vaga
**Important:** *after confirming your application on this platform, you’ll receive an email with the next step: completing your application on our internal site, LaunchPod. So keep an eye on your inbox and don’t miss this step — without it, the process can’t move forward.* **About the role** As a **Lead DevOps Engineer**, you’ll own and evolve cloud infrastructure that powers AI\-driven products, ensuring scalability, reliability, and security across distributed systems. You’ll lead modernization efforts in AWS, streamline CI/CD pipelines, and champion automation and observability best practices. This role combines hands\-on technical leadership with strategic impact, offering the opportunity to shape infrastructure standards and drive innovation in a fast\-paced, collaborative environment. **What you will do** * Lead cross\-cutting infrastructure projects not tied to app/platform changes, such as domain migrations for pipelines and customer\-facing sites, networking redesigns to avoid IP exhaustion, and medium\-sized automation initiatives; * Design and evolve AWS networking and environments to support more dev/test sites, future SaaS infrastructure, and sandboxes where dev teams can experiment safely; * Define and implement a disaster recovery strategy and secondary region or DR zone to improve resilience and recovery time; * Implement cost, reliability, observability, and monitoring improvements across services, using metrics and logs to guide optimization; * Design, maintain, and evolve AWS\-based infrastructure, including ECS, RDS/Aurora, Lambda, S3, CloudWatch, CDK, VPCs, subnets, security groups, Route 53, and load balancers; * Upgrade AWS Aurora Postgres clusters to the latest supported versions, ensuring high availability, data integrity, and minimal downtime; * Own and improve CI/CD pipelines using GitHub Actions for production deployments, covering containerized services and Lambda\-based workloads; * Manage infrastructure as code using AWS CDK (TypeScript) and other IaC practices to drive automation, consistency, and repeatability; * Consolidate and optimize shared tooling, utility scripts, and reusable components across multiple repositories; * Collaborate with engineering and leadership to define the infrastructure roadmap, influence architecture decisions, and promote DevOps culture and best practices. **Must haves** * **6\+ years of experience** in DevOps / Site Reliability Engineering, including ownership of multi\-quarter infrastructure projects or leadership roles; * Strong expertise with AWS services such as **ECS**, **RDS/Aurora**, **Lambda**, **S3**, **CloudWatch**, **CDK**, and core networking (VPC design, routing, subnets, security groups, NAT, DNS/Route 53, load balancers); * Proficient with **Docker**, **GitHub Actions**, and modern CI/CD patterns for cloud\-native applications; * Deep knowledge of **Postgres** administration, including upgrades, backups, and performance tuning; * Strong scripting and automation skills with **TypeScript**, **Python**, or **Bash**; * Proven ability to architect scalable, secure, and reliable cloud environments, including DR strategies and cost\-optimization practices; * Experience improving observability (metrics, logs, traces, alerting) and using it to guide reliability and cost improvements; * Excellent communication and collaboration skills, with a track record of working closely with engineers and stakeholders to execute infra roadmaps; * Self\-driven, practical, and detail\-oriented, comfortable making decisions, documenting trade\-offs, and delivering high\-quality results with limited supervision; * Upper\-intermediate English level. **Nice to haves** * Familiarity with AI/ML workflows or cloud\-based AI services; * Experience with AWS Bedrock or similar generative AI platforms; * Exposure to Cursor or other modern AI\-enhanced developer tools; * Understanding of security and scaling best practices for distributed environments; * Experience with monitoring and observability tools (Datadog, Prometheus, CloudWatch, etc.). **About us** AgileEngine is an Inc. 5000 company that creates award\-winning software for Fortune 500 brands and trailblazing startups across 17\+ industries. We rank among the leaders in areas like application development and AI/ML, and our people\-first culture has earned us multiple Best Place to Work awards. If you're looking for a place to grow, make an impact, and work with people who care, we'd love to meet you! **Perks and benefits** * **Professional growth:** Accelerate your professional journey with mentorship, TechTalks, and personalized growth roadmaps. * **Competitive compensation:** We match your ever\-growing skills, talent, and contributions with competitive USD\-based compensation and budgets for education, fitness, and team activities. * **A selection of exciting projects:** Join projects with modern solutions development and top\-tier clients that include Fortune 500 enterprises and leading product brands. * **Flextime:** Tailor your schedule for an optimal work\-life balance, by having the options of working from home and going to the office, whatever makes you the happiest and most productive. Job Type: Full\-time
Vaga originalmente publicada em: indeed
💼 Encontre as melhores oportunidades para desenvolvedores no Job For Dev