What you'll do
Regnology, a global leader in RegTech and SupTech solutions, is seeking an experienced Infrastructure Operations Manager to lead and manage a hybrid team specializing in the operation of our critical infrastructure. This strategic role ensures the high availability, performance, and security of our platform, spanning both our private datacenters and public cloud environments (GCP, AWS). The ideal candidate is a proven leader with significant experience in team management and operational transformation.
Key Duties and Responsibilities
1. Team Leadership and Management2. Hybrid Operations Management3. Strategy and Continuous Improvement
1. Team Leadership and Management
- Team Development: Lead, mentor, and develop a team of SRE and technical operations engineers responsible for 24/7 infrastructure.
- Performance and Objectives: Define team objectives (OKRs/KPIs), conduct performance reviews, and manage skills development plans.
- SRE Culture: Foster a culture of automation, toil reduction (eliminating manual, repetitive tasks), and continuous improvement, aligned with Site Reliability Engineering (SRE) principles.
- Daily Operations: Oversee the operational maintenance of infrastructure (networking, storage, operating systems, virtualization) within both private datacenters and Public Cloud environments (primarily GCP and AWS).
- Availability and Performance: Ensure and guarantee compliance with defined SLAs (Service Level Agreements) and SLOs (Service Level Objectives) for all production services.
- Incident Management: Lead critical incident response processes (Incident Management) and coordinate Post-Mortem Analysis reviews to identify root causes and improve resilience.
- Transformation Management: Drive change to transfer operations from private datacenters to Public Cloud where and when relevant
- Automation: Drive the adoption of Infrastructure as Code (IaC) (Terraform, Ansible) to automate deployment, configuration management, and operational tasks.
- Security and Compliance: Work closely with the security team to ensure regulatory compliance (DORA, ISO, etc.) and the application of best security practices across both environments (on-premise and cloud).
- Cloud Cost Optimization: Actively monitor and optimize public cloud spending (FinOps) without compromising performance or security.