Title: Senior Cloud SRE
The Senior Cloud Engineer is responsible for the operations and engineering required to provide a high level of user satisfaction in support of Scotiabank’s strategic Public Cloud services:
-
As part of Public Cloud Operations teams, deliver exceptional support, innovation, and service availability of Public Cloud Services for our global clients.
-
Develop and deliver innovative solutions to continuously measure and improve the team’s speed, quality, and effectiveness without compromising on security and controls of the bank.
-
The incumbent leads and develops solutions to meet or exceed operational objectives, availability targets, key performance indicators (KPIs), and service commitments.
Public Cloud Operations and SRE is a specialized technology operations function which provides day 2 support for the bank’s strategy Public Cloud infrastructure and services, application development/project support and delivery, and operational readiness services including design, consultation, and reporting. The current Public Cloud platforms include Microsoft Azure Cloud and the Google Cloud Platform (GCP).
The ideal candidate is passionate about analyzing data and providing a multidimensional view of service reliability in a complex and demanding environment. We are looking for someone who has a keen interest in incident analysis, application performance trends, application monitoring, finding root causes of incidents and has a strong customer-focused mindset which will thrive in a fast-moving environment. As a member of this highly talented and interactive team, you will have the opportunity to grow and learn from experts in this technology space.
Is this role right for you? In this role, you will:
Operational Excellence
-
Manage IT service management (ITSM) incidents, problems, change and service requests for the team to ensure Public Cloud infrastructure and delivery pipelines are available and performing without operational standards.
-
Drive root cause analysis and problem resolution where required to prevent repeat issues and/or improve key performance indicators for the team.
-
Develop and deliver procedures and best-practices to prevent unplanned outages.
-
Improve proactive monitoring and remediation to reduce customer impact, MTTR, and unplanned outages.
-
Interaction and collaboration with service vendors, application teams, and other operations and engineering technology partners.
Influence a team of specialized IT professionals
-
Ability to collaborate and influence partners and teammates of IT professionals with multiple technology backgrounds such as infrastructure management systems administration, middleware systems, application development, networking, and database technology.
-
Reviewing and providing operational signoff on project deliverables, documentation (including Operational Readiness)
-
Demonstrate strategic thinking, building relationship, influencing, conflict resolution, developing and coaching talent, executive communications.
Site Reliability Engineering (SRE)
-
Develop and achieve system availability commitments.
-
Apply SRE methodology for all process, tools and technology managed by Public Cloud Operations.
-
Establish procedures and policies that ensure problems are properly documented and effectively resolved.
-
Identify, document, and drive automation opportunities to improve productivity, observability, and SRE/SRO metrics.
Managing Risk
-
Ensure regulatory requirements, security controls, and compliance procedures are met where applicable (i.e., OSFI, SOX, AML, etc.).
-
Actively manage Internal and External Cloud Audit and deliver on all assigned audit action items
-
Identify and report on risks, controls, and findings to operate within the bank’s risk framework.
-
Negotiating IT project requirements (i.e., Deadlines, budgets, resources, etc.)
Development and Innovation
-
Own operations and product roadmaps; develop strategies for improving automation, observability, non-functional requirements testing, SRE/availability capabilities, and an engineering mindset.
-
Preparing business cases for adopting new technologies or processes, ensuring that existing products and services are exploited to the fullest, and managing delivery where necessary for the implementation of new hardware and/or software tools
Do you have the skills that will enable you to succeed in this role? We'd love to work with you if you have:
-
You are an Information Technology professional with broad experience in development, operations, project management, and service delivery. The incumbent must have a relevant degree and/or proven IT experience.
-
Minimum of 5 years experience supporting technology in an operational role
-
Minimum of 5 years of experience working with enterprise delivery methodologies such as ITIL, Agile and Waterfall
-
You have expert knowledge of incident & problem management methods and methodologies in a production environment and using platforms such as ServiceNow to manage incidents and problems.
-
Ability to analyze and present data using a variety of tools and techniques in the areas of business analysis and business intelligence principles to create executive dashboards and reports (e.g.: Power BI, Excel, PowerPoint etc.)
-
Experience with operational monitoring & performance management tools such as Dynatrace, Aternity, ARMS, Tivoli or other similar technologies
-
Experience with Splunk, Google Logging or other software for searching, monitoring, and examining machine-generated Big Data
-
Knowledge and understanding of SRE Service Level Objectives Google best practices.
-
Professional designation would be an asset in ITIL, Scrum and/or Six Sigma
-
Must possess excellent verbal and written communication skills, as well as strong problem-solving skills coupled with the ability to collaborate with development teams and Business partners.
-
Spanish would be an asset
-
Normal office working conditions. May be required to work extended hours and/or on-call & escalations
What's in it for you?
-
Diversity, Equity, Inclusion & Allyship - We strive to create an inclusive culture where every employee is empowered to reach their fullest potential, respected for who they are, and are embraced through bias-free practices and inclusive values across Scotiabank. We embrace diversity and provide opportunities for all employee to learn, grow & participate through our various Employee Resource Groups (ERGs) that span across diverse gender identities, ethnicity, race, age, ability & veterans.
-
Accessibility and Workplace Accommodations - We value the unique skills and experiences each individual brings to the Bank, and are committed to creating and maintaining an inclusive and accessible environment for everyone. Scotiabank continues to locate, remove and prevent barriers so that we can build a diverse and inclusive environment while meeting accessibility requirements.
-
Upskilling through online courses, cross-functional development opportunities, and tuition assistance.
-
Competitive Rewards program including bonus, flexible vacation, personal, sick days and benefits will start on day one.
-
Community Engagement - no matter where you choose to work from; we offer opportunities for community engagement & belonging with our various programs such as hackathons, contests, cooking with friends, Humans of Digital and much more!
Work arrangements: Hybrid
#LI-Hybrid
Job Segment:
Cloud, Application Developer, Testing, Performance Management, Compliance, Technology, Human Resources, Legal