Share this Job

Please be advised that our Careers site will be unavailable from November 28 at 12am ET to November 29 12am ET for scheduled system maintenance.

Title:  Site Reliability Engineer, Scotia Digital (Vancouver Hub)




Requisition ID: 152012

Join a purpose driven winning team, committed to results, in an inclusive and high-performing culture.


Digital Engineering Operations SRE team comprises Site Reliability Engineers and Software Developers to improve Scotia Digital production services' availability, scalability, performance, and reliability. The team proactively looks for ways to improve application monitoring, address production issues and investigate and assist with customer inquiries.


 Is this role right for you?


Are you passionate about improving automation and ensuring the resiliency of technology? Do you get your energy by providing technology solutions working with a team? We are currently seeking an experienced Site Reliability Engineer who is curious and drives insights from massive-scale data in real-time. Specifically, we are searching for someone who brings fresh ideas, demonstrates a unique and informed viewpoint, and enjoys collaborating with a cross-functional team to investigate and assist with resolving recurring and major issues and help improve the performance of our supported applications. This role requires 24/7 on-call rotation.


  • You will run the production environment by monitoring availability and taking a holistic view of system health.
  • You will improve our suite of software solutions' reliability, quality, and time-to-market.
  • Measure and optimize system performance to push our capabilities forward, get ahead of customer needs, and innovate to improve continually.
  • You will provide primary operational support and engineering for multiple large distributed software applications.
  • Participate in defining SLIs, SLOs and SLAs for Enterprise Systems.
  • Gather and analyze metrics from both applications and infrastructure to assist in performance tuning and fault finding
  • Partner with development teams to improve services through rigorous testing and release procedures.
  • Participate in system design consulting, release management, and capacity planning.
  • Create sustainable systems and services through automation and process improvements.
  • Balance feature development speed and reliability with well-defined service level objectives.
  • Monitor multiple application health and discover opportunities to optimize in a continuously growing large complex hybrid environment.
  • Lead on-call problem escalation and outage recovery effort, not limited to code fixes in presentation and integration layer, but also provide infrastructure level investigation and support where necessary.
  • Lead post-incident technical retrospect to discover and implement remediation actions.
  • You will be part of a 24/7 on-call rotation and support multiple applications and occasional weekend releases.
  • You will perform troubleshooting, deploy systems or execute maintenance tasks as necessary to meet the specified SLOs.


 Do you have the skills that will enable you to succeed in this role?


  • Be self-motivated, autonomous and a team player in a fast-paced environment. 
  • Proficiency with fundamental front-end languages such as HTML, CSS and JavaScript
  • 3-5 years of experience developing and supporting complex, large-scale customer-facing platforms.
  • Strong working knowledge of multiple programming languages and tools (Java, NodeJS, Python, etc.).
  • Experience working with database technology such as Sybase, Oracle, and MongoDB.
  • Experience working with scalable containerized systems in the public cloud (Azure and GKE/GCP).
  • Strong working experience with incident management and setting up monitoring alerts.
  • Have a proficient understanding of code versioning tools, such as Bitbucket/Git.
  • Experience in building public and internal REST APIs.
  • Experience working with scalable containerized systems in the public cloud (Azure, GKE/GCP, HTTP and IP networking concepts).
  • To build a highly automated production monitoring and support model, hands-on experience integrating Splunk, Dynatrace, Stackdriver, ThousandEyes,, or equivalents to create a highly automated production monitoring & support model.
  • Proven ability to translate ideas into technical and business realities and map technology to business problems.
  • Experience with on-call rotational support.
  • Experience with cloud services and platforms
  • Experience with Continuous Integration tools
  • Superior verbal and written communication skills with the ability to influence decision-making with stakeholders.
  • A proactive approach to spotting problems, areas for improvement, and performance bottlenecks.
  • Exceptional written and verbal communication skills
  • Excellent problem-solving skills
  • Flexible approach to work and the ability to adapt to change
  • Prior production support or SRE experience.
  • Proficient with MS suite


What's in it for you?


  • We have an inclusive and collaborative working environment that encourages creativity, curiosity and celebrates success!
  • Dress codes don't apply here; being comfortable does
  • We provide you with the tools and technology needed to create meaningful customer experiences.
  • Onsite cafeteria for when you work onsite.
  • We offer a competitive total rewards package that includes a base salary, a performance bonus, company matching programs (on pension & profit sharing), generous vacation, personal & sick days, personal development funding, maternity leave top-up, parental leave, and more.
  • Access to thousands of online and in-person courses so you can hone your current skills or learn new ones.


Working Arrangement: Remote / Hybrid


*Some of our perks & onsite offerings will be offline as we continue to monitor federal and provincial regulations around COVID-19.




Location(s):  Canada : British Columbia : Vancouver || Canada : Ontario : Toronto 

Scotiabank is a leading bank in the Americas. Guided by our purpose: "for every future", we help our customers, their families and their communities achieve success through a broad range of advice, products and services, including personal and commercial banking, wealth management and private banking, corporate and investment banking, and capital markets.  

At Scotiabank, we value the unique skills and experiences each individual brings to the Bank, and are committed to creating and maintaining an inclusive and accessible environment for everyone. If you require accommodation (including, but not limited to, an accessible interview site, alternate format documents, ASL Interpreter, or Assistive Technology) during the recruitment and selection process, please let our Recruitment team know. If you require technical assistance, please click here. Candidates must apply directly online to be considered for this role. We thank all applicants for their interest in a career at Scotiabank; however, only those candidates who are selected for an interview will be contacted.

Job Segment: Cloud, Test Engineer, Sustainability, Web Design, Testing, Technology, Engineering, Energy, Creative