Title: Principal Platform Engineer
Requisition ID: 265477
Join a purpose driven winning team, committed to results, in an inclusive and high-performing culture.
The Principal Platform Engineer will play a critical role within the Enterprise Data & AI Technology organization - one of Scotiabank’s most significant enterprise wide strategic initiatives. This organization drives data enabled decision making, AI innovation, and technology modernization across the Bank.
The Principal Platform Engineer will be responsible for defining and owning the technical strategy, architecture, and operational excellence for the Data & AI platform(s) in alignment with the Bank’s Data & AI strategy. This role sets platform standards and guardrails, leads reliability and security improvements, and drives automation and enablement at scale. The Principal Platform Engineer partners across IAM, Network, Cloud Ops, Security, Data Governance, and client delivery teams to influence enterprise roadmaps, manage risk, and deliver new capabilities and modernization initiatives.
Is this role right for you? In this role, you will:
- Platform Strategy and Roadmap: Define the platform technical strategy and multi‑quarter roadmap for Azure & Databricks, aligning to enterprise architecture, security, and data governance standards. Identify capability gaps, prioritize investments, and drive adoption across delivery teams.
- Architecture Ownership and Standards: Own end‑to‑end architecture for Azure identity and access (RBAC, PIM, workload identities), and Databricks governance (Unity Catalog, workspace configuration, cluster policies). Establish reference architectures, design patterns, and reusable blueprints; ensure designs are compliant, resilient, and cost‑effective.
- Operating Model and Governance: Define and evolve the platform operating model (intake, onboarding, support tiers, change management, controls evidence), including SLAs/SLOs, service objectives, and runbooks. Drive consistency across environments and delivery streams.
- Reliability Engineering (SRE): Establish error‑budget aware practices, incident severity models, and resilience engineering (autoscaling, retry/backoff strategies, capacity planning). Lead post‑incident reviews, ensure corrective actions are delivered, and continuously reduce toil and MTTR.
- Observability and Monitoring: Design, build, and standardize observability across Databricks and Azure using Azure Monitor and Log Analytics. Deliver actionable dashboards and alerting for cluster/job health, audit events, performance, and cost insights; enable proactive detection and capacity management.
- Infrastructure as Code and Automation: Design and develop reusable Terraform modules for Azure and Databricks (clusters, SQL warehouses, Unity Catalog objects), enabling consistent, scalable, and automated deployments via Terraform Cloud/Enterprise and CI/CD. Set IaC standards, review practices, and policy-as-code controls.
- Release, Change, and Risk Management: Own the Infrastructure & Platform release and change management approach, including approvals, change windows, automated validations, and rollbacks. Partner with Risk, Security, and Compliance to ensure auditability and control adherence.
- Performance, Troubleshooting, and Cost Optimization: Lead complex troubleshooting across Databricks jobs, clusters, SQL warehouses, and Azure dependencies. Drive performance tuning, capacity planning, and cost optimization (tagging/chargeback, cluster policies, autoscaling, right‑sizing) in partnership with Finance/Cloud Ops.
- Stakeholder Leadership and Enablement: Build strong relationships with platform users and delivery teams. Communicate platform direction, constraints, and best practices; influence cross‑functional stakeholders (Platform, Security, Cloud Ops, Networking, Data Governance) to align priorities and accelerate adoption of standards.
- Security and Secrets Management: Establish secure patterns for secret management using Azure Key Vault and HashiCorp Vault; integrate with Databricks secret scopes and workload identities. Enforce least‑privilege access, credential rotation, and secure-by-default platform configurations.
- Vendor Partnership and Technology Evolution: Partner with Microsoft and Databricks to plan upgrades, troubleshoot complex issues, evaluate new capabilities, and influence product/enterprise roadmaps while maintaining control compliance.
- Technical Leadership and Mentorship: Provide hands‑on technical leadership across squads; mentor engineers on architecture, IaC, CI/CD, incident response, and operational excellence. Raise engineering standards through design reviews, documentation, and continuous improvement.
Do you have the skills that will enable you to succeed in this role? We'd love to work with you if you have:
- Around 10 years of progressive IT experience in large, regulated enterprises operating across multiple geographies.
- 7+ years of hands‑on experience with Microsoft Azure, including architecture and deep expertise in networking, security, identity, storage, compute, and PaaS services.
- 5+ years of hands‑on Databricks on Azure experience (workspaces, jobs/workflows, clusters/SQL warehouses, Unity Catalog governance), including platform standards and guardrails.
- 7+ years using Infrastructure as Code (Terraform modules, Terraform Cloud/Enterprise; working knowledge of ARM/Bicep a plus), including establishing IaC standards and reusable blueprints.
- 7+ years with CI/CD and GitOps practices (Azure DevOps, GitHub Actions), including automated testing, security scanning, policy gates, and release/change controls.
- Strong development and automation skills (Python required; Bash/PowerShell; Go optional) used to build platform tooling, self‑service enablement, and operational automation.
- Proven experience designing secure, enterprise-grade Azure network and identity architectures (VNets, Private Endpoints, NSGs, UDRs, Azure Firewall, RBAC/PIM, workload identity) using zero‑trust principles.
- Deep understanding of data platforms and integration patterns: Azure SQL, Cosmos DB, Databricks Lakehouse (Delta Lake, SQL Warehouses), ADLS Gen2, Event Hubs, and enterprise data governance controls.
- Demonstrated ownership of SRE and incident management practices: SLOs/error budgets, on‑call readiness, major incident leadership, post‑incident reviews, and reliability improvements delivered through measurable outcomes.
- Experience establishing observability standards (Azure Monitor, Log Analytics, dashboards, alerting) and driving performance/cost optimization at scale.
- Strong stakeholder management and cross‑functional leadership skills; able to influence enterprise roadmaps, align priorities, and communicate tradeoffs to technical and non‑technical audiences.
- Bachelor’s degree in Computer Science, Engineering, Mathematics, Management or a related field (or equivalent practical experience).
What's in it for you?
- Diversity, Equity, Inclusion & Allyship - We strive to create an inclusive culture where every employee is empowered to reach their fullest potential, respected for who they are, and are embraced through bias-free practices and inclusive values across Scotiabank. We embrace diversity and provide opportunities for all employee to learn, grow & participate through our various Employee Resource Groups (ERGs) that span across diverse gender identities, ethnicity, race, age, ability & veterans.
- Accessibility and Workplace Accommodations - We value the unique skills and experiences each individual brings to the Bank, and are committed to creating and maintaining an inclusive and accessible environment for everyone. Scotiabank continues to locate, remove and prevent barriers so that we can build a diverse and inclusive environment while meeting accessibility requirements.
- Upskilling through online courses, cross-functional development opportunities, and tuition assistance.
- Competitive Rewards program including bonus, flexible vacation, personal, sick days and benefits will start on day one.
- Dynamic Ecosystem - Free tea & coffee, universal washrooms, and lots of space for team collaboration.
- Community Engagement - No matter where you choose to work from; we offer opportunities for community engagement & belonging with our various programs.
Location(s): Canada : Ontario : Toronto
Scotiabank is a leading bank in the Americas. Guided by our purpose: "for every future", we help our customers, their families and their communities achieve success through a broad range of advice, products and services, including personal and commercial banking, wealth management and private banking, corporate and investment banking, and capital markets.
At Scotiabank, we value the unique skills and experiences each individual brings to the Bank, and are committed to creating and maintaining an inclusive and accessible environment for everyone. If you require accommodation (including, but not limited to, an accessible interview site, alternate format documents, ASL Interpreter, or Assistive Technology) during the recruitment and selection process, please let our Recruitment team know. If you require technical assistance, please click here. Candidates must apply directly online to be considered for this role. We thank all applicants for their interest in a career at Scotiabank; however, only those candidates who are selected for an interview will be contacted.
Job Segment:
Change Management, Computer Science, Information Technology, Investment Banking, Risk Management, Management, Finance, Technology