Discover Technata Job board

Find your next tech job in Kanata North, Canada’s largest technology park. Then explore endless international opportunities and dream about where your career will take you. With the Country’s largest density of technology companies ranging from promising startups to leading global giants, Kanata North is the place to be if you are serious about a career in tech.

My job alerts

AI Evaluation Specialist (CS-2)

National Research Council of Canada

Software Engineering, Data Science

CAD 86,503-108,068 / year

Posted on May 23, 2026

Apply now

AI Evaluation Specialist (CS-2)

Priority may be given to the following designated employment equity groups: women, Indigenous Peoples* (First Nations, Inuit and Métis), persons with disabilities and racialized persons*.

* The Employment Equity Act, which is under review, uses the terminology Aboriginal peoples and visible minorities.

Candidates are asked to self-declare when applying to this hiring process.

City: Ottawa

Organizational Unit: Digital Technologies

Classification: CS-2

Tenure: Continuing

Language Requirements: English

Work arrangements:

Due to the nature of the work and operational requirements, this position may be eligible for a limited hybrid work arrangement (combination of working onsite and telework).

At the NRC, we recognize that Indigenous candidates may have important connections to their communities and you may be eligible for an exception to this work arrangement. Alternative work arrangements may also be considered to accommodate candidates as required. To learn more about these options, please contact the NRC Hiring team using the contact information below.

Discover the possible

Anything is possible at the NRC, named in 2025 one of Canada’s Top Employers for Young People, Top Employer in the National Capital Region and Forbes Canada’s Best Employers!

As Canada’s largest research and innovation organization, our world-renowned research pushes the boundaries of science and engineering to make the impossible, possible. Every day we explore new ideas through innovative research and help companies discover possibilities that impact Canada’s future and the world.

At the NRC, you’ll also discover new possibilities. Our supportive workplace fosters a culture of creativity, welcoming fresh perspectives and innovation at all levels. We value teamwork. You’ll collaborate across multiple fields and with the brightest minds to find creative solutions. Most importantly, you’ll discover what’s possible within you as you grow, make valuable contributions and progress in your professional journey. From ground-breaking discoveries to a life-changing career, discover your possible at the NRC.

The role

Are you passionate about safe and responsible AI? Do you want to be part of a pioneering team shaping how AI systems are evaluated, measured, and deployed safely? Do you want your work to have a real impact for Canada and Canadians? If so, we want to meet you!

We are looking for a CS-02 AI Evaluation Specialist. Come join our dynamic team at the National Research Council (NRC) as we embark on ground-breaking projects related to AI safety and security. Be part of a pioneering team that will establish a new AI Safety Lab for conducting system evaluations and translating technical findings into practical solutions and recommendations for AI practitioners and policy makers.

Under the guidance of the AI Safety Lab Lead, you will play a critical role in building, maintaining and using a new infrastructure for AI safety evaluation. You will also set up, run, and monitor advanced agentic evaluation at scale on frontier AI models, while contributing to cutting-edge AI safety research conducted by the NRC in collaboration with national and international partners.

Key Responsibilities include:

Build, maintain, and use a state-of-the-art model evaluation infrastructure combining on-premises and cloud-based clusters.
Support AI safety researchers leveraging the evaluation infrastructure.
Set up, run and monitor large-scale evaluations involving multiple AI models, agents and benchmarks.
Steward and document protocols for model evaluation and experiment tracking.
Contribute to develop and maintain customized evaluation benchmarks.
Contribute to the analysis and reporting of model evaluation results.
Stay current with the latest advancements in model evaluation tools, approaches, and packages.

Screening criteria

Applicants must demonstrate within the content of their application that they meet the following screening criteria in order to be given further consideration as candidates:

Education

University degree in computer science, computer engineering, or related field.

Education Assets

A specialization in data analytics, machine learning, or Artificial Intelligence (AI) may be considered an asset.
A Master’s degree in a field related to the position may be considered an asset.

Equivalency
A college diploma with significant experience directly related to the duties of this position may be considered as an educational equivalent.

For information on certificates and diplomas issued abroad, please see Degree equivalency

Experience

You must meet the essential experience criteria and at least two of the Specialised Technical Experience criteria to be considered in this process.

Essential

Experience developing Python-based tooling and automation for ML/AI workflows in a research or production environment.
Experience with the programmatic use and orchestration of large language models (both closed-source and open-source).
Experience with containerization technologies (such as Docker, Podman, or Apptainer).
Experience with high-performance computing (HPC) clusters and job orchestration (e.g., SLURM).
Experience documenting experimental AI results through technical reports, internal documentation, or scientific publications.

Specialised Technical Experience

Experience setting up, running, or contributing to evaluations or benchmarking of AI systems.
Experience with red-teaming workflows, adversarial testing, or stress-testing of AI systems (including agentic systems).
Experience with AI/ML experiment tracking tools (e.g., MLflow, Weights & Biases).
Experience developing or curating AI benchmark datasets and associated documentation (e.g., through HuggingFace).

Condition of employment

Reliability Status

For a Reliability Status, verification of background information over a period of 5 years is required.

Language requirements

English

Information on language requirements and self-assessment tests

Assessment criteria

Candidates will be assessed on the basis of the following criteria:

Technical competencies

You must meet all of the essential competencies and at least one of the Specialised Technical Competency criteria to move forward in this process.

Essential Technical Competencies

Ability to design and implement reproducible AI workflows using Python.
Ability to make sound technical decisions when using different models, accounting for factors such as API constraints, compute budgets, reproducibility, and result validity.
Ability to diagnose and resolve infrastructure issues in HPC or cloud-based evaluation environments, including resource contention, job failures, and performance bottlenecks.
Knowledge of AI safety evaluation frameworks (e.g., Inspect AI, Moonshot) and their application to frontier model assessment.
Knowledge of complex data storage and querying tools for handling large volumes of evaluation data.
Knowledge of software engineering best practices including version control, CI/CD pipelines, automated testing, and related modern software development tools (e.g., Git, GitLab/GitHub) and practices.

Specialised Technical Competencies

Knowledge of ML experiment tracking tools (e.g., MLflow, Weights & Biases) and their integration into reproducible research workflows.
Knowledge of agentic AI frameworks and protocols.
Familiarity with AI safety concepts, risk taxonomies, or governance frameworks (e.g., TBS guidelines, NIST AI Risk Management Framework, model evaluation practices).

Behavioural competencies

Research Technician/Technologist - Client focus (Level 2)
Research Technician/Technologist - Communication (Level 2)
Research Technician/Technologist - Results orientation (Level 2)
Research Technician/Technologist - Self-knowing and self-development (Level 2)
Research Technician/Technologist - Teamwork (Level 2)
Management services - Conceptual and analytical ability (Level 2)

Competency Profile(s)

For this position, the NRC will evaluate candidates using the following competency profile(s): Management Services; Research Technician/Technologist

View all competency profiles

Compensation

From $86,503 to $108,068 per annum.

NRC employees enjoy a wide-range of competitive benefits including a robust pension plan, comprehensive health and dental coverage, disability and life insurance, office closure at the end of December, and additional supports to enhance your well-being throughout your career and beyond.

Notes

In 2025, the NRC was chosen as one of Canada’s Top Employers for Young People, a National Capital Region Top Employer and Forbes Canada’s Best Employer.
Relocation assistance will be determined in accordance with the NRC's directives.
A pre-qualified list may be established for similar positions for a one year period.
Preference will be given to Canadian Citizens and Permanent Residents of Canada. Please include citizenship information in your application.
The incumbent must adhere to safe workplace practices at all times.
We thank all those who apply, however only those selected for further consideration will be contacted.

Please direct your questions, with the requisition number (25418) to:

E-mail: NRC.NRCHiring-EmbaucheCNRC.CNRC@nrc-cnrc.gc.ca

Telephone: 3439906649

Closing Date: 2 June 2026 - 23:59 Eastern Time

For more information on career tools and other resources, check out Career tools and resources

*If you are currently a term or continuing employee at NRC, please apply through the SuccessFactors Careers module from your NRC computer.

Apply now

See more open positions at National Research Council of Canada

Powered by Getro.com

Privacy policy Cookie policy