Assessing AI capabilities with education tests

Mila Staneva; Abel Baret; Ángel Aso-Mollar; Joseph Blass; Salvador Carrión Ponz; Vincent Conitzer; Ulises Cortes; Pradeep Dasigi; Angel de Paula; Carlos Galindo; Janice Gobert; Jordi Gonzàlez; Fredrik Heintz; Jim Hendler; Daniel Hendrycks; Lawrence Hunter; Juan Izquierdo-Domenech; Maria Juarez; Aina Juraco Frias; Aviv Keren; Rik Koncel-Kedziorski; David Leake; Bao Sheng Loe; Fernando Martinez-Plumed; Aqueasha Martin-Hammond; Cynthia Matuszek; Antoni Mestre Gascón; Jose Andres Moreno; Constantine Nakos; Taylor Olson; Carolyn Rose; Areg Mikael Sarvazyan; Brian Scassellati; Wout Schellaert; Claes Strannegård; Neset Tan; Tadahiro Taniguchi; Karina Vold; Michael Wooldridge

doi:10.1787/bbdeb1e0-en

Back

Assessing AI capabilities with education tests

Book chapter

Open access

Assessing AI capabilities with education tests

Mila Staneva, Abel Baret, Ángel Aso-Mollar, Joseph Blass, Salvador Carrión Ponz, Vincent Conitzer, Ulises Cortes, Pradeep Dasigi, Angel de Paula, Carlos Galindo, …

AI and the Future of Skills, Volume 2, pp.40-64

Educational Research and Innovation, OECD Publishing

11/16/2023

DOI: 10.1787/bbdeb1e0-en

Files and links (1)

url

https://doi.org/10.1787/bbdeb1e0-enView

Published (Version of record)See pages 40-64 Open Access

Abstract

This chapter introduces three exploratory studies that assessed the capabilities of artificial intelligence (AI) through standardised education tests designed for humans. The first two studies, conducted in 2016 and 2021/22, asked experts to evaluate AI’s performance on the literacy and numeracy tests of the OECD’s Survey of Adult Skills (PIAAC). The third study collected expert judgements of whether AI can solve science questions from the OECD's Programme for International Student Assessment (PISA). The studies aimed to refine the assessment framework for eliciting expert knowledge on AI using established educational assessments. They explored different test formats, response methodologies and rating instructions, along with two distinct assessment approaches. A “behavioural approach” used in the PIAAC studies emphasised smaller expert groups engaging in discussions, and a "mathematical approach" adopted in the PISA study relied more heavily on quantitative data from a larger expert pool. This chapter presents the results of the studies and discusses the advantages and disadvantages of their methodological approaches.

Details

Title: Subtitle: Assessing AI capabilities with education tests
Creators: Mila Staneva - Organisation de Coopération et de Développement Economiques
Abel Baret - Organisation de Coopération et de Développement Economiques
Ángel Aso-Mollar - Universitat Politècnica de València
Joseph Blass
Salvador Carrión Ponz
Vincent Conitzer - Carnegie Mellon University
Ulises Cortes - Universitat Politècnica de Catalunya
Pradeep Dasigi - Allen Institute
Angel de Paula - Universitat Politècnica de València
Carlos Galindo - Universitat Politècnica de València
Janice Gobert - Rutgers Sexual and Reproductive Health and Rights
Jordi Gonzàlez - Universitat Autònoma de Barcelona
Fredrik Heintz - Linköping University
Jim Hendler
Daniel Hendrycks
Lawrence Hunter - University of Colorado Anschutz Medical Campus
Juan Izquierdo-Domenech - Universitat Politècnica de València
Maria Juarez
Aina Juraco Frias
Aviv Keren
Rik Koncel-Kedziorski
David Leake - Indiana University
Bao Sheng Loe - University of Cambridge
Fernando Martinez-Plumed - Universitat Politècnica de València
Aqueasha Martin-Hammond - Indiana University
Cynthia Matuszek - University of Maryland, College Park
Antoni Mestre Gascón
Jose Andres Moreno - Universitat Politècnica de València
Constantine Nakos
Taylor Olson
Carolyn Rose - Carnegie Mellon University
Areg Mikael Sarvazyan - Universitat Politècnica de València
Brian Scassellati - Yale University
Wout Schellaert - Universitat Politècnica de València
Claes Strannegård - Chalmers University of Technology
Neset Tan - University of Auckland
Tadahiro Taniguchi - Panasonic (Poland)
Karina Vold - University of Toronto
Michael Wooldridge - University of Oxford
Resource Type: Book chapter
Publication Details: AI and the Future of Skills, Volume 2, pp.40-64
Series: Educational Research and Innovation
DOI: 10.1787/bbdeb1e0-en
eISSN: 2076-9679
ISSN: 2076-9660
Publisher: OECD Publishing; Paris
Number of pages: 25
Language: English
Date published: 11/16/2023
Academic Unit: Computer Science
Record Identifier: 9984958640502771

Metrics

3 Record Views