A methodological synthesis and meta-analysis of judgment tasks in second language research

Luke D Plonsky, Emma Marsden, Dustin Crowther, Patti Spinner

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Judgment tasks (JTs, often called acceptability or grammaticality judgment tasks) are found extensively throughout the history of second language (L2) research. Data from such instruments have been used to investigate a range of hypotheses and phenomena, from generativist theories to instructional effectiveness. Though popular and convenient, JTs have engendered considerable controversy, with concerns often centered on their construct validity in terms of the type of representations they elicit, such as implicit or explicit knowledge. A number of studies have also examined the impact of JT conditions such as timed vs. untimed, oral vs. written. This article presents a synthesis of the use of JTs and a meta-analysis of the effects of task conditions on learner performance. Following a comprehensive search, 385 JTs were found in 302 individual studies. Each report was coded for features related to study design as well as methodological, procedural, and psychometric properties of the JTs. These data were synthesized in order to understand how this type of instrument has been implemented and reported. In addition to observing a steady increase in the use of JTs over the last four decades, we also found many of the features of JTs, when reported, varied substantially across studies. In terms of the impact of JT design, whereas modality was not found to have a strong or stable effect on learner performance (median d =.14; interquartile range = 1.04), scores on untimed JTs tended to be substantially higher than when timed (d = 1.35; interquartile range = 1.74). In examining these features and their links to findings, this article builds on a growing body of methodological syntheses of L2 research instrumentation and makes a number of empirically grounded recommendations for future studies involving JTs.

Original languageEnglish (US)
JournalSecond Language Research
DOIs
StatePublished - Jan 1 2019

Fingerprint

construct validity
language
psychometrics
performance
history
knowledge

Keywords

  • instrumentation
  • judgment task
  • L2 knowledge
  • meta-analysis
  • methodological synthesis

ASJC Scopus subject areas

  • Education
  • Linguistics and Language

Cite this

A methodological synthesis and meta-analysis of judgment tasks in second language research. / Plonsky, Luke D; Marsden, Emma; Crowther, Dustin; Spinner, Patti.

In: Second Language Research, 01.01.2019.

Research output: Contribution to journalArticle

@article{78692d1f858b4f0887c85c17a8090313,
title = "A methodological synthesis and meta-analysis of judgment tasks in second language research",
abstract = "Judgment tasks (JTs, often called acceptability or grammaticality judgment tasks) are found extensively throughout the history of second language (L2) research. Data from such instruments have been used to investigate a range of hypotheses and phenomena, from generativist theories to instructional effectiveness. Though popular and convenient, JTs have engendered considerable controversy, with concerns often centered on their construct validity in terms of the type of representations they elicit, such as implicit or explicit knowledge. A number of studies have also examined the impact of JT conditions such as timed vs. untimed, oral vs. written. This article presents a synthesis of the use of JTs and a meta-analysis of the effects of task conditions on learner performance. Following a comprehensive search, 385 JTs were found in 302 individual studies. Each report was coded for features related to study design as well as methodological, procedural, and psychometric properties of the JTs. These data were synthesized in order to understand how this type of instrument has been implemented and reported. In addition to observing a steady increase in the use of JTs over the last four decades, we also found many of the features of JTs, when reported, varied substantially across studies. In terms of the impact of JT design, whereas modality was not found to have a strong or stable effect on learner performance (median d =.14; interquartile range = 1.04), scores on untimed JTs tended to be substantially higher than when timed (d = 1.35; interquartile range = 1.74). In examining these features and their links to findings, this article builds on a growing body of methodological syntheses of L2 research instrumentation and makes a number of empirically grounded recommendations for future studies involving JTs.",
keywords = "instrumentation, judgment task, L2 knowledge, meta-analysis, methodological synthesis",
author = "Plonsky, {Luke D} and Emma Marsden and Dustin Crowther and Patti Spinner",
year = "2019",
month = "1",
day = "1",
doi = "10.1177/0267658319828413",
language = "English (US)",
journal = "Second Language Research",
issn = "0267-6583",
publisher = "SAGE Publications Ltd",

}

TY - JOUR

T1 - A methodological synthesis and meta-analysis of judgment tasks in second language research

AU - Plonsky, Luke D

AU - Marsden, Emma

AU - Crowther, Dustin

AU - Spinner, Patti

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Judgment tasks (JTs, often called acceptability or grammaticality judgment tasks) are found extensively throughout the history of second language (L2) research. Data from such instruments have been used to investigate a range of hypotheses and phenomena, from generativist theories to instructional effectiveness. Though popular and convenient, JTs have engendered considerable controversy, with concerns often centered on their construct validity in terms of the type of representations they elicit, such as implicit or explicit knowledge. A number of studies have also examined the impact of JT conditions such as timed vs. untimed, oral vs. written. This article presents a synthesis of the use of JTs and a meta-analysis of the effects of task conditions on learner performance. Following a comprehensive search, 385 JTs were found in 302 individual studies. Each report was coded for features related to study design as well as methodological, procedural, and psychometric properties of the JTs. These data were synthesized in order to understand how this type of instrument has been implemented and reported. In addition to observing a steady increase in the use of JTs over the last four decades, we also found many of the features of JTs, when reported, varied substantially across studies. In terms of the impact of JT design, whereas modality was not found to have a strong or stable effect on learner performance (median d =.14; interquartile range = 1.04), scores on untimed JTs tended to be substantially higher than when timed (d = 1.35; interquartile range = 1.74). In examining these features and their links to findings, this article builds on a growing body of methodological syntheses of L2 research instrumentation and makes a number of empirically grounded recommendations for future studies involving JTs.

AB - Judgment tasks (JTs, often called acceptability or grammaticality judgment tasks) are found extensively throughout the history of second language (L2) research. Data from such instruments have been used to investigate a range of hypotheses and phenomena, from generativist theories to instructional effectiveness. Though popular and convenient, JTs have engendered considerable controversy, with concerns often centered on their construct validity in terms of the type of representations they elicit, such as implicit or explicit knowledge. A number of studies have also examined the impact of JT conditions such as timed vs. untimed, oral vs. written. This article presents a synthesis of the use of JTs and a meta-analysis of the effects of task conditions on learner performance. Following a comprehensive search, 385 JTs were found in 302 individual studies. Each report was coded for features related to study design as well as methodological, procedural, and psychometric properties of the JTs. These data were synthesized in order to understand how this type of instrument has been implemented and reported. In addition to observing a steady increase in the use of JTs over the last four decades, we also found many of the features of JTs, when reported, varied substantially across studies. In terms of the impact of JT design, whereas modality was not found to have a strong or stable effect on learner performance (median d =.14; interquartile range = 1.04), scores on untimed JTs tended to be substantially higher than when timed (d = 1.35; interquartile range = 1.74). In examining these features and their links to findings, this article builds on a growing body of methodological syntheses of L2 research instrumentation and makes a number of empirically grounded recommendations for future studies involving JTs.

KW - instrumentation

KW - judgment task

KW - L2 knowledge

KW - meta-analysis

KW - methodological synthesis

UR - http://www.scopus.com/inward/record.url?scp=85062726663&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85062726663&partnerID=8YFLogxK

U2 - 10.1177/0267658319828413

DO - 10.1177/0267658319828413

M3 - Article

JO - Second Language Research

JF - Second Language Research

SN - 0267-6583

ER -