The roles of suprasegmental features in predicting English oral proficiency with an automated system

Okim Kang, David Johnson

Research output: Contribution to journalArticle

Abstract

Suprasegmental features have received growing attention in the field of oral assessment. In this article we describe a set of computer algorithms that automatically scores the oral proficiency of non-native speakers using unconstrained English speech. The algorithms employ machine learning and 11 suprasegmental measures divided into four groups (prominence, filled pause, speech rate, and intonation) to calculate the proficiency scores. In test responses from 120 non-native speakers of English monologues from the Cambridge English Language Assessment (CELA), the Pearson’s correlation between the computer’s calculated proficiency levels and the official CELA proficiency levels was 0.718. The current findings provide empirical evidence that prominence and intonation are salient features in the computer model’s prediction of proficiency.

Original languageEnglish (US)
Pages (from-to)1-19
Number of pages19
JournalLanguage Assessment Quarterly
DOIs
StateAccepted/In press - Mar 23 2018

Fingerprint

English language
official language
learning
evidence
Proficiency
Suprasegmental Features
Group
Language Assessment
Intonation

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Cite this

The roles of suprasegmental features in predicting English oral proficiency with an automated system. / Kang, Okim; Johnson, David.

In: Language Assessment Quarterly, 23.03.2018, p. 1-19.

Research output: Contribution to journalArticle

@article{a17561ec39e34390ba2e573fc8ccf164,
title = "The roles of suprasegmental features in predicting English oral proficiency with an automated system",
abstract = "Suprasegmental features have received growing attention in the field of oral assessment. In this article we describe a set of computer algorithms that automatically scores the oral proficiency of non-native speakers using unconstrained English speech. The algorithms employ machine learning and 11 suprasegmental measures divided into four groups (prominence, filled pause, speech rate, and intonation) to calculate the proficiency scores. In test responses from 120 non-native speakers of English monologues from the Cambridge English Language Assessment (CELA), the Pearson’s correlation between the computer’s calculated proficiency levels and the official CELA proficiency levels was 0.718. The current findings provide empirical evidence that prominence and intonation are salient features in the computer model’s prediction of proficiency.",
author = "Okim Kang and David Johnson",
year = "2018",
month = "3",
day = "23",
doi = "10.1080/15434303.2018.1451531",
language = "English (US)",
pages = "1--19",
journal = "Language Assessment Quarterly",
issn = "1543-4303",
publisher = "Routledge",

}

TY - JOUR

T1 - The roles of suprasegmental features in predicting English oral proficiency with an automated system

AU - Kang, Okim

AU - Johnson, David

PY - 2018/3/23

Y1 - 2018/3/23

N2 - Suprasegmental features have received growing attention in the field of oral assessment. In this article we describe a set of computer algorithms that automatically scores the oral proficiency of non-native speakers using unconstrained English speech. The algorithms employ machine learning and 11 suprasegmental measures divided into four groups (prominence, filled pause, speech rate, and intonation) to calculate the proficiency scores. In test responses from 120 non-native speakers of English monologues from the Cambridge English Language Assessment (CELA), the Pearson’s correlation between the computer’s calculated proficiency levels and the official CELA proficiency levels was 0.718. The current findings provide empirical evidence that prominence and intonation are salient features in the computer model’s prediction of proficiency.

AB - Suprasegmental features have received growing attention in the field of oral assessment. In this article we describe a set of computer algorithms that automatically scores the oral proficiency of non-native speakers using unconstrained English speech. The algorithms employ machine learning and 11 suprasegmental measures divided into four groups (prominence, filled pause, speech rate, and intonation) to calculate the proficiency scores. In test responses from 120 non-native speakers of English monologues from the Cambridge English Language Assessment (CELA), the Pearson’s correlation between the computer’s calculated proficiency levels and the official CELA proficiency levels was 0.718. The current findings provide empirical evidence that prominence and intonation are salient features in the computer model’s prediction of proficiency.

UR - http://www.scopus.com/inward/record.url?scp=85044313318&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85044313318&partnerID=8YFLogxK

U2 - 10.1080/15434303.2018.1451531

DO - 10.1080/15434303.2018.1451531

M3 - Article

SP - 1

EP - 19

JO - Language Assessment Quarterly

JF - Language Assessment Quarterly

SN - 1543-4303

ER -