Automatic detection of Brazil’s prosodic tone unit

David O. Johnson, Okim Kang

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

This research is focused on the automatic detection of one of the fundamental elements of Brazil’s prosody model, the tone unit. We compared the performance of using silent pause duration alone to delimit tone units and using relative pitch resets and slow pace (or post-boundary lengthening) along with silent pause duration to delimit them. The corpus used for the comparison is composed of 18 highly proficient speakers giving academic lectures in six varieties of English which are representative of the inner (American and British), outer (Indian and South African), and expanding (Chinese and Spanish) concentric circles of Kachru’s World Englishes. The performance was compared by computing Pearson’s correlation between the numbers of tone units in a trained linguist’s transcription of the corpus and the numbers automatically detected by the computer. The computer detected the tone units from phone sequences identified in the audio files by a large vocabulary spontaneous speech recognition (LVCSR) program. We found including relative pitch resets and slow pace along with silent pause duration in the computer algorithm improved the correlation between the numbers of tone units in the linguist’s transcription of the corpus and the numbers automatically detected by the computer from 0.935 to 0.959.

Original languageEnglish (US)
Pages (from-to)287-291
Number of pages5
JournalUnknown Journal
Volume2016-January
StatePublished - 2016

Fingerprint

performance
vocabulary
Pause
Transcription
Relative pitch
World Englishes
File
Phone
Africa
Fundamental
Speech Recognition
Spontaneous Speech
Academic Lectures
Prosody
Varieties of English
Vocabulary

Keywords

  • Automatic speech recognition (ASR)
  • Brazil’s prosody model
  • Large vocabulary spontaneous speech recognition (LVCSR)
  • Tone unit
  • World Englishes

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Cite this

Automatic detection of Brazil’s prosodic tone unit. / Johnson, David O.; Kang, Okim.

In: Unknown Journal, Vol. 2016-January, 2016, p. 287-291.

Research output: Contribution to journalArticle

Johnson, DO & Kang, O 2016, 'Automatic detection of Brazil’s prosodic tone unit', Unknown Journal, vol. 2016-January, pp. 287-291.
Johnson, David O. ; Kang, Okim. / Automatic detection of Brazil’s prosodic tone unit. In: Unknown Journal. 2016 ; Vol. 2016-January. pp. 287-291.
@article{2aebfd58ae164285adda41352e33c4c3,
title = "Automatic detection of Brazil’s prosodic tone unit",
abstract = "This research is focused on the automatic detection of one of the fundamental elements of Brazil’s prosody model, the tone unit. We compared the performance of using silent pause duration alone to delimit tone units and using relative pitch resets and slow pace (or post-boundary lengthening) along with silent pause duration to delimit them. The corpus used for the comparison is composed of 18 highly proficient speakers giving academic lectures in six varieties of English which are representative of the inner (American and British), outer (Indian and South African), and expanding (Chinese and Spanish) concentric circles of Kachru’s World Englishes. The performance was compared by computing Pearson’s correlation between the numbers of tone units in a trained linguist’s transcription of the corpus and the numbers automatically detected by the computer. The computer detected the tone units from phone sequences identified in the audio files by a large vocabulary spontaneous speech recognition (LVCSR) program. We found including relative pitch resets and slow pace along with silent pause duration in the computer algorithm improved the correlation between the numbers of tone units in the linguist’s transcription of the corpus and the numbers automatically detected by the computer from 0.935 to 0.959.",
keywords = "Automatic speech recognition (ASR), Brazil’s prosody model, Large vocabulary spontaneous speech recognition (LVCSR), Tone unit, World Englishes",
author = "Johnson, {David O.} and Okim Kang",
year = "2016",
language = "English (US)",
volume = "2016-January",
pages = "287--291",
journal = "Unknown Journal",

}

TY - JOUR

T1 - Automatic detection of Brazil’s prosodic tone unit

AU - Johnson, David O.

AU - Kang, Okim

PY - 2016

Y1 - 2016

N2 - This research is focused on the automatic detection of one of the fundamental elements of Brazil’s prosody model, the tone unit. We compared the performance of using silent pause duration alone to delimit tone units and using relative pitch resets and slow pace (or post-boundary lengthening) along with silent pause duration to delimit them. The corpus used for the comparison is composed of 18 highly proficient speakers giving academic lectures in six varieties of English which are representative of the inner (American and British), outer (Indian and South African), and expanding (Chinese and Spanish) concentric circles of Kachru’s World Englishes. The performance was compared by computing Pearson’s correlation between the numbers of tone units in a trained linguist’s transcription of the corpus and the numbers automatically detected by the computer. The computer detected the tone units from phone sequences identified in the audio files by a large vocabulary spontaneous speech recognition (LVCSR) program. We found including relative pitch resets and slow pace along with silent pause duration in the computer algorithm improved the correlation between the numbers of tone units in the linguist’s transcription of the corpus and the numbers automatically detected by the computer from 0.935 to 0.959.

AB - This research is focused on the automatic detection of one of the fundamental elements of Brazil’s prosody model, the tone unit. We compared the performance of using silent pause duration alone to delimit tone units and using relative pitch resets and slow pace (or post-boundary lengthening) along with silent pause duration to delimit them. The corpus used for the comparison is composed of 18 highly proficient speakers giving academic lectures in six varieties of English which are representative of the inner (American and British), outer (Indian and South African), and expanding (Chinese and Spanish) concentric circles of Kachru’s World Englishes. The performance was compared by computing Pearson’s correlation between the numbers of tone units in a trained linguist’s transcription of the corpus and the numbers automatically detected by the computer. The computer detected the tone units from phone sequences identified in the audio files by a large vocabulary spontaneous speech recognition (LVCSR) program. We found including relative pitch resets and slow pace along with silent pause duration in the computer algorithm improved the correlation between the numbers of tone units in the linguist’s transcription of the corpus and the numbers automatically detected by the computer from 0.935 to 0.959.

KW - Automatic speech recognition (ASR)

KW - Brazil’s prosody model

KW - Large vocabulary spontaneous speech recognition (LVCSR)

KW - Tone unit

KW - World Englishes

UR - http://www.scopus.com/inward/record.url?scp=84983003911&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84983003911&partnerID=8YFLogxK

M3 - Article

VL - 2016-January

SP - 287

EP - 291

JO - Unknown Journal

JF - Unknown Journal

ER -