Incorporating text dispersion into keyword analyses

Jesse Egbert, Doug Biber

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

Keyword analysis has become an indispensable tool for discourse analysts, being applied to identify the words that are especially characteristic of the texts in a target discourse domain. But, surprisingly, the statistical computation of keyness makes no reference to those texts. Rather, once a corpus has been constructed, it is treated as a homogeneous whole for the computation of keyness. As a result, the keywords in such lists are relatively frequent in the corpus, but they are often not widely dispersed across the texts of that corpus and are thus not truly representative of the target discourse domain. The purpose of this study is to propose a new method for keyword analysis - text dispersion keyness - that is based on text dispersion, rather than corpus frequency. We compare the effectiveness of this measure to four other methods for computing keyness, carrying out a series of case studies to identify the keywords that are typical of online travel blogs. A variety of quantitative and qualitative analyses are carried out to compare these methods based on their content-generalisability and content-distinctiveness, demonstrating that text dispersion keyness is a superior measure for generating keyword lists.

Original languageEnglish (US)
Pages (from-to)77-104
Number of pages28
JournalCorpora
Volume14
Issue number1
DOIs
StatePublished - 2018

Keywords

  • Analysis
  • Distinctiveness
  • Generalisability
  • Lexical dispersion
  • Word importance

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint Dive into the research topics of 'Incorporating text dispersion into keyword analyses'. Together they form a unique fingerprint.

Cite this