Do all roads lead to Rome? Modeling register variation with factor analysis and discriminant analysis

Jesse Egbert, Douglas E Biber

Research output: Contribution to journalReview article

5 Citations (Scopus)

Abstract

Previous theoretical and empirical research on register variation has argued that linguistic co-occurrence patterns have a highly systematic relationship to register differences, because they both share the same functional underpinnings. The goal of this study is to test this claim through a comparison of two statistical techniques that have been used to describe register variation: factor analysis (as used in Multi-Dimensional analysis, MDA) and canonical discriminant analysis (CDA). MDA and CDA have different statistical bases and thus give priority to different analytical considerations: linguistic co-occurrence in the case of MDA and the prediction of register differences in the case of CDA. Thus, there is no statistical reason to expect that the two techniques, if applied to the same corpus, will produce similar results. We hypothesize that although MDA and CDA approach register variation from opposite sides, they will produce similar results because both types of statistical patterns are motivated by underlying discourse functions. The present paper tests this claim through a case-study analysis of variation among web registers, applying MDA and CDA to analyze register variation in the same corpus of texts.

Original languageEnglish (US)
Pages (from-to)233-273
Number of pages41
JournalCorpus Linguistics and Linguistic Theory
Volume14
Issue number2
DOIs
StatePublished - Sep 25 2018

Fingerprint

dimensional analysis
discriminant analysis
factor analysis
road
linguistics
empirical research
Multidimensional Analysis
Discriminant Analysis
Rome
Roads
Register Variation
Factor Analysis
Modeling
discourse

Keywords

  • discriminant analysis
  • factor analysis
  • linguistic co-occurrence
  • multi-dimensional analysis
  • register variation
  • text classification
  • web registers

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Cite this

Do all roads lead to Rome? Modeling register variation with factor analysis and discriminant analysis. / Egbert, Jesse; Biber, Douglas E.

In: Corpus Linguistics and Linguistic Theory, Vol. 14, No. 2, 25.09.2018, p. 233-273.

Research output: Contribution to journalReview article

@article{7057c19a60384c4c98a50792bce83ebb,
title = "Do all roads lead to Rome?: Modeling register variation with factor analysis and discriminant analysis",
abstract = "Previous theoretical and empirical research on register variation has argued that linguistic co-occurrence patterns have a highly systematic relationship to register differences, because they both share the same functional underpinnings. The goal of this study is to test this claim through a comparison of two statistical techniques that have been used to describe register variation: factor analysis (as used in Multi-Dimensional analysis, MDA) and canonical discriminant analysis (CDA). MDA and CDA have different statistical bases and thus give priority to different analytical considerations: linguistic co-occurrence in the case of MDA and the prediction of register differences in the case of CDA. Thus, there is no statistical reason to expect that the two techniques, if applied to the same corpus, will produce similar results. We hypothesize that although MDA and CDA approach register variation from opposite sides, they will produce similar results because both types of statistical patterns are motivated by underlying discourse functions. The present paper tests this claim through a case-study analysis of variation among web registers, applying MDA and CDA to analyze register variation in the same corpus of texts.",
keywords = "discriminant analysis, factor analysis, linguistic co-occurrence, multi-dimensional analysis, register variation, text classification, web registers",
author = "Jesse Egbert and Biber, {Douglas E}",
year = "2018",
month = "9",
day = "25",
doi = "10.1515/cllt-2016-0016",
language = "English (US)",
volume = "14",
pages = "233--273",
journal = "Corpus Linguistics and Linguistic Theory",
issn = "1613-7027",
publisher = "Walter de Gruyter GmbH & Co. KG",
number = "2",

}

TY - JOUR

T1 - Do all roads lead to Rome?

T2 - Modeling register variation with factor analysis and discriminant analysis

AU - Egbert, Jesse

AU - Biber, Douglas E

PY - 2018/9/25

Y1 - 2018/9/25

N2 - Previous theoretical and empirical research on register variation has argued that linguistic co-occurrence patterns have a highly systematic relationship to register differences, because they both share the same functional underpinnings. The goal of this study is to test this claim through a comparison of two statistical techniques that have been used to describe register variation: factor analysis (as used in Multi-Dimensional analysis, MDA) and canonical discriminant analysis (CDA). MDA and CDA have different statistical bases and thus give priority to different analytical considerations: linguistic co-occurrence in the case of MDA and the prediction of register differences in the case of CDA. Thus, there is no statistical reason to expect that the two techniques, if applied to the same corpus, will produce similar results. We hypothesize that although MDA and CDA approach register variation from opposite sides, they will produce similar results because both types of statistical patterns are motivated by underlying discourse functions. The present paper tests this claim through a case-study analysis of variation among web registers, applying MDA and CDA to analyze register variation in the same corpus of texts.

AB - Previous theoretical and empirical research on register variation has argued that linguistic co-occurrence patterns have a highly systematic relationship to register differences, because they both share the same functional underpinnings. The goal of this study is to test this claim through a comparison of two statistical techniques that have been used to describe register variation: factor analysis (as used in Multi-Dimensional analysis, MDA) and canonical discriminant analysis (CDA). MDA and CDA have different statistical bases and thus give priority to different analytical considerations: linguistic co-occurrence in the case of MDA and the prediction of register differences in the case of CDA. Thus, there is no statistical reason to expect that the two techniques, if applied to the same corpus, will produce similar results. We hypothesize that although MDA and CDA approach register variation from opposite sides, they will produce similar results because both types of statistical patterns are motivated by underlying discourse functions. The present paper tests this claim through a case-study analysis of variation among web registers, applying MDA and CDA to analyze register variation in the same corpus of texts.

KW - discriminant analysis

KW - factor analysis

KW - linguistic co-occurrence

KW - multi-dimensional analysis

KW - register variation

KW - text classification

KW - web registers

UR - http://www.scopus.com/inward/record.url?scp=85040552442&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85040552442&partnerID=8YFLogxK

U2 - 10.1515/cllt-2016-0016

DO - 10.1515/cllt-2016-0016

M3 - Review article

AN - SCOPUS:85040552442

VL - 14

SP - 233

EP - 273

JO - Corpus Linguistics and Linguistic Theory

JF - Corpus Linguistics and Linguistic Theory

SN - 1613-7027

IS - 2

ER -