A corpus-driven approach to formulaic language in english

Multi-word patterns in speech and writing

Research output: Contribution to journalArticle

135 Citations (Scopus)

Abstract

The present study utilizes a corpus-driven approach to identify the most common multi-word patterns in conversation and academic writing, and to investigate the differing pattern types in the two registers. The paper first surveys the methodological characteristics of corpus-driven research and then contrasts the linguistic characteristics of two types of multi-word sequences: 'multi-word lexical collocations' (combinations of content words) versus 'multi-word formulaic sequences' (incorporating both function words and content words). Building on this background, the primary focus of the paper is an empirical investigation of the 'patterns' represented by multi-word formulaic sequences. It turns out that the multi-word patterns typical of speech are fundamentally different from those typical of academic writing: patterns in conversation tend to be fixed sequences (including both function words and content words). In contrast, most patterns in academic writing are formulaic frames consisting of invariable function words with an intervening variable slot that is filled by content words.

Original languageEnglish (US)
Pages (from-to)275-311
Number of pages37
JournalInternational Journal of Corpus Linguistics
Volume14
Issue number3
DOIs
StatePublished - 2009

Fingerprint

language
conversation
linguistics
Formulaic Language
Content Words
Function Words
Academic Writing
Formulaic Sequences

Keywords

  • Conversation versus academic writing
  • Corpus-driven research
  • Lexical bundles
  • Lexical patterns
  • Multi-word formulaic sequences

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Cite this

@article{d1277c7a7fb948d38114aba414cfd7f3,
title = "A corpus-driven approach to formulaic language in english: Multi-word patterns in speech and writing",
abstract = "The present study utilizes a corpus-driven approach to identify the most common multi-word patterns in conversation and academic writing, and to investigate the differing pattern types in the two registers. The paper first surveys the methodological characteristics of corpus-driven research and then contrasts the linguistic characteristics of two types of multi-word sequences: 'multi-word lexical collocations' (combinations of content words) versus 'multi-word formulaic sequences' (incorporating both function words and content words). Building on this background, the primary focus of the paper is an empirical investigation of the 'patterns' represented by multi-word formulaic sequences. It turns out that the multi-word patterns typical of speech are fundamentally different from those typical of academic writing: patterns in conversation tend to be fixed sequences (including both function words and content words). In contrast, most patterns in academic writing are formulaic frames consisting of invariable function words with an intervening variable slot that is filled by content words.",
keywords = "Conversation versus academic writing, Corpus-driven research, Lexical bundles, Lexical patterns, Multi-word formulaic sequences",
author = "Biber, {Douglas E}",
year = "2009",
doi = "10.1075/ijcl.14.3.08bib",
language = "English (US)",
volume = "14",
pages = "275--311",
journal = "International Journal of Corpus Linguistics",
issn = "1384-6655",
publisher = "John Benjamins Publishing Company",
number = "3",

}

TY - JOUR

T1 - A corpus-driven approach to formulaic language in english

T2 - Multi-word patterns in speech and writing

AU - Biber, Douglas E

PY - 2009

Y1 - 2009

N2 - The present study utilizes a corpus-driven approach to identify the most common multi-word patterns in conversation and academic writing, and to investigate the differing pattern types in the two registers. The paper first surveys the methodological characteristics of corpus-driven research and then contrasts the linguistic characteristics of two types of multi-word sequences: 'multi-word lexical collocations' (combinations of content words) versus 'multi-word formulaic sequences' (incorporating both function words and content words). Building on this background, the primary focus of the paper is an empirical investigation of the 'patterns' represented by multi-word formulaic sequences. It turns out that the multi-word patterns typical of speech are fundamentally different from those typical of academic writing: patterns in conversation tend to be fixed sequences (including both function words and content words). In contrast, most patterns in academic writing are formulaic frames consisting of invariable function words with an intervening variable slot that is filled by content words.

AB - The present study utilizes a corpus-driven approach to identify the most common multi-word patterns in conversation and academic writing, and to investigate the differing pattern types in the two registers. The paper first surveys the methodological characteristics of corpus-driven research and then contrasts the linguistic characteristics of two types of multi-word sequences: 'multi-word lexical collocations' (combinations of content words) versus 'multi-word formulaic sequences' (incorporating both function words and content words). Building on this background, the primary focus of the paper is an empirical investigation of the 'patterns' represented by multi-word formulaic sequences. It turns out that the multi-word patterns typical of speech are fundamentally different from those typical of academic writing: patterns in conversation tend to be fixed sequences (including both function words and content words). In contrast, most patterns in academic writing are formulaic frames consisting of invariable function words with an intervening variable slot that is filled by content words.

KW - Conversation versus academic writing

KW - Corpus-driven research

KW - Lexical bundles

KW - Lexical patterns

KW - Multi-word formulaic sequences

UR - http://www.scopus.com/inward/record.url?scp=70349799309&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70349799309&partnerID=8YFLogxK

U2 - 10.1075/ijcl.14.3.08bib

DO - 10.1075/ijcl.14.3.08bib

M3 - Article

VL - 14

SP - 275

EP - 311

JO - International Journal of Corpus Linguistics

JF - International Journal of Corpus Linguistics

SN - 1384-6655

IS - 3

ER -