A corpus-driven approach to formulaic language in english: Multi-word patterns in speech and writing

Research output: Contribution to journalArticle

142 Scopus citations


The present study utilizes a corpus-driven approach to identify the most common multi-word patterns in conversation and academic writing, and to investigate the differing pattern types in the two registers. The paper first surveys the methodological characteristics of corpus-driven research and then contrasts the linguistic characteristics of two types of multi-word sequences: 'multi-word lexical collocations' (combinations of content words) versus 'multi-word formulaic sequences' (incorporating both function words and content words). Building on this background, the primary focus of the paper is an empirical investigation of the 'patterns' represented by multi-word formulaic sequences. It turns out that the multi-word patterns typical of speech are fundamentally different from those typical of academic writing: patterns in conversation tend to be fixed sequences (including both function words and content words). In contrast, most patterns in academic writing are formulaic frames consisting of invariable function words with an intervening variable slot that is filled by content words.

Original languageEnglish (US)
Pages (from-to)275-311
Number of pages37
JournalInternational Journal of Corpus Linguistics
Issue number3
StatePublished - 2009



  • Conversation versus academic writing
  • Corpus-driven research
  • Lexical bundles
  • Lexical patterns
  • Multi-word formulaic sequences

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Cite this