Data-based Grammar

Hear the Lecture

Chapter 1: Introduction: A discourse perspective on grammar

In the first 2 paragraphs of the book on page 2, the authors tell us about their basic approach to grammar. These 2 paragraphs are from their hearts as well as their minds. They really care about the topic and really care that we understand what they are doing in this book. This is not casual stuff. It's not just because a book has to have an introduction. So, let me read those 2 paragraphs to you and then make a list of the important words and phrases that we're going to see a lot this semester.

1.1 Introduction

Every time we write or speak, we are faced with a large array of choices: not only choices of what to say but of how to say it. The vocabulary and grammar that we use to communicate are influenced by a number of factors, such as the reason for the communication, the setting, the people we are addressing, and whether we are speaking or writing. Taken together, these choices give rise to systemtatic patterns of choice in the use of English grammar.

Traditionally, such patterns have not been included as part of grammar. Most grammars have focused on structure, describing the form and (sometimes) meaning of grammatical constructions out of context.They have not described how forms and meanings are actually used in spoken and written discourse. But for somone learning about the English language for the purposes of communication, it is the real use of the language that is important. It is not enough to study just the grammatical forms, structures, and classes. These tell us what choices are available in the grammar, but we also need to understand how these choices are used to create discourse in different situations.

Their Corpus

There's no end to the making of corpora. Many different individuals and groups have created collections of language to use for different purposes. The information on pages 7-9 is given in the context of this "making of corpora." The authors of our book want us to understand just how careful and thoughtful they have been in building the corpus that's used in this book. Here are the points they want us to understand:

1. The corpus is big: 40,000,000 words. Now, that's not the biggest because a few corpora have 100,000,000's words. But still it's a good, generous size to show patterns of use in vocabulary and grammar.

2. The corpus is fairly well balanced. It contains British and American English. It includes conversation produced by women as well as by men. The academic text includes selections from books as well as from journal articles. The newspaper text includes samples of different types of news writing.

Building a corpus is a huge task, demanding time and labor by a large staff and thus demanding substantial funding. This particular corpus was funded by Pearson Education through their Longman division. So, the corpus was built for the use of the authors in developing the Longman grammars. In addition, the authors continue to use the corpus for other publications.

No corpus is the whole of English. No matter how many words it contains a corpus will not include everything. But a really big corpus with carefully chosen materials will show core uses and most frequent vocabulary and grammar patterns.

Reading & Understanding Their Data

They tell us repeatedly that we have to learn to read the data in the Figures and Tables. At the end of Chapter 1, they talk about "visible frequency." The "visible" part means that instead of just giving a prose passage to describe the information, they put the data into a graph or table to try to help us "see" the patterns by seeing the numbers. The "frequency" part is really important to notice: Studies of "language in use" and "patterns of choice" look at how many times a word or phrase or grammar type occurs in a corpus. The basic data is frequency data. They ask us to please look ahead to Figure 2.1. So, let's do that. Here's the figure. It's also on page 23 in the book.

What frequency information are they making visible?

Questions to answer about Figure 2.1

1. What's the purpose of the figure? What information is being presented?

2. What "lexical word classes" are the focus?

3. What's the scale down the left represent? What are those numbers?

4. What registers are used? We know to expect conversation, fiction, news, and academic writing.

5. What does the data tell us about differences among the 4 registers?

Click here to see my answers after you've figured out yours!

Basic Words and Phrases

Term Information
systematic patterns of choice Corpus linguists are always looking for patterns and systems and systematic use and systematic choices. Not one-time uses. Not things we do every once in a while. But the big patterns that are characteristic of how English is used. The grammar and the vocabulary that often occur together.
English in use

The contrast is between (1) what a language is capable of doing and (2) what we really do. A grammar can describe the potential of a language by describing every form....all the verb tenses in a description of verb tenses. Or a grammar can describe the use of a the verb tenses are used in particular kinds of communication. "Language in Use" ties grammar to socio-linguistics.

This kind of grammar starts with data about how English is used. The data is made up of samples of written and/or spoken English. That "language in use" is analyzed for patterns. Then, the grammar is explained in terms of those patterns.

register A variety of a language defined in terms of use and functions. The Longman Grammar focuses on 4 registers: conversation, fiction, newspaper writing, and academic writing. Other linguists use the term "genre" for the same meaning.
discourse Communication. Samples of a language. Language in use.
context Settings...books, stories, classrooms, your dining room


And always remember that you can email me when you have questions.  My address is