The Building Blocks of Language

1  Overview

This chapter will introduce the basic tools of linguistic analysis. We use language in order to communicate with others all the time, without thinking about it. However, the fact that using language is such a natural thing does not mean that it is a simple and straightforward matter. Think of the skills required in playing the violin. A good player makes it look very easy, but in fact it is a very complex activity. So is language. It is only because each of us has been using language in one way or another since right after we were born that we fall under the impression that language use is simple and uncomplicated.

Traditionally, linguists describe a language starting with the smallest units (the sounds) and move up building larger and larger units (words, sentences, paragraphs, complete texts). We will follow this organization here as well.

2  The Sounds of Language

At the simplest of levels, language is made of sounds. In other words, we communicate using sounds that somehow carry meanings. How this is accomplished is fairly complex. We will start by looking at how we produce these sounds. Phonetics deals with the sounds of a language, in a physical sense. Sound is the motion of air in waves. When we speak we move the air inside and around the mouth, and the waves of the movement spread in the air. Sounds have four components:

  1. quality or timbre (the frequency of the vibration of the sound waves, what makes an ``a'' different than an ``o'')
  2. volume (how loud the sound is)
  3. length (how long the sound lasts)
  4. pitch or tone (high or low)

We will study phonetics using an articulatory view, which looks at how sounds are produced by our body organs, as opposed to an acoustic approach, which describes the way the sound waves actually look. By articulatory description, we mean the movements made by the phonatory organs to produce the sound waves.

2.1  The Phonetic Alphabet

The International Phonetic Alphabet or IPA (see figures) is used by linguists to represent the sounds in the languages of the world. The English sound system is listed below. Do not worry about the articulatory descriptions; they will be explained in what follows!

Please note that in this online version of chapter 2 the symbols for the sounds are not correct or missing, because of encoding problems with html. Refer to the IPA chart for the correct symbols.

Figure 1: The International Phonetic Alphabet: English consonants

SymbolManner of Articulation Examples
[p] bilabial voiceless stop pit, sip, apple
[b] bilabial voiced stop bit, sob, about
[t] alveolar voiceless stoptap, sot, about
[d] alveolar voiced stopdip, sod, adult
[k] velar voiceless stopcar, tack, acorn
[g] velar voiced stopgo, log, agog
[] glottal voiceless stopbutton, mutton
[f] labiodental voiceless fricativefluff, rough, ruffian
[v] labiodental voiced fricativevest, love, lover
[] interdental voiceless fricative thin, death, ether
[] interdental voiced fricative then, rythm
[s] alveolar voiceless fricativesnake, bass, decent
[z] alveolar voiced fricativezoo, roses
[ ][ ] palatal voiceless fricativeshell, rush, ashes
[ ][ ] palatal voiced fricativejejune, rouge, closure
[h] glottal voiceless fricativehave, hill, house
[ ][t ] palatal voiceless affricatechild, reach, hatchet
[ ][d ] palatal voiced affricatejudge, ridge
[m] bilabial voiced nasalman, mom, lamp
[n] alveolar voiced nasalnasty, run, ant
[ ] velar voiced nasalhanger, ringing
[l] alveolar voiced (lateral) liquidlove, hill, plate
[r] alveolar voiced liquidring, floor, crow
[w] bilabial voiced glidewood, awash
[y] palatal voiced glideyoung, canyon

Figure 2: The International Phonetic Alphabet: English vowels

Symbol Articulation Example
[i] front high tensebeat, feet
[I] front high laxbit, pit
[e] front mid tense day, pey
[ ] front mid laxpet, net
[æ] front low lax cat, mat
[u] back high tense loot, you
[ ] back high lax put, foot
[ ] central mid lax but, a, the (unstressed)
[o] back mid tense row, low, baugh
[ ] back mid lax cop, not
[a] back low lax pasta, father

The sound called ``schwa'' [ ] is the sound that is produced when the muscles in the mouth are relaxed and the tongue is in a central position. When a vowel is not stressed in English it becomes a schwa; thus, in ``about'' the [a] at the beginning is pronounced [ ] as in the, but, among.

Note that your pronunciation of the vowel sounds of English may be very different from the examples chosen here, and even from that of your instructor. For example, some speakers pronounce ``cop" [kap], while others pronounce ``father" [f] or [f]. Obviously, these speakers would not agree on the pronunciations chosen above. Choose some examples that work for you among those presented by your instructor.

Diphthongs   The following three sounds are diphthongs , i.e., a vowel followed by either [y] or [w] (a glide). Diphthongs can be considered as one vowel. This way of marking glides is used in the American tradition. An alternative notation, used by the IPA, is to mark diphthongs as [o], [a], etc.

[oy] boy, toy
[ay] buy, bye
[aw] bow, now

Traditionally in linguistics, spoken language has been considered the language and writing a derived, secondary way to represent (imperfectly, at that) spoken language. Many sophisticated and fully developed cultures never invented writing (on writing see ch. ). Spelling may have very little to do with the pronunciation of a word: for example, ``through" is pronounced [ ru] and ``laugh" is pronounced [laf], although they both end with ``gh." Note that sounds transcribed in the phonetic alphabet are always surrounded by square brackets. This is a convention followed by linguists all over the world.

2.2  How Are the Sounds Produced in the Mouth?

2.2.1  Articulatory Phonetics

Eleven organs are involved in phonation, i.e., the production of sounds for speaking; however, note that phonation also refers to the activity of the larynx (see below), a potentially confusing ambiguity! These organs are:

diaphragm lungs
trachea larynx
velum uvula
nasal cavity tongue
roof of mouth teeth
lips

Note that the larynx is also known as the glottis and that the velum is the soft part that opens and closes the nasal cavity. None of these organs originated as speech organs. For example, the function of teeth is that of chewing our food; however, they are also used in speaking. Since the time that our ancestors started speaking, we have developed several speaking-specific muscles.

Figure 3: The vocal tract

There are four processes by which we produce sound:

  1. airstream
  2. phonation
  3. nasalization
  4. articulation.

Airstream   The airstream includes lungs, diaphragm, and trachea. Most speech sounds are produced with the airstream; this is the case for all the sounds of English. An example of sounds that are not produced with the airstream are clicks. To produce clicks a temporary chamber of resonance is created in the mouth, and then the occlusion is released. Examples are ``tsk!tsk!'' or the sound of kissing. Clicks are frequent in African and Australian languages.

Phonation   Phonation analyzes what the larnyx is doing:

2.2.2  Nasalization

The nasal cavity may or may not be involved in the phonation process. This is determined by the opening or closing of the velum (a.k.a. soft palate). If the velum is closed, no air flows through the nose; if the velum is open (lowered), then air flows through the nose, producing nasalization.

If we have an open velum we produce the three nasalized sounds in English [m, n, ]. If we have a closed velum we produce non-nasalized stops: [p/b, t/d, k/g]. All nasals are voiced in English. Notice that nasals are continuants.

Other languages (e.g., French, Polish, Portuguese) have nasalized vowels [ã, ~e, õ], marked by the superimposed tilda ~ .

Articulation   Articulation is the modification of the sounds produced by the other three processes. It deals with the way the tongue, lips, and teeth are positioned.

There are three types of articulations in the vocal tract:

Each sound can fit only into one type of articulation, i.e., if [p] is a stop it is not at the same time a fricative.

Where do these occlusions (or obstructions) take place? Note that these places of articulation work for both stops and fricatives.

  1. Bilabial: putting your lips together

  2. Labio-dental: putting your lower lip and upper teeth together

  3. Tongue

    1. Tip
      1. interdental: placing tip of tongue between teeth
      2. dental: placing tip of tongue against upper teeth
      3. alveolar: placing tongue slightly higher than the dental region (the gums)

    2. Body: the body of the tongue articulates against the palate

      1. alveopalatal
      2. palatal-hard palate
      3. velar-soft palate
      4. uvular-back of tongue against uvula

    3. Pharyngeal: creating an occlusion using the back of the tongue and the upper pharynx

  4. Glottal: closure or narrowing of the space between the vocal folds

2.2.3  Glides, Affricates, and Liquids

We now turn to three small classes of sounds that are slightly more complex.

Glides   Glides are articulated as [i] and [u] but with the tongue higher. In English, glides occur after ``long'' vowels, e.g., [gluw] glue.

Affricates   Affricates are articulated like stops but the release of the occlusion is slow, not sudden. An affricate begins as a stop, but is released as a fricative, e.g., [ r] = church.

Figure 4: Main places of articulation

Liquids   Liquids are also produced by a partial occlusion of the vocal tract, but the occlusion, unlike with stops or fricatives, is neither complete nor narrow enough to cause friction of the airflow. [l] is called a lateral liquid because it is produced by laying the tongue flat in the mouth, with the tip in the alveolar region and the sides (hence, lateral) of the tongue raised. The airflow escapes from both sides of the tongue. [r] is a very complex sound; there are several varieties of [r]. In the United States, most [r] sounds are pronounced with the tongue in the alveolar region and with the tip of the tongue bent backwards (retroflex). Other pronunciations of [r] include the uvular [r] of French. All English liquids are voiced.

2.3  Sounds and Meaning

Consider the fact that different people pronounce the ``same" sound differently; for example, a man and a woman will pronounce the word dog differently: a man usually has a deeper, lower voice, whereas a woman usually has a higher pitched voice (the differences depend on the size of their phonatory organs). Often a very short sample of language (``It's me'') is sufficient for people to recognize a speaker, which tells us that the differences are noticeable. And yet the same speaker will pronounce the same word differently if stressed or sad. How can we say then that the sound at the beginning of the word dog is the same sound at the end of road?

The answer to this question lies in the concept of phoneme. People have in their minds a mental image of the sound [d]. This mental image of a sound is called a phoneme. A useful analogy is the score of a musical piece: different musicians will all play a given concerto differently and yet they all played that particular piece of music. Exactly in the same way, no two productions of the phoneme /d/ will be exactly alike, and yet we recognize that they are examples of /d/. Note that phonemes are written between slashes, to distinguish them from sounds.

Take, for example, the sound [p]. Do you recall the articulatory description of [p]? It is produced by blocking the flow of air from the lungs by closing the lips. When the airflow is released (by opening the lips again) and the vocal folds are still, we articulate a [p] sound. When it is pronounced at the beginning of a word and followed by a vowel (such as in ``pit'') it is actually followed by a puff of air. This is called an aspirated [ph]. When a [p] sound occurs at the end of a word (such as in ``tip'') it is articulated without releasing the airflow at all (this is called an unreleased [p]). In other words, [p] at the beginning of a word sounds different than a [p] in the middle of a word or at the end. And yet you probably never realized that you pronounced these sounds differently.

All the ways that a given phoneme is articulated are called allophones. So, the unreleased [p] and the aspirated one are both allophones of the phoneme /p/.

Your brain recognizes phonemes by matching the allophones it actually hears (sounds) to mental images of sounds (phonemes) and then compensating for the differences.

However, we can look at phonemes from a different point of view. Thus there are two received definitions of phoneme:

We have examined the first definition. Let us focus now on the second one. The ``smallest" part of the definition is not problematic. When you hear a speaker talk you hear a long chain of different sounds one after the other. If you start cutting the chain in pieces at some point you will have to stop, because there's nothing left to cut: that is the smallest unit. For example [ d g] (the dog), can be segmented as [ ] and [d g]; these can be further segmented as [ ], [ ] and [d],[ ],[g], respectively. After this, there's nothing left to segment. We have reached the smallest units of language.

Let us turn to the ``distinctive" part of the definition. By distinctive we mean that the unit identified through segmentation must convey a difference in meaning. Note that we did not say that the units must have meaning, but that they must cause a difference in meaning. The two are quite different claims. Clearly, [g] has no meaning. However, [d g] (dog) and [d k] (dock, or doc, short for doctor) differ in meaning. Thus the presence of a [g] or a [k] at the end of the string [d ] is meaningful, in the sense that it conveys a difference in meaning.

Two words such as [d k] and [d g] are a minimal pair. A minimal pair is two words of different meaning that differ in only one phoneme, e.g., /bit/ and /pit/. Note that the definition of minimal pair is a twofold one:

  1. two words with different meanings
  2. and which differ in one sound.

So, pit and bit are a minimal pair, because the two words differ in meaning and in one sound. How about [pit] and [phit]? Here, while the two words differ in one sound, they do not differ in meaning in English, which makes them a challenge for nonnative speakers (see ). Hence, these are not a minimal pair(again, in English). In other languages, apirated [ph] and unreleased [p] may well cause a difference in meaning so that [pit] and [phit] would be two different words, a minimal pair. Minimal pairs are very important, because they allow us to identify the phonemes of a language. If we find a minimal pair in the words of a language, we know that the two phones involved in the minimal pair are the allophones of two different phonemes. So, in the example above, since dock and dog are a minimal pair in English, it follows that /g/ and /k/ are phonemes in English. On minimal pairs in second language learning, see .

3  Words and Their Parts

If we combine phonemes we get a larger unit, called a morpheme. For example, if we put together the phonemes /d/, //, and /g/, we get the morpheme /d g/ spelled dog. Note that morphemes are written between slashes, like phonemes.

Recall the definition of phonemes as the smallest distinctive units in language. An important aspect of their nature is that they have no meaning: a sound has no meaning in and of itself. On the contrary, morphemes have meanings. Cat means something, for example.

3.1  Morphemes and Words

So a morpheme can be /kæ t/ ``cat'' but also /put f/ ``put off," and /-z/ as in ``roses" (i.e., it marks the plurality of the roses, meaning there's more than one rose). We notice immediately that morphemes may or may not coincide with what we think of as words.

The term ``word" is not a technical term of linguistics. Usually by word we mean any sequence of letters divided by blank spaces. Thus in the sentence Mary has two dogs we would say that there are four words. However, there are a number of problems: to begin with, ``dogs" is two morphemes (/d g/+/-z/); on the other hand, as we saw above, there are ``words" that consist of more than one ``word" (put off, do away with, fly off the handle): these are called phrasal verbs and idioms (see ch. ). Thus, in order to avoid these problems, linguists have decided to use the word morpheme to indicate any unit of meaning that cannot be broken down any further (think of them as ``semantic atoms") and to use the word lexeme to indicate any entry in the lexicon (the vocabulary) of a speaker/language. Lexemes may be plurimorphemic (i.e., have more than one morpheme). The following section deals with the classification of morphemes.

3.2  Free vs. Bound Morphemes

Free morphemes can appear alone. Take for example the morpheme /cat/ (for simplicity we will not use the IPA transcription of morphemes, unless significant); we can find this morpheme used independently in speech. Consider now the following words:

cat -s
carriage -s
despot -s
criminal -s
elephant -s

/cat/, /carriage/, /despot/, /criminal/, and /elephant/ are all free morphemes. They can occur alone in discourse. This is not true for the /-s/ morpheme. This morpheme marks PLURALITY. Note that semantic features will appear in capital letters. PLURALITY means that there is more than one cat, dog, and so on. However the PLURALITY morpheme cannot show up alone in speech or writing; thus it is called a bound morpheme. Free morphemes are often called root morphemes or stems. Bound morphemes are called affixes because they need to attach to another morpheme.

3.3  Affixes

Affixes are classified according to their position. If they come before the root they are called prefixes; if they occur after the root they are called suffixes. If they attach in the middle of a word, they are called infixes. The following chart gives you some examples:

prefixes infixes suffixes
un-believable a-whole-nother proud-ly
in-credible un-f**king-believable love-s
super-ordinate walk- ed

Infixes are very rare in English but extremely common in other languages. Tagalog, spoken in the Philippines, uses -um- as an infix to indicate the infinitive of the verb: kuha = take; k-um-uha = to take

3.4  Inflectional vs. Derivational

Some morphemes can be used to create new words from old ones; they are called derivational morphemes. For example, the name for the person who performs an action (agent) is often formed with the ``agentive" derivational morpheme -er as in to buy ® buyer
to sell ® seller

On the other hand, inflectional morphemes simply mark such grammatical categories as PLURALITY, TENSE (past, continuous present), comparatives (tall-er), superlatives (tall-est), and THIRD PERSON SINGULAR (walk-s).

The principal differences between derivational and inflectional morphemes are:

Inflectional morphemes carry meaning, just like roots. Read the beginning of Jabberwocky.

Jabberwocky by Lewis Carroll

'Twas brillig, and the slithey toves
Did gyre and gimble in the wabe:
All mimsy were the borogroves,
And the mome raths outgrabe.

``Beware the Jabberwock, my son!
The jaws that bite, the claws that catch!

Just like Alice, you probably cannot figure out what the poem is about. However, take a word like slithey; we can tell that it is an adjective by its -y ending (as well as its position close to a noun) and so this means that whatever slith may be, the ``tove'' has that quality (naturally, that leaves an open question, namely what is a tove?).

3.5  Where Do New Words Come From?

The English language has roughly 500,000 words, but new words are being invented every day to match the ever changing needs of the speakers, such as the new words required by the rising importance of computers in our lives. Derivational morphemes are only one way to get new words from old ones. The following are some of the ways that speakers can create new words in English.

Derivation
New words can be created by using derivational morphemes. For example, after we invented the fax machine, we needed a verb to describe the action of faxing-hence ``to fax."

Compounding
Another technique is that of putting two old words together to make a new one, e.g., railway, department store.

Clipping
New words can be constructed by shortening a longer word. e.g., (tele)phone, prof(essor), auto(mobile)

Acronyms
A rich source of new words is the practice of using the initial letters of a set of words, e.g., NAFTA, NASA, NFL, PTA, REM. Acronyms are different from abbreviations in that acronyms use intital letters of words or of parts of words. Abbreviations shorten the word, as in amt for amount or pres for president.

Blends
New words can also be created by the blending of two existing words, e.g., motel (motor+hotel), brunch (breakfast+lunch).

Backformation
New words are unconsciously created by speakers when they no longer analyze a word in its constituent morphemes and instead break it down according to the way it ``looks." From the word inflammable came flammable, when people perceived in- as the NEGATION morpheme. From swindler came swindle (when the -er suffix was perceived as the agentive, i.e., the person doing the action). From burglar came burgle. This phenomenon is also known as reanalysis.

Other examples:

alcohol-ic ® alco-holic ® worka-holic, choco-holic
hamburg-er ® ham-burger ® cheese-burger, veggie-burger
entertain-ment ® enter-tainment ® info-tainment, edu-tainment
Marathon ®mar-athon ® walk-athon, phon-athon

Invention
When speakers want, they can always invent new words from scratch (e.g., googol, meaning a very large number). This is often done in advertising. Examples: Kodak, Xerox, Kleenex. Often words start out as proper nouns (e.g., Kodak, Kleenex) and end up being used as common nouns (a xerox, a kleenex).

Borrowing
Languages in contact borrow words from each other. It may be that one language does not have a word for a new product or concept. Thus, when coffee became popular in Europe the Arabic word kawatin was used in various forms: coffee, cafe, Kaffee, and so on. Japan in turn borrowed kohi from the European languages. Borrowing is a two-way street; English gave French weekend and took bon vivant. Japanese took game from English and gave it sushi. Indeed, Japanese is full of English loan words:

Japanese English
keiki cake
steiki steak
boifurendo/garufurendoboyfriend/girlfriend
basubaru baseball

You will notice that a language typically changes the phonology of the borrowed word. Japanese does not allow syllables to end with a consonant (except in certain cases n). Therefore, borrowed words need to change to fit Japanese phonology and an [i] gets added to the ends of cake and steak and a [u] finds its way into the middle of baseball. This happens with any borrowing, not just those from English; arubeito was taken from German arbeiter to mean not just any worker but a part-time or temporary worker. ``Regular'' workers are sarariman from ``salary men'' and ``OL'' (office ladies).

Calque
A calque is a special kind of loan. In a calque, parts of words are translated. A good example is Fernsprecher, German for telephone. As telephone is taken from the Greek tele distant and phone voice; Fernsprecher means distant speaker. Other examples are big man in Nigerian English (see ) for an important person, which comes from Yoruba enia nia (big man); bush meat is Nigerian English for game, as opposed to poultry. It comes from the Akan phrase ha nam (bush meat).

3.6  Idioms and Phraseology

Some units of language span more than one ``word'' and thus cause all sorts of problems to lexicographers (dictionary writers). For example, kick the bucket means die but consists of three ``words,'' none of which has much to do with death. It turns out that idioms, i.e., multi-word units the meaning of which is not the sum of its parts, are quite frequent, especially if we count as idioms most phrasal verbs. Phrasal verbs are verbs such as put off, do away with, get on, start up, and deal with, which are composed of a verb and a preposition or an adverb. Often their meaning is idiomatic, or cannot be derived by the sum of its parts. See for instance put up as in, I won't put up with your excuses anymore! where the meaning of tolerate cannot be derived from put and up. Other units that are often treated as idioms are stock phrases (How are you?) and proverbs (A rolling stone gathers no moss). A related topic is collocations, i.e., the way that two or more words are associated, such as bread and butter, get away with murder, salt and pepper, and rock and roll. On collocations, see also .

4  The Way Sentences Are Put Together

When we put together morphemes of the right kind, we can form sentences.

4.1  The Double Articulation of Language

The first articulation of language is the breakdown of a sentence into morphemes; the second articulation is the breakdown of morphemes in phonemes.

The reason the two articulations of language are so important is that they allow us to express an infinite number of thoughts, in an infinite number of different sentences, with a finite and in fact quite small number of sounds. There are roughly forty-three different sounds in English. If we had to have a different idea for each sound we would need many thousands more sounds. That would make it very difficult to remember which sound goes with which idea. So using sounds, which are not associated with any given meaning, to make morphemes, which are given a meaning, is a pretty clever way of getting around that problem. But even that would not be sufficient for the purposes of communication. Consider the fact that every day we may produce sentences that we have never produced before. If we only had the second articulation of language, we would be stuck with however many morphemes we would have produced. By putting together morphemes to form sentences (first articulation), we acquire the power to produce an infinite number of sentences, capable of accommodating any needs for communication that might arise, now or at any time in the future. The double articulation of language is also called duality (see ).

4.2  Syntax

Putting together morphemes to form sentences is the area covered by syntax. Syntax is based on the idea of grammaticality. A sentence is said to be grammatical if the speakers of the language agree that it is a sentence that they would produce under the appropriate circumstances. Thus The book is on the table

will be accepted as grammatical by all speakers of English, while *Table the on is book will not. Accordingly, the ungrammatical sentence is marked with an asterisk (*), which shows that the sentence is unacceptable. The goal of syntax is to describe all the grammatical sentences of English, or any other language, and show why the ungrammatical sentences aren't acceptable. By doing so, syntax eventually hopes to explain how people's capacity to use language works. Incidentally, the grammar of a language is the entire description of the language (from sounds to sentences, including meaning) not just its syntax, as some believe. Try not to confuse the grammar of a language and the grammaticality of sentences; grammaticality is only part of the overall grammar. If you feel you need a little help remembering basic English grammar, we have supplied Appendix A to help you.

You will probably recall sentence diagramming from high school . Syntax provides linguists with a similar, but more effective, tool. Consider Bob eats broccoli. This sentence can be broken down into [Bob] and [eats broccoli]. These are called the sentence's immediate constituents;. [eats broccoli] can be further broken down into [eats] and [broccoli]. In conclusion, [Bob], [eats], and [broccoli] are the constituents of our sentence. Consider now that the sentence could have been Bob eats his lunch. In (4.2) the immediate constituent analysis gives [Bob] and [eats his lunch], which is further analyzable as [eats] and [his lunch]. The latter can be finally analyzed as [his] and [lunch].

Note how [his lunch] and [broccoli] have the same function in the sentence (the direct object). Thus we say that since [broccoli] is a noun, [his lunch] is a noun phrase (NP). Along the same line, we can find that since eats is a verb, [eats broccoli] is a verb phrase (VP). There are other types of phrases, the most significant being the prepositional phrase (PP), which requires a preposition, such as on, in, and by. Incidentally, the part of speech without which a given ``phrase'' could not exist is called the ``head'' of it. So the noun is the head of an NP.

Recapitulating: we have introduced the immediate constituents of a sentence, which are the parts in which a sentence can be broken down. Its constituents are the parts in which it ultimately is analyzable (i.e., the morphemes). We further introduced the concept of phrase, which is any constituent, either immediate or not, that is not a clause. A clause is a full sentence that has a subject and a verb. Somewhat confusingly, phrases may consist of only one constituent, as will be seen below.

Table gives a list of what can appear in a phrase structure grammar, with examples.

Table 1: Lexical and Phrasal Categories
Major Lexical Categories
Noun (N) John, rock, table, idea
Verb (V) run, kiss, kill, speak
Adjective (Adj) old, beautiful, large
Adverb (Adv) quickly, madly, yesterday
Minor Lexical Categories
Determiner (Det) the, a/an, this, those
Auxiliary Verb (Aux) do, be, have, can, may, must
Preposition (P) in, on, up, near, at, by,
Pronoun (Pro) he, she, it, him, her, they
Conjunction (C) and, or, but, however
Phrasal Categories
Noun Phrase (NP) the young man, books, John
Verb Phrase (VP) runs, opened the door
Prepositional Phrase (PP) in the dark, in open contest

It is traditional to represent the structure of the sentence with a tree diagram that shows with branching lines the process of breaking down the sentence illustrated above. A simple example follows:

4.2.1  Types of NPs in English

Let us look at some other examples that will show the various configurations of an NP in English. An NP may be very simple or have a fairly complex structure:

Notice the increasingly more complex structure, going from a simple noun, to an article plus noun, then an article plus adjective, and so on.

Finally, notice that an NP may be modified by a PP, as in the example the woman at work. Note that modification takes the form of attaching under the NP node. This will become quite significant when we discuss PP attachment.

Let us add that VPs can be similarly complex, except nothing ever comes before the verb (we will not deal with auxiliaries in this text, but even those do not come out of the VP node, but rather out of S).

4.2.2  Configurational Definitions

Tree diagramming offers a clear and simple solution to problems that had plagued grammar for centuries. Take the issue of how to define the ``subject" and/or ``direct object" of a sentence. You are probably familiar with such ``nondefinitions" as ``the person or thing doing the action" which can be defeated by a few well-chosen examples (just to name one, in It rains who or what is doing the raining?).

Generative grammar (see ) notes that when labeling sentence structures, there is no need to label subjects and direct objects. The ``subject" of a sentence is a nontechnical name for the first noun phrase located immediately below ``S" in a tree diagram. The direct object is simply the noun phrase located immediately below the verb phrase (NP, VP). This is what is called a configurational definition. It only relies on positions within the tree.

4.2.3  Syntactic Ambiguity

Sentences can often be ambiguous. Tree diagrams can exemplify this fact. Note that phrases can be attached to more than one spot. This can lead to differences in meaning.

Consider this joke: A woman walks in a store and says to the clerk: ``I'd like to try the red dress in the window." And the clerk says: ``But, Ma'am, we have dressing rooms for that." The woman means that the dress is in the window, while the clerk understands that the trying on will take place in the window. Here's how this ambiguity is represented in a tree diagram (simplifying the sentence a little); first the clerk's interpretation:

And now the woman's intended interpretation:

Note the triangles in the tree, which indicate that the structure of the NP and of the PP are trivial and can be assumed as drawn to save time.

There are many other kinds of syntactic ambiguities. Another example of ambiguity: ``Flying planes can be dangerous." You could interpret this sentence in more than one way. For instance, you could take it as planes that are flying can be dangerous or actually piloting planes is dangerous.

4.2.4  Generative Grammar

This kind of grammar, associated with the groundbreaking work of Noam Chomsky in the mid 1950s, is called a generative grammar. Why generative? Chomsky wanted to design a model (a grammar) that would produce all the sentences that a native speaker would agree are grammatical and none of the sentences that a native speaker would reject as ungrammatical. The main idea is that the sentences of a language are generated. In mathematics, from which Chomsky was inspired, generation means that given a symbol and a rule one can produce new (sequences of) symbols, based on the rule. For example, a mathematical system that generates even numbers can be defined as:

neven = n · 2

This system, known as a function, is extremely simple. If you plug in any number (n) and multiply it by 2, the result is an even number. A similarly simple formula generates odd numbers:

nodd = (n · 2) + 1

If you take an even number and add one, you get an odd number.

Notice also that, in their misleading simplicity, the two systems defined above are quite powerful. In fact, they can both generate an infinite set of even and odd numbers. Any formal system will have rules and ``objects" to which the rules apply; these formal systems are also known as ``grammars."

What do the rules used to generate sentences of a language look like? They are called rewriting rules. Rewriting rules are all of the same form. You take element A and rewrite it as B, or in symbols: (A®B). A can only consist of one symbol (item) but B can be any number of items. This is very significant, as the discussion of transformations (see below) will show.

However, we still have not touched upon the specifics of a grammar capable of generating sentences of English. The rewriting rules that generate sentences are called phrase structure rules.

Phrase Structure Rules  

Phrase structure rules are simple instructions for building larger constituents from smaller ones. They also give us information on the order in which their components appear and their grammatical categories.

For example, the following rule will immediately look familiar:

S ® NP + VP

It can be paraphrased as follows:

A sentence may be rewritten as ``noun phrase" plus ``verb phrase"

or more formally as
To build a constituent of the category S, take a constituent of the category NP and combine it with another constituent of the category VP, in that order.

Thus what the rule merely says is that a sentence is made up of an NP and a VP in that order. Tree diagrams can also be used to represent the structure of this sentence. In fact, phrase structure grammars (PSGs) are equivalent to tree diagrams. The only difference is that in tree diagrams, you are unable to tell which ``branch'' was generated first. But for our purposes, they are the same.

A PSG for a Subset of English  

The following simple phrase structure grammar generates all the sentences used as examples or in the exercise questions in this book.

S ® NP + VP
NP ® (Art) + (Adj)n + N + (PP)n
VP ® V + (NP) + (PP)n
PP ® Prep + NP
Art ® the, a, an
N ® girl, house, book ...
V ® eat, run, laugh ...
Prep ® up, above, with ...
Adj ® big, small, green ...

Note the following conventions:

5  Types of Sentences

We traditionally distinguish different types of verbs and corresponding types of sentences. In English, verbs must always have a subject, but some verbs do not take a direct object; these are called intransitive verbs.

Mary laughs.
Mary sleeps.

Verbs that have a direct object are called transitive. Mary won the race.
Mary kissed John.

Some verbs have two objects, a direct and an indirect object (see ). These are called ditransitive verbs. Mary gave John a book.
Mary sent Ann a letter.

Other types of sentences do not have a NP as their direct object, but rather an adjective is their complement. A complement is anything that follows the verb. Usually this happens with copular verbs (such as to be, to seem, to look): Mary is tired.
Mary seems happy.

Naturally, we can add all sorts of adverbs or adverbial clauses in various positions to modify the sentence or some of its parts. So, for example, we can have: Mary won the race yesterday.
Luckily, Mary won the race.
Mary easily won the race.
Mary won the race without much effort.

Finally, transformations can be used to move parts of a sentence into different positions, for emphatic purposes. For example, It's in the garden that John lost his glasses.
This wreck of a car you want me to drive?
A good student he is not. In a now somewhat old-fashioned terminology, sentences before transformations have been applied to them are in deep structure and after the transformations have been applied in surface structure. Deep and surface structure were originally used to capture the fact that two sentences such as Mary loves John.

J
ohn is loved by Mary.
are clearly related. The idea was that the two sentences (5/5) share the same deep structure, but undergo different transformations, resulting in different surface structures. Thus, sentence (5) is in the active, while (5) is in the passive voice. Voice is a technical term indicating that the verb is either active or passive. An active sentence to which no transformation has been applied is called a kernel sentence.

Picture Omitted
Syntacticians, or people who study syntax, used to believe that there were many more transformations, even in simple sentences. Today we tend to believe that there are few, if any at all. For example, take subordination. A clause is called a subordinate if it is ``inside'' another clause. By inside, we mean that a higher order sentence has as one of its components another sentence (the subordinate clause). For example, a sentence may have its direct object replaced by a subordinate clause, which will appear under the comp node (comp stands for complementizer, i.e., a word that introduces a complement/subordinate clause; see in the English Grammar Appendix): Mary believes that John is the culprit. In this sentence, Mary believes is the main clause while that John is the culprit is the subordinate clause.

Another way to join sentences, besides subordination (a.k.a. hypotaxis), is coordination (a.k.a. parataxis). In coordination, unlike subordination, the two sentences, or phrases, are on the same level. Consider the following example Mary left and John went to bed. in which the two sentences Mary left and John went to bed are on the same level (i.e., neither is a subordinate of the other). This can be seen very clearly from the phrase structure tree.

The joining together of sentences (through coordination or subordination) was handled by transformations in the early versions of the theory. Nowadays, syntacticians think that transformations are mostly unnecessary, although they still use the concepts of deep and surface structure.

The following are some further examples of transformations.

Besides the way yes/no questions are formed in English, there is another way of asking questions, i.e., the so-called WH-questions. It consists in taking the complement phrase (in the example below, the direct object NP) and moving it to the beginning of the sentence, in the empty node under the ``comp" node (marked by a circle).

Note that we are ignoring the presence of the auxiliary (``Who does Mary know?") for the sake of simplicity.

5.1  Recursion and Embedded Sentences

Recursion can also occur in sentences. Syntactic recursion is the occurrence of noun phrases or sentences within the same kind of constituent. For example, PP®Prep + NP; NP®Art + N + PP; PP®Prep + NP ...

in which each PP is rewritten as ``Prep + NP'' and each NP as ``Art + N + PP'' which are both perfectly acceptable rewritings for both categories.

You can also have embedded sentences inside your original sentences. There are no limits as to how many recursions of phrases or embedded sentences you have in your sentences, and therefore you can generate sentences that are infinite, since a recursive sentence can have another recursive sentence within itself, and so on.

In the following example the embedded sentence is in the box.

5.2  Syntax, Universal Grammar, and the Chomskian Program

Syntax, and indeed most of linguistics, has been influenced deeply by the work of Noam Chomsky, one of the geniuses of the twentieth century. Even those who disagree with him have to recognize that his influence has been momentous. Chomsky is largely responsible for the central role of syntax in most of theoretical linguistics work in the last forty years. Chomsky's grammatical theory has a very significant and controversial psychological and even biological underpinning.

Note that to this point we have presented Chomsky's standard theory. The theory has been heavily revised several times (see the box below); however, discussion of the more recent notations and developments would be far too complex for a beginners' text. See the Further Readings section for references.

Chomsky claims that the rules of grammar (syntax, morphology, and phonology) are governed by principles that are universal, in the sense that all languages of the world obey them. For example, no language would negate or deny the sentence John is not tall by reversing the order of the morphemes of the sentence to form tall is John (meaning that John is not tall). Chomsky says that that is no coincidence, because the principles that govern grammar are genetically programmed in human beings. These principles are called universal grammar (UG). In other words, Chomsky and his followers claim that part of our genetic makeup tells us what counts as a possible grammatical/linguistic rule, just like our genes tell us what counts as a possible leg or eye color.

Each individual specification towards the nature of universal grammar is called a parameter. Parameters are very abstract, but pro-drop (see Glossary) can be used as an example: in Italian the following sentence is perfectly grammatical: Piove. while in English *Rains.

is clearly ungrammatical. Spanish, German, Chinese, Latin, and many other languages behave like Italian in this respect, while French and other languages behave like English. What we have here is a parameter, that says roughly that in a given language one either can or cannot have an empty subject position (cf. the grammatical It rains). Parameters can be thought of as yes/no switches. A parameter allows either one thing or the other (see 5.2).

6  Beyond the Sentence

Sentences do not occur in isolation (except in grammar books!). Sentences may occur in paragraphs or as part of a conversation where two or more speakers talk to each other. The disciplines of linguistics that look at these units larger than the individual sentence are called text linguistics and discourse analysis; also, contrastive rhetoric is interested in paragraph structure (see ).

6.1  Coherence and Cohesion

This area owes a lot to the work of M.A.K. Halliday and R. Hasan. Generally, we distinguish between textual cohesion, which happens at the level of the surface of the text, and coherence, which happens at the level of the meaning of the text.

6.1.1  Cohesion

Cohesion, generally speaking, is the property of the surface structure of the text to ``hold together.'' Consider the following example: The boy came in the room. He was wearing a red coat.

In sentence (6.1.1) the italicized NP the boy is referred to by the bolded pronoun he. Technically, we say that the pronoun is an anaphoric item, i.e., a linguistic item that refers to another (part of a) text, and what it refers to is called its antecedent. So, the relationship between any anaphoric item and its antecedent is a cohesive relationship.

Pronouns are not the only type of cohesive devices found in texts. Articles are cohesive too, as well as some adverbials (e.g., on the one hand ... on the other hand; however) and conjunctions ( and, but). Other cohesive devices include lists, parallelisms, explicit markers such as chapter and section titles, and tables of contents. NPs can be cohesive too: Napoleon was a great general. The winner of Marengo was proud of his reputation. Bonaparte was also known for his habit of keeping his hand on his stomach. It is believed that the emperor's digestion problems are the reason for his well-known pose. In this paragraph, the italicized NPs are all coreferential and are therefore cohesive.

6.1.2  Coherence

Coherence is the overall meaning of a text. You may think of it as its ``point,'' or ``main idea,'' or as the part of its meaning that makes it all fit together. Coherence happens at the semantic level. As such, it may, but does not have to be, explicitly expressed in the text itself. Besides by cohesion, coherence may be established by any of the following means:

The presence in a text of cohesive devices does not guarantee that it is also coherent, although usually coherence and cohesion go together. To show that cohesion is neither a sufficient nor a necessary component of coherence, consider the two following paragraphs: John likes to swim. Mary is fond of skydiving. Ann is a pro golfer. What athletic children I have.

John likes to swim. It is a very good sport, from an exercising point of view. Exercise is a good way to lose weight. Weight loss is the number one reason for dieting.

In paragraph (6.1.2) there are no cohesive ties (unless you count the fact that John, Mary, and Ann are potentially children's names), and yet coherence is easily achieved by invoking the frame for family, which tells us that one may have three children.

In (6.1.2) there are cohesive ties between each sentence and the following one, yet the paragraph fails to be coherent because there is no unifying theme, no one thought that is expressed by the text.

6.2  Conversation Analysis

Conversations occur when two or more people talk together and are coherent (if everything goes well, obviously!). Coherence in dialogue is achieved by all the means discussed above (especially the cooperative principle). There are, however, also conversation-specific tools, such as adjacency pairs.

6.2.1  Adjacency Pairs

An adjacency pair is the succession of two linked turns, by different speakers, which make sense only taken together. The following are some examples of adjacency pairs: Note that a speaker may choose not to complete the adjacency pair immediately but instead delay it by introducing another adjacency pair. A: When are you going on vacation? [question]
B: Why do you want to know? [question]
A: I have to write a report for the boss. [answer]
B: Ah. The second week of July. [answer] In this example, we see speaker B refusing to complete the adjacency pair and instead opening another adjacency pair. Note that when the second adjacency pair (turns 2 and 3) is complete, B does eventually complete (in the last turn) the adjacency pair opened in the first turn.

However, conversations are quite different from written texts. To name the most obvious differences, they are spoken, as opposed to written, and they have more than one speaker, as opposed to written texts, which usually have one author.

6.2.2  Turn Taking

One of the central issues in the analysis of conversation is how to regulate turn taking, i.e., who is to speak. In general, people tend to avoid overlapping turns, because it is fairly complex to follow what someone is saying while someone else is speaking too. People have developed strategies to insure that speakers who have the floor or are speaking, will not be interrupted. In the White, middle-class, Anglo-Saxon culture, the convention is that whoever is speaking is entitled to keep the floor until he/she arrives at a transition relevance place (TRP) (for example, the end of a sentence, or a pause in speech). Then, unless he/she signals with appropriate means that he/she is not done speaking (for example, by making a hesitating sound), the floor is up for grabs. The speaker also has the option of selecting the next speaker, for example, by asking a question.

This is not to say that speakers never interrupt. However, interruptions are viewed as ``rude'' (not following the conventions of proper behavior) and disruptive. Not all interruptions are rude or disruptive, either. Speakers may interrupt to agree or to express interest in what the speaker currently holding the floor is saying. These kind of overlapping turns are called back channel, here exemplified by B's turns.

A: I went to the store...
B: Uh-huh.
A:... and bought milk...
B: Right.
A: ... because they were out of cream.

7  Exercises

7.1  Words to Know

phonetic alphabet phoneme first articulation
morpheme free morpheme bound morpheme
root stem affix
prefix infix suffix
derivational morpheme inflectional morpheme derivation
compounding clipping acronyms
blends backformation borrowing
calque invention second articulation
syntax grammaticality constituent
phrase tree diagram universal grammar
parameter kernel sentenceallophone
collocation phrasal verbidiom
lexeme phonetics diphthong
articulatory phoneticsairstream voiced
voiceless velum nasalization
glides affricatesliquids
aspiration unreleasedminimal pair
configurational definition syntactic ambiguity generative grammar
rewriting rule phrase structure rule intransitive
transitive ditransitivedeep structure
surface structure active voice passive voice
subordination coordination main clause
recursion embedding coherence
cohesion contrastive rhetorics anaphor
antecedent conversation adjacency pair
overlap TRP back channel

7.2  Review

  1. Why is the International Phonetic Alphabet (IPA) useful?

  2. Transcribe the following words in IPA. It may be helpful to say the words out loud before you do so.

    plush bet bleak do
    chop but nit bough
    church boat jump new
    down bought boy fruit
    fluff through sing plain
    cough knight child poach
    beep push judge flinch
    bit chirp flash beat
    heap hurt bait hip
    dew hat heat hit

  3. Break these words into morphemes:

    1. indecision, cheaters, broadcasting, conferences, childishness

    2. Unconstitutional laws are unusually common lately

    3. The reemergence of nationalisms is worrysome.

  4. Look at the list of ways new words are made. Can you think of other examples for each process?

  5. Try your hand at diagramming sentences:

    1. The pretty boy likes the smart girl.

    2. Mary loves pizza with anchovies.

    3. John ate the pizza with his hands.

  6. Compare the trees of the last two sentences. Is the place where you attached the PP the same? Why? See if you can explain your choice.

  7. Draw the syntactic trees for the following kernel sentences:

    1. The tall woman complained about the noise.

    2. John loves the warm feeling of an open fire.

    3. The book is on the table with the white tablecloth.

    4. A hardworking student passes a tricky exam without any trouble.

    5. The woman with the red dress closed the door with a kick.

    6. I ate a slice of pizza with my friends from college.

    7. The man with red hair ate a slice of pizza with his fingers.

    8. The man with the red umbrella in his hands laughs.

  8. Draw the syntactic trees for the following sentences. Note that all of the following have undergone a transformation, so you need first to find out what transformation has applied, apply it backwards to find the kernel sentence, and then draw the tree diagram.

    1. The old table was repaired by a good craftsman.

    2. Mary was given a flower by John.

    3. John was hit on the head by Mary.

    4. Is the book on the table?

    5. The book was put on the table by Mary.

    6. Clinton was elected by the American people.

    7. Is the car with the flat tire in the garage?

    8. The lazy students were flunked by the righteous professor.

    9. Is Mary in the blue car with John?

    10. The game was canceled by the tall umpire.

    11. Mary gave John a piece of cake with many candles.

7.3  Research Projects

  1. Make a list of words you think were invented in your lifetime. Use a dictionary to check your intuitions.

  2. Transcribe some English words into IPA. To check your answers, use a dictionary with IPA pronunciation. English as a second language dictionaries like the Longman Dictionary of Contemporary English use IPA. If you know a language other than English, you might want to try transcribing some words you know.

  3. Tape a five-minute piece of conversation and identify adjacency pairs and turns. For an example of transcription conventions, see Brown and Yule (1983, x-xi).

  4. Select a newspaper article and try to identify all markers of cohesion. Use colored highlighters to mark different types (e.g., pronouns, repetitions of words, synonyms, conjunctions, etc.)

8  Further Readings

More detailed, but nonetheless introductory, treatments of phonetics, phonology, morphology, and syntax can be found in the relevant chapters of the general introductions to linguistics listed in the previous chapter. The treatment of phenomena above the sentential level is spotty, at best. A good introductory text in that area is Stubbs (1983).

For a more advanced look at the various subfields, the following sources may be helpful. A complete description of the IPA can be found in Pullum and Ladusaw (1986). Clark and Yallop (1990, 1995) is an in-depth look at phonetics and phonology. Matthews (1991) is a good introduction to morphology. Syntax is a difficult area to keep up with, given the dizzying pace of change in the theories. A recognized excellent all-purpose introduction to mainstream syntax is Radford (1997). Other options are the relevant chapters in Napoli (1996) and Culicover (1997). Introductions to the variety of theories of syntax are Sells (1985) and Horrocks (1987). The ``ideology'' of the Chomskian program is spelled out in the very readable Pinker (1994) and in Cook (1996). Lyons (1977) is an excellent introduction to Chomsky's standard theory. A book on Chomsky's life and politics is Barsky (1997). On the double articulation of language, see Martinet (1966) translated by E. Palmer.

On cohesion and coherence, the fundamental reference is still Halliday and Hasan (1976). On discourse analysis, the relevant chapter in Levinson (1983) is a good synthesis. Brown and Yule (1983) is also excellent. Schiffrin (1994) provides a broad survey.

The Nigerian English examples come from Ahulu (1998).

Appendix

A Chart of the English Consonants

Bilabial Labiodental Interdental Alveolar Palatal Velar Glottal
Stops [p] [b][t] [d] [k] [g] [ ]
Fricatives [f] [v] [] [][s] [z] [ ] [ ] [h]
Affricates [ ] [ ]
Nasals [m] [n] [ ]
Liquids [l] [r]
Glides [w] [y]

A Chart of the English Vowels


Picture Omitted


Footnotes:

1 Note the ``dash'' (-), which means that the morpheme cannot appear by itself, i.e., it is a bound morpheme. Thus, one should not confuse it with the free morpheme hood as in ``the hood of the car.''


File translated from TEX by TTH, version 2.25.
On 25 Aug 2002, 09:59.