University of California, Berkeley
Department of Music

The 1999 Ernest Bloch Lectures

Lecture 1. Music and Mind:

Foundations of Cognitive Musicology

David Huron

The Origins of Cognitive Musicology

In an introductory lecture such as this, I suppose a good place to start is to address three questions: What is cognitive musicology? How did the field arise? What does it hope to achieve? Let me begin first with a thumbnail history of the origins of cognitive musicology, and then draw on this background to identify what I think are the defining features of the field. Of course practitioners in a field are rarely the best historians; so I approach the idea of tracing origins of cognitive musicology with trepidation. At the same time, I believe that reviewing some of the history can prove informative in understanding how and why the field has developed as it has.

Cognitive musicology has its origins in two intellectual currents. The first is the so-called "cognitive revolution" and the second is what might be called "music psychology." The cognitive revolution is a broad movement that has transformed psychology over the past three decades. Many music scholars with an interest in psychology have simply been swept along the path of the cognitive revolution. At the same time cognitive musicology can also be viewed an off-shoot from a century-old research tradition of music psychology -- a field whose predominantly German origins recommends using the designation Psychologie der Musik. However, cognitive musicology arose, at least in part, in response to specific criticism of the practice of music psychology. Please indulge me while I attempt to trace these two converging histories of scholarship.

From Music Psychology to Cognitive Musicology

There are many interesting questions one can ask about music. Why are some people more musical than others? Is musical "intelligence" independent of general intelligence? How does music give pleasure? Why do people disagree about musical likes and dislikes? Are musical preferences related to personality? Why do our musical preferences sometimes change over time? Does everyone "hear" music the same way? With training, how might we listen differently? Are there certain life experiences (such as ecstasy or grief) that contribute to a person's understanding of music? Is music somehow similar to speech or language? What makes something sound "musical?" Why do some melodies get stuck in your head? Why don't all melodies get stuck in your head? Why do people willingly listen to music that makes them sad? Can music somehow corrupt or enhance moral behavior? Can a person listen to too much music? Can we hear/understand the music of another culture in the same way as people from that culture do? Why do cultures or styles change? Does a music tell us something about the people who make it? Can one musical culture ever be regarded as superior to another culture? What is the relationship between music and the other arts? Are there limits to what music could be?

Most of these questions are essentially psychological in nature. For the non-professional, these look like good questions -- the sort of questions that would animate music scholars. Yet professionals know that most music scholarship flits around the periphery of such questions. Unfortunately, despite a history of research going back at least 150 years, music psychology never really captured the imaginations of music scholars and so failed to become a core discipline within 20th-century musicology. There are reasons for this. Some 50 years ago, Paul Farnsworth gave a lecture on this very campus outlining what he considered the main shortcomings of music psychology. His talk was entitled "Sacred Cows in the Psychology of Music." Although I disagree with some points raised by Farnsworth, a half century later I find myself extending and refining Farnsworth's criticisms of the continuing field of music psychology. There are, I believe, at least four problems that have haunted music psychology.

  1. First, throughout its history, music psychology has tended to focus on the individual, and on individual responses to music. Music psychologists often pay little attention to social and cultural context. Although early sociologists like Max Weber wrote extensively about music, later social psychologists failed to continue the tradition.[1]
  2. Secondly, although psychology is a broad discipline, music psychology has tended to focus exclusively on low-level issues of sensation and perception. While many significant discoveries have been made, these discoveries have held little pertinence to musical experience. To this day, most books on the psychology of music typically include lengthy discussions of acoustics and psychophysics without showing how these matters might relate to the quality of musical experience.
  3. Thirdly, when music psychology has addressed more musically interesting questions -- such as (say) the perceptibility of serial transformations -- the resulting research has tended to emphasize the limitations of music listening. Again and again, music psychologists have been the bearers of bad news. All of this nay-saying might have been offset if music psychologists had shown a comparable interest in discussing what music might be. That is, the discipline has lacked a creative or imaginative component; in general, it has not spawned research ventures that point to new and unexplored musical terrain. Until very recently, a composer reading works in the psychology of music would find little inspiration.
  4. Finally, the field of music psychology has tended to be dominated by researchers with conservative musical tastes. Well-known researchers like Carl Seashore have shown little interest in contemporary music, and many mid-twentieth century music psychologists were privately or openly hostile towards the new music. Practicing musicians have been largely justified in suspecting music psychologists of pursuing a conservative musical agenda. It should be noted that the discipline itself has attracted scholars who are suspicious of the new music, and who think that psychological research can be used to buttress their arguments that contemporary music is somehow "unnatural."

To be fair to my colleagues and predecessors, one needs to include some rejoinders to these four criticisms.

  1. First, in carrying out any research program, one must narrow the field of inquiry if anything is to be accomplished. The topic on which one focuses often arises from convenience. (If a music theorist chooses to analyse a particular work, it does not necessarily follow that the theorist thinks other works unworthy of study.) Music psychologists focused on individual responses rather than broader social and cultural issues primarily because it is easier to study individuals rather than groups.
  2. Secondly, the emphasis on low-level aspects of sensation and perception has proved, in retrospect, to be justified. Far from being musically irrelevant, the past decade of research has shown that low-level phenomena, such as the mechanics of the basilar membrane, have had far more impact on musical organization than was formerly suspected.
  3. Thirdly, regarding the nay-saying character of much psychology of music research, history has largely vindicated the nay-sayers. For example, ongoing research on the perceptibility of serial transformations has been carried out since the 1950s. Careful, sophisticated experimental research has been carried out by scholars such as Bruner, Francès, Gibson, Lannoy, Largent, Millar, Pedersen, Thrall, and others. Yet, to my knowledge, not a single one of these scholars has had his or her work cited by any set theorist. Many music theorists continue to write as though questions of perceptibility remain unaddressed and open. Some theorists wrongly assume that research has only addressed the listening of non-musicians or non-experts. (Gibson, for example, studied members of the Society for Music Theory.) Set theorists have been delinquent in ignoring this research. Music theorists in general have been delinquent when assuming that the human capacity for auditory experience is unbounded.
  4. Finally, regarding the conservative musical tastes of music psychologists, it must be noted that the vast majority of music psychologists received their academic training in psychology, not in music. Music psychologists were no more conservative in their tastes than the general population. Many psychologists were notably supportive of new music (e.g. Francès). The more pertinent question is why more music scholars didn't make the effort to learn how to do psychological research. Fifty years ago, Farnsworth complained that few musicians were competent psychologists. That's just as true today as it was in 1948. If music psychology seems to favor a psychological perspective, that is largely because music scholars have generally failed to get involved. In fact, speaking now as a musicologist, I believe that musicology owes a collective debt of gratitude to the innumerable psychologists whose extraordinary efforts laid the groundwork for the discipline.

The Cognitive Revolution

Let's now turn to the second historical current contributing to cognitive musicology, the cognitive revolution.

The term "cognition" has many connotations. For the non-specialist, cognition is more or less synonymous with thought or thinking. Psychologists have used the term to designate various forms of knowing, and in some cases, psychologists have regarded cognition as equivalent to "the functioning of the mind."[2]

The rise of cognitive psychology is often traced to Ulric Neisser's book of that name, published in 1967. However, the origins of cognitive approaches to psychology can be seen in several earlier strands of research in psychology that led to increasing disgruntlement with behaviorism.

For most of the early part of the twentieth century, psychology, especially American psychology, was dominated by the behaviorist approach associated with J.B. Watson and (later) B.F. Skinner. Watson argued against positing mental states that were unnecessary for explaining a behavior. For example, the fact that an animal approaches a food dish does not mean that the animal has a desire or a conscious intent to eat. There is no way for an observer to "see" such a presumed conscious intent or desire.

To be fair, Watson's severe approach to psychological reasoning was a deliberate reaction against more informal psychological discourse whose theories appeared to be impossible to test. Watson and Skinner's behaviorism was simply an application of Occam's razor in the domain of mental processing. According to Skinner, we shouldn't posit sophisticated mental states when a simpler explanation can account for the experimental data equally well. This belief accounted for Watson's well-known (and notorious) disdain for appeals to consciousness as an unseen epiphenomenon, even in humans. Skinner, by contrast, never shared Watson's view regarding consciousness. Nevertheless, Watson and Skinner had much in common with the logical positivist, A.J. Ayer, and so it is not unreasonable to characterize behaviorism as "positivistic."

In our simplified story, the end of behaviorism's popularity can be loosely attributed to three events. First, experimental research itself implied the existence of higher-level mental processing that appeared to be essential in many tasks, especially those tasks that resembled natural problem-solving activities. Some psychologists, such as Broadbent, noted in their experiments that human subjects weren't simply responding to stimuli; they were anticipating and interpreting events, and different subjects appeared to be motivated by different goals. Increasing numbers of psychologists became interested in studying memory, attention, pattern recognition, concept formation, categorization, reasoning, and language. Behavioral methods seemed well suited to studies of sensation and perception, but behaviorism proved less useful in investigating more complex mental functions.

A second contributing factor was the advent of computer science and artificial intelligence. Computer programs were the very epitome of invisible information processors. In computers, the relationship between inputs and outputs depends critically on the nature of such invisible programs. Clearly, complex and multifaceted information processing functions can exist without anyone (apart from the programmer) knowing about their existence. If computer programs can be invisible yet real, then it is more plausible that analogous unseen mental functions can exist for humans and other animals.

Finally, a third influence was a general unhappiness with the reductionistic and simple mechanistic view of mental life that was implied by Skinner's work.

In contrast to behaviorism, the new cognitive psychology could be characterized by three dispositions. First, there was a willingness among cognitive psychologists to entertain explanations of mental processes and mental states that could not be behaviorally observed. In effect, some intellectual space was made for plausible invisible mental functions -- the sort of functions that might provide motivations, such as initiating actions, rather than simply responding to a stimulus. Second, there was a consensus that a useful way to study the operation of the mind is to decipher and describe underlying mental representations. That is, cognitive psychologists became interested in how skills, perceptions, knowledge, beliefs and motivations might be mentally coded, stored and retrieved. Third, cognitive psychologists placed special emphasis on the processes of thought instead of its content. [3]

In the early years, cognitive psychology tended to eschew psychophysics, sensation, and neural aspects of mental behavior. However, in recent decades, cognitive psychologists have shown a renewed interest in the mechanisms of mental life. Where formerly cognitive psychologists were interested in discussing mental life and mental functions apart from mechanisms, in recent years, cognitive psychology has connected once again to those perceptual and biopsychology researchers who remained tied to behaviorist methods. This integrative tendency is reflected, for example, in the burgeoning field of cognitive neuroscience.

In retrospect, cognitive psychology has prevailed over behaviorism, primarily because behaviorism fell prey to what is now referred to as the positivist fallacy. If a phenomenon results in no observable behavior, a researcher may be tempted to wrongly conclude that no mental activity has taken place. In short, the positivist fallacy arises when absence of evidence is mistaken for evidence of absence. We will return to the issue of the positivist fallacy again in my third lecture on methodology where we will see that this fallacy has plagued not only scientific research, but humanities scholarship as well.

What is Cognitive Musicology?

At this juncture, we might offer a preliminary definition of cognitive musicology. Cognitive musicology is an area of musicology that studies musical "habits of mind." It is a field that has been inspired by the cognitive revolution and informed by past lessons and mistakes in the psychology of music. In contrast to the behaviorists, cognitive musicologists do not presume that there is a simple relationship between stimulus and response. Musical stimuli and the phenomenal experiences they evoke typically have sophisticated, complex, and mostly unobserved mental functions interposed between them. Cognitive musicologists are primarily interested in processes rather than content. We accept that listeners, performers, composers, improvisers, dancers and others have specialized knowledge, beliefs, motivations, skills and strategies. We tend to focus on mental representations for music, but we don't regard these representations as disembodied abstractions: musically pertinent representations are concretely expressed in human biology and often exist as socially distributed codes as well. In investigating the musical mind, it is not the task of the cognitive musicologist simply to document limitations to musical experience, but also to point to the unexplored cognitive terrain -- regions of musical possibilities that have not yet been visited by creative artists.

In summary, music cognition is an approach to the study of music that places the mind in the central position. To study music is to study the musical mind.

Mental Representations of Music

As I've just noted, a major preoccupation for cognitive musicologists is the study of mental representations for music. Music-lovers will have no difficulty believing that most of what is musically valuable is unobservable -- at least not observable with the unaided or untutored eye. Experienced performers, for example, know all too well that there is hardly any difference in facial expression between those members of an audience who are in rapture, and those who would rather be somewhere else. However, the presumption that cognitive processes are difficult to observe is open to abuse. As the behaviorists rightly fear, one might claim that all sorts of spurious processes exist. Whenever possible, the cognitive musicologist needs to demonstrate that a presumed music-related mental representations does, in fact, exist. Let me illustrate some mental representations by invoking some specific examples.

EXAMPLE 1: Musical Memory

As quickly as you can, I want you to answer the following question, yes or no:

Does the word "but" occur in the lyrics to the song Row, Row, Row Your Boat?

[This example doesn't work if the reader doesn't actually try the task.]

If you are familiar with the song, you probably solved this problem by scanning the lyrics from the beginning of the song. More precisely, you probably mentally generated a speedy rendition of the work until you encountered the word "but" in the phrase "life is but a dream" and then you stopped searching. There are at least three conclusions we can draw from this little task:

  1. We are able to access mental representations for music. In this case, I had you focus on the lyrics, but the same can be done for melody alone.
  2. We can access music-related representations in the total absence of sound.
  3. We can manipulate these mental representations in certain ways (such as speeding up the rendition beyond what would be musically acceptable). But we cannot manipulate these mental representations in any way we wish. For example, you might have been able to answer my question much more quickly if you had random access to all of the words of the lyrics. Similarly, it would have been faster if you could start at the end of the lyrics and work your way forward. Either of these two strategies would have generated a faster answer to my question, but as far as we know, people are unable to do this. It is as though the mental representation for Row, Row, Row Your Boat is a linear recording that we must play from the beginning (or from a handful of possible starting points). Once again, my third point here is that we can access and manipulate musical representations only in certain ways.

EXAMPLE 2: Perceptual Schemas

Let's consider now a second example that requires a little more musical sophistication. Sing any tone to yourself. Now I'd like you to hear this pitch as a tonic pitch (or `doh') in a scale that begins on that pitch. In fact, if you are like most people, you already would have been hearing this pitch as a tonic even before imagining the scale.

Let's now have you hear this same pitch differently. Once again, sing the pitch, only this time I want you to hear this pitch as the dominant scale degree (or `so'). Now, for those of you who are able, try hearing the same pitch as the leading-tone (`ti'). Now hear it as the mediant pitch (`mi'). Notice how much longer it takes to hear the pitch as `mi' compared with `doh'.

Figure 1 shows response-time data for five music students. Each musician heard a randomly selected tone, and was asked by a computer to hear the tone as a particular scale degree. We then measured how long it took our listeners before they responded that they were hearing the tone in the specified way. In order to be certain they weren't fibbing, we then played a cadence and asked them to indicate whether or not the cadence corresponded with the imagined key. The data in Figure 1 plot the results only for correct responses.

Figure 1

Fig. 1: Median response times for scale degree orientations. Black bars indicate the median response time for imagining a tone as the specified scale degree (left scale in seconds). Grey bars indicate the frequency of occurrence for various traditional folksongs beginning with the specified scale degree (right scale in bits).
You can see that hearing a tone as the tonic takes the least amount of time. Hearing the tone as the dominant is the next fastest. Perhaps surprisingly, hearing the tone as a subdominant (`fah') takes the longest time to imagine.

We know from other research in psychology that response times (how long it takes to do something) tell us something about how much mental effort is involved in the task. (A classic illustration of this is Roger Shepard's famous work on mental rotation.)

Response times tell us something about the complexity of the mental representation. For an isolated pitch, the least mental effort is required to hear that note as a tonic. In fact, we know that people who don't have perfect pitch unconsciously presume that an isolated pitch is a tonic. It requires considerably more effort to hear that note as a non-scale tone.

There are again, several conclusions we can draw from this brief illustration:

  1. There is a difference between hearing and hearing as. Any person with normal hearing can hear a tone, but not everyone can hear the tone as (say) "fah."
  2. Hearing as is a natural tendency when hearing tones. The existing cognitive research suggests that listeners automatically and unconsciously make assumptions about the scale context (or what musicians call "tonal function") of a pitch.
  3. Some hearing as's are easier to hear than others. For example, it is easier to hear an isolated tone as a tonic than to hear it as a mediant pitch. Once again, these tendencies reflect different aspects of mental representations. Reaction time provides a useful indication of the complexity of mental processing.
  4. Hearing as is obviously related to one's cultural background. The vocabulary of scale degrees is passively learned from the cultural milieu. For most of the people in this room, it is simply impossible to hear a tone as the pitch hwang in a traditional Korean scale. Most of us haven't been exposed to the pertinent music.
  5. Although I haven't presented any detailed evidence, another conclusion we can offer is that listeners are different. Of course people in different cultures are exposed to different musics -- and so they differ. But even within a single culture, differences of exposure are evident. An obvious example occurs for absolute or perfect pitch. Some people will be able to represent a sound by an absolute pitch name (e.g., G#). But there are many other more subtle differences as well. The experimental evidence shows that not everyone listens in the same way, or has the same phenomenal experience.

EXAMPLE 3: Rhetorical Listening

Let's consider now an even more sophisticated example of a music-related mental representation: in this case, another form of hearing as. From the early Middle Ages until recent times, it has been common for musical commentators to relate music to rhetoric. Theorists like Heinrich Koch have suggested that musical materials can manifest different "tones of voice" or rhetorical character. In particular, Koch noted that the different formal sections in musical works can be characterized by such rhetorical differences. Using contemporary terminology, we can distinguish types of passages such as the following:

Closing material. A closing passage conveys a feeling of impending finality. Such passages suggest that the work is ending, or that the end of the work may be expected shortly.

Expository material. Expository passages present the basic musical ideas of a work, such as the principal melodies or themes.

Developmental material. Developmental passages convey musical ideas that have been varied, broken up, or rearranged in some manner.

Transitional material. Transitional passages act as links or bridges between other passages. They provide an interlude or prepare for something new.
We might well ask whether listeners are capable of hearing passages according to these rhetorical categories. To this end, Mei Yen Ch'ng, Kim Rasmussen and Sarah Stockwell and I recruited forty-three listeners. We assembled a number of brief passages (lasting 20 seconds each) taken from recordings of string quartets by Haydn and Mozart. The sample passages were randomly selected from sections that had already been analytically identified as the introduction, exposition, or development in a sonata-allegro movement. Transitional passages were randomly extracted from appropriate points in the exposition.

The listeners fell into three groups: music majors who had taken a course whose curriculum stressed the identification of music-rhetorical devices in symphonic works, a second group of matched music majors who hadn't taken such a course, and a third group of non-musician university students who claimed to have little or no formal musical background.

We found that listeners were able to identify all rhetorical categories significantly better than chance. As you might expect, "closing" passages were most easily identified, even though these passages never included a final chord or cadence. "Transitional" passages proved to be the most difficult to identify. We were surprised to find that all three groups of listeners were equally adept; the musicians were not better than the non-musicians. In fact, the raw scores for the non-musicians were slightly better than for the musicians, mostly because musicians showed a slight reluctance to classify passages as "transitional."

What does this mean? First, it suggests that listeners are indeed broadly capable of hearing brief musical excerpts in terms of rhetorical categories traditionally distinguished by music scholars. These rhetorical categories are psychologically salient; they make sense to people, they aren't merely formal abstract concepts. Moreover, this way of listening appears to be equally accessible to musicians and non-musicians. In the course of our experiment, we were pleasantly struck by how unphased our non-musicians were. They didn't receive any feedback, and we didn't give them any practice trials. Without ever having taken a music course, they seemed perfectly happy to classify passages as transitional, or developmental, or whatever. Most importantly, note that the test passages were presented in isolation, entirely removed from their musical contexts. There is something about (say) a development passage that sounds "developmental" even when the rest of the piece is unknown. Finally, since none of the passages used in this experiment straddled boundaries between formal sections, the results also imply that it isn't necessary to recognize sectional boundaries in order to follow the formal outline for a sonata-allegro work.

Cognition and Conscious Thought

We have just looked at three examples illustrating mental representations for music, namely memory for musical lyrics, perceptual schemas for hearing scale degrees, and hearing musical passages in terms of rhetorical categories.

It isn't often we get asked whether the word "but" occurs in the lyrics of some song, or to hear a particular pitch as some specified scale degree. It would be useful to know, not just what people are capable of doing, but also what they commonly or typically do. In particular, since the word "cognition" implies some sort of "cogitation" or conscious "thinking" we might ask what do people typically think about when they listen to music? Unfortunately, this isn't easy to answer.

In 1994, I made a preliminary effort to try to answer this question. I was teaching two sections of the same course in music theory. Each class consisted of roughly 30 students. In the first class I distributed a questionnaire which remained face down on their desks while they listened to two minutes of music. The music was a segment from a Mozart symphony, selected at random. After the music ended, the students turned over their questionnaires. The questionnaire began as follows:

"You have just listened to two minutes of music. The purpose of this questionnaire is to have you report on what you were thinking about during this time. Please answer the questions honestly. The questionnaire is intended to be anonymous, so do not write your name on this paper."

Students were asked a series of questions; they were asked to estimate the proportion of time they spent on certain types of activities. The most commonly reported activity was thinking about things I have to do today. Students were encouraged to provide written elaborations on the reverse side of the questionnaire.

I repeated this same informal experiment with the second section of the same music course. This time, I played the same recording, but with the amplifier turned off. That is to say, the entire class sat in silence for two minutes. (Incidentally, that's a long time for a group of people to sit in silence.) After the two minutes had elapsed, this second group of students were similarly asked to answer a questionnaire.

"You have just sat in silence for two minutes. The purpose of this questionnaire is to have you report on what you were thinking about during this time. ..."
As you might expect, these students reported a wealth of daydreaming scenarios.

I then compared the responses of the two groups of students. As expected, the group that listened to the Mozart symphonic passage reported significantly more music-related thoughts. But the size of this difference was tiny. On average, the group exposed to the music reported less than 5 percent of their thoughts related to music, while the non-exposure group reported only 1 percent of their thoughts related to music. This means that, over the 120 seconds of music, the group that listened to the music spent on average about 6 seconds thinking about the music. In effect, the typical student's thinking went something like this:

"This sounds like Mozart, maybe Haydn but probably Mozart. A symphonic work, no solo instrument so not a concerto. Um, what should I do after school tonight? ..."
Six seconds of music-related thought, and then they were gone for the next 114 seconds. And this occurred in a music theory class, where a music professor had handed out a questionnaire that could well have been a surprise quiz.

There are a number of methodological problems with experiments such as this that rely on introspection, especially when we try to assess unguided mental activity. But this informal experiment is nevertheless suggestive. It implies that the predominant conscious mental activity engaged in while listening to music is daydreaming.

Since research has established that listening to music entails a host of mental representations (see, for example, Krumhansl, 1990), the corollary of listener-daydreaming is that most music-related mental representations must be unconscious phenomena. Although most people in industrialized countries are exposed to lots of music, it appears that they don't think many music-related thoughts while listening.

Listening Strategies

Of course not all listening is unconscious or pre-verbal. Listeners may approach a listening experience with different strategies or different mental habits at different times. Elsewhere I have written about listening styles and listening strategies and have described some 20-odd common approaches to music listening. Let me give you the flavor of these by describing just one listening style, which I call fault listening. It is a listening mode that has a strong conscious component.

For several years I lived in the United Kingdom, and while there I was a perennial listener to the BBC's classical music network known as Radio 3. Unlike radio broadcasting in North America, European classical programming relies much less on commercial recordings. At the time that I lived in Britain, the majority of classical radio programming entailed live or delayed-live broadcasts.

As a listener accustomed to hearing virtually flawless commercial recordings, I vividly recall the shock of hearing performers make mistakes on the radio. What I found remarkable was how the occurrence of a single mistake would utterly transform my listening. Having heard one mistake, I was "all ears" -- vigilant to identify further errors or lapses of musical judgment.

Fault listening might be defined as follows: it is a listening mode that arises when the listener is mentally keeping a ledger of faults or problems. A high-fidelity buff may note problems in sound reproduction. A conservatory teacher may note mistakes in execution, problems of intonation, ensemble balance, phrasing, etc. A composer is apt to identify what might be considered lapses of skill or instances of poor musical judgment.

Fault listening tends to be adopted as a strategy under three circumstances: (1) where an obvious fault has occurred, the listener switches from a previous (often passive) listening mode and becomes vigilant for the occurrence of more faults; (2) where the role of the listener is necessarily critical, as in teachers, conductors, or music critics; or (3) where the listener has some prior reason to mistrust the skill or integrity of the composer, performer, conductor, audio system, etc.

There are many other listening styles and strategies we could discuss, but we don't have time. This single example should suffice to establish my point. Even as individual listeners, we have a palette of different ways to approach the listening experience. In some cases we can switch strategies in the middle of a musical work. As individuals, we undoubtedly have preferred ways of listening; some arise from enculturated habits, some from professional training, and others from personal disposition or mental habit.

Investigating Musical Thought

Let's pause for a moment and take stock. As we have seen, cognitive musicology is predominantly the study of musical thought and mental representations. We've seen three examples in memory for musical lyrics, schemas for hearing scale degrees, and hearing musical passages in terms of rhetorical categories. We've also encountered evidence suggesting that most music-related mental phenomena are unconscious in nature. But we've also seen an example of a more conscious listening style in strategies such as "fault listening."

All of these examples have related to listening, and all have relied on introspective accounts of our mental experiences. In the time remaining, I'd like to broaden our discussion and address five more extended examples that are intended to highlight several contrasts. The examples include both socio-cultural phenomena and neurological phenomena; they address historical, performance, compositional, and listening issues; the repertories span archaic to contemporary popular music, and include cultures from five continents.

1. Musical Notation: Deciphering an Ugaritic Song

How do we gain access to the minds of people and cultures long past? We have no direct access to their thoughts, but that's also true of people sitting right next to us. We can glimpse mental activities by examining whatever externalized evidence is available. In some cases, the available evidence can be very small. Consider the oldest known musical notation, shown in Figure 2.

Figure 2: Ugarit music tablet.

In 1929, the French archaeologist Claude-Frédérric-Armand Schaeffer began a series of excavations at Ras Shamra on the Mediterranean coast of Syria. Schaeffer uncovered hundreds of clay tablets bearing testimony to the ancient city of Ugarit, a site that was home to a succession of cultures from the 6th to the 1st millennium BC. The document reproduced in Figure 2 comes from the most prosperous age in Ugarit's history and is dated between 1450 BC and 1200 BC.

The text uses cuneiform writing organized from left to right. The language is Hurrian, a language that has largely been deciphered. However, this particular tablet (and several others like it) have so far resisted complete decipherment. Laroche (19XX) observed [pp. 462f., 484] that the section above the double line forms a coherent text that contains several repetitions resembling refrains found in musical lyrics or poetry. Below the double line is a combination of words and numbers. Hans Güterbock (1970) noted that the words are Hurrian equivalents to [Sumerian??] musical terms that had already been deciphered. Specifically, the terms indicate the names of the intervals formed by strings of a 9-stringed harp or lyre. In the Ugarit tablet, each interval term is followed by a single number (refer to Figure 3).

Figure 3: Ugarit transcription (text).

There are at least six modern attempts to transcribe this work into contemporary Western notation. The most difficult challenge has been interpreting the meaning of the numbers following each interval term. Do these numbers represent the number of repetitions of the intervals, or the number of upward scale tones from the lower to the upper string of the interval, or the number of downward scale tones from the upper to the lower string of the interval?

Figure 4: Two interpretations of the Ugarit tablet.

Figure 4 shows excerpts from two different transcriptions, one by XXX and the other by Anne Draffkorn Kilmer. It's hard to imagine more contrasting decipherments.

Now I'm not at all an expert in Ugarit, nor am I a historical musicologist. However, what we know from music cognition may be of some help in deciphering the music. Consider, for example, the finding by Vos and Troost (1989) that showed that most large intervals in melodies ascend in pitch. That is, intervals such as perfect fifths and major sixths are significantly more likely to rise than fall.

Figure 5 illustrates this phenomenon for a number of repertoires I've examined, including songs from the following cultures: Arabic, Austrian, Belgian, Czech, Dutch, English, French, German, Italian, Yugoslavian, Russian, Spanish, Chinese, Korean, Japanese, Hassidic, Ojibway, Tahitian, Pondo, Venda, Xhosa, and Zulu. In addition, I've examined American popular songs, Schubert Lieder, and Gregorian chant. In all of these repertories, there is a significant tendency for large pitch intervals to ascend rather than descend. We don't yet know the reason for this phenomenon; however, it might be related to pitch "declination" in speech.

Figure 5

Fig. 5: Proportion of ascending/descending intervals for 22 cultures. In general, most small intervals tend to descending whereas most large intervals tend to ascend. Cultures included: Arabic, Austrian, Belgian, Chinese, Czech, Dutch, English, French, German, Gregorian chant, Italian, Korean, Japanese, Hassidic, Pondo, Russian, Spanish, Tahitian, Venda, Xhosa, Yugoslavian, Zulu. N.B. In all cultures, intervals roughly 11 semitones in size tend to be rare, hence the corresponding plotted values have a low reliability.

Such a pattern in no way proves that large leaps are more likely to ascend than descend in Ugarit music. But a predominance of descending large leaps would certainly be unusual given our knowledge of other musical cultures.

Unfortunately, time doesn't permit a complete enumeration of the discoveries about melodic organization that might be pertinent to deciphering the Ugaritic tablets. Suffice it to say that there are at least a dozen features of melodic organization that have been established through systematic study, and these principles could provide independent evidence in support of some proposed transcriptions at the expense of others.[4]

2. Transcultural and Historical Listening: The Case of Melodic Accent

A question that has long preoccupied ethnomusicologists is the extent to which we can hear the music of another culture in the same manner as culturally-experienced listeners. In fact, this question is a central issue in historical musicology as well. Even if we were to hear period-authentic sound recordings, we might well ask whether the modern listener experiences the music in a manner similar to past listeners.

In order to consider this question, we need to distinguish many possible aspects of musical experience. A modern listener might hear the pitches the same way as a past listener, but not hear the connotations of the timbres in the same way. A modern listener might apprehend the musical program or context, yet fail to hear the radical betrayals of harmonic expectations. In other words, we need to ask to what extent a modern listener can have an experience similar to a past listener for each of several aspects of musical experience.

For illustrative purposes, let's focus on one aspect of musical behaviors, whether modern and past listeners experience accent (or stress) in a similar way. Over the centuries, music theorists have proposed a number of factors which are thought to contribute to stress or accent in music. For example, accents are presumed to arise through increased loudness ("dynamic accent") and through increased duration ("agogic accent"). One of the most contentious forms of accent has been the notion of pitch-related accent, or melodic accent. Some theorists have suggested that higher pitches are more accented than lower pitches (you'll find this view, for example, in Benward and White). Other theorists (such as Parncutt) have argued the reverse: that low pitches are perceived as more accented. Yet other theorists proposed that both extremes of high and low pitch are more salient than mid-register pitches. Other theorists, for example Graybill, have claimed that it is the size of the interval that's important: large intervals are more accented than small intervals. Some (such as Rothgeb) have suggested that it is only ascending intervals that are important. Other theorists, notably Joel Lester, have argued that it is not pitch height or interval size that's important, but rather changes of melodic contour -- that is, pivot points in a melody.

For modern listeners these different notions of melodic accent have been tested experimentally by Woodrow and by Squire. Unfortunately, the perceptual evidence indicates that modern listeners do not experience any of these forms of presumed melodic accent. Of course it is possible, that listeners in different historical periods heard melodic accent differently. Without any knowledge of these modern experiments, the theorist William Caplin was surprisingly prescient when some years ago he questioned whether any of these ideas of melodic accent hold merit. In 1982, the Dutch researcher Joseph Thomassen carried out two sets of perceptual experiments and formulated what is now regarded as the best model of melodic accent (for modern listeners); unfortunately, it's a model that's too complicated to describe succinctly, so I'll skip the details here.

In 1996, Matthew Royal and I published the results of a series of studies testing eight different notions of melodic accent. Instead of approaching the problem by carrying out further perceptual experiments, we decided to study a large sample of notated music to measure which concept was most consistent with how composers actually compose. We studied three contrasting repertoires of music containing a total of two hundred works. Although the works spanned a considerable historical period, in all three repertoires, we found that Thomassen's model was significantly superior to all the other proposed notions of melodic accent of which we are aware.

What's important from a historical point of view is that one of the repertoires we tested was a sample of Gregorian chant. Now in most music, different types of accent tend to coincide -- accent types tend to be synchronized. That is, notes which have longer durations tend to be given greater dynamic accents, and both of these tend to occur in stronger metric positions. In addition, when the music has some sort of text or lyrics, the accents tend to coincide with syllable onsets rather than with a sustained syllable, or mellisma. This tendency to synchronize accent types is illustrated in Figure 6 where agogic (duration), metric, dynamic, melodic (contour), and syllable onset are all coordinated.

Figure 6

Figure 6: Synchrony of accent types. Agogic (duration) accent, metric accent, dynamic accent, melodic (contour) accent, and syllable onset are all coordinated.

Matthew Royal and I found that the tendency for accent types to be synchronized also holds true for melodic accent. By and large, melodic accents tend to occur in strong metric positions, are associated with longer duration notes, receive more dynamic stress, and tend to coincide with syllable onsets rather than with sustained syllables. The exceptions to this generalization occur for syncopated and hemiola passages where one or two accent types are systematically offset from the others.

Royal and I were surprised to discover a notable exception in the case of Gregorian chant. As in the other repertoires, in the chant literature, there are marked correlations between the occurrence of melodic accents (as defined by Thomassen's model) and whether or not the moment is syllabic or mellismatic. However, the correlations are negative rather than positive. Pitches that are deemed to convey a melodic accent are much more likely to occur on a mellisma than at a syllable onset. Let me try to illustrate this using Happy Birthday. In Figure 7, I've miscoordinated the syllable placement with respect to metric position and agogic accent:

Figure 7.

Fig. 7: Happy Birthday re-texted in order to reduce the correlation between syllable onsets and strong metric positions.
In chant, the miscoordination is between syllable placement and melodic accent. The miscoordination is utterly systematic. Of the 60 randomly selected chants we studied, only a single chant did not display this methodical miscoordinated relationship between melodic accent and text. In the first instance, this suggested that the musicians who created or subsequently modified these works were purposely trying to avoid highly stressed or inflected moments in the music.

Some musicologists (a small minority) have suggested that chant might have been originally sung in a rhythmic fashion (and that modern arrhythmic performance of chant is somehow an aberration). However, the statistical correlations do not at all support this view.

Incidentally, the single exception in our sample of chant was A Solis Ortus Cardine, the text of which is given in Figure 8. The syllable stresses as published in the Liber Usualis are also shown, as well as a simple representation of the stress pattern. One can clearly hear the iambic tetrameter rhythm here; the poetic text is highly rhythmic:

Figure 8: Text for A Solis [Liber Usualis, p. 400; #12].

A solis ortus cardine A so/- lis or/- tus car/- di- -ne .>.>.>..
ad usque terrae limitem, ad us- que ter/- rae li/- mi- tem, .>.>.>..
Christum canamus principem, Chri/- stum ca- na/- mus prin/- ci- pem, >..>.>..
natum Maria Virgine. na/- tum Ma- ri/- a Vir/- gi- ne. >..>.>..
Beatus auctor saeculi Be- a/- tus au/- ctor sae/- cu- li .>.>.>..
servile corpus induit: ser vi/- le cor/- pus in/- du- it: .>.>.>..
ut carne carnem liberans, ut car/- ne car/- nem li/- be- rans, .>.>.>..
ne perderet quos condidit. ne per/- de- r et quos con/- di- dit. .>...>..
Castae parentis viscera Ca/- stae pa- r en/- tis vis/- ce- ra >..>.>..
cae lestis intratgratia: cae/ le/- stis in/- trat- gra/- ti- a: >>.>.>..
venter puellae bajulat ven/- ter pu- el/- lae ba/- ju- lat >..>.>..
secreta, quae non noverat. se- cre/- ta, quae non no/- ve- rat. .>...>..
Domus pudici pectoris Do/- mus pu- di- ci pe/- cto- ris >....>..
tem plum repente fit Dei: tem/ plum re- pen/- te fit De/- i: >..>..>.
intacta nesciens virum, in- ta/- cta ne/- sci- ens vi/- rum, .>.>..>.
concepit alvo filium. con- ce/- pit al/- vo fi/- li- um. .>.>.>..

Now I'm not a chant scholar, so I know nothing about the origin of this work. But even if we didn't know that the text is rhythmic, the synchronization between the syllable placement and what we know of perceived melodic accent (for modern listeners) suggests that it is indeed likely that this particular work was sung rhythmically and that it differs significantly from the other chants we studied.

When Royal and I did this work, we were also struck by something else. Joseph Thomassen's model of melodic accent was formulated from tests using Dutch listeners in the early 1980s. In carrying out our statistical analyses we found that the relationship was significant at less than one chance in a million. That is, there is less than one chance in a million that a handful of modern Dutch listeners sitting in a laboratory listening to sequences of sine tones would respond in a way that corresponds to the text setting of music created roughly a thousand years ago. Moreover, this robust correlation was found only for Thomassen's model of melodic accent. Other conventional views of accent (such as the highest pitches, the largest intervals, etc.) did not show such correlations -- and let me remind you that the existing perceptual research is consistent only with Thomassen's model.

The inescapeable conclusion is that, whatever melodic accent is, it doesn't seem to have changed much over the past millennium. Modern listeners may not hear Gregorian chant the same way that Medieval listeners do, but we appear to hear the melodic accents in a similar way.

Where historical musicologists might infer rhythmic performance based on source studies, rescension, and other standard techniques, it seems that cognitive musicology might well be able to provide independent corroborating evidence of a particular interpretation of the music of the past. The research also might assist scholars in distinguishing sub-repertoires that are often mixed together in the sources we have available for study. As Katherine Bergeron has shown, collections of such works can have unusual and sometimes bizarre origins.

3. Performance and Idiomaticism

A common mistake is to regard cognitive representations of music as arising solely from the perception of music. However, there are many cognitive aspects of music that have nothing to do with perception. Good examples of non-perceptual phenomena that are reflected in musical organization can be found in performance idiomaticism. Since music is often performed using musical instruments, the mechanics of the instruments themselves often influence how the music is structured.

Some of these performance aspects are relatively easy to identify. A trivial example occurs when a musical work is composed to lie within the pitch range of some particular instrument. Another obvious example is evident in the contrast between wind instruments and non-wind instruments. When composing for French horn, for example, the composer must accommodate the performer's need to breathe by providing periodic rests. A work composed for 'cello is often impossible to perform on (say) the bassoon, because the bassoonist is constantly trying to find a place to breathe.

Other idiomatic aspects of performance are less directly observable, though still evident. Ethnomusicologists have frequently observed that instrumental idioms appear to have marked impacts on the character of music-making in different cultures (e.g., Yung, 1980; Baily, 1985; Kippen & Bell, 1989). Similarly, jazz musicians have often stressed the importance of idiomatic instrumental techniques in improvisation (e.g., Sudnow, 1978, 1979).

The most distinctive instrumental idioms are those gestures that are unique to a given instrument. For example, a well-known solo trumpet passage at the end of Leroy Anderson's Sleigh Ride imitates the sound of a neighing horse. This effect is almost impossible for any other instrument to produce, and so the relative ease with which it can be done on the trumpet means that it is justifiable to characterize the gesture as "idiomatic to the trumpet."

More subtle instrumental idioms are evident in a study of works for trumpet carried out by myself and Jonathon Berec in 1993. Berec and I began by collecting detailed performance data from two performers, one professional and one amateur. The measurements included many of the mechanical aspects of performance, including fingering, tonguing, embouchure, and breathing techniques. For example, the trumpet performers were asked to tongue notes as rapidly as possible in different registers and at different dynamic levels. Measurements were taken of how long the performers could sustain tones, and how quickly they could inhale. In addition, measurements were made of the speed of loss of muscle tone in the embouchure for sustained playing. Data was also collected on the difficulty associated with pitch movements within registers. In the case of fingering difficulty, the trumpet players themselves estimated the degree of difficulty for all possible transitions between two successive finger/valve combinations. The following table shows the average degree of difficulty for each of the possible finger/valve transitions, as judged by our two performers. Rows and columns represent antecedent and consequent finger/valve positions. For example, on a scale of difficulty ranging from zero to ten, the transition from first valve (1) to second and third valve (2-3) received an average rating of 7.5.

Table 1.

Mean difficulty for finger/valve transitions as judged by two trumpet players.

Valve combination for the consequent tone.
0 1 2 3 1-2 1-3 2-3 1-2-3
0: 0.0 1.0 1.0 1.9 1.5 3.0 3.0 3.5
1: 1.0 0.0 2.0 3.0 2.0 4.5 7.5 6.0
2: 1.0 1.5 0.0 5.3 3.0 9.5 6.0 9.0
3: 2.5 4.0 4.5 0.0 7.0 4.0 4.0 5.5
1-2: 1.5 1.5 2.3 7.5 0.0 6.0 6.0 5.0
1-3: 3.5 4.0 9.5 1.5 5.5 0.0 6.0 4.0
2-3: 2.5 6.0 5.5 4.0 5.0 5.5 0.0 3.8
1-2-3: 3.0 4.0 8.5 3.5 6.0 5.0 5.0 0.0

Having collected all of this data, we constructed a computer model of the trumpet/performer interaction. For any given musical score or passage, the model is able to generate estimates of the degree of difficulty for each of seven technical aspects of performance: (1) pitch register, (2) dynamic level, (3) fingering, (4) tonguing, (5) embouchure endurance, (6) breathing, and (7) intervallic transitions. We tested the model by comparing the difficulty estimates with graded trumpet études from a well-established conservatory curriculum.

After developing our trumpet model, we applied it to several trumpet works. Some works were written by trumpet virtuosi while other works were written by non-trumpet players. The virtuoso works included Malcolm Arnold's Fantasy for trumpet, Guillaume Balay's Prélude et ballade, and Herbert Clarke's Stars in a Velvety Sky. In addition, the three movements of Paul Hindemith's trumpet sonate were examined.


Just because a work is easy to perform on a given instrument does not make it idiomatic to that instrument. The work may be easy to perform on all instruments. A gesture is idiomatic when it is can be produced with comparative or relative ease. That is, given what could be the case, the actual arrangement renders the music more manageable.

Consider, by way of example, the effect of key on performance difficulty. Suppose we were to transpose a work through all twelve pitch-classes, and compare the difficulty for all keys. If a work was written in the key of Eb major, and Eb major turned out to be the most difficult of all possible keys, then we could not claim that the work is idiomatic to the instrument. On the other hand, if we found that the key of Eb major exhibited the lowest difficulty score, then this would lend weight to the claim that the work was created with the instrument in mind.

The following two graphs show the effect of transposition on fingering difficulty estimates for the Arnold, Balay and Clarke works. Notice, first of all, that the fingering difficulty shows a general tendency to fall as the work is transposed up in pitch. Brass players will recognize that this is a simple consequence of the way the harmonics and valves interact. As a work is transposed higher, there is less need to use some of the more difficult finger combinations.

Superimposed on this general downward trend you can see local fluctuations in difficulty depending on the key. The point marked zero along the horizontal axis represents the original key in which each work was written. You can clearly see that, with one exception, there is a notable minimum present. (The one exception is the slow second movement in Arnold's trumpet concerto.) The predominance of local dips at zero transposition suggests that the composers chose a key that facilitates performing the work.

Figure 9.

Figure 9: Effect of transposition on fingering difficulty in Malcolm Arnold's Fantasy and Concerto for trumpet.

Figure 10.
Figure 10: Effect of transposition on fingering difficulty in Guillaume Balay's Prélude et ballade and Herbert Clarke's Stars in a Velvety Sky.

Now compare these results with those for Paul Hindemith's Trumpet Sonata shown below. Here there is no clear effect of key, nor is there any notable dip coinciding with the key chosen by Hindemith.

Figure 11.

Figure 11: Effect of transposition on fingering difficulty in Paul Hindemith's Sonate for trumpet.

Another way to examine possible idiomatic design in these works is to observe the effect of changing the tempo. In general, as the tempo is increased, tonguing becomes more difficult while breathing becomes easier. The following graphs show the effect of tempo on overall difficulty for the works written by trumpet virtuosi. In the case of Malcolm Arnold's works, tempo seems to have little effect, except for the lively first movement of his trumpet concerto, which shows a notable increase in difficulty when the tempo is increased by roughly 25 percent.

Figure 12.

Figure 12: Effect of tempo on difficulty in Malcolm Arnold's Fantasy and Concerto for trumpet.

More dramatic changes are evident in the Balay and Clarke works, where there is a marked increase in difficulty that occurs -- a sort of "brick wall" -- where a slight increase in tempo causes a large increase in difficulty. Once again, the zero value along the X-axis corresponds to the original tempo specified by the composer in the score. Notice that for the Balay and Clarke works, the recommended tempo occurs just prior to the brick wall of increased difficulty.

Figure 13.
Figure 13: Effect of tempo on difficulty in Guillaume Balay's Prélude et ballade and Herbert Clarke's Stars in a Velvety Sky.

The equivalent graph for the three movements of Hindemith's Trumpet Sonate is shown below. By comparison with the works by trumpet virtuosi, the effect of tempo is rather featureless. In the first and third movements, the difficulty declines slightly as the tempo is increased, suggesting that the principal difficulty in these movements is linked to breathing rather than articulation.

Figure 14.

Figure 14: Effect of tempo on difficulty in Paul Hindemith's Sonate for trumpet.

To summarize, we've seen that the choice of key and the choice of tempo can have a considerable impact on the overall performance difficulty for a work. In the case of our sample of works by virtuoso performer/composers, we can see that the choice of keys and tempi often approach optimal values. That is, for many movements, the composer has chosen the best possible key or tempo, from the point of view of reducing the performance difficulty. In the case of a work composed by a non-trumpet player, the choice of key and tempo seems to be independent of considerations of performance difficulty.

It bears emphasizing that measures of performance ease and measures of instrumental idiomaticism cannot be regarded prima facie as indices of compositional merit. Difficult works are not necessarily better than easy works, and idiomatic works are not necessarily better than unidiomatic works. Only if the composer's explicit goal is to create a highly idiomatic work might such measures be construed as having a bearing on the evaluation of a composition. Moreover, there are occasionally good reasons for a composer to write explicitly difficult works. As Bernard Holland has pointed out, difficulty itself can be a handy muse.

The point of this analysis has not been to somehow denigrate Hindemith's music. Rather, my point is that musical works exhibit varying degrees of influence of the instrumental idioms. These idioms get reflected in the mental habits of performer/composers, and find their way into the very fabric of the music. That is, the performer's actions get embodied in the music. By paying close attention to the biomechanics and physiology of different performance resources, it is possible to observe idiomatic features present in the musical notation. A virtuoso or idiomatic composer often produces works that exhibit concrete manifestations of the cognitive structures of performance.

It should be clear that we can use this approach to address analytic, historical and cognitive issues in music. For example, this approach might provide additional pertinent evidence in debates and hypotheses related to the origin of a particular work. Did composer X originally write composition Y for instrument Z, and only later arrange the work for instrument W? Finally, this approach allows us to pinpoint those aspects of musical organization that arise from the physiological, mechanical (and possibly psychological) aspects of performance.

4. Social Mediation of Taste

Idiomaticism highlights an interesting aspect of musical experience. Two instrumentalists can have very different experiences playing the same work depending on the performance situation. Yet the sonic result may be indistinguishable to the ear. For example, a difficult passage for violin might be much easier to play using a scordatura (re-tuning of the instrument). Of course, the same divergence of experience can also occur for listeners: two listeners hearing the same music can have dramatically different experiences. Nowhere is this phenomenon more evident than in the case of musical taste. Consider the following two examples reported by Clements:

  1. A common problem for convenience stores is that they become hangouts for young teenagers. In most circumstances, the teenagers are harmless and are not breaking any law. However, store-owners regard their presence as a deterrent for other customers. It has been found that an effective strategy for minimizing loitering is to play music by the Beatles or the Beach Boys (Clements, 1993).
  2. A Chicago school uses music as a punishment during after-school detentions. Detentions last 30 minutes during which the student must listen to recordings of Frank Sinatra. Students are not allowed to do homework or to talk. However, students are invited to sing along if they wish; none do (Clements, 1993). The music has made detention hall highly unpopular, and school officials are pleased by the reduced numbers of students who receive detentions.

In the first case, music has been used as a deterrent. In the second case, music is explicitly used as a punishment. What is interesting about these cases is how the popularity of the music has changed. In the 1960s, playing the music of the Beatles or the Beach Boys would probably have attracted more teenagers to loiter around the local convenience stores. Playing Frank Sinatra in the late 1950s might have made detention hall the single most popular activity at school.

What could explain the reception of music shifting from highly desirable to highly distasteful? After all, the recordings of Sinatra, the Beatles, and the Beach Boys have not changed: they are the same recordings, with the same sequences of sonic events. The music has not changed. What has changed is the people.

It is easy here to jump to conclusions about what is going on. We should acknowledge that there are several possible explanations for such dramatic changes of taste. One possibility is that modern teenagers have a different listening history. The music that has been produced since Sinatra and the Beatles has undoubtedly transformed our hearing; the music may in some sense have been superseded or have lost its power to engage or delight. This might be called the "jaded palette" hypothesis. Although a person might have loved X at one time, X is not nearly so appealing now that one can listen to Y instead.

Of course, a more popular view is to regard such changes in taste as manifestations of peer-related social interaction, especially during post-puberty years. It seems reasonable to assume that past music cannot serve to establish a distinctive peer-group identity for any new generation, since the music will continue to evoke associations with some existing age group. I will have more to say on this topic in Lecture 2 on music's origins.

At a minimum, cases such as these raise interesting questions about the representation of taste. Are musical styles and individual works mentally represented as having specific social connotations? If so, how is music represented socially?

5. Mental Representations as Brain Representations

Perhaps the ultimate representations for music are to be found in the neural codings of human brains. At the moment, we have little understanding of how the brain represents music. However, we can observe what happens when the normal representations are disrupted. Throughout history, neurologists have learned a great deal from those unfortunate individuals who have suffered physical insults to the brain.

In the area of music, Isabelle Peretz has recently written about an especially interesting case, a woman known only as "IR". IR suffered a stroke that left her with some serious musical debilitations. IR suffered no speech-related deficits, but her music listening was severely disrupted. In particular, her stroke severely damaged her musical memory. IR is not able to name well-known melodies. Moreover, she can't even identify whether a melody is familiar or unfamiliar. This is true even for very common melodies such as the national anthem. This memory deficit is evident for both long-term and short-term memory. For example, IR cannot determine whether two three-note fragments are the same or different. She can listen to an entire musical piece, and then be unable to tell whether the same piece is being played a second time.

IR can't identify violations of pitch or temporal structure, but she can identify violations of mode (major/minor) and tempo. She can also describe the emotional character of musical excerpts.

These deficits might not be of interest except for the following fact: IR continues to take pleasure in listening to music. Dr. Peretz gave her a cassette tape containing some music. IR enjoys playing the tape in the cassette deck in her car. She is aware that she plays the tape again and again, but each time the music is fresh and new. She enjoys the music, but cannot tell you anything about it, and cannot recognize any of the tunes from the tape when they are played.

IR raises some difficult questions for music scholars. Most theories of musical aesthetics presume that some sort of short- and medium-term memory is essential for proper musical enjoyment. But IR's listening is restricted to a paper-thin musical present in which past musical events are immediately forgotten, and future musical events remain untethered to what happened earlier.

Interactions Between Biology and Culture

As should now be clear, one of my principal concerns is to bridge the divide between those who regard music as almost exclusively cultural (with little or no influence from biology), and those who regard music as principally a sensory/perceptual phenomenon (with only a minor role for culture). It is, I believe, essential to study music from both perspectives simultaneously.

Musical phenomena are not either/or when it comes to biology and culture. Depending on the phenomenon, biology or culture may have the upper hand. In many cases, there are fascinating interactions between the two.

Let me make this claim concrete by offering an example. I'll begin by talking about an issue from a biological perspective, and then I'll look at the same issue from a cultural perspective.

In most of the world's cultures, there is a notable tendency to place the principal musical line or melody in the uppermost voice or part. This tendency is not universal; in Western music, counter examples include faux bourdon, barbershop quartets, and descant singing. Nevertheless, in general, melodies tend to be placed in the highest part.

A plausible explanation for this practice comes from what hearing scientists have discovered about auditory masking. Masking is the tendency for one sound to obscure or render inaudible another sound. Auditory masking is known to arise due to the mechanics of the basilar membrane in the cochlea, and arises when sounds are close in frequency. Two neighboring frequencies will tend to obscure each other, but the tone with the lower amplitude is prone to being completely masked.

Consider the following illustration. Suppose that two musical parts have equal amplitudes and that they both use complex tones having identical spectral content. In general, complex tones have progressively less energy in the upper partials. Figure 13 shows declining amplitudes for the first seven harmonics of a complex tone whose fundamental is 230 Hz. The X-axis has been scaled according to the position of maximum excitation along the basilar membrane; consequently, equal horizontal distances represent equal regions of potential masking. Masking will occur only between partials that are within a millimeter of each other.

Figure 13

Fig. 13: Spectral content of 230 Hz complex tone.

Now consider the interaction of this tone with a 100 Hz tone having an identical spectral recipe. Partials from both tones will tend to overlap. In Figure 14, the partials of the lower tone are shown as dotted lines:

Figure 14

Fig. 14: Spectral interaction for two complex tones.

Notice that the upper partials of the lower-pitched tone are significantly lower in amplitude than the neighboring partials of the higher tone. Since spectral energy tends to decrease with successive partials, higher-pitched tones will tend to mask the partials of lower-pitched tones more than the reverse.

For those who understand auditory physiology, this account gives a rather satisfying explanation for why musicians might want to place the most important melodic part in the highest voice in a texture.

Let me switch gears now, and talk about one of the most robust and pervasive social phenomena attending music: namely, the long-standing and systematic discrimination against female musicians.

There is, for example, no compelling evidence (or even suggestive evidence) that women as a group are somehow inferior to men in musicianship or musical connoisseurship. Wherever women have been given an equal opportunity to pursue their musical goals, they have shown no less ability than men. However, all of the historical evidence suggests that women have been systematically sidelined when it comes to music.

It is against a background of sustained and widespread prejudice against women that the importance of auditory masking is put in perspective. In light of this prejudice, it is remarkable that so much of the music of the past would be organized to permit women to sing the foremost vocal part. Even when women were entirely excluded from music-making, it is striking that young boys (also of comparatively low social status) still managed to command the principal melodic part.[5]

We see here a complex musical phenomenon that has both biological and socio-cultural origins. In this particular case, we see a phenomenon where physiological factors mitigated an otherwise powerful social practice. The mechanics of the basilar membrane facilitated the participation of women and children in music-making. Were it not for this physiological phenomenon, one can scarcely imagine how much more profoundly women would have been excluded from the production of music.

The take-home message is not that biological factors are more important than social and cultural factors when it comes to music. (One can easily identify musical phenomena where socio-cultural factors are preeminent.) Rather, the lesson is that biological issues broadly intersect with cultural issues in intricate and interesting ways, and that a fuller understanding of music will require attention to both realms.

This lesson has been difficult to learn, not least among cognitive musicologists themselves. In her otherwise excellent book Music As Cognition, Mary Louise Serafine clearly expressed the formerly common view that, when it comes to music, biology is not important.

"it is clear that the basilar membrane (or whatever structure) has exerted no appreciable influence on the way the world's music actually turned out." [p.59]
As we've seen, this isn't entirely accurate. In fact, one might be justified in claiming that, in the darkest periods of gender prejudice, it was the idiosyncracies of the basilar membrane that assured a place for women and children in music-making. Serafine's statement echoes the early attitudes in cognitive psychology when physiology and psychobiology were denigrated, primarily because of their continued association with behaviorism. Most cognitive musicologists are no longer so sanguine, and like cognitive psychologists generally, pay closer attention to the developments in cognitive neuroscience, and seek to better understand some of the biological foundations for mental activity.


This brings us to the conclusion of the first lecture. In this lecture I have placed cognitive musicology within the general history of the cognitive revolution. This revolution, as you will recall, arose in response to the limitations of behaviorism. The cognitive approach eschewed the positivist fallacy of interpreting absence of evidence as evidence of absence. This approach provided greater intellectual space for entertaining theories of plausible invisible mental functions. Cognitivists paid special attention to mental representations.

As we have seen, there is excellent evidence that musically pertinent mental representations exist. Ordinary listeners have access to mental representations for music, and can introspect musically. Some representations can be accessed in the total absence of sound. We can manipulate these mental representations in a variety of ways, but we cannot manipulate them in any way we wish. We've learned there is a difference between hearing and hearing as, and that scale function is a good example of the latter phenomenon. We learned that these ways of hearing are typically automatic and unconscious, and that some ways of hearing as are considerably easier than others. We also saw that hearing as is related to culture and that the functional vocabularies are learned passively from the cultural milieu of the listener.

We've seen that listeners, even non-musician listeners, can experience passages according to rhetorical categories or types. We've noted that there exist mental habits embodied in listening styles, and that most listeners have more than one listening approach which they can apply depending on the circumstance. We've also seen evidence suggesting that the most common conscious mental activity while listening to music is daydreaming. Most of the essential aspects of music listening occur as unconscious mental processes.

We've seen that musical notations can provide useful windows to musical thought, and that modern and ancient notations can be analyzed to reveal patterns of behavior that might otherwise go unnoticed. For example, with appropriate modeling, we can see the effect of instrumental or vocal idioms on musical organization.

We've seen evidence, in the case of melodic accent, that suggests that what modern listeners hear as accented is the same as what ancient listeners heard as accented. We've seen how analyses of sound recordings point to possible social factors involved in performance practice.

We've also seen how brain injuries can sometimes give us useful clues about how mental representations are concretely coded, and how the ensuing musical changes can tell us something about the elements of musical experience. And finally, I've shown how biology and culture can interact in subtle and unexpected ways -- as when the structure of the human hearing organ tended to mitigate against a pervasive sexism.

The Promise of Cognitive Musicology

What is cognitive musicology? Cognitive musicology is the study of habits of mind as they relate to music. Since minds are the products of both biology and culture, cognitive musicology is an approach to the study of music that takes both biology and culture seriously. A common ground for both biological and cultural study is found in the domain of mental representations. Consequently, much of the day-to-day research of cognitive musicologists centers on discovering and deciphering various music-related mental representations.

As you might expect, I believe that cognitive musicology has much to offer music scholarship in general.

For the historian, cognitive musicology offers (with some limitations) the possibility of reconstructing aspects of seemingly lost practices. It also offers ways to approach how musical works and practices may have held meanings for listeners and musicians of past historical periods and places.

For the ethnomusicologist, cognitive musicology offers relatively effective techniques for gaining access to the minds of others, and useful ways of pinpointing how culturally sophisticated experiences differ from culturally naive experiences. Cognitive musicology also offers the ethnomusicologist better ways for investigating how material and cultural conditions get reflected and expressed in a music.

For the performer, cognitive musicology offers ways for investigating what distinguishes inexpressive and pedestrian performances from inspired and compelling ones.

For the composer, cognitive musicology offers pointers to cognitively and perceptually rich regions of unexplored musical materials. In describing musical "habits of mind," cognitive musicology can help composers in their quests to establish new habits for the musical mind.

For the music theorist, cognitive musicology promises to address basic questions of musical organization from a more rigorous and less speculative approach.

There has been a growing interest in music cognition in recent years. I think this growth originates, at least in part, because cognitive musicology can appeal to scholars inspired both by continental and by Anglo-American philosophical traditions. For the continentally-inspired scholar, music cognition offers the opportunity to treat subjectivity as real without reifying it. Music cognition provides ways of considering the subjective without making it mystical or juxtaposing it irredeemably against the objective. Nor does it merely objectify the subjective.

For the empiricist-inspired scholar, cognitive musicology offers the opportunity to transform intuition and speculation into conjecture and hypothesis, and thereby provides a means for testing musical ideas and theories.

In the ensuing lectures, I hope to illustrate in greater detail some of the accomplishments and opportunities that cognitive musicology holds.

Thank you.


[1] At the same time, those music scholars who pursued socially-oriented studies in music (such as the Anglo-Marxist popular-music scholars) failed to pay much heed to the extant psychological research. As the anthropologist Roy D'Andrade has pointed out (regarding sociology generally), sociologists have showed an extraordinary ignorance of the extant psychological research, and have tended to devise their own psychological theories with little reference to the existing research.

[2] In an unguarded moment, Ulric Neisser unhelpfully wrote that "every psychological phenomena is a cognitive phenomena." This casts a very wide net. As we will see, there are a number of themes that characterize and give some focus to cognitive psychology and cognitive science.

[3] This enthusiasm was concretely evident in research on information processing, where mental phenomena were analyzed as successively ordered stages of processing.* [See, e.g., R. Lachman, J. Lachman & E.C. Butterfield, Cognitive Psychology and Information Processing. ] As Ulric Neisser defined it,

"Cognitive psychology refers to all processes by which the sensory input is transformed, reduced, elaborated, stored, recovered, and used."

[4] Examples of other principles might include (1) phrase-final fall (where pitches at the ends of phrases tend to exhibit a downward contour), (2) a preponderance of small intervals; in particular, repeated pitches are common (except when it is impossible to re-articulate notes -- such as with the bagpipe) (3) repeated text is often associated with repeated melodic passages (facilitates memory)

[5] There may be other factors that also favor placing the melody in the upper-most voice. However, auditory masking appears to play the most significant role.


Arnold, M. (1969).
Fantasy for B flat Trumpet. (Opus 100), London: Faber Music Ltd.
Baily, J. (1985).
Musical structure and human movement. In: I. Cross & R. West (Eds.), Musical Structure and Cognition. London: Academic Press, pp. 237-258.
Balay, G. (n.d.)
Prélude et ballade. Cornet Solo with Piano Accompaniment. New York: Belwin Inc.
The Benedictines of Solesmes (Eds.). (1963).
The Liber Usualis. Tournai, Belgium: Descleee Company.
Bowen, J.A. (1993-4)
A computer aided study of conducting. Computing in Musicology, 9: 93-103.
Bruner, C.L. (1984).
The perception of contemporary pitch structures. Music Perception, 2(1), 25-39.
Caplin, W. (1978).
Der Akzent des Anfangs: Zur Theorie des musikalischen Taktes. Zeitschrift für Musiktheorie, 8, 17-28.
Clarke, E.F. (1989).
Mind the Gap: Formal structures and psychological processes in music. Contemporary Music Review, 3, 1-13.
Clarke, H.L. (n.d.)
Stars in a Velvety Sky. Solo B-flat cornet. New York: Carl Fischer.
Collins, D. & Huron, D. (in press).
Voice-leading in Cantus Firmus-based canonic composition: A comparison between theory and practice in renaissance and baroque music. Computers in Music Research.
Densmore, F. (1929).
Pawnee Music. Washington, DC: Smithsonian Institution, Bureau of American Ethnology.
Drake, C.W., Dowling, W.J. & Palmer, C. (1991).
Accent structures in the reproduction of simple tunes by children and adult pianists. Music Perception, 8, 315-334.
Farnsworth, P.R. (1948).
Sacred cows in the psychology of music, Journal of Aesthetics and Art Criticism, 7(1), 48-51.
Francès, R. (1958/1988).
La Perception de la Musique. Translated by W.J. Dowling as The Perception of Music. Hillsdale, N.J.: Lawrence Erlbaum Associates.
Gibson, D. (1986).
The aural perception of nontraditional chords in selected theoretical relationships: A computer-generated experiment. Journal of Research in Music Education, 34(1), 5-23.
Gibson, D. (1988).
The aural perception of similarity in nontraditional chords related by octave equivalence. Journal of Research in Music Education, 36(1), 5-17.
Gibson, D. (1993).
The effects of pitch and pitch-class content on the aural perception of dissimilarity in complementary hexachords. Psychomusicology, 12(1), 58-72.
Harwood, D.L. (1976).
Universals in music: a perspective from cognitive psychology. Ethnomusicology, 20, 521-533.
Hindemith, P. (1940).
Sonate für Trompet in B und Klavier. Mainz: B. Schott's Sohne (ED 3643).
Holland, B. (1999).
When composers make it hard, fright and strain become muses. New York Times, June 1, 1999; p.B1f.
Huron, D. (1990).
Mary Louise Serafine: Music As Cognition: The Development of Thought in Sound [review of]. Psychology of Music, Vol. 18, No. 1, pp. 99-103.
Huron, D. & Berec, J. (1993).
The influence of performance physiology on musical organization: A case study of idiomaticism and the B-flat valve trumpet. Unpublished manuscript.
Huron, D. & Royal, M. (1996).
What is melodic accent? Converging evidence from musical practice. Music Perception, 13(4), 489-516.
Kippen, J. & Bell, B. (1989).
The identification and modelling of a percussion "language", and the emergence of musical concepts in a machine-learning experimental set-up. Computers & Humanities, 23(3), 199-214.
Krumhansl, C.L. (1990).
Cognitive Foundations of Musical Pitch. Oxford: Oxford University Press.
Krumhansl, C.L. (1995).
Music psychology and music theory: Problems and prospects. Music Theory Spectrum 17(1), 53-80.
Lannoy, C. (1972).
Detection and discrimination of dodecaphonic series. Interface, 1, 13-27.
Largent, E.J. (1972).
An investigation into the perceptibility of twelve-tone rows. Ohio State University, unpublished PhD Dissertation.
Lewin, D. (1986).
Music theory, phenomenology, and modes of perception. Music Perception, 3(4), 327-392.
Lomax, A. (1962)
Song structure and social structure. Ethnology, 1, 425-451.
MacKenzie, C.L. & Iberall, T. (1994).
The Grasping Hand. Amsterdam: North-Holland (Elsevier Science).
Millar, J.K. (1984).
The aural perception of pitch-class set relations: A computer-assisted investigation. North Texas State University, unpublished PhD dissertation.
Nam, U. (1998).
Pitch distributions in Korean Court music: Evidence consistent with tonal hierarchies. Music Perception, 16(2), 243-248.
Neisser, U. (1967).
Cognitive Psychology. New York: Appleton-Century-Crofts.
Pedersen, P.R. (1970).
The perception of musical pitch structure. University of Toronto, unpublished PhD Dissertation.
Roederer, J.G. (1987).
Why do we love music? A search for the survival value of music. In: Music in Medicine, R. Spintge & R. Droh (Eds.), Heidelberg: Springer Verlag.
Serafine, M.L. (1988).
Music As Cognition: The Development of Thought in Sound. New York: Columbia University Press.
Squire, C.R. (1901).
Genetic study of rhythm. American Journal of Psychology, 12, 546-560.
Sudnow, D. (1978).
The Ways of the Hand; The Organization of Improvised Conduct. Cambridge, Massachusetts: Harvard University Press.
Sudnow, D. (1979).
Talk's Body. New York: Alfred A. Knopf, Inc.
Temperley, D. (2000).
The question of purpose in music theory: Description, suggestion, and explanation. Current Musicology, in press.
Thomassen, J. (1982).
Melodic accent: Experiments and a tentative model. Journal of the Acoustical Society of America, 71, 1596-1605.
Thomassen, J. (1983).
Erratum. Journal of the Acoustical Society of America, 73, 373.
Thrall, B. (1962).
The audibility of twelve-tone serial structure. Ohio State University, unpublished PhD dissertation.
Tuzin, D. (1984).
Miraculous voices: the auditory experience of numinous objects. Current Anthropology, 25, 579-596.
Vos, P.G. & Troost, J.M. (1989).
Ascending and descending melodic intervals: statistical findings and their perceptual relevance. Music Perception, 6(4), 383-396.
Watt, H.J. (1924).
Functions of the size of interval in the songs of Schubert and of the Chippewa and Sioux Indians. British Journal of Psychology, 14, 370-386.
Weber, M. (1958).
The Rational and Social Foundations of Music. Carbondale: Southern Illinois University Press.
Woodrow, H. (1911).
The role of pitch in rhythm. Psychological Review, 18, 54-77.
Yung, Bell (1984).
Choreographic and kinesthetic elements in performance on the Chinese seven-string zither. Ethnomusicology, Vol. 28, pp. 505-517.