Scarlet & Grey
Ohio State University
School of Music


Conclusion

The main theoretical points of this study can be summarized as follows:

  1. The ability to anticipate future events is important for survival. It is reasonable to assume that evolution by natural selection has shaped perceptual and cognitive systems so that they endeavor to anticipate future events. "All brains are, in essence, anticipation machines." (Dennett, 1991; p.177).

  2. It is possible to form relatively accurate expectations only because real-world environments exhibit structure and are not totally chaotic.

  3. Some expectations are formed through conscious thought or reflection, as when a knowledgeable jazz listener anticipates a drum solo following a bass solo. However, most expectations are unconscious, automatic, and ubiquitous. We cannot "turn off" the mind's tendency to anticipate events, and we are usually unaware of the mind's disposition to make predictions. Except when we are surprised, or when the outcomes are important, we may not be cognizant of the specific predictions our minds make.

  4. Minds are disposed to anticipate all types of stimuli -- even those stimuli (like music) which appear to be unimportant for survival.

  5. Theoretically, expectations might have exclusively innate or learned origins. When an environment remains stable over millions of years, it is possible for efficient innate expectations to evolve. In hearing, innate functions are evident in such auditory reflexes as the orienting response. However, when an environment is highly variable, the capacity to form expectations through learning provides a better evolutionary strategy (Baldwin, 1896).

  6. The auditory environments in which humans evolved appear to have been highly variable. Sounds that in one context might indicate danger, might, in another context, indicate opportunity. Given the great variety of auditory contexts in human experience, it should not be surprising that the existing research implicates learning as the preeminent source of auditory expectations.

  7. Ideally, the principles underlying expectations would precisely reflect the actual principles that cause the environment to be a particular way (i.e., Shepard's complementarity).

  8. Whether innate or learned, expectations can be formed through exposure to an environment. Expectations arise through a process of induction, in which generalizations are formed from a finite number of specific experiences.

  9. Since inductive inference is known to be fallible, the generalizations formed through listener experience are also fallible. That is, the principles underlying expectations are likely to be imperfect approximations of the actual principles shaping the world (von Hippel, 2002).

  10. For a broad sample of melodies, several simple principles have been identified that appear to underly the objective organization. One principle is the tendency for successive pitches to be relatively close. Experienced listeners appear to form an appropriate expectation for pitch proximity. A second principle is for pitches to exhibit a central tendency. A mathematical consequence of central tendency is the phenomenon of regression-to-the-mean. However, experienced listeners do not form an appropriate expectation for melodic regression. Instead, experienced listeners expect post-skip reversal -- which is an approximation of melodic regression. A third principle is that large intervals tend to ascend. The more common repercussion is that small intervals tend to descend. However, experienced listeners do not form the appropriate expectations. Instead, experienced listeners expect step-inertia -- which appears to arise from a combination of the tendency for pitch proximity, and the tendency for intervals to descend.

  11. In a stable environment, the most frequently occurring events of the past are the most likely events to occur in the future. A simple yet optimum inductive strategy is to expect the most frequent event. The simple frequency of isolated events ("zereoth-order distribution") forms the foundation for learned expectations.

  12. An example of frequency-dependent learning in music is listener sensitivity to the distribution of scale degrees as documented by Krumhansl and elaborated by Aarden.

  13. In addition to zeroeth-order frequencies, listeners are also able to learn contingent frequencies of neighboring or co-occurring events. The distance separate contingent events can range from immediate neighbors to long-range relationships. In addition, contingent probabilities can be influenced by the number of prior events that combine to influence a particular ensuing event. These probability "frames" can range from a single preceding event (first-order probability), to many preceding events (higher-order probabilities).

  14. An example of contingent-frequency learning in music can be found in scale-degree successions, such as the tendency for chromatic tones to be anchored to neighboring diatonic tones.

  15. Expectations provoke emotional responses. Three response categories can be distinguished: (1) responses that preced the outcome (anticipatory affective responses), (2) responses evoked by the outcome itself (secondary affective responses), and (3) responses related to the accuracy of the expectation (primary affective responses). A positively valenced primary affect ensues when an expectation proves accurate, whereas a negatively valenced primary affect ensues when an expectation prove inaccurate.

  16. Expectations that prove to be correct represent successful mental functioning. Successful anticipations help us prepare appropriate motor responses, inhibit or suppress inappropriate responses, and better perceive ensuing stimuli. Successful expectations evoke a primary affective reward.

  17. Successful expectations can be measured. When a person's expectations are correct, they will be faster and more accurate in processing information related to the expectation. Accurate expectations can be regarded as functionally equivalent to perceptual priming.

  18. Expectations that prove to be incorrect represent failures of mental functioning. Unsuccessful expectations evoke a primary affective punishment in the form of stress.

  19. Stress is also evoked under situations of high uncertainty. That is, stress can ensue when we already anticipate that we will fail to anticipate events (negative anticipatory affect).

  20. Since successful predictions evoke a positive primary affective response, we may mistakenly attribute the positive feelings to the outcome itself. That is, we may prefer a predicted outcome.

  21. In addition, if we repeatedly make successful predictions for a given outcome, then the predicted outcome can itself become associated with the positive feelings.

  22. Since we are more likely to successfully predict high frequency events, it is high frequency events that tend to become associated with the primary affective reward that accompanies successful prediction. Over time, we come to prefer the high frequency events (expectancy effect).

  23. An example of the expectancy effect in music is the phenomenon of tonality. Once a tonal center is established, the listener will experience the tonic stimulus as more pleasant or preferable to other states.

  24. Another example of the expectancy effect is found in the phenomenon of meter. Once a metrical context is established, the listener will experience events that occur at the most expected moments to be more pleasant or preferable to other states.

  25. [Closure and Stability]

  26. While expected events are generally preferred, highly predictable environments can lead to reduced attention and lowered arousal -- often leading to sleepiness.

  27. Apart from the simple frequency of occurrence, we are also sensitive to the co-occurrences of various events. That is, we form expectations based on conditional probabilities.

  28. Most conditional probabilities reflect short-range moment-to-moment contingencies, as when one note tends to immediately follow another. However, long-range conditional probabilities may also be formed -- provided such long-range structures exist in the environment.

  29. Expectations can be learned dynamically. That is, listening to a passage can help listeners form expectations that arise uniquely from the immediately preceding experience.

  30. Regularities in the world are often evident only in particular contexts or environments. It is important for an organism to learn to distinguish these different environments, and to protect learned expectations within each context from the undue influence of learned associations that pertain to a different context (Cosmides & Tooby, 2000).

  31. Such cognitive firewalls permit listeners to distinguish different kinds of musical experiences. Learned expectations can be segregated into different expectational sets or "schemas."

  32. Due to lack of experience or possible cognitive deficits, it is possible that a listener fails to distinguish two forms of musical experience that other listeners experience as distinct kinds. A given listener might consequently experience a musical genre in a unique or idiosyncratic manner.

  33. Complex stimuli may unfold in an invariant way, as when we hear the succession of pitches of Happy Birthday. In this case we form veridical expectations -- given these eight notes, the ninth note will undoubtedly be ...

  34. Veridical expectations do not suppress the effects of schematic expectation (Bharucha). Schematic expectations are tenacious. This explains the apparent paradox of how some events can be both simultaneously surprising and unsurprising. For example, a wholly expected deceptive cadence doesn't entirely lose it's "deceptive" character.

  35. Schemas may include prediction rules, such as the rule that successive tones tend to be close in pitch. These rules arise because they are broadly successful in their predictions (though not infallible). Some prediction rules are sub-optimum. An example is the rule for post-skip reversals. This rule is generally successful in its predictions, however the rule merely approximates a more fundamental property of musical structure, namely that melodies tend to be constrained in their ranges. A regression-to-the-mean rule would allow listeners to better predict successive melodic pitches, however listeners appear to learn the less accurate post-skip reversal prediction rule.

  36. Expectations rely on underlying mental representations. Representations might include absolute pitch, pitch-class, scale degree, interval, contour, etc. Several representations may operate concurrently in the forming of expectations. It appears that not every listener has access to all of these representations. For example, people with absolute pitch are able to code events and expectations according to absolute pitch. A major difference between people who have AP and those who don't is that AP possessors heard musical works in early life that are always in the same key, whereas non-AP possessors typically experienced musical works in a multitude of keys. It is possible, as argued by Abramson at the beginning of the twenthieth century, that the practice of singing songs in different keys, reduces the value of coding absolute pitch, and so pitch height lost its predictive value for some listeners -- leading to the ignoring of pitch height information.

  37. Since more than one representation may be involved in forming expectations, an expectation may be mixed. For example, one element (such as pitch) may be highly unexpected, whereas another element (such as onset time) may be highly expected.

  38. When the circumstances are appropriate, listeners may come to expect the unexpected. That is, a sort of "reverse psychology" may arise. Twelve-tone music has been shown to be organized in a manner consistent with such reverse psychology.

  39. Paradoxical expectations can arise when schematic and veridical expectations differ.

  40. Different listeners may have different expectations. Individual differences may be attributable to four possible sources. (1) Listeners may differ in their underlying representation codes. For example, one listener may favor an absolute pitch representation, whereas another listener favors a scale degree representation. (2) Listeners differ in the exposure to music, and so some listeners may have had less opportunity to develop appropriate schemas. (3) A listener may fail to distinguish expectational sets that may be appropriate for different genres of music. For example, as Krumhansl has shown, a listener may continue to apply a tonal schema to an atonal listening experience. (4) Listeners may differ in the accuracy of the prediction rules. For example, it is theoretically possible that a listener experiences melodic contours in accordance with the regression-to-the-mean rule rather than the post-skip-reversal rule. (5) It is theoretically possible that existing schemas may prevent a listener from distinguishing a separate schema. For example, a hypothetical scale schema `B' might interfere with the acquiring of a similar (yet distinct) schema `A'. A listener who acquires schema `A' first may retain the ability to acquire schema `B', whereas a listener who acquires schema `B' first may be incapable of acquiring schema `A'. For example, Meyer (1956; p.46) cites the Fox Strangways who claims that some Indian music uses a scale that is very similar to the Western major scale, yet the "tonic" pitches do not coincide. The Western listener may therefore hold expectations that are wholly inappropriate to the Hindustani music (Fox Stangways, 1914; p.18).

  41. In addition to the primary affective response (which reflects the accuracy of the expectation), a listener can experience a secondary affective response that reflects the appraised value of the outcome state. Positive outcomes evoke positive secondary emotions and negative outcomes evoke negative secondary emotions.

  42. Primary and secondary affective responses interact. Highly predictable outcomes evoke less response than highly unpredictable outcomes. For example, an unexpected positive outcome will feel better than a highly expected positive outcome. Similarly, an unexpected negative outcome will feel worse than a highly expected negative outcome. In effect, increased uncertainty tends to amplify the aggregate affective response.

  43. The delaying of an outcome has the effective of decreasing its certainty. Consequently, delay amplifies the aggregate affective response. The effect of delay is most marked when events seem to be most certain.

  44. Many performance and compositional techniques can be regarded as efforts to delay expected outcomes. Such delaying techniques tend to be used in the most stereotypic musical passages.

  45. The fact that learning plays a preeminent role in forming expectations, in addition to the fact that expectations can adapt dynamically to ongoing stimuli, suggests that there exist considerable opportunities to craft a range of musics for which listeners may form appropriate expectations.

A number of questions remain to be addressed in future research concerning musical expectations. Perhaps the premiere unresolved question concerns the nature of the mental representations that underly musical expectations. What do listeners expect? Do they expect intervals, pitches, pitch-classes, scale degrees, scale degree successions, contours, rhythms, pitch-rhythms, etc. The existing research provides evidence that mental representations for music consist of a complex combination of musical elements. There is also evidence that different listeners may make use of different representations.

Under what circumstances are new expectational sets formed. That is, when will the auditory system erect a cognitive firewall to allow the formation of a new music-related schema? Is is possible for past listening experiences to prevent a listener from forming a new musical schema? Is it possible, for example, with the right regime of musical exposure, for a modern listener to form a truly "medieval" way of hearing early music?

Finally, what types of musical structures or principals of organization will fail to evoke appropriate learning?

Footnotes

[1] The tonic is the most common pitch only for tonal music that does not contain modulations. Return to text.

[2] A survey of European folksongs indicates that melodies in major keys are roughly twice as common as melodies in minor keys. This suggests that even the choice of initial schema may be sensitive to the frequency of occurrence of various contexts. Return to text.