Action and Perception in Rhythm and Music edited by Alf Gabrielsson

Reviewed by David Huron

Psychology of Music, Vol. 16, No. 2 (1988) pp. 156-162.
Alf Gabrielsson, Action and Perception in Rhythm and Music; Papers given at a symposium in the Third International Conference on Event Perception and Action. Stockholm: Royal Swedish Academy of Music, 1987, No. 55; ISBN 91-85428-51-5 (with recorded disk).

In many ways this volume is a second testimonial collection to one of the modern proponents of rhythmic research, Ingmar Bengtsson. [Footnote 1] This volume marks the retirement of Bengtsson as head of the Department of Musicology at Uppsala University -- a post which he has held for a quarter of a century. As with any collection of this sort, the thirteen papers contained within its soft covers are highly varied in quality, and reflect some of the growing pains as music scholarship attempts to accommodate empirically-inspired research within its ranks. The study of rhythm is a particularly apt concern; there is a need to redress the relative imbalance of studies of temporal structure compared with studies of pitch structure. Moreover, as Alf Gabrielsson points out, timing is the one common factor which is available in all music-making endeavors -- independent of the types of resources provided.

After the Preface by Alf Gabrielsson and a historical review by Paul Fraisse, Eric Clarke continues the volume with a valuable paper demonstrating the categorical nature of rhythmic perception. This is a welcome extension to studies by Fraisse (1956) and Povel (1981) which have shown the primacy of small integer proportions in the perception and cognitive representation of rhythms. Clarke follows classic criteria laid out by Studdert-Kennedy, Liberman, Harris and Cooper (1970) for demonstrating that a class of perceptions is categorical in nature. Although the temporal dimension may be physically continuous, there are no `shades of grey' when it comes to the perception of simple and compound meters. Metrical contexts dispose listeners to classify rhythms in a manner which most reconciles the rhythm to the prevailing metrical context (simple or compound). Not surprisingly, there is a sort of mental inertia which tends to conserve the perception of meter -- even in the face of ambiguous or conflicting stimuli.

Clarke places his experimental results in a theoretical context which distinguishes two types of rhythmic information: (1) the metric category of a passage which he equates with the domain of structure, and (2) deviations from strict metricality which he identifies as the "expressive" domain.

"The separation of temporal information into a domain of structure and a domain of expression resolves the apparent paradox that small whole number duration ratios are the simplest to perceive and reproduce, but that real human performances do not conform to these integer proportions." (p.31).
Moreover, Clarke explicitly aligns the structural domain with categorical perceptions, and the expressive domain with continuous perceptions. His experimental work nicely shows that metricality conforms to the criteria for categorical perception. However, there is no similar empirical support for his complementary claim that the expressive information is perceived as continuously variable. This may or may not be the case, since there is evidence both pro and con. Neil Todd's model of metrically-linked rubato shows that deviations from metricality are stereotypically patterned, while the work of Kronman and Sundberg (see later in this review) shows that rallantandos are also stereotypic. Jazz drummers make a distinction between `hot' or `push' drummers and `cool' or `lag' drummers on the basis of deviations from strict metricality. These facts suggest that interpretive gestures may be discrete and/or may be perceived categorically -- independent of the perception of the metrical context.

In various forms, the "structure/expression" model articulated in Clarke's paper has become the most common view of musical rhythm. It has an intuitive appeal, partly because it corresponds well with traditional views in western music. But there are certain problems with this model which ought to be explicitly acknowledged. In the first instance, I think it is prudent to be cautious about any iron-clad distinction between the basic (written) musical gestures and expressive or `interpretive' transformation of these gestures. This distinction may be culture-bound, and may arise simply from the division of labors between composer and performer in western art music. Not all performing arts make such a distinction. For example, improvising dancers do not normally conceive of their work in terms of basic gestures which are then embellished or transformed by interpretive nuances. Most dancers view the gesture and the interpretation as inspearalbe and equivalent. Clarke's formal distinction between "a domain of structure and a domain of expression" implies that the metrical context has no expressive qualities. Accordingly, a march (simple-duple) and a gigue (compound-duple) could not be considered to differ with respect to musical expression. But surely this begs the question as to what constitutes musical expression. Few people would object to the characterization of a waltz meter by an expressive term such as "graceful" and the characterization of a rumba by an expressive term such as "sleazy". Another difficulty is that tempo must be discounted as merely a structural rather than expressive component; this suggests that slowing down a disco beat would not change its expressive content. Moreover, wouldn't it be better to characterize "metrically dead-pan" or mechanical performances as "staid" or "awkward" rather than as "neutral" or "devoid of expression"? I am sure that Clarke would agree that metrical context is not merely a neutral wardrobe within which the expressive content hangs, but that the metrical context directly contributes to what we might agree to call the expressive content of a work. Can we unequivocally say that a composer chooses a particular metrical genre purely for "structural" reasons rather than "expressive" reasons?

There are other good reasons to be skeptical of any easy distinction between form and content. Logically, there is little reason why the roles cannot be reversed. For example, rather than viewing the gestures produced by performers as "interpretations", it is equally possible to view them as stereotypic forms by which the expressive content (the "notes" or structure of the work) is conveyed. Do notes convey gestures or do gestures convey notes? What is to be gained by either of these dichotomies?

This issue of expression-as-deviation arises in a number of other papers in this volume. Bengtsson admits that "deviation" is a slippery concept when one is unsure of what constitutes the norm. We use a mechanical norm as the temporal ruler only because we can think of no other alternatives. The problem gains poignancy in Peter Reinholdsson's paper on jazz rhythm. Reinholdsson asks: what justifies comparing performance timings to an abstract (notationally-derived) notion of strict metricality when the music is not notated? What is the legitimate template against which performance "deviations" are compared? Unfortunately, Reinholdsson's paper remains somewhat vague and descriptive, and I feel Reinholdsson makes the problem of a temporal ruler sound more intractable than it is. there are numerous studies concerning the perception of the passage of time which might provide alternative points of departure. And while improvised jazz may not be notated, there is no reason not to use that quintessentially jazz listening response -- toe-tapping -- as a clock against which systematic variations may be examined.

Such heady issues are abandoned for more practical concerns in a paper by Piet Vos and Stephen Handel concerning the production and perception of triplets and duplets. Vos and Handel had subjects perform various triplet/duplet rhythms and then tested the same subjects on the perception of synthesized and re-synthesized renditions. In the perceptual task, subjects were instructed to judge the more "natural" sounding stimuli. Unfortunately, the conclusions of this paper are highly equivocal; the experiments produced enigmatic results such as the fact that the subjects did not necessarily judge even their own performed versions as sounding more natural than various transformations.

Judging stimuli according to "naturalness" is an unfortunate instruction. The obvious tendency is to assume that "natural-sounding" stimuli would also be "preferred" in some sense. But one cannot claim that any systematic result would be correlated with musical preference without further research linking the two concepts. What Vos and Handel have shown is that non-metrically exact triplets and duplets are perceived as sounding more "natural" than metrically exact renditions. But this is hardly surprising, for if by "natural", we mean non-mechanical or irregular, it would make sense that subjects would not choose regular patterns as the most natural sounding. Again, we must remind ourselves that the subjects were not supposed to judge on the basis of preference. But in the end, even the authors slip into this misconception:

"... the changes found in production do not necessarily yield better sounding rhythms" (p. 46)

"... we can suggest that playing doublets and triplets accurately is not preferred." (p.46)
If the original intention of the experiment was to elucidate rhythmic preferences, then there are other problems with the experiment apart from the problem of using "natural" as a synonym for "preferred". A more serious methodological problem is the choice of subjects. Vos and Handel used non-professional musicians (which in itself is fine), but then reported that in the production task for the first experiment "many [subjects] could not accurately tap the rhythms" (p.35), and in the second experiment (using different subjects) that "half of our subjects could not maintain an alternating 2/beat and 3/beat rhythm" (p.42). The authors add:
We might expect that professional musicians would be more able to play the alternating rhythms. It is doubtful, however, that the results would be qualitatively different." (p.42)
Such statements are not apt to inspire confidence in the reader. For most auditory studies, experimenters may legitimately go out into the street and commandeer anything adorned with ears (barring cats and dogs) as subjects; but such a laissez-faire attitude in music research cannot be sanctioned. When systematic results are absent from data, one of the first things the music experimenter should question is the skill level of the subjects. This is not to suggest that more skilled subjects would necessarily produce more systematic data; it is merely to recognize that music is a skilled behavior, and results are apt to be correlated with skill level.

On the question of musical measurement, Gerald Balzano contributes both a thoughtful and passionate paper. He attacks "the assumption that traditional physics, with its particular arsenal of measuring instruments, provides an exhaustive description or even privileged units of analysis for the real world." (p.179) Balzano ardently pleads for a "return [of] the reckoning of musical units and structures back to musicians, where it rightfully belongs" (p.198) Balzano places particular stress on measurements of pitch:

"The protest of some musicians that "in tune is what the player says is in tune", has been derided by scientists as the woolly-headed thinking of artists." (p.179)
The point carries an obvious truth, and bears being constantly reiterated. But I fear Balzano is being unduly harsh on scientists by virtue of failing to be equally critical of those within our own hallowed ranks of musicians. I can think of no more polemical and moralising literature based on physical rater than psychological principles than the ten centuries of musical writings concerning tuning and temperament. Harry Partch would share the derision of Balzano's hypothetical scientist against musicians satisfied with their own tuning. Where scientists are in danger of drifting into physicalist measurements devoid of any cognitive or perceptual reality, musicians have shown a similar penchant to drift into empty formalisms.

The problem Balzano identifies is exemplified in the frequent tendency to regard metrical structure as a recursively-organized hierarchy. In Mari Riess Jones article, this hierarchy is given a central place. In order for recursive hierarchies to make sense in perceptual theory, all levels of the hierarchy must be qualitatively equivalent from the perceptual point of view -- that is, all hierarchical levels must elicit the same class of phenomenal experience. Music perception may be hierarchically ordered, but in order for the hierarchy to be recursive, all levels must maintain identical properties.

The pitfalls of this mode of thinking are vividly illustrated in the writings of Karlheinz Stockhausen. By manipulating a square-wave oscillator, Stockhausen discovered that as the frequency descended into the subsonic range, pitched tones were transformed into rhythms. Stockhausen extended this simple observation into a full-blown music theory of truly cosmic proportions. He has argued that the epicyclic periodicities of music extend through the complete spectrum of time-frames from the "big bang" to ulta-violet light, and X-rays.

As a formalism, this cosmic perspective is not without interest. However, as a description of perceptual reality it is fatuous. As Balzano would point out, we must keep our understanding of physical origins from coloring our understanding of perceptions. All perceptions share the same origin in the electro-chemical stimulation of the nerves: there is nothing qualitatively different in the activation of the auditory nerve versus the optic nerve. But the perceptual and phenomenal qualities elicited by these two stimulations could be hardly more different. Loudness, saltiness, brightness, pain, and color are qualitatively different experiences despite their common neurological origin. Even within the domain of audition, there arise innumerable distinct mental phenomena -- such as loudness and pitch. As an oscillator is swept into the subsonic region, despite the constancy of the physical phenomenon, the associated mental experience is that of a pitch which is abruptly superseded by a rhythm. Any theory which unifies pitch and rhythm under a single description may be an adequate formal or physical theory, but is not a good perceptual theory for it blithely discounts a distinction which minds readily make.

From a physical point of view, pulse/beat, meter, sub-phrase, etc. all exist within a single uniform hierarchical temporal system. But we have little evidence that this hierarchy exhibits the same uniformity from a perceptual point of view. For example, musicians might agree that when a meter is increased in tempo it begins to sound more like a beat. Such a statement should not be construed to mean that meter and beat are really part of the same phenomenal continuum -- no more than one should construe pitch as begin merely speeded-up rhythm. What such a statement does suggest is that there are definite bounds within which periodic energies are perceived as a beat -- as distinct from a meter. Gregorian chant suggests that it is possible to have musical phrases without underlying beats or meter; thus chant literally exposes a hole in recursive/hierarchical thinking about rhythm. In the absence of evidence showing that beat, meter, and phrase are qualitatively similar percepts, we ought to avoid theories which equate these concepts a priori under a uniform or recursive hierarchy.

Having criticized the notion of recursive hierarchies in perceptual theory, let me at once say that I don't think such concepts are entirely out of place in music theory. Even assuming that music could be explained purely in terms of human psychology, music psychology cannot be equated solely with perception, since there are cognitive, emotional, social and other dimensions to human experience. For example, the prominent pitch-to-rhythm glissando near the mid-point of Stockhausen's Kontakte is a structural gesture which represents a colossal unifying principle in the world. The glissando has an intellectual or symbolic meaning that positively contributes to the depth of musical experience. Some ideas have a transcendent appeal -- the beauty of which may successfully overshadow their erroneous premises. Stockhausen may offend physicists by incorrectly conflating acoustic frequencies with electro-magnetic frequencies; and he many offend perceptual psychologists by incorrectly conflating pitch and rhythm; but he may nevertheless manage to put together a brilliant and moving work. We have a great deal to learn about how perception interrelates with historical, formal, symbolic, social, and other factors.

For all of the above criticisms, ultimately I am endeared to this volume because of the presence of a delightful paper entitled "Is the musical ritard an allusion to physical motion?" by Ulf Kronman and Johan Sundberg. This is an elegantly crafted paper which is destined to be a classic. The argument is simplicity itself. Kronman and Sundberg take data from 24 recorded ritardandos and compare them to a model of physical deceleration. The match is excellent. As in any good paper, the authors' discussion takes the reader along a series of paths and possible caveats -- methodically defusing hypothetical objections. There can be no doubt that what musicians mean by "slowing down" is precisely equivalent to what Isaac Newton meant by the same phrase. It suggests that the end of a musical work is an "arrival" in a very literal sense, and reinforces the experience of rhythm as mental analog of locomotion. This paper is one of the best accounts linking a characteristic musical phenomenon to a naturalistic explanation.


David Huron
Ohio State University


References

Fraisse, P. Les structures rhythmiques. Louvain: Editions Universitaires, 1956.


Return to Publication List

Return to Huron's Home Page

Return to "Music Cognition at Ohio State University"