Chaos and Music Reproduction
Benoit Baald
The end of musical reproduction is the re-creation of a musical event in a different spatio-temporal location than which it occurred. The means to this end are the recording process (and devices) which stores the event for later recall, and the playback process (and devices) which transduce the stored event into a new aural reality which (hopefully) approximates the original very closely. Much of contemporary popular culture, however, seems to be ignorant of this notion of the purpose of a "stereo"; nowadays the original musical events to be reproduced are already reproductions themselves. Contemporary culture, to a large extent, lacks exposure to music with actual presence, i.e. living, breathing humans playing acoustical (1) instruments in an effort to communicate with music that which words cannot (e.g. emotion, soul). The "end" of musical reproduction for the masses, then, is to provide a rythm for aerobics, to distract one from the boredom of ones existence, or to provide a sonic-wallpaper of artificial ambiance to complement the artificial environments in which most of us live. Instead of judging the quality of reproduction by observing how closely the reproduction approximates the original, the criteria become the effectiveness of the systemin boosting ones ego (due to size or cost) or how well the system performs on an irrelevant set of artificial tests.There is a subculture, though, which does take music and its reproduction quite seriously--and literally, in that they seek to reproduce the musical event. Members of this group are often referred to as "audiophiles", but this is a misnomer, the term implying a preoccupation with the paraphernalia of reproduction rather than the musical event itself; they refer to themselves as music-lovers or "tweaks"--someone willing to patiently adjust ones system in order to extract from it the greatest degree of musical realism.
In order to better understand the tweak perspective, consider the nature of the musical event. Though others may describe a concert (or its reproduction) in terms of the notational aspects of it (melody, harmony, rhythm, instrumentation--all of which are scoreable), the coherence of the ensemble, and the emotional manipulation wrought by the experience, the tweak will expand this list considerably. Not only are categorical descriptions of instruments important--such as violin or piano-but individual characteristics of the instruments are as well, e.g. whether the violin is a Stradivari or a Guarnieri, the piano made of rosewood or mahogany (each wood has its own resonant colorations). Also important is the sound of the hall or auditorium and the air within it (which may be a chaotic manifestation). Further, the size of the instruments is noted, and how that apparent size changes--"blooms" or expands--with dynamic changes of the music.
Lest it be thought that this type of critical listening is the exclusive domain of the tweak, or that sonic characteristics take precedence over appreciation of the performance, it must be pointed out that many nontweak concertgoers and music lovers pay attention to the full spectrum of aural and musical parameters; also, because critical listening is a learned technique, once proficient at it one can integrate it into the listening experience so that it is unobtrusive to the enjoyment of, and emotional connection with, the musical event.
When listening to reproduced music, however, the tweak is diatinguished from other critical listeners by the fact that he expects as many auralmusical parameters as possible to be accurately reproduced in his listening room. More than the illusion that the musicians are in his room, he seeks to be aurally transported to the original concert or recording site.
Although the criteria used for evaluating reproduction equipment are numerous, and correlate to the previously mentioned criteria, they are unified under a single paradigm: The absolute sound. As Harry Pearson, editor of "The Abso!ute Sound", puts it, the absolute sound is the sound of live, unamplified music occurring in real space. This is the only acceptable reference for reproduced sound (for reasons that should be obvious--amplified music has no absolute as it is being reproduced already by amplification devices); further, it implies that music reproduction is not a matter of taste, but of truth. Any deviation from the absolute in reproduced music is considered a coloration or distortion; because colorations or distortions don't occur with the absolute, the following types of distortion, coloration, and aspects of reproduced sound only have meaning within the context of the reproduction. Take e.g. the concept of "soundstaging", which refers to a systems ability to portray the acoustic envelope--the boundaries of the recording site and the sound of the air and ambiance within it-accurately in three dimensions. This is effectively a parameter that differs with different reproduction systems, yet is constant and correct at the live event; attention will normally only be drawn to it when it is distorted, which only happens during reproduction. Though one could pay attention to the acoustic envelope at a live event, to talk about the soundstage of a particular hall would be meaningless, as it is not an (alterable) parameter within that context.
"Imaging" is a term which refers to the placement of individual sonic images within the soundstage. Imaging inaccuracies occur when the instruments seem too bunched together laterally or in depth, or if they are too spread out, leaving gaping holes. Or images may wander or drift when they should maintain placement stabillity.
"Tonal Balance" is a very fundamental aspect of reproduced music, referring to emphasis or de-emphasis of portions of the frequency spectrum. Typical tonal abberations include thinness, or lack of bass and lower midrange energy; nasality, a midrange coloration which makes singers sound as if they have a slight cold, or makes clarinets sound like oboes; brightness, or emphasis of high frequencies, makes music sound electronic--obviously reproduced. Tonal balance, especially in the midrange (where most musical energy occurs), is so important because instrumental identity--an instrument's sonic signature--is so severely affected by tonal colorations.
A systems portrayal of dynamics, or changes in loudness, is is also of fundamental importance to a reproduction's degree of reality. A Mahler symphony, for example, has extremely quiet passages and passages of absolute bombast; if a system compresses the extremes of loudness, or distorts other wise during volume extremes, a great deal of the impact of the performance is lost. On a smaller scale, the vibrato and breathing of a vocalist or saxophonist may be obscured if the system is incapable of resolving such minute changes in level. Transient and decay characteristics, which concern, respectively, the speed of a given note's rise to maximum volume, and the speed with which the note fades to silence. E.g. a snare drum has a very sharp transient attack and a very distinctive, slighty longer decay. A system may be characterized as fast or slow, hard or soft, depending on its transient characteristics. Further, the pace and rhythmic feel of a piece may be affected by transient inaccuracies.
"Dimensionality" and "palpability" are some of the hardest aspects to reproduce, yet are of course taken for granted when listening to live music. They refer to the sense that instrumental images have thickness or body--that one can reach out and not only touch the images, but grab them.
"Musicality" is also a dificult aspect to get right, yet arguably the most important. A "musical" system is one that does not distort or diminish the emotional impact, connection, or nuances of a performance. A system which excells at minimizing the above distortions, yet which is amusical would probably be less compelling and realistic to a music lover than a very musical, yet otherwise colored system, which preserves the "magic" of a performance.
Taken collectively, the accurate reproduction of the above aspects of music results in an aural "virtual reality" which can be much more realistic than the computer based virtual realities responsible for that buzzword. Additionally, accurate music reproduction systems are much more accessible to the public than virtual reality systems. Yet, for a variety of reasons, this "high-end" audio movement remains an underground phenomenon and is very controversial.
One reason high end audio (which isn't necessarilly highpriced) hasn't attracted mass appeal has to do with human laziness. High end equipment generally lacks convenience features because they aren't necessary for the reproduction of music and they add to the price of a product--money which could be spent bettering the performance of the product--and sometimes actually degrade the sound. Many people are also reluctant to take the time and energy needed to educate themselves about music and listening techniques; it seems to be a low priority item for them.
The main obstacle to the acceptance of the high end has to do with the "scientific" validity of its claims. The scientific establishment has yet to design a way to measure a systems ability to reproduce the emotional content of a performance, for example. The measurements used by the mainstream audio industry reveal amounts of different distortions (technical, not musical, ones), noise, and frequency response abberations; these tests were of some use until the 1960's, when distortion and noise products were reducible to inaudibility, and systems were possible that revealed no measured frequency response abberations that would be audible. Instead of congratulating themselves and moving on to different forms of analysis to improve audio equipment, the mainstream singlemindedly sought to reduce these distortions further, while not necessarily improving the products' accuracy. While new measurements are still being devised, it is within the context (mostly) of a positivistic context, which music does not share.
Any tweak will surely affirm the sentiment that "people don't play recordings in order to view them on an oscilloscope, but to listen to music". Because so many aspects of music are not physically measurable, but are easily audible, the high end has devised its own methodology for determining concensus opinions about the accuracy of audio components. Harry Pearson calls this methodology "observational". A subjective evaluation would involve one person listening to a component in an uncontrolled environment for an unspecified period of time, then describing the sound. The evaluator may however be biased toward (or against) the particular brand of unit under test; his listening may be altered by mood or other psychological phenomena; he may not be familiar with the sound of live music; there may be an unusual interaction with the unit being evaluated and the rest of his system or room. An "objective" methodology, using mainly irrelevant physical measurements has been described. The observational methodology uses the rigor of the objectivist school and the apparatus of the subjectivist school--the ear/brain, the most sensitive device extant for evaluating sound, and possibly the only device, as sound can't exist without it (although signal and vibration can).
A typical agenda for observational evaluation includes: a multiplicity of independent, isolated evaluators who are very familiar with the sound of live music; different systems and environments in which to evaluate the unit; a long enough period of time spent listening to the unit so that subtle colorations can become obvious as the evaluators familiarity with the unit's effect on music grows. (The reason blind tests aren't used is because insufficient time is allotted to grow familiar with a unit's subtleties; further, the stress levels entailed by such tests impede the relaxed listening state necessary for involved listening.) The result of such an evaluation is almost always a concensus among evaluators of the sonic character of a given unit.
Of extreme importance to the accuracy of musical reproduction, and the evaluation of that reproduction, is the accuracy of the recordings played through the reproduction system. The aural virtual reality mentioned earlier isn't as likely to occur with inferior recordings; because high end are designed in order to resolve as much information as possible, accurately, from a recording, an accurate system will let one know exactly how inaccurate a poor recording is.
Just as most high end units are built with the idea that the least amount of circuitry built with the highest quality parts will result in the least distortion of the musical signal (since every thing it passes through will distort it somehow; no electronic device is perfect), the most successful recordings are made with a minimum of signal processing, effects, or other circuitry that would alter the musical signal as recieved by the microphones. Interestingly, it is the recording process which has created the most disagreement between high end and mainstream audio, that regarding the merits of digital audio recording.
In order to understand the merits and failings of digital recording, let's first consider the fractal aspects of music (which are the hardest to record and reproduce) and how they are manifest aurally.
My favorite aspect of fractals has always been the boundary conditions within them. A fractal can be interpreted as an infinitely complex (functionally ambiguous) boundary between chaos and stability; I believe this boundary condition is mimicked in other dualistic systems as well. Consider the semantic dyad "yes/no". Just as areas of chaos in a fractal will appear to have islands of stability, and vice versa, until the scale is changed, so too can the term "yes" have some "no" in it, depending on what could be called semantic scaling, which is related to familiarity, intimacy and personal significance. E.g. one might find her lover's "yes" to have more "no" in it than an unatached observer would-because the scale of the term is different for each in significance. An illustrative metaphor of this concept is the Digital-Analog dichotomy. Digitally, "no" means "no" because it is an arbitrary signifier which can only refer to itself. Yet vocal inflections can reduce the definition of "no"--a reluctant "no" might seem to have some "yes" in it. Vocal inflection seems often to be analogous with intent, refering to the meaning which "no" tries digitally to signify. The important point is that the integration of the digital-analog aspects is necessary for understanding, even though definition might still be elusive.
One fractal aspect of music is that its a system of ambiguous definition between the analog-digital semantic dualism. The digital aspect concerns that part of music which can be symbollically represented, or scored; the analog aspect concerns how the score is played--the intent, inflection, and emotional nuance conveyed by a performer's interpretation of the score. We don't appreciate fractals aesthetically by seeking out just the chaotic or stable regions within them; we appreciate the lack of definition between the two regions, treating the fractal as an aesthetic whole. Similarly, when listening to music, one appreciates the whole musical event as a transcendence of definition between digital and analog aspects of it, even though areas of one or the other may seem to leap out as definite "islands" depending on semantic scale.
There are more obvious ways in which music is fractal-ish or chaotic. Certain kinds of cannonical formalism are melodically self-similar and change according to harmonic or rhythmic scale. So, a fundamental melodic line might be echoed contrapuntilly with smaller note values. One could listen to a recording of such and alter the playback speed so that at different speeds (temporal scalings) a given version of the echoed line would seem similar to the fundamental line. With enough such lines, the original temporal scale could be unintelligible without external scaling reference. Or, if the line were repeated at higher pitches within the composition, a pitch scaling phenomenon would occur.
Rythmic subdivision may also be quasi-fractal. A measure is usually subdivided into four quarter-notes. Each of these, however, is subdivided into two eighth-notes, which can be subdivided again, ad infinitum, or to the frequency which exhausts the musicians technique. (Other subdivisions besides log2 can be used, and are in other cultures; western music is predominantly log2 based.)
The waveforms of acoustic sounds also have fractal elements. Any sound will be composed of its fundamental frequency (corresponding to the longest vibrational node of the instrument) and hogher harmonics (corresponding to shorter vibrational modes within the instrument). Amplitude of harmonics decreases inversely to the frequency of the harmonics, i.e. the 2nd harmonic of a sound is lower in amplitude than the fundamental, yet higher in amplitude than successive harmonics. If harmonics decreased by half in amplitude for every octave increase in frequency, a 1/f relationship occurs, which suggests a fractal; this doesn't happen with acoustic sounds however, but does describe the tendency of acoustical harmonic spectrums.
Analyzing complex music waveforms shows an even more fractallike character. As described by John Atkinson, editor of "Stereophile", "[complex] waveforms have a wealth of fine detail, and that detail itself has an even finer-structured wealth of detail, and so on, until the crinkliness of the waveform is enveloped in the analog noise that accompanies every sound we hear. ...a casual study of waveforms reveals the fact that without having a time or amplitude scale attached, it is impossible to assign any kind of loudness or frequency value to them..." (Stereophile, v.13, #11, p.7).
The "analog [acoustical] noise" that Atkinson refers to is also important to musical reproduction, and demonstrates deterministic chaos within an aural context. When a sound occurs within an enclosed space, it creates fluctuations in the air pressure (which the ear interprets as sound) radiating out from the instrument spatially and reflecting off of walls or other surfaces. The amplitude of these fluctuations decreases as energy is disipated, yet if another sound occurs while the air is still fluctuating, the second sound will be very slightly modulated by the first sound's remnants (reverberation)--larger fluctuations travelling through a minutely pre-fluctuating medium. The second sound repeats the process, as do all other sounds within the space, craeting a positive feedback loop (the output of the system feeding back to and modulating the input). Negative feedback is provided by the friction of the air, and the absorbtion by hit surfaces, damping the air pressure fluctuation so that the noise--the sound of the ambiance or air--is kept at a low level.
All enclosures, however, have resonances--reluctancies to dissapate energy of a certain frequency--resulting in areas of less negative feedback. A resonance tends to amplify sounds of a certain frequency because as the sounds disspate more slowly, successive sounds near the resonant frequency add to the amplitude of air fluctuation at that frequency, resulting in a stronger positive feedback characteristic at that frequency. So, sounds at the enclosure's resonant frequency(s) will have a more pronounced effect on the characteristic sound of the enclosure's ambiance. Further, since nearly all instruments have an enclosure of some sort, a musical event will have very complex networks of of low level deterministic noise due to these feedback effects.
Analog and digital recording processes differ in the way they store musical signal information. An analog recorder takes the source signal (from the microphone e.g.) and records an analogous waveform onto a physical medium, either magnetically or mechanically. There will always be a "texture" to the medium used for storage, however, which will modulate the signal slightly at playback, resulting in excess noise (similar to the way paint on canvas reveals the texture of the canvas).
A digital recorder takes the signal, converts it into numerical data by "sampling" it at regular intervals (44,100 times per second) i.e. noting it's amplitude level on a binary scale of 0-65,536, and storing the numerical data, which is reconverted to an analog signal during playback. Whereas an analog recorder records detail as fine as the textural quanta of the storage medium (molecules of vinyl for records; magnetic particles for tape; increasing the speed of either increases the quanta per time unit available to record fine detail) and playback as much detail as the playback device is able to retrieve, the digital system is limited by the pre-defined standard at which it operates, and records a much smaller amount of music (i.e. it's recording quanta are much larger and courser than analog's); also, the circuitry necessary to deconstruct and reconstruct the original signal is complex and, thus increases the possibility of signal degradation. On the other hand, the digital data is immune to the degradation that afflicts analog data with use-physical changes in the analog storage medium (wear) translate to changes in the signal that is retrieved.
Based on the archaic measurments used by the audio mainstream, digital recording is superior to analog as a recording process for music. Yet observational evaluation tells a different story; let's examine the aural characteristics of each.
Digital audio, coincidentally, reproduces semantically digital information very well, especially one aspect of reproduction. Extraneous noises that afflict analog recordings, such as the noise of a medium, and wear artifacts such as pops, ticks, scratches, and dropouts, are not reproduced by a digital system, since it is immune to this type of degradation. But digital recording lacks the resolution of analog, and so doesn't reproduce the extremely fine detail that falls through its coarse quantization. Aural manifestations include lack of air and ambiance; lack of dimensionality and palpability--digital recordings tend to miss the depth dimension entirely when reproducing music, leading to a "cardboard cut-out" portrayl of instrumental images; digital recordings tend to sound very "electronic" and unnatural-singers sound as if they have vocal cords of metal instead of flesh; high-frequency overtones are lost to the severe bandwidth limiting of the digital format, which also produces phase distortions within the audio band. In general, digital recordings fail to portray the verve and life of live music.
Analog recordings tend to preserve the fractal elements of sound better than digital recordings. So, ambiance and the sound of the recording environment is present during analog reproduction, recreating the space of the original event. The quantizing distortion which plagues digital recording--a noise effect which changes along with the signal, altering instrumental tonal characteristics and adding to the electronic sound-are absent, although the constant, low-level noise of the medium's texture (which is much less musically unnatural) is present. The minutae of subtle resonant effects and dynamics are preserved, allowing the reproduction of palpability and depth, as well as subtlties of performance which digital glosses over. "With an analog signal, the closer you get to the sound-the higher the level you choose to listen--the more detail there is to be heard, the only tradeoff being an increase in the level of the signal's intrinsic background noise. With a digital recording, this is not true. No matter how close you get, how high you set the playback level, there is no more detail to be discerned then what was captured at the time of analog to digital conversion." (ibid.)
In general, the sub-culture of dedicated listeners known as tweaks have found analog recordings to capture a more accurate slice of reality than digital recordings, which I believe is due to analog's ability to preserve more of the fractal characteristics of music. In a different sense of the fractal characteristics of music--how music is a fractal boundry between digital and analog signification--it is evident that finding a similar transcendent integration of digital and analog recording processes would provide the most accurate reproduction of all.