BASIC PROBLEMS OF SOUND REPRODUCTION
THAT APPLY TO DIGITAL AND ANALOG
©1998 The Anstendig Institute
With the new digital standard and the analog vs. digital controversy occupying the worlds of computers and sound reproduction, a review of the problems of all sound reproduction, analog or digital, is timely.
Sound reproduction was a badly flawed field long before digital came into being. People have, for a century, been listening to recordings of live events that do not sound even close to the original. This is true for radio and TV as well as sound reinforcement (amplified live sound programs). It started with the awful, early-in-the-century sound associated with the original Caruso recordings and those of other famous artists of the period. The most beautiful voices and instruments sounded like a Kazoo on the machines of the time and, uncorrected, still sound that way on present day machines. But people loved these recordings because they still made them cry and/or gave them an emotional lift. Part of the reason for this is that the ear is more forgiving of such distortions at low volume levels and most machines up to the late 60s could only play well at low volume levels. Films also had less than desirable sound, but by the 1930s, when film sound began, recordings had progressed technically.
The main problem with early sound recording was that the recording process favored some frequencies over others, and thus the volume level of some frequencies was much louder than other frequencies. The main imbalance was an exaggeration of the mid-range frequencies around 400 to 1000Hz because the recording equipment was most sensitive in this range and progressively less sensitive as the frequency range got higher and lower. That gives the old recordings their characteristic “hooty” sound. There were also problems of surface and background noise. The various equipment involved in the whole record and playback process had other frequency exaggerations of their own, called resonances, which would vastly overemphasize a particular small frequency range. Microphones were the biggest culprits.
For the most famous and grossest example, most recordings from the 40s to the 80s were made with a particular Neumann microphone favored by the industry. That mic had a huge overemphasis of the frequencies in the 1200 to 2000 Hz range (centered at about 1600 Hz). This greatly distorts the sound and must be corrected if the sound is to sound like the original. Almost all of the thousands or more important recordings made during that long period have this frequency emphasis, including the famous Mercury recordings and most others prized as having excellent sound. Yet millions of those recordings are in circulation, including thousands re-released on CDs, and practically no one recognizes this huge resonance that changes even what human voices sound like. This points out a lack of hearing acuity in most of the population of the world, including that of the critics who review these records. And it is not the only emphasis on most recording, new and old (yes, modern recordings have them too). The extraordinary Maria Callas suffered particularly from resonances in recordings, especially a 350-600Hz emphasis that gives her voice a “hooty” sound and gave her a reputation of singing with a hoot, since few got to hear her sing live. In fact, her voice production was quite excellent and can be experienced in its full glory when this emphasis and that of the Neumann mic (1600Hz) is reduced by using equalizers.
Some of the resonances and emphasis take place when the recording is played back in the listening room because the speakers are, in reality, no different from musical instruments creating new sounds in air. They, too, create overtones and other effects associated with the process of creating sounds in air. The Anstendig Institute has described these added distortions in our paper “A Massing of Overtones”.
But, beyond the problems of recording and playback, there is another characteristic in the way we hear the frequencies that colors the sounds we hear.
In the early 1930’s, two technicians in the Bell Telephone Laboratories carried out the most important experiments in the history of sound, well known throughout the industry as the Fletcher Munson Equal Loudness Curves (see illustration below). Fletcher and Munson were carrying out research to determine how telephone handset speakers should be manufactured to sound most natural. What they found is that
1) when all the frequencies are playing equally loudly, we do not hear them equally loudly. We hear some frequencies louder than others, not just in one particular range, but that we hear various frequency groups, low and high, louder than those above and below them. The region we hear loudest is that between 2000 and 4000 Hz, which is exactly the range where the frequencies of most instruments and voices peak. But there are other imbalances nearly as strong.
2) that the amount of emphasis with which we hear various frequencies changes with the volume level. That means, that, when all the frequencies are played at one particular volume, we hear a certain amount of imbalance. But that the imbalance changes dramatically when the overall volume level of the frequencies changes.
The meaning of these findings is that, if there were no other resonances or balance problems in the technology of the recording, the only way the playback could sound like the original performance is if it is played back at absolutely exactly the same volume level as the original performance in a room with the same acoustics as that of the performance. But there is no standardization of volume levels and no way of knowing if one is listening at the original volume level. So a sizeable amount of equalization would be necessary for that reason alone. Add to that the various other resonances and imbalances inherent in the recording and playback processes and there is no way anyone is going to hear sound accurately, or even close to the original without a large amount of equalization.
Furthermore, because all recording media, including digital, cannot capture the whole dynamic range of real life sound, the volume levels of the performance are changed during the recording, reducing the loud passages and increasing the soft passages. After Fletcher Munson’s discoveries, the recording industry should have come up with a means of compensating for the EQ changes in the way we hear sound when overall volume levels change. But the industry has not done so. Therefore, there is another distortion to the sound every time the overall volume level was changed during the recording..(Compression and expansion is the term for this technique when done automatically; “gain riding” is the term when done manually by the recording engineer.) The only way to compensate for this is to manually raise or lower the volume levels during playback.
Even live performances, especially of shows, suffer greatly from this problem. I attended a performance of “Les Miserables” in San Francisco, in which the sound was amplified and a technician used gain riding to increase the sound to deafening levels (at least 115 decibels and probably louder) during climaxes and decreased the sound to nearly inaudible during quiet, lyrical passages. I walked out as soon as I could, after it became apparent what was happening. In the medical community, exposure to sound pressure levels above 100 decibels is believed to cause hearing damage (some even feel levels above 90 decibels to be dangerous). Therefore, one could expect those who stayed to have suffered at least some hearing damage, besides suffering discomfort from the distortions of the sound.
The above described distortions do not even begin to touch on the problems of the equipment itself. The most important point to know about equipment is that, until the mid seventies, amplifiers had very low output, usually 10 watts or lower. Therefore, loud listening is only very new. Up to the last two decades, all radio, TV, and recordings were heard at low volume levels. As mentioned, the ear is more forgiving of the distortions at low volume levels. The louder the sound, the greater the effect of these distortions on the listening experience and the more apparent the distortions become. With the advent of high powered amplification, volume levels have risen and the distortions due to the sound reproduction as well as those due to our unequal perception of loudness have become much more disturbingly apparent, to the point where they usually ruin our perception of the expressive content.
The Anstendig Institute has carried out year long research that demonstrated that these resonances, especially those in the 2-4,000 Hz range, destroy our perception of expressive nuance, keeping us from hearing the more delicate emotional qualities in music and sound when they are present.1
A further problem of sound reproduction and sound reinforcement is the loudspeakers. The only loudspeakers I have found that are capable of reproducing the finer nuances at most volume levels are very large, horn-loaded (the higher frequency drivers are connected to and dispersed by a horn), and vented (they have open boxes, in which the frequencies from the back of the speaker escape into the room through a vent). These are usually theater loudspeakers. But this type of speaker usually has other resonances of its own, which also have to be equalized. All speakers in closed boxes that I have heard dampened the nuances as well as the frequencies from the back of the driver.
What is the effect of all this? First of all, most people in today’s world, at least in the developed countries, hear most of their music in sound reproduction. They have been doing so for a century. Society has slowly lost its concept of natural, undistorted sound. It has also stopped listening for or expecting anything more than the relatively gross, unsubtle emotional content that comes across in typical recorded sound.
Present digital recording, which is without much of the subtle differentiation of nuance and actually falsifies the expressive nuances, simply put the nail in the coffin. Digital was only able to be accepted because people were already accustomed to not hearing all of the expressive information and don’t miss it. The following is a quote from our paper “Our Loss of Emotional Richness Due To Bad Sound Reproduction”. What Dr. Ostwald says goes for old analog as well:
A noted psychiatrist at the Langley Porter Institute at the University of California in San Francisco, Dr. Peter Ostwald, M.D., recognized our Institute’s warnings, dating back to the early 1980’s, about the probable effects of the acceptance of unperfected digital technology: “I was fascinated by his (Mark’s) original theories, which included the daring proposition that due to its inability to record subtle changes between notes, the then-developing digital technology might be detracting from listeners’ perception of emotional nuances in musical instruments and the human voice.” After nearly two decades of digital recordings, that has, in fact, already happened: we now live in a society that suffers from a general impairment of its ability to perceive and experience emotional nuances. Worse, people no longer are aware of the finer differentiations of emotional qualities and no longer listen for them.
Someone of great personal sensitivity came over recently to hear Menuhin’s recording of the Beethoven Violin concerto with Klemperer conducting. Afterwards, he confessed that he simply hadn’t expected anything like the depth, sweetness and seriousness of that experience, but particularly the sweetness of the expression, which is the first thing destroyed by the uncorrected distortions of all recorded sound, analog or digital. The difference between the two is that analog at least has the full depth of expression on the recording and the playback sound can be adjusted so that the expression can be heard. Current digital does not even capture that expression. But with either analog or digital, sound reproduction cannot be even close to accurate without compensating for the frequency imbalances during playback. The inescapable conclusion is that sound reproduction that closely matches the original sounds and allows the listener to experience the emotional content of the original is not possible with analog or with an adequate digital system unless the sound is equalized during the playback.
1 Our papers on sound reproduction, particularly “Sound Equalization in Relation to the Way We Perceive Sound”, deal with the effect of distortions on our perception of nuances of expression, especially those resonances in the frequency ranges where we are most sensitive.
The Anstendig Institute is a non-profit, tax-exempt, research institute that was founded to investigate stress-producing vibrational influences in our lives and to pursue research in the fields of sight and sound; to provide material designed to help the public become aware of and understand stressful vibrational influences; to instruct the public in how to improve the quality of those influences in their lives; and to provide research and explanations for a practical understanding of the psychology of seeing and hearing.