Stereo imaging has come on a long way from simply panning instruments between two speakers to create the illusion of a band on stage. Ernie Tello looks into state-of-the-art sound processing and what it can do for music.
TECHNOLOGY HAS COME OF AGE IN THE REPRODUCTION OF THREE-DIMENSIONAL SOUND IMAGERY - THE IMPACT THAT THIS WILL HAVE ON RECORDED MUSIC COULD LITERALLY BE, AS YET, UNIMAGINABLE.
YOU MAY NOT be familiar with "spatial sound processing" by name, because it's a new and somewhat hi-tech field that's still very much under development. It is quite likely, however, that you will be hearing more and more about it as time goes on. But whether or not you've heard of SSP or not, you've almost certainly heard some of its effects.
Spatial sound processing is usually included in the general category of "effects" as applied to music and sound, but it's important to understand that this process is quite different from reverb and most other forms of digital sound processing. Effects like reverb and delay allow us to simulate acoustic spaces of different sizes and characteristics. Spatial sound processing allows us to simulate the effect of sound sources moving within such an acoustic space or field. For example, panning is a form of one-dimensional SSP which uses only one sound source.
In most amplified music, the sound source is a pair (or pairs) of speakers. They project or disclose voices and instruments that appear to be coming from some sound environment that is not exactly "here", but "offstage" somewhere. One of the major effects made possible by spatial sound processing is the ability to make a musical performance, whether live or recorded, inhabit the space in which listening occurs in a way that sounds independent of the speakers. This is achieved by understanding the production chain of audio performances in a different way than in conventional audio engineering. With spatial sound processing, the end point of the production chain is considered to be the listener's ears, rather than a pair of loudspeakers.
Understanding how musical sound behaves in space has been a major goal for acoustic engineers involved in designing concert halls with desirable sound characteristics, and electronic devices intended to simulate these characteristics, for years. Recently, different people have independently come to the conclusion that sideways or lateral reflections cannot be ignored in concert hall design or electronic sound processing. For example, David Greisinger of Lexicon has developed a concert hall simulator that can utilise up to eight speakers. In a system such as this - which is capable of driving separate loudspeakers on the sides - independent reverberation for front, rear, and side speakers is essential. An alternative to the use of side speakers is the use of front speakers that utilise some form of phase cancellation: eliminating the sound from the left speaker heard by the right ear, and vice versa, as headphones do.
One of the first impressive demonstrations of accurate spatial imaging was in John Chowning's composition Turenas. This was accomplished using special software running on a large computer. Chowning, incidentally, is the inventor of Yamaha's FM synthesis. Today, much the same thing can be done with microprocessor-based equipment. Some interesting research in spatial sound processing has also been done in Germany. In one recording of a radio play, the voices of the actors appear to float about the listener with no sense that the sound is emanating from speakers.
ONE OF THE paradoxes in sonic imaging is the fact that human hearing occurs in stereo, and yet stereo sound reproduction seems to be unable to faithfully recreate what we hear. To understand how spatial sound processing works, we must delve into some of the basic phenomena of psychoacoustics that affect how we determine the location of sound sources. The main cues that allow us to judge whether a sound source is coming from the right or left are the loudness or intensity, and the arrival time of the sound at each ear. However, if you imagine a vertical plane passing through the centre of your head that is equidistant from your right and left ears, then the location of sounds occurring anywhere on this plane cannot be determined by this method. The loudness and arrival time for such sounds are the same for both ears. The way in which we locate sounds above and below, behind and in front, is largely due to our outer ear, and apparently also to our experience with the way sound is treated by the shape of our heads.
To prove to yourself that the outer ears, or pinnae, are the culprits that allow up/down and front/back localisation of sounds, perform this simple experiment.
Use your hands to fold over your outer ears, shut your eyes, and have a friend shake a set of keys at different places and try to guess the location. Then try the same thing without holding your ears. You'll be amazed at the difference.
From this experience, it is apparent that stereo symmetry by itself does not allow us to hear the true directional characteristics of three-dimensional audio. Some additional kind of encoding is necessary to capture the information deciphered by the brain from the outer ear so that up/down and front/back distances are detected as well.
Simply adding more channels is not the solution. Quadraphonic sound systems failed commercially because adding more channels merely increases the cost without properly addressing the problems of improving fidelity while preserving the spatial relations of sound sources to one another and to the listener. Besides, directional hearing and clear imaging are possible with stereo headphones - this has been known for some time. I have listened to stereo recordings in which a pair of scissors clipped away around my head with such realism that I was prompted to look down and see if any hair was actually being removed - strange, but true.
"ONE OF THE PARADOXES IN SONIC IMAGING IS THE FACT THAT HUMAN HEARING OCCURS IN STEREO, AND YET STEREO REPRODUCTION IS UNABLE TO RECREATE WHAT WE HEAR."
Various strategies for capturing and reproducing the directional properties of musical sound have been explored for many years. These include specially-designed microphones, speakers, and dynamic sound processing equipment. In order for us to understand just what is going on with this technology, it is necessary to focus on the production chain that is used for developing final units utilising the latest in musical sound technology.
TRADITIONALLY, THE PRODUCTION chain for musical sound has been seen as this: Instrument - Microphone - Recorder - Processor - Speakers. Today's production chain is rapidly becoming regarded as being something like this: Source - Microphone - Sampler/Player - Recorder - Processor - Spatialiser - Speakers - Ears. The ability to produce a final result means accepting the challenge of knowing just what to do at each step of the production chain. Needless to say, this is a field that is still too new for any accomplished masters or proven experts to have appeared as yet.
If you were to assemble a studio consisting of all available spatial sound processing equipment, it might consist of the following: binaural or sound field microphones and their control units or decoders, stereo or multichannel samplers, automated mixers, effects processors, spatial processors, audio enhancers, and speakers designed for phase cancellation of inter-aural crosstalk. However, at this point, it is not clear that all of this equipment could be made to function properly as a single system. It could very well turn out that the effects produced by some devices would defeat those produced by others, because they were not designed to be used as an integrated system.
The most common error is to treat spatial sound processing as a special effect or quick fix to be tacked on to a problem sound. Make no mistake about it, sonic imaging is not just another effect to throw in your proverbial bag of tricks. Using this technology involves a major commitment that should be considered at the very outset of a project. Ideally, music should be conceived, composed, and orchestrated with spatialisation technology in mind to obtain a purposeful, aesthetically pleasing result. Many technicians and audiophiles look upon spatial sound processing as the only proper means of faithfully reproducing the sound characteristics of the concert hail. Does this sound familiar? It's the very same issue that we've had to deal with for so long in music synthesis.
Faithful reproduction is a great testing ground to try out new technology - and it's a valid artistic tool for many purposes. But to leave it at that is like discovering a new planet and then bringing back examples of things readily available on Earth. Although bringing the true sound of a concert hall into your living room or studio is a perfectly legitimate goal, to stop there is to miss out on a tremendous world of creative opportunity. The real future of this technology is the creation of dynamic 3D worlds of musical sound that otherwise could not exist.
FOR SEVERAL YEARS, engineers have been trying out innovative design ideas for microphones intended to capture the directional properties of sound in a three-dimensional environment. The sound field microphone is a multi-microphone assembly of subcardioid capsules arranged in a pyramidal or tetrahedral array. It is designed to work with a control unit that decodes the signals coming from the array.
Recently, I had a chance to test a low-end stereo microphone set from Sonic Engineering that puts dimensional recording in the hands of the average musician or engineer. The key to these microphones is their size - they are small enough to be placed close to your ears in order to capture the way in which sound is conditioned by your head and outer ears. There is some disagreement as to just why they work, but I was able to obtain some very impressive recordings with them.
Another popular approach is to build microphone assemblies in the shape of the outer ear or even the entire human head. These are generally referred to as "artificial head" recording systems and have been used in the past by such artists as Tangerine Dream's Edgar Froese. Artificial heads usually require binaural mixing console, and are quite expensive. Typically, the microphones are placed inside the models of the ear canals. This technique works best when played back over headphones that are matched to the microphone assembly. A number of people have shown that, when models of the outer ear are used in the vicinity of the microphone, the vertical position of a sound source can be localised even when reproduced with just two stereo speakers or headphones.
"THE WAY WE LOCATE SOUNDS ABOVE AND BELOW, BEHIND AND IN FRONT, IS LARGELY DUE TO OUR OUTER EAR AND ALSO, APPARENTLY BY THE SHAPE OF OUR HEADS"
As pointed out above, it may be more economical to use an actual human head outfitted with tiny, specially prepared microphones. The two tiny microphones from Sonic Engineering are fitted with small loops that allow them to be conveniently placed over the stems of your sunglasses and positioned as close to the ear as desired. The best results seem to be achieved when they are not too close to the ear (perhaps individual differences in ear shape become unimportant at a certain distance).
THE MOST WELL-KNOWN spatial sound processing scheme today is the Dolby Laboratories Surround Sound system that is installed in many major cinemas. There are also home units appearing on the market for decoding the Surround Sound signal as well. In response to this, RCA Records in the States have announced the first CD album mixed in the Surround Sound format. A very simple decoding logic for surround sound is to send the left channel to the left speaker, right to right, right plus left to centre front and left minus right to the rear.
Spatial sound processing is something to be used in conjunction with effects processing - like reverb and delay. Establishing the direction of the sound source is one thing that can be accomplished with spatial sound processing. If this has been achieved, then the next goal is often to create the effect of one or more sound sources moving through a sound field in one, two, or three dimensions. This kind of processing is effective for both live performance and for recordings. There are a number of distinct techniques that can be used for recording that allow spatialised musical sound to be used in stereo, surround and other multi-channel formats.
A sound spatialiser like Spatial-Sound's SP1 is capable of handling multiple sound sources in multiple dimensions with a variety of different multi-channel speaker setups. In general, you must choose between more sound sources or more spatial dimensions, as both of these require significant processing power. If you must have control over many sources and dimensions, multiple units can be used to get the most dramatic effects that the human ear and brain can handle. Once you have created moving patterns for two or three sound sources in a three-dimensional sound field, these moving patterns can be rotated about one or more axes. And if that isn't all your tender brain can stand, you might choose to have the entire sound field expand and contract at a speed synchronised with the beat of the music. Try listening for these effects in your everyday environment, and think about how this kind of processing might enhance your own music. The creative possibilities are truly unlimited.
Blauert, J, Spatial Hearing, MIT Press Cambridge, 1983.
Chowning, J, The Simulation of Moving Sound Sources, Journal of the AES, May 1970.
Cooper, D and Bauck, Y, Prospects for Transaural Recording, Journal of the AES, Jan/Feb 1989.
Gerzon, M and Barton, G, Ambisonic Surround Sound Mixing for Multitrack Studios, Journal of the AES, May 1984.
Greisinger, D, Theory and Design of a Digital Audio Signal Processor for Home Use. Journal of the AES, Jan/Feb 1989.
Sommerwerck, B, Ambisonics: Everything You Know About Stereo is Wrong! Stereophile, Volume 8, No. 6.