Home -> Magazines -> Issues -> Articles in this issue -> View
Every Little Bit | |
Article from Music Technology, November 1987 |
Sound quality has long been a grey area in the world of sampling. Chris Meyer finds out it takes more than "bits" to make a sampler sound good.
How many bits make a sampler sound good? Unfortunately it's not quite that simple, as sampling quality is dependent on a host of other factors - not the least of which is yer ears.
LETS START WITH an easy one - why samplers with different resolutions (loosely translated to numbers of bits) sound different. Linear encoding is not only the easiest to explain, it's the most common - where the number of bits directly translates into the resolution of the sample. The whole purpose of digital recording is to make a smooth, "round-peg" analogue signal fit into a notched, "square" digital hole.
For the technically minded, the number of bits we have to play with translates into how many discrete "notches" we have to fit the sample into. The number of notches (signal levels) may be calculated by raising 2 to the power of "x" where "x" is the number of bits. Another way to translate bits into a more meaningful value is multiplying them 6dB - this is the theoretical dynamic range (softest to loudest) of the sound we're trying to record.
We'll start by looking at the range of an eight-bit analogue-to-digital converter (ADC), which is from 0-255 levels, and a dynamic range of roughly 48dB. The normal analogue signal output by the A/D converter ranges typically from -10 to +10 volts, or 20 volts overall. In this case, each bit output from the ADC will represent (20V/255 levels) 78.4 millivolts.
This resolution presents a couple of problems. For example, what if the input signal to the ADC falls at 5.70 volts? The digital representation output from an eight-bit ADC would fall somewhere between 200 and 201. So, if we represent the 5.70 volt signal with the digital number 200, or 5.68 volts, an error - distortion of the signal - occurs. This error is called quantisation error. The maximum error is obviously half the resolution between levels (in the worst case of a signal falling precisely between them), and the average error is about one quarter of this resolution. The audible effect is referred to as quantisation noise. This error makes a sound audibly less smooth, with the quantisation noise sounding like a cross between ordinary noise and a balloon being squeaked.
This noise gets worse as the signal level goes down. In our eight-bit case, if the input signal was down 42dB from its loudest output (a common occurrence at the tail end of sounds, such as percussion), we have only one bit left to represent the signal. Here the error is practically as large as the signal itself - with a subsequent drastic rise in distortion and quantisation noise. This can be heard as the squeaky balloon effect at the end of tom tom samples, for example.
As you can well guess, quantisation error decreases significantly as the number of bits in A/D conversion increases. For a 16-bit machine in our theoretical case, the quantisation error is down to 0.102 millivolts, which is below the capabilities of most humans' perception. As far as dynamic range goes, the laboratory measured response of undamaged human ears (which doesn't apply to most of us) to music is around 130dB. Practically, it's around 90dB for most people. Using the formula for finding dynamic range we get 16X6 = 96dB. We can see that 16 bits more than adequately takes what most music can dish out.
IT IS A common misconception of our western minds that dragons, magic and other sorts of compromise are ultimately bad. This is not strictly true and we won't write off lower resolution machines at this point. One reason is that the cost of producing a "true" 16-bit machine may prevent it becoming a practical consideration. In designing any given instrument, keyboard manufacturers, in addition to dealing with the overall hassle of just running a business, must deal with the issues of user friendliness, manufacturing, marketing and of course, the bottom line - cost, probably the most important to many potential buyers. So, while we know that a 16-bit sampler should satisfy us musically, the cost of building such a machine may be prohibitive, depending on the area of the intended buyer - amateur, semi-pro, or professional.
In order to meet the needs and demands of the user, a company might decide that a machine doesn't necessarily have to handle the 96dB dynamic peaks that a 16-bit sampler can. When the signal-to-noise ratio, or dynamic range, is measured in the presence of the audio signal, some claim that about 60dB (10 bits) of A/D resolution are all that are really needed to keep most listeners happy. There exist several other methods of sample data storage and encoding that deliver at least this much range at significantly lower cost. Therefore, some manufacturers resort to these various forms of magic to get adequate sound quality at lower cost. This usually means less memory (RAM) in the machine. Some of these magics are companding, floating point, and delta code modulation:
Next to linear, this is the most common form of encoding. For example, virtually every digital drum machine announced in the past year uses this technique, as does the sound chip in the new Macintosh II computer. This method uses fewer bits (usually 8) stretched over a wider dynamic range by placing more space between the highest levels. To do this, the signal must be compressed upon sampling at the input into an eight-bit (or 48dB) dynamic range. Upon playback, the output electronics have to re-expand this eight-bit signal into something more - typically 72dB (or 12 bits worth). This compression/expansion process is where the term companding originates. Expansion of the signal is either done in the analogue domain (by a circuit similar to the one you'd use to compress a guitar) or by a special DAC (digital-to-analogue converter) known as a COMDAC, which makes this round-to-square remapping of bits to voltage.
Nothing comes free - there is still quantisation noise generated when the signal falls at a voltage level in between those the COMDAC represents. By the nature of the system, this error is larger at higher levels (since there is more space between them), but the sheer loudness of the signal tends to cover it. But there is less error at lower levels, which need the higher resolution. The system works best for sounds that are loud for a short section of their overall period, such as drums and percussion. Often, major equalisation is still necessary to cover the faults - this takes the form of high frequency boost at the input and corresponding cut at the output designed to cut the quantisation noise with the excess signal.
Another piece of white magic is known as floating point. In this case, most of the bits are used to describe the signal as if it were linear, and the remaining bits are used to scale the signal's loudness. Taking our eight-bit example above, imagine adding on three more bits (range of values 0-7), with 0 representing "off', 7 representing "+/—20 volts", and values in between having different ranges. In this way, one can fake higher resolution at lower amplitudes (you always have eight bits to describe the signal). Since the same number of bits have less range to cover, the quantisation error (and therefore quantisation noise) comes down at low levels. Yes, it takes more hardware and software (and therefore more chance of error) to pull this off, but you can get by with reasonable resolution with fewer bits.
Again for the technically minded, it works like this: a simple hardware translator prescales the signal level entering the ADC circuit and stores this input scaling in, say, two- or three-bit format. This gives a significant cut in RAM costs by representing a 90dB S/N ratio that normally needs 16 bits of linear encoding in 14 or 15 bits of memory. Oddly enough, the one commercially available machine to use this scheme - the Kurzweil 250 - actually uses 18 bits - 10 for the signal (there's our 60dB again), and 8 to scale it. The new Kurzweil 1000 series also uses a form of this, and with the current emphasis on higher sound quality, more manufacturers are likely to try this scheme.
Another commonly used method to cut ADC and memory costs is a sampling method called Delta Modulation. This is the storage in memory of the difference in level from sample to sample, as opposed to the absolute value of each one. Several older DDL's along with the E-Mu Systems Emax and Emulator II use this scheme. The Emax, for example, uses eight bits to represent the differences of roughly 12-bit resolution samples taken by the ADC. This not only decreases the amount of memory needed for a corresponding 12-bit linear system by two-thirds, but significantly decreases the cost of the A/D conversion circuit (it is more expensive to accurately translate from audio into digital than the other way around).
Again, there are drawbacks to this approach to sampling. One is the fact that when the input signal increases rapidly, the eight-bit representation often cannot follow it in a single sampling period, and must take several samples to "catch up". This causes an overload in the natural slope of the input signal, which is referred to as slew rate limiting . Another drawback is the error that occurs when a sample differs less than a full positive or negative step from the previous sample - the same as the linear quantisation problem.
However, these sources of error can be easily overcome. Increasing the sampling rate - doubling it doubles the slew rate of the machine so that it takes less time to catch up (after all, a signal can only change so quickly). Reducing the minimum step size in the system increases the number of bits and hence reduces quantisation noise.
Another reason for differences in the sound quality between various sampling machines is the rate at which the sound is actually sampled. Some years ago Nyquist declared that sampling at twice the bandwidth of a signal would permit the capture of all information necessary for recording and reproducing it. However, in the years since, it's transpired that more than twice the sampling rate, about two-and-a-half times, is actually necessary. So, for the 20kHz audio spectrum the sampling rate needed would be around 50kHz. Interestingly enough, almost all hi-fi applications use a sampling rate less than this theoretically necessary rate. CD's and PCM recorders both use 44.1kHz as their sampling rate (some CDs use only 14-bit A/D too). So, differences in sampling rates of machines with similar encoding systems and resolution can limit the amount of information they are able to extract from the input signal, thereby affecting the sound quality. Try sampling the same sound at two different rates - say, 31kHz and 42kHz - on the same machine, and see if people can really hear the difference.
THE INSIDE OF a digital sampler is no place to find pure analogue signals. There are a lot of strange radio and clock frequencies floating around trying to get at and spoil our virgin sound. A lot of attention goes into laying out the circuit boards and electrical shielding. This has a lot to do with how many unwanted noises and disturbances mingle with the sound between the input and output.
SO FAR, WE'VE eliminated the differences of A/D resolution and sampling rate (and at least explained different encoding schemes) in an attempt to explain why samplers don't all sound the same. So, let's use the same A/D resolution, the same sampling rate and try not to compare apples to oranges by comparing a LinnDrum to an Emulator II. Result? Our samplers still sound different. Why, why, why?
The final difference comes down to our old friends noise, frequency response and distortion. Some of these demons have familiar manifestations; others are new.
As you may or may not know, low pass Filters are used in most samplers between the input and the ADC in order to prevent aliasing. This occurs when frequencies are present in the input signal which are greater than half the sampling rate, and are mistakenly characterised at a lower frequency. Imagine that pictures are being flashed in front of you, but this time you are also opening and closing your eyes at one second intervals. If your eyes are open for a half second, and closed for a half second, and the picture is changed every half second (once while your eyes are opened, and once while they are shut), you will see only one picture every second. It gets more complicated if the rate of change of the pictures is not sync'd with your own "sample" rate. This is similar to what happens to an ADC when those high frequencies are input and are digitised at a lower frequency. This colours the sample taken.
Enter low-pass filters. These are placed before the ADC to filter out frequencies higher than we can sample. However, because the filters are built of "analogue" components, there's bound to be noise, frequency response changes, and distortion involved. The seriousness of this depends on the quality of components used, which is reflected in the cost. Also, there may be a certain amount of equalisation added at this input stage, to cover the deficiencies of the system or to brighten the signal.
The noise present in this filter circuit is what governs the lowest noise ("noise floor") of the instrument. Even if there is no signal present this is as quiet as your sampler is ever going to be. Therefore it has a significant effect on the resulting signal/noise ratio, and from that dynamic range.
This noise floor can be measured by how many bits on the ADC "toggle" with no input signal present. As more bits of noise appear, a greater noise characteristic is recorded along with the desired signal, lowering the sound quality. In other words, the more bits of noise present, the fewer bits are available to record the input signal. And that's one figure we haven't seen listed on any spec sheet.
The input is not the only place you'll find a filter. "Reconstruction filters" smooth our digital signal back into analogue, and other filters are sometimes used for timbral changes - you know, that voltage-controlled job that you used to find on real synthesisers. These too have different frequency response, distortion, and noise characteristics. Their shortcomings are not as noticeable on synthesisers because the sound source - the oscillator - was always running at full volume, masking noise. However, with a sample, changes and colorations to the original are much more noticeable.
So, we're back to the same differences that we had back in the days of analogue Moogs, ARPs, Korgs and Rolands - the differences in components and circuits used in sound generating.
For a while, it looked as if the manufacturers should print frequency, distortion, and signal-to-noise specs along with their samplers, and those would tell us which sampler sounded better. This still isn't a bad idea - it would at least give us an indication which sampler was more "accurate". But better? Well, which guitar amp, microphone, or synthesiser sounds better? It's time to go trust your ears again: get down to your local music store with a handful of sound sources that you think you'll be using (or CD recordings of them - it's much easier to carry around than a grand piano), sample them into different instruments and listen.
The author would like to thank Scott Peer for his assistance in compiling the technical data for this feature.
Home Recording With Digital - Sony PCM F1 (Part 1) |
Digital Recording - A New Landmark! (Part 1) |
Retro-Sampling - Sampling Classic Electro Sounds |
![]() Photographing Sound - The Art of Sampling (Part 1) |
"We'll fix it in the bits..." - Digital voice editing |
Hands On: Roland S750 |
Sample Shop |
Soft Options - Cheap And Cheerful Sample Creation On The Atari ST |
The Complete Sampler Buyers' Guide |
A Vocal Chord (Part 1) |
Digital Mixing Magic - With Sampling Keyboards |
![]() Tuning Your Breakbeats |
Browse by Topic:
Feature by Chris Meyer
mu:zines is the result of thousands of hours of effort, and will require many thousands more going forward to reach our goals of getting all this content online.
If you value this resource, you can support this project - it really helps!
New issues that have been donated or scanned for us this month.
All donations and support are gratefully appreciated - thank you.
Do you have any of these magazine issues?
If so, and you can donate, lend or scan them to help complete our archive, please get in touch via the Contribute page - thanks!