Describing the Delay
There's a lot more to the humble digital delay than meets the eye. Paul White investigates.
DDLs have been with us so long now that we tend to take them very much for granted. They are however very versatile pieces of studio equipment capable of producing all sorts of effects.
The humble digital delay line has been, for many years, a statutory piece of equipment in any studio. Not only may it be used to create conventional delay and echo effects, but it can also be made to supply effects such as phasing and flanging. However, before examining the various ways in which a digital delay line (DDL) can be used, it's advantageous to gain some insight into how they work.
Any digital signal processor designed to act on what are essentially analogue input signals, such as music, must first convert the input into a digital format. An analogue input is one where an electrical voltage varies in proportion to the occurrence of an original event. In the case of acoustically generated music, the change in voltage is proportional to a change in air pressure; a rapidly vibrating string, for example, will create rapid air vibrations which a microphone will convert into variations in voltage.
A digital system, on the other hand, functions on a binary basis; 1s and 0s, which are represented in the circuit by the presence or absence of a fixed voltage.
To change an analogue signal to a digital one therefore, we have to measure the analogue voltage of the input at regular intervals and to convert this voltage into a binary number represented by a series of electrical pulses.
One second of sound may be represented by several tens of thousands of these binary numbers, each relating to a particular instant in time. If this concept is difficult to grasp, think of a cine film, where each frame is slightly different to the next. By running these through the projector in quick succession, the illusion of smooth movement is created. So it is with digital sound. If you have enough instantaneous measurements per second, the original sound can be recreated. This process of measuring and digitising minute sections of the input signal is known as sampling.
Figure 1 shows what happens when a signal is sampled; it's cut into slices rather like a loaf of bread and the height of each slice is measured. Each slice or sample has a flat top corresponding to the voltage at the start of the sample time so it doesn't accurately follow the curve of the waveform. From this you might deduce that the thinner the slices, the more accurate, or less distorted, the sound will end up and you'd be right. Sampling theory (far too complex to go into here), states that you must sample at a minimum of twice the frequency of the highest harmonic you are likely to encounter if the output is to be reconstructed accurately, and because of the inevitable differences between theory and practice (often known as Sod's Law), this figure ends up being nearer three times. This is known as the sampling frequency. So, to sample an input signal containing harmonics reaching up to 10kHz, you would have to sample at 30kHz. To create a delay of 1 second, you have to find somewhere to store these 30,000 samples and the ideal place is in an area of random access memory similar to that used in home computers. A 30kbyte memory can hold up to 1 second of sound with a 10kHz upper frequency response limit, and by continually updating and out-putting the contents, a one second delay is created. It therefore follows that you would need 60 kilobytes of memory to do the same job if you wanted to process signals extending up to 20kHz.
There's more to sampling than just choosing the right sampling frequency though, there's 'resolution'. These digital numbers used to represent individual samples aren't just any old numbers, they go in steps, and the number of steps available depends on how many bits your analogue to digital converter can manage. Eight bits will give you only two to the power of eight steps which works out at 256. This means that your loudest signal could have 256 steps but quieter ones will have considerably fewer and this poor resolution causes 'quantisation' distortion which actually sounds like noise, the main difference being that it disappears in the absence of signal unlike most sources of noise. Using 12 or 16 bits gives a vast improvement in resolution and most of the newer systems use 16 bits: the same as used by compact disc. For those interested in such things, each bit used in sampling yields a maximum of 6dB of dynamic range, so an 8-bit system can only give you 48dB of dynamic range which is about as noisy as a cassette recorder without Dolby. 16 bits on the other hand gives you a dynamic range of 96dB which is superb. 12 bits gives 72dB: adequate for many effects applications. This formula is not exact as you have to add 1.8dB or so to the result to get the actual figure, but as rule of thumb, it's near enough.
The higher the sampling frequency, the higher input frequencies the system can cope with which means a better frequency response. The trade-off here is that the faster you sample, the more samples per second you have and so the more memory is needed to store one second of sound, and that means higher cost. It would be possible to keep this faster sampling rate and keep the cost low, but this might involve cutting down on memory, resulting in shorter delays. Early systems compromised both on frequency response and on delay time to keep the cost down but most modern machines often offer a 15kHz or greater frequency response with at least one second of delay time for an affordable price. Once you have the capacity to create a long delay, it's a simple matter to set up a shorter one by switching out some of the memory or by increasing the sampling frequency. A typical modern machine uses both methods, the memory being switched in and out by the Range switch and the sampling rate changed by the Fine control.
The final consideration in this area is the number of bits the system uses. 8-bit machines use only half the memory of 16-bit machines, if all other parameters are equal, and an 8-bit converter costs less than a 12- or 16-bit one, so 8-bit machines are usually cheap. However, they are too noisy for serious use. For home recording, a 12-bit machine with a 12kHz bandwidth should be considered as the minimum for applications where quality is of importance. Such a machine should be expected to have a maximum delay time of at least one second and a modulation section, the purpose of which will be discovered later in this article.
Figure 2 shows the block diagram of a typical system and, though it may look daunting, it's really quite straightforward if we take it one bit at a time. The signal comes in through a gain control which usually has some kind of metering system so you can get the levels just right. Failure to set the level properly may result in either excess noise or distortion. Just after the gain control, the signal splits and some passes direct to the output Mix control. This is so that you can combine a proportion of the untreated sound with the delayed sound for certain effects. The input to the delay line first encounters the ADC or analogue to digital converter where it's turned into a continuous sequence of byte sized numbers before being passed to the memory for short term storage. The reading into and writing out of this memory is controlled in most cases by a microprocessor and this also takes instructions from the Range control, so that more or less memory can be brought into play, depending on the amount of delay needed. Also interacting with the memory part of the system is the sampling clock and the modulation oscillator that acts upon it. These are shown as separate blocks for the sake of clarity, but are more likely than not to be incorporated into the microprocessor section in an actual production model. By varying the sample clock rate, the delay can be fine-tuned, usually over at least a 2:1 range, and the modulation controls allow the user to set up a cyclic change in pitch of any desired depth or speed which is necessary for the creation of chorus, flanging or vibrato.
A short time later, the digital representation of the signal emerges from the memory and passes through the DAC or digital to analogue converter where it's turned back into an analogue, electrical waveform and subsequently mixed with the desired proportion of the undelayed signal. I have deliberately omitted to mention the filtering needed in the input and output stages as this would only confuse the issue at this stage.
That leaves us with one more major aspect to cover and that is feedback. This sends some of the output back to the input of the delay line so that repeating echoes can be set up. The feedback gain must be less than unity though, otherwise the echoes will build up in level rather than decaying, resulting in an uncontrollable howl. On some models there is a phase invert switch in the feedback path which gives a subtle change in sound at very short delay times and this is used mainly to vary the sound of flanging effect.
The easiest effect to set up is the single delay. The Modulation Depth and Rate and the Feedback controls should be set to minimum and the delay Range control set up for the length of delay you had in mind. You can then use the Fine control to adjust the delay time to match it to the speed of the song being worked on (if that's the effect you want). This kind of straight effect is effective and ranges from a very short slapback echo (around 20mS) to a distinct repeat occurring a second or more after the original sound. It's a short step to convert this single repeat into a true repeating echo by turning up the Feedback control. Whatever appears at the output is then fed back to the input and so goes through the delay again, the time the echoes take to die being set by the amount of feedback. With the feedback up full, the echoes may go on indefinitely or even build up uncontrollably, so care must be taken in the setting up of this control, if working near the maximum setting.
Chorus is obtained by setting the delay to a few tens of milliseconds and then introducing a little modulation at around 3Hz. An equal mix of direct and delayed sound will give the result and when the effect is applied to either guitar, bass or keyboards, the result is instantly recognisable. Don't use too much modulation depth though, or the effect will sound dreadful. The reason that chorus is so named is because it creates the illusion of two or more instruments or voices playing together. It does this by simulating the difference in timing and pitch that always occurs when two or more people try to play exactly the same thing at the same time. By the same token, it also makes electronic instruments sound more natural as their waveforms otherwise tend to be far more rigidly structured than they would be in naturally occurring sounds. For example; a cheap electronic organ can be made to produce a good impression of a pipe organ just by adding a little chorus. In stereo mixes, panning a chorused sound to one side and a totally dry sound to the other creates a sense of movement, as this arrangement simulates some of the psychoacoustic clues our brains use subconciously to determine the position of sounds.
By shortening the delay further to only a few milliseconds and then removing the untreated portion of the sound using the mix control, a true pitch vibrato may be set up, which may be applied to instruments or vocals as required.
Adding the direct sound again will give an effect something like a phaser pedal and gently increasing the feedback will produce flanging. Flanging is difficult to describe but it is instantly recognisable. It found favour in the 60s and early 70s as a psychedelic effect and is reminiscent of a 747 trying its best to fly down a sewer pipe. Both phasing and flanging can sound effective using the 'effect one side, dry the other' technique to add width and movement.
Flanging sounds best using a slower modulation frequency, say one second, and a little more depth than you would use for chorus. Changing the delay time will affect the notes and harmonics that the flanging picks out and the feedback phase switch, if one is fitted will change the sound noticeably giving you an extra option.
Take care not to overload the input when using a lot of feedback because the input sound is being added to the fed-back sound before the input ADC, so there's more going through the system than just the original input.
One extra feature which you will find on a good many DDLs is the Hold button. This freezes any sound currently stored in the random access memory and recycles it, rather like a tape loop. At this time no new sounds are added to the memory. This is of limited creative use but a further refinement available on some machines is a single-shot trigger facility whereby the sound stored in the hold mode can be played back once each time a trigger pulse or MIDI signal is received. This forms the basis of a very crude sampler which allows short percussive sounds to be triggered by a drum machine or similar trigger source. The next logical step on from this is to add keyboard control of pitch and this is exactly how the early samplers were developed. These, of course, have progressed to incorporate all sorts of other functions, such as sound operated loading, sample editing and looping. However, sampling is another story...
Feature by Paul White
Previous article in this issue:
mu:zines is the result of thousands of hours of effort, and will require many thousands more going forward to reach our goals of getting all this content online.
If you value this resource, you can support this project - it really helps!