Modular Synthesis (Part 7)
The sound of the human voice is one of the most difficult acoustic tones to reproduce electronically, but reading Steve Howell's step-by-step guide should set you on the right track.
Steve Howell takes a look at how to synthesise what is the perhaps the most complex acoustic sound of all — that of the human voice itself.
It could be argued that if there is one sound that is almost impossible to synthesise accurately using analogue techniques, it is the human voice. This is because, although the mouth is nothing more than an elaborate lowpass filter and envelope shaper, it can be manipulated in many sophisticated ways that a synthesiser's VCF just cannot come close to in terms of versatility. Apart from this, the mouth can switch instantly from being an oscillator to a noise generator; combine with these the resonance of the nasal cavities and the incredible control we have over all these parameters, and the analogue synthesiser appears somewhat humble by comparison. Synthesised speech is often almost entirely unintelligible, even when vocoders are used, and sampling (at the moment) can only really handle one small element of the vast range offered by the voice.
What an analogue synthesiser can do, however, is to recreate the effect of vocal sounds, and these effects can be employed in many styles of music.
The most simple vocal sound to synthesise is that of the solo female soprano. The patch is given in Figure 1 and, as you can see, it could be patched up on even the simplest of monosynths. It utilises a pulse wave with a mark/space ratio (pulse width) of about 25/75. This is fed into a standard VCLPF whose cutoff frequency is set at about two thirds and whose resonance is set so that it is a tweak away from oscillating - in other words, high. The EG controls are set as required, but I would recommend a slowish attack with full sustain and a release of about 1½ seconds for the legato effect this sound normally requires. Vibrato can be delayed or left on permanently as you wish, while portamento is essential to create the 'wailing' effect, though it shouldn't be excessive. Add to this copious quantities of echo and/or reverb and you have an ethereal vocal effect that should be quite atmospheric. Adjustment of the cutoff frequency will give you the whole range of 'ooohs' and 'aaahs', depending on where it is set. Keyboard track must be on, and should you find the sound too 'shrieky' at the top end of the keyboard, backing this control off should remedy the problem as fewer harmonics will be passed through.
For a more 'choral' sound, two or more VCOs detuned as necessary should fit the bill. I suggest you use a sawtooth wave as the other waveform and you could, if your synthesiser allows simultaneous waveform output, mix in a pulse wave whose pulse width is being swept by the sine or triangle output of an LFO. Chorusing, a mild flange or a harmoniser will also thicken the sound, especially if run in stereo. So, not a particularly difficult sound to set up, but it may require some delicate tweaking to get exactly the sound and effect you require.
Male voices are, likewise, fairly easy and require only a change in pitch, a decrease in resonance, adjustment of the cutoff frequency to suit and slight modification of the EG controls, You can add a touch of EG modulation of the VCF using the second EG. If you do decide to do this, the controls of the second EG should be set to give attack, decay and release times of about 500ms and the sustain set to about two-thirds - this will give a slight 'wow' effect which can be quite useful. You could also use the second EG (or yet another EG if you still have one to spare!) to sweep the pulse width very slightly. As with the soprano sound, be prepared to fiddle a bit to get the sound you want as it won't come instantly. Choral sounds can be obtained in the same way as before by using detuned VCOs and/or chorus, harmoniser, etc.
Those, then, are two sounds which can be obtained with a fairly modest synthesiser. If you have more in the way of hardware more possibilities are open to you.
For instance, if you listen to almost any singer, be he (or she) of the rock, pop or operatic persuasion, you will notice that there is usually a slur up to each note, and this can easily be obtained by using the output of an EG routed to the CV input of the VCO. The attack should be set to around 100ms so that there is a slight 'swoop' upwards. You can either set the sustain full up so that the pitch will stay constant after the attack cycle, or you can back it off a bit so that the pitch slides down. In the latter case, the decay control should also be set to 100ms or so. In either event, you'll have to retune the VCO using a combination of the VCO frequency control, EG sustain level and EG modulation level. For an extreme slur the pitch has to be set fairly high, but if you only want a hint of sweep then, naturally, the EG modulation level needs to be set quite low - either way, be prepared to jiggle with the respective controls for the optimum effect. Release of the pitch sweep EG should be set longer than that of the amplitude shaping EGs, so that you don't end up with a 'clunk' at the end of the note as the pitch drops abruptly before the sound has died away (unless, of course that's precisely what you want!)
If you opt to use more than one VCO for a more choral effect, you could try sweeping only the one VCO and keeping the other 'straight'. Depending on how you balance the two VCOs level-wise, you can create a variety of commonly encountered vocal sounds, from the comic to the menacing. An extension of this is to use three VCOs, with two of them being swept and the other left untouched. You can then tune the two swept VCOs apart and bring them to unison using a combination of modulation levels and sustain amount - yet again, experimentation will yield the best results. In both these examples, the VCF and amplitude shaping EG can be adjusted to taste.
Probably the most outstanding feature of the human voice is its ability to change its tonal characteristics, often quite drastically, for each new note, and whilst we can't get synthesisers to actually come up with words, we can use the VCF for some fairly drastic tonal changes. Perhaps the most famous example of this is the comic male voice so beloved of Japanese synthesist Tomita. This sound is actually quite easy to create, but you will need at least a sequencer or a sample and hold that can be stepped through with an external trigger pulse. The patch is shown in Figure 4 and the method is as follows.
Set the basic vocal sound up as you require (in this case, the male voice patch). Next, program some voltages into your sequencer, setting each one about a volt apart. If you're using an analogue sequencer you can simply tune the controls, but if you've gone digital, you'll have to connect the keyboard to your sequencer and play, say, a C and another C an octave up. Now connect the sequencer's CV output to the CV input of the VCF, and connect the gate output of the keyboard either to the 'step' or the 'external clock' input of the sequencer. Whenever you play a note, the sequencer will step through the two voltages you have programmed into it and will open and close the filter accordingly. By varying the level of modulation at the filter and by adjusting the cutoff frequency and resonance, you should be able to create a whole host of vocal sounds that would probably make Tomita proud! You can, of course, also modulate the VCF with an EG for a touch of 'wow', and if you find the jumps in voltages too abrupt you can rectify this by routing the CV output of the sequencer via a lag time integrator which will smooth the changes out.
If you don't have access to a sequencer, you can use a sample and hold circuit in its place just as effectively, except that in this instance the tonal changes will be random instead of preset. If, however, you have an old ARP analogue sequencer, you have the best of both worlds in that you can preset the voltages and then, by switching it to the 'random' position, step through those preset voltages so that they are picked out at random.
If you want to, of course, you can program many more voltages into your sequencer to give the sound more variation, and you could also use more than one VCO, sweeping it with another EG as outlined above.
Points to watch for when playing these sounds are basically the same as those for any sound that is an imitation of one that requires breathing in that, for total realism, you have to phrase the music properly, allowing plenty of time for 'breaths'. Of course, the beauty of synthesised vocal sounds is that you don't have to worry about such things, but if realism is your aim then it's a detail you've more or less got to bear in mind.
Because of the very high resonance of some synthesised vocal effects, in particular the female soprano, you could well run into problems during recording, whereby on certain notes the level is boosted incredibly high and wraps the needles round the end stops of your poor VU meters! If this does happen, the use of a compressor/limiter will help - even if it's only a little footpedal type - otherwise you'll just have to watch your recording levels closely. These sounds are also fairly pure, and you may therefore experience some problems getting them to cut through a mix: again, if you can use a compressor/limiter it will certainly help.
You can, of course, experiment with other types of filters such as high or bandpass and you could also try routing the sound through a graphic or parametric equaliser, boosting the mid frequencies in particular. Most modern-day mixers have reasonably versatile quasi-parametric EQ sections and such a facility will usually suffice if you don't have access to larger units. If your mixer's EQ is a bit limited, however, a simple six-band graphic EQ pedal will do and, since they could always come in useful for other sounds as well, it might be well worth investing £50 or so in a suitable model...
Reverb and echo can be added in whatever quantity you wish - I prefer to use quite a bit and usually add it to the sound as I record. This not only helps me to play the sound in the first place but also enables me to set up a unique acoustic environment for that sound, which in turn helps it stand out in the mix.
Different echo speeds can produce startlingly different effects: long echoes on the female voice should make it particularly ethereal and heavenly whilst a short slap-back echo on the male variant can give an almost 'computer' feel to the sound. Likewise, a chorus unit and harmoniser, as mentioned before, will augment a choral sound, especially if run in stereo.
Meanwhile, vibrato can be added to any of the sounds as you wish, and if you find the cyclic effect of a low frequency sine or triangle too repetitive you could always inject a shade of 'human error' into the process by using the random vibrato technique explained last month, whereby the output of the sample and hold is routed via a lag time integrator to create a smooth but random pitch modulation.
In conclusion, I think of all the 'acoustic' sounds available to the modern synthesist, vocal effects are the ones most likely to make an audience sit up and take notice. There's something about a synthesised vocal passage that people find quite fascinating - I well remember my own reaction when I heard Tomita's version of Debussy's Golliwog's Cakewalk for the first time - so it's worth experimenting with vocal effects and trying to make them as interesting as possible.
If done well, they can turn a mediocre piece of music into a reasonable one, and a good one into something rather special.
Feature by Steve Howell
mu:zines is the result of thousands of hours of effort, and will require many thousands more going forward to reach our goals of getting all this content online.
If you value this resource, you can support this project - it really helps!