Advanced Music Synthesis
In the 1950's Karlheinz Stockhausen generated new, complex timbres by splicing small sections of tape containing sine waves recorded at different frequencies into a single loop, and replaying it at high speed, while Pierre Schaeffer used the sounds of natural instruments and transformed them by separating out different portions of their envelopes, again by tape splicing. Hence the old concept of synthesis was born, recognising two distinct approaches: additive synthesis, where separate elements are assembled to give a complex result; and subtractive synthesis, where components of the original are removed, yielding new elements. When applied to timbre generation using electronics, additive synthesis becomes the mixing of separate tones at different frequencies to generate sounds with new overtone series, and more specifically, adding sine waves to build up a timbre harmonic by harmonic. Subtractive synthesis becomes filtering of waveforms, generating new timbres by eliminating unwanted harmonics, and processes that generate new components from an input signal correspond to the early techniques of intermodulation, whereby the timbre, amplitude or pitch of one sound transforms similar properties of another.
Clearly in many ways this concept is out-of-date in the context of the modern analogue synthesiser. For example, the commonest modern application of multiple oscillators controlled as a group with their outputs mixed is to produce rich sounds by tuning them in unisons and/or octaves, rather than successive harmonic intervals for the synthesis of new harmonic structures. In 'classical' additive synthesis, the oscillators must either be very stable or have some method of locking their frequencies together, since where one harmonic is made up of components from more than one oscillator, a slightly changing phase difference will destroy the integrity of the resultant harmonic series. Sine waves or other low harmonic content waveforms were used, giving independent control over the amplitude of each harmonic, and also minimising the effect of frequency drifts. By contrast, the modern unisons-and-octaves technique relies upon beating between oscillators for the richness, and without a high harmonic content the sweeping cancellations that are the most useful effect of beats would not be heard.
Additive synthesis for timbre generation was practical in the early classical electronic music studios where the only pieces of equipment built specifically for music synthesis were custom mixers, modulators etc. Like some of the theory, most items were borrowed from radio: banks of sinewave oscillators (not voltage-controllable) provided the best method of synthesising arbitrary timbres. Pulse generators and octave filter banks gave an alternative but something like a rampwave oscillator was considered a luxury, the commonest substitute being a sine wave oscillator driving one or more valve amps to overload. Nowadays, synthesisers use multiple oscillators to create new timbres by various kinds of modulation, for example frequency and amplitude modulation for non-harmonic tones and waveform shape modulation, including sync, and self-modulation, for harmonic tones. Hence it is possible to design a small, low-cost synthesiser which is more powerful than a complete classical electronic music synthesis system, while larger synthesisers can incorporate features to take full advantage of the voltage-controllability of the oscillators and other modules. These two approaches to timbre generation, classical using parallel organisation and contemporary using series organization, are illustrated in Figure 1. For the purposes of this comparison, doubling up of tones to thicken the sound is not considered a basic part of timbre generation since it does not fundamentally affect the partials of the sound and is usually achieved by simple duplication of the oscillator part of the patch. Nevertheless, like reverb and echo it is still an important factor in determining overall tone quality.
Returning to the concept of synthesis and the need for an update to include integrated electronic systems, I propose that the various tone-forming techniques of synthesis, both analogue and digital, be divided into four groups, according to their effects on the time domain (waveform) and frequency domain (spectrum) representations of the signal. These four groups are:
1. Frequency domain additions.
All that produce new components not necessarily harmonically related to those of the original eg. frequency and amplitude modulation, frequency shifting.
2. Time domain transformations.
All that introduce new harmonically related components or alter the amplitudes of existing ones by non-linear processing of the original eg. waveform shape, clipping.
3. Frequency domain transformations.
All that alter the amplitude of existing components according to their frequencies, eg. filters.
4. Time domain additions.
All that reintroduce the original signal after a time delay eg. reverb, echo.
Where one device can perform in different ways depending upon the input, it falls into more than one group. For example rectification can be used for waveform shape modulation, as it is on many analogue synthesisers where a variable triangle-to-ramp waveform is available; or for amplitude modulation where one of the signals is a square or pulse wave. Chorusing can be considered a member of Group 3, since phase cancellation alters the strength of harmonics, or Group 4, because a distinct separate image may be heard.
The techniques of Group 2 are best described as waveform shape modulation — the shape of the waveform is controlled directly, and independently of the frequency. This latter point is a useful property of this type of timbre generation and makes it different from filtration (Group 4) where the effect is purely based on frequency so that a given waveform gives different results at different frequencies when processed by a fixed filter. Another important difference is that shape modulation has a more general effect on the harmonics of the waveform. Unlike a simple filter which has a single cutoff or centre frequency which relates directly to the frequencies of the harmonics affected, altering the shape of a waveform effects the strengths of many harmonics simultaneously. This means that it cannot be used to specifically control certain bands of harmonics, and the nearest it comes to a filter's functionality in these terms is with synchronisation, which can increase the energy near the natural frequency of the slave oscillator. This makes shape modulation no less powerful: filters cannot produce many of the shape modulation effects, for example gradual introduction of even harmonics in a square wave, but are still invaluable. (In fact seemingly ultimate digital synthesis systems which allow each harmonic its own key-gated ADSR for amplitude are unable to produce the sound of a low-pass filter sweeping the spectrum, since this requires sequential control of each harmonic.) Hence waveform shape modulation and filtering are excellent complementary timbre-forming techniques, and together provide easy and versatile static and dynamic control of the timbres of harmonic sounds.
The commonest type of waveform shape modulation (WSM) is pulse width modulation, which potentially provides any pulse width between 0 and 100%. Since a pulse wave with a width greater than 50% is merely the corresponding wave with a width less than 50% but inverted, the range from infinitesimally thin (delta wave) to equal mark/space ratio (square wave) contains all the different timbres. However, in the same way that a mix two unison ramp waves of the same sense sounds very different from that with one inverted, the effect of inverting a pulse wave can be important when it is heard with other waveforms. Also its control properties are different, and the effect of sweeping a pulse width through 50% could not be easily duplicated if only one half of the range was available, so it is useful to have access to the complete 0-100% range. The spectra of a square wave and two different pulse waves are shown in Figure 2. Note that the pulse wave spectra show periodic dips — these occur where the harmonic number is a multiple of n, where n is 100/width (%). In the case of the square wave they occur at every second harmonic, eliminating all the even ones. The odd-looking sharpness of these dips is a result of the fact that alternate inter-dip groups of components are of opposite phase relative to the fundamental, hence the envelope of a pulse wave's spectrum can be thought of as a decaying sine wave where below-the-axis components have reversed phase. This makes the difference between a square wave and a triangle wave, which also has odd harmonics only, more than just a matter of harmonic strength since the square waves components alternate in phase, and this has a more important effect when considering waveform shape modulation by linear self-modulation in frequency domain terms.
The complex nature of pulse wave spectra makes PWM unique among common WSM techniques, none of the others having nulls at related harmonics. This has little consequence when PWM is used as a source of different static timbres where the most important feature is the increasing amplitude of harmonics, odd and even, in relationship to that of the fundamental, as the width is made more extreme. But when the pulse width is varied dynamically the nulls sweep the frequency spectrum, giving a marked phasing effect that is virtually impossible to obtain by other means not incorporating time delays. It is hence very useful in creating multiple oscillator sounds with just one, and is the basis of most chorus features on preset monophonic synthesisers. In fact periodic shallow PWM at about 5Hz is the easiest and sometimes the only way of getting a sustained tone that is pleasing to the ear on many cheap commercial single VCO instruments.
The reason why PWM with a low frequency oscillator waveform sounds like phasing is that it very nearly is phasing, of a sort, since mixing a rising rampwave and a falling rampwave of the same pitch and amplitudes produces a pulse wave since the slopes cancel and the resets produce respective rising and falling edges. As the phase relationship changes due to a slight pitch difference, the pulse width becomes more extreme until perfect cancellation occurs, then the advancing phase produces a thin pulse of the opposite sense. This process is illustrated in Figure 3.
So, the dips in a pulse wave's spectrum correspond to cancellations of harmonics of two rampwaves with the particular phase difference required for that pulse width.
The major reason why sweeping doesn't sound exactly like phasing is because of the way it is used — the reversal of the modulating waveform before 0 to 100% is reached gives it away as not being the real thing. It is used like this because a cutoff of the sound at each extreme is often undesirable, and anyway difficult to arrange precisely. What is really required is the sound of two identical signals with a changing delay, without one inverted. Since a pulse wave is the difference we only have to add twice as much of one of the imaginary original ramps to get a more realistic phasing effect.
The resultant waveform is indistinguishable from a mix of two rampwave oscillators, and remember so far we have only used one oscillator, with PWM.
In just about all analogue synthesisers, the variable width pulse wave is derived from another waveform using a comparator, a device which has two inputs, and an output which is at one of two values depending upon which of the inputs is most positive at any instant. A ramp or triangle wave is compared with a control voltage, and a pulse wave is generated with a width according to the proportion of time the waveform spends above the value of the control voltage. Hence by sweeping the control voltage to the comparator, which will usually be the sum of a voltage from the pulse width control and an external CV, between the limits of the waveform, the width can be varied between 0 and 100%. This is shown in Figure 4. Note that the limits of the resultant pulse waves are independent of their widths, so for values other than 50% there is a DC offset equal in magnitude but opposite in sign to the control voltage (assuming that the input waveform has no offset). The pulse waves in Figure 2 are shown with no offset and since the ramp waves of Figure 3 are symmetrical about 0V, the resultant waveform has no offset, so the peak levels slope up or down, to be returned to zero when the pulse wave 'turns over'. Offset that varies with shape is a feature of most kinds of waveform shape modulation, and can cause problems when the output is used as a CV, e.g. for FM.
As Figure 4 shows, there are important differences between triangle-and ramp-derived pulse waves. One edge of a ramp-derived pulse will coincide with the reset of the ramp, so there is a phase difference which is proportional to pulse width. This is most obvious when the PW is modulated by an LFO signal, since the phase modulation can become great enough to be perceived as vibrato (the frequency shift equals the differential of the phase shift). Since no phase error occurs if a triangle wave is used as the basis of the pulse wave, this is more suitable for chorus effects, where a sine wave of around 5Hz is used to modulate the pulse width over a small range. Considering the pulse wave as the sum of two ramps again, ramp-derived PWM gives vibrato to one of the ramps only, whereas triangle-derived PWM modulates the frequencies by the same amount but in opposite phase, retaining a constant 'average pitch'. The vibrato effect of the former is very obvious with a fast LFO and large depth, rendering the tone unusable for melodic playing, and since the frequency shift is constant with changing pitch it is more noticeable at low frequencies, where even a envelope sweep of width can cause an audible 'wow'.
LFO's that are equipped with PWM are particularly useful for audio since apart from the sense (inverted or non-inverted) of the output, the comparator is symmetrical in its response to the waveform and CV. Hence we can just as easily modulate an LFO by a ramp or triangle wave from a VCO to generate a pulse wave with swept width, appearing at the LFO output. Using more than one LFO in this way gives pulse waves with independently varying widths, producing some beautiful multiple chorus/phase sounds, depending on the LFO rates. Unlike more conventional techniques, this requires only one VCO. Also, the basic waveform is accessible, so the rampwave could be mixed in as previously described. However, a triangle wave gives best results since all components are then of independent phase, whereas in the case of the ramp, the resultant pulse waves have simultaneous falling edges and therefore degenerate components. A basic patch demonstrating multiple PWM is shown in Figure 5.