Martin Russ breaks off from his 20th century studies way in the future to report on some alternative goings-on which may soon materialise in the form of a new generation of powerful sampling synthesizers.
Recently I got a phone call from one of my contacts who lives in the Alternative Universe. Aan Zapriski told me that apart from the different shaped gear stick on the Mini Metro, there is one other thing over there that has got them all talking - resynthesis! Never one to admit complete ignorance, I managed to bluff my way through the call by supplying the necessary 'um's, 'er's, 'yeah's, 'really?'s and 'OK's. After he rang off, I got a phone call from the Alternative Universe Operator who asked me if I would accept the call from Aan - seems that there is a distinct problem in the time relationship between us and them! (My next phone bill should be interesting!) Luckily, my tutor and I had managed to hack access onto one of the better database mega-computers here in the future, and so I spent the next couple of free periods cramming on this mysterious subject of 'resynthesis'. My tutor suggested that, rather than keep it to myself, I should write it up and push it into the same time-warp as all the other material I had been sending back to the 20th century. So here we are - welcome to Resynthesis!
Let's start with something familiar - a cup of coffee. To make a cup of instant coffee, you take a cup, some hot water and some coffee powder/granules. You put the coffee into the cup and add the hot water. A splash or two of milk completes the sequence resulting in the required cup of coffee. This is not a cookery lesson but rather a good example of how you can go from a set of instructions to a finished product. If I changed the ingredients and the instructions, we could make almost anything - from a baby to a time machine!
Synthesizing sounds is very similar - you take a list of parameters or requirements and run them through a series of processors, with the sound you specified appearing at the end. With some analogue synthesizers the number of parameters and controls are small, whilst for FM synths the number of things that need to be specified can be astronomically large!
But can we reverse our coffee-making sequence? Can we take a cup of coffee and deduce how it was produced? (I have been in a few cafes where I don't think that would be possible!) Unless you have access to some pretty specialised equipment, it might be quite difficult to 'undo' some objects. This problem of working out how something is produced is called analysis, and it is the reverse of the synthesis process. You take an object and analyse it into a set of parameters which describe its component parts and how to process them into the object. In sound terms, we would have a device which could take the sound of a violin being crushed under a bus and produce a set of parameters a bit like these:
Analysis of Input Sound: Violin + Bus
Ingredients: (i) violin (ii) bus
Procedure: Place violin in front of wheel of bus. Move bus forwards and backwards until violin has been suitably modified.
But why? So far we seem to have discovered a wonderful tool for mangling Stradivarii and little else. What purpose is there in analysing a sound? The answer lies in that other subject which has been on everyone's lips this past year or so - sampling.
When you first buy a sampler, you start by playing all the sounds in the supplied sample library, work your way through the embarrassment of 'N-n-n-nineteen' and barking dogs, eventually ending up listening to classical music CDs for good orchestral thumps and pop CDs for good snares!
Once you have sampled everything you can think of, from the obvious to the rude, you begin to get the first inklings of a couple of minor problems. First, that dog barking sounds really silly as you move away from the original pitch, and that tambourine sample only sounds like a tambourine on one note! Secondly, that whoopee cushion sound lacks a bit of raspiness at the start and no amount of filtering, looping or editing makes any appreciable difference to the sound. Yep, this is the problem with samplers - they give you very good snapshots of sounds, but are usually completely hopeless at letting you change or edit the sound in any way that makes sense in the context of the actual sound. Filters on samplers can sound really funny - "It sounds like a violin being squeezed through a VCF..." being one of my collection of quotes from one of those rare people who speaks their mind honestly.
This problem of what to do after you have a sampler with 16-bit resolution, twice CD sampling rate, hours of sample time etc, has already been faced by the biggies in the sampling world (that's Synclavier, Fairlight and Kurzweil in no particular order) and at least two of them have opted for resynthesis-type functions as the method of enabling meaningful sound sample alteration. The basic idea combines analysis and synthesis (Figure 1): you analyse a sound to produce a list of parameters, and then use that list to synthesize the sound again, except that whereas changing the original sound is very difficult, changing the parameters is much easier. Choosing the right set of parameters is another matter though, as we shall see.
Interestingly, notice that in our resynthesis instrument we only need to analyse the original sound, since we can replay it by synthesizing it - so we are not talking about a sampler here; what we are talking about is a much better replacement for a sampler! By using a resynthesizer you will be able to edit the sampled sounds in many ways, exploiting much of the expertise you already use in synthesis. This is why those people who knew what the limitations of samplers would be have not bought one, but have been waiting in the wings for resynthesizers - myself included.
All you need to do to reproduce any sound you like is to analyse it, and resynthesize it, editing the definitions within the parameters if you want to change the sound (Figure 2). We are talking big potatoes here - this really is the technique where you can say 'I want a sound a bit like...' and get that sound, not the one vaguely like it in the sample library!
Despite the fact that all this analysis-then-synthesis stuff sounds very hi-tech and new, it is actually a very familiar technique. The simplest analogue example I can think of is the humble synthesizer 'patch sheet'. You listen to a sound and tweak the knobs on your synth to a rough approximation, writing down the positions of the knobs at the end. The synth then produces the sound whenever you want. Changing the values of the parameters on the patch sheet changes the sound when you make the same changes to the synth controls. On a slightly more complex level of analogue resynthesis, there is the vocoder. This device splits the incoming sound into frequency bands and allows you to impose the resulting analysis onto another signal. By changing the allocation of the frequency bands you can make all sorts of changes to the processed sound. I bet you never thought of a vocoder as a resynthesizer!
The vast quantity of calculations and data manipulation required for the analysis and synthesis process make the use of digital technology an obvious choice here. Digital examples of resynthesis are easy to find and are just as obscure. A simple example is a digital delay line (DDL). A DDL digitises the incoming signal and stores it as a list of numbers in memory (RAM). To output the delayed sound you just wait a while and read the stored numbers, converting them back to sound in the process. Depending on the delays and how you read the RAM, you can simulate sounds in reverberant or echoing rooms, and even reverse the sound. It is the parameters determining how you read the numbers out of the RAM which determine the effect you perceive at the output, which explains why modern DDLs can generate a lot more than just echo effects. In essence, reverb is simply lots of closely-spaced echoes coupled with some feedback; chorus is just echoes whose time delay is varying slowly at two different rates. By applying more complex operations to the numbers in the RAM, you can perform equalisation, pitch changing, compression and expansion; in fact, the only limitations are due either to insufficient computing power or lack of processing speed.
The patch sheet example above serves a dual purpose, it shows a simple example of everyday analysis (by ear, using trial and error) and synthesis (using a simple synthesizer). Unfortunately, as we all know, some synthesizer sounds are only good approximations, and some are very poor impressions of real sounds. The major technological and intellectual problem of resynthesis is how to make the analysis and synthesis accurate enough to make the synthesized sounds sufficiently 'real' to be able to replace the original sound. To paraphrase a tape manufacturer's famous advert: Is it a sample, or is it resynthesized? Until this goal is achieved you would probably be better off with a sampler.
The really bad news for the sampler manufacturers is that given time and enough processing power you can make the resynthesis sufficiently accurate to use on a large range of sounds. Although this is expensive at the moment - the Synclavier's resynthesis option costs more than most people's mortgage! - the history of the computer industry is littered with examples of the horrendous price drops achievable by technology. As an example, in 1980 the cost of producing a machine capable of doing in real time what a modern DX7 can do, using the technology then available, would have stretched into the millions of pounds. (It would probably have had to be a Cray computer- 6- and 7-figure sums!) Expect to be able to pick up a resynthesizer for a couple of thousand pounds in the near future, and way less than a thousand in a couple of years.
True resynthesis takes an incoming sound and produces from it a parameter list good enough to enable the synthesis of sounds indistinguishable from the original. It also enables editing of the parameter list, thereby changing the produced sound. The effectiveness of the process is dependent on the careful choice of method for each of the stages of resynthesis.
The analysis can be made in the time domain, where you split the incoming sound into short time segments, or frames, and match these to stored library frames which can be synthesized easily. Alternatively, you can look at the frequency content of the incoming sound in similarly short time frames and match these to stored spectra. A more complex technique uses a process called deconvolution, which attempts to untwist the incoming complex sound into lots of simpler components.
I have listed these three of the many possible techniques of analysis in an approximate order of complexity. Time matching of frames is just relatively simple pattern matching, although some stretching of the library shapes may be needed to give a good match (this is known as 'time warping', honestly!). Frequency matching needs to produce lots of spectra for the time frames and this requires lots of Fast Fourier Transform operations to be performed. (FFTs, as they are known, are the basis of modern spectrum analysers - see Figure 3.) The matching of spectra uses very similar techniques to the time frame matching. Deconvolution uses some very complicated maths which involves specifying the simpler waveforms and attempting to unravel them out of the more complex one - a bit like trying to find out what numbers you need to multiply to get the answer 12. It could be 2 x 6 or 3 x 4 or even 1 x 12 (seems relatively easy until I mention things like 24 x ½ or —12 x -1, or any of the other countless ways of generating 12!).
Once analysed into parameters, it is possible to use these as the basis of the synthesis or, alternatively, the parameters can be used as the starting point for more analysis. The suitability of the parameters for the synthesis method used is also important. If you are using an analogue synthesizer as the synthesis method then it is no use producing FM-type parameters! The mapping of the parameters produced by the analysis to the required format of the synthesizer can be very complex and involve additional processing. Typical parameter formats include: Spectral content information - this gives details of the time variation of the amplitudes of the frequencies in a sound. Much like the sort of information you need to supply to a Kawai K5 synth - you need to describe how each frequency component behaves in time. FM parameters are an alternative format, although their derivation can be difficult. Approaches vary from simple matching of time frames to simple FM library waveforms, to complex deconvolutions. Another approach is to use analogue-type synthesis parameters as the format, and here you need information about the waveforms available and how they change when filtered. Because of the limited number of VCOs and VCFs available in many simple analogue synthesizers, this is usually not a very satisfactory technique, although with large numbers of modular synth elements remarkable results can be obtained (as per Walter/Wendy Carlos and Tomita).
All this processing of data involves compromises in accuracy, precision and time - some of the more esoteric and unusual number-crunching can take lots of time. The end result may be usable, or may be completely untenable. However, the analysis/resynthesis process is not a fixed one time/one way process. It is perfectly OK to analyse the synthesized sound and see where it differs from the original sound input. It is possible to run the resulting analysis/synthesis loop as many times as you like, until no further improvement can be made - this is known as 'successive approximation'. The idea is that you successively move nearer and nearer to the original sound, although sometimes it is equally possible to move further away!
So far I have described a product which uses a similar front-end to a sampler, ie. it digitises and stores incoming sounds. The data is then processed to produce a parameter list which is used to drive a synthesis section - some sort of digital synthesizer is used to generate the finished sound. We now have a complex amalgam of parts of a sampler and synthesizer, glued together with a very fast and powerful computer - in other words, a resynthesizer! As Aan Zapriski said to me: "It sort of follows a sequence - the drum, the guitar, the piano, the MiniMoog, the DX7, the Resynthesizer..." As a parting thought, how about a quick trailer: 'Coming to a music shop near you very soon - a resynthesizer for about £1500 - all the sounds you ever dreamed of, as well as those you never imagined!" I'm saving my money already!
Feature by Martin Russ
mu:zines is the result of thousands of hours of effort, and will require many thousands more going forward to reach our goals of getting all this content online.
If you value this resource, you can support this project - it really helps!