Bits 'n' Pieces (Part 2)
How tapeless recording works and what advantages it has over conventional recording systems.
With the pace at which technology moves, it won't be too long before affordable hard disk recording becomes genuinely flexible enough to rival tape. When it does — and some would say that time is almost upon us — you'll need to know what it's all about. Yasmin Hashmi puts you wise.
Over the past seven years, around 90 different tapeless systems have been launched onto the market and thousands are currently in use. In order to appreciate the full potential these systems have to offer, it helps to have a basic understanding of the principles involved. Part 1 of this introduction dealt with digitisation of audio and the merits of various recording media. This month, we'll look at the advantages of instant access and the principles behind non-destructive editing.
A tapeless system will basically consist of three main elements, namely the computer, the storage medium and the user interface. The computer forms the heart of the system and is analogous to a brain without knowledge. It may consist of off-the-shelf hardware, custom-designed hardware, or a combination of both, and the speed and processing power of the computer will determine the potential capabilities of the system.
The knowledge which makes the 'brain' useful and takes advantage of its capabilities is software. This can also be off-the-shelf, custom-designed, or a combination of both. Software determines how efficiently the computer's power is used and which of its potential capabilities are used. It will control the flow of data to and from the storage medium and will determine which recording, editing and processing features are made available to the user.
The vast majority of tapeless systems use hard disk as the recording medium. The hard disk drive, computer, inputs, outputs and any connectors for interfacing with other devices will normally be housed in one hardware unit. The user interface will be connected to the hardware unit by a cable and could be positioned some distance away.
This provides the user with a means of control over the system. The hardware can again be off-the-shelf, custom-designed, or a combination of the two. Off-the-shelf user interfaces normally consist of a personal computer with monitor, keyboard and mouse. A typical PC-based user interface is shown in Figure 1. The most popular brands of personal computer (from now on shortened to PC) are Macintosh and IBM or IBM compatibles. In many cases, the PC provides the hardware platform for the complete tapeless system. In these cases, all that is required is one or two cards to be plugged into the PC's computer unit (to provide inputs/outputs, and so on) and some software to turn the PC into a tapeless editor, recording to the PC's own hard disk.
Systems with custom-designed user interfaces generally provide a control surface with a screen or display, dedicated and soft keys, transport controls and a jog wheel, and perhaps one or more faders for level control. An alpha-numeric keyboard is often supplied but is only used for labelling purposes. A typical example of a custom-designed user interface is shown in Figure 2. The jog wheel is often multifunction, and can normally be used for scrolling a cursor or marker on the display, incrementing or decrementing values, jogging through audio and scrubbing (imitating reel rocking).
With the PC-based user interface, these functions are generally performed using the mouse. Other controllers, such as transport controls or faders, are represented as icons on screen (which are activated by selecting them with the mouse). In some cases, function keys on the alpha-numeric can be used for transport control and/or an optional third-party remote could be connected, which usually provides transport controls and a jog wheel.
The way in which information is presented on screen, what function a key performs or which methods of recording/editing are used (in other words how a system operates) depends on software.
Each manufacturer of a tapeless system has their own way of system operation — no two are the same. Nonetheless, most utilise a number of common strategies, and once these are understood, it is not too difficult to move from one system to another.
The two most common ways of displaying audio on screen are by using waveforms or tape representation. Waveforms are more likely to be used with systems which are only capable of stereo recording and editing and which have a PC-based user interface. There will normally be a waveform for each channel (one for left and one for right) which will be displayed horizontally on the screen. There will be a cursor displayed as a vertical line across both waveforms, which represents the current point of replay. In addition to using transport controls, the cursor can be dragged, using the mouse, to change the point at which audio commences replay or to find an edit point. In some cases, the mouse can be dragged back and forth across the waveform to perform scrubbing in order to find an edit point.
Editing consists of selecting a portion of the waveform and cutting, deleting or copying it and then perhaps pasting it elsewhere within the waveform. The section to be edited can be selected by dragging the cursor across the waveform to highlight the section or by finding the start of the section by moving the cursor, marking it, then finding the end, marking it and then pressing a key which selects the audio between these 'in' and 'out' points. This is now called a cue (or segment) and can be labelled with a name if necessary. Another key is then pressed which performs an edit function such as cut or copy. Each time an edit takes place, the waveform redraws to reflect the change.
The tape representation method is commonly adopted for multichannel systems and is used with both PC-based and custom-designed user interfaces. The channels of audio are represented by horizontal tracks, with audio highlighted as blocks within the tracks. A vertical line across the tracks represents the play head and is fixed. The audio moves across the screen and passes underneath the play head. In order to find an edit point, transport controls can be used, or the jog wheel can be used for scrubbing the audio back and forth under the play head. Editing generally consists of locating and marking in and out points and selecting the edit function to be performed.
The strength of the tapeless system is the fact that editing is non-destructive. In other words, audio can be edited without actually destroying or physically moving the original recording. The key to non-destructive editing is random access. This is the ability of the computer to accurately and virtually instantly jump to any point in a recording.
With tape, one way of knowing where you are, apart from aurally, is by looking at the counter. However, this has the disadvantage of being independent of the tape — if you accidentally run off the end of a spool, the counter will no longer refer to same places it did before. A more reliable method of reference is to physically record time code onto one of the tape's tracks. This is fixed and can be output to a display to show the current position of the play head and can also be used to automatically locate a point on the recording. However, the time it takes to locate a point (ie the spool time) is too long to be considered random access.
Disk-based systems use the formatting process to physically record a position reference throughout the disk, so that any position on the disk can be precisely located. In general, a disk only has to be formatted once (and this is usually done at the factory). Every sample recorded onto a disk will have its own unique position reference (or address) and these addresses can be given labels such as 'First verse' or 'Middle 8' or 'Mistake'. So if you've found the start and end of the first verse in the recording and have told the computer to label this section 'First verse', it will take those start and end addresses and store them with the label. Thereafter, irrespective of where you are in the song, if you tell the system to play the 'First verse' it will find the corresponding start address on the disk and play from there to the end address. Since the disk is rotating at high speed and the head can move back and forth above the surface very quickly, the time it takes to locate any point in a recording is almost instantaneous.
Let us consider that we are using a disk-based system to mark out further salient positions in a song we have just recorded. Listen through the whole song and stop at the end. Label this as 'End of song'. Then tell the system to copy the 'First verse' and place it immediately after the 'End of song'. This should take very little time and, depending on the system, may involve simply moving the jog wheel and pressing a couple of keys, or clicking with a mouse a few times. Now tell the system to play the new arrangement.
To you, it appears that some audio (in the form of the first verse) has been added to the end of the song, but in fact the disk head is simply flying around the original material and throwing it out when, and as often as, you told it to. When the system gets to the 'End of the Song' label it sees a command to play the first verse again. It knows the address of the first verse so it jumps straight back and plays the first verse again. This is illustrated in Figure 3.
It follows, then, that if you have sectioned up your original recording with labels, you can play these 'cues' in any order you wish — totally reshuffling the order in a matter of seconds. So the question 'What if we swapped the first and second verses with each other?' no longer presents a major headache, the task consisting only of marking out cues and pressing a few keys to change the order in which they are played.
Random access is also very useful when the chorus of a song is always the same — it is necessary only to record one chorus, label it, and then sequence the cue as many times as the chorus is required.
Although the head of a hard disk may take virtually no time to move from one place to another, there are cases where the small amount of time taken can be critical, and can introduce undesirable delays. To avoid this problem, the audio is first loaded into a buffer memory of RAM before being output. Figure 4 shows an analogy between buffer RAM being filled from disk and a barrel being filled by a bucket.
The bucket represents the play head and the barrel represents the RAM. The bucket finds water (audio) from various locations in the pool (the hard disk). It empties the water into the barrel which has a tap (an output). The water flows out of the tap continuously, eliminating any breaks caused by the bucket having to move to different locations in the pool. In order to ensure that the barrel does not run out of water before it should, it is a good idea to maintain a certain level of water in the barrel. In order to do this, the bucket must do its job fast enough to maintain the level.
However, if another tap is added (a stereo output) the water will be running out of the tap twice as quickly, therefore the bucket will have to do its job twice as quickly. Adding more taps (for multi-channel output) will force the bucket to work even faster. Eventually, adding yet another tap would be the final straw — the bucket would not be able to move fast enough to keep up with the output flow. In practice, the head of a hard disk can operate fast enough to support eight simultaneous outputs, although some manufacturers manage to achieve more.
As has already been shown with copying, editing does not actually involve physically moving the originally recorded audio. Consider the example illustrated in Figure 5. Label where the second verse begins and ends. Now tell the system to cut the second verse out. To you, it appears that the second verse has been removed and the end and beginning of the adjacent sections have been joined. However, what actually happens is that when the system reaches the beginning of the second verse, it knows it must immediately skip to the end of the second verse and continue playing. So if you change your mind and want the second verse back in, don't worry, it's still there, just 'undo' the cut command.
Let us now imagine that we wish to insert four beats of silence between the second chorus and repeated first verse. This will require a few keystrokes or simply dragging the repeated first verse back with the mouse.
Compare these editing procedures with tape editing where, in order to get the same results, audio would have to be recorded onto another machine and re-recorded back onto the first, at the correct position. If any mistakes were made, the result could be catastrophic. In any case, such processes would be very time consuming and would probably not be attempted in the first place.
These editing examples illustrate what can be done with mono or stereo music recordings — however, the same techniques can be applied to speech and sound effects. There are a number of systems which allow multitrack operation, the most common configuration being eight outputs. These can be used to record in a linear way and can then be used to edit the recordings as shown in the previous examples, but on a multi-channel basis (or they can allow multichannel arrangements to be built up of mono and stereo tracks). However, because disk-based systems allow audio to be played in any sequence from any location on disk, there is no reason why the audio should be recorded in a linear fashion in the first place.
This concept is in direct contrast to tape recording. Since tape does not have random access, audio must be recorded in the physical position in which it is to be replayed. For example, if there is a minute's worth of silence between two sections of audio, there must be a minute's worth of tape between them. Disk-based systems can be used to record in this linear way yet still take advantage of random access, or they can flout the convention altogether. Consider the comparison shown in Figure 6.
The disk system need not record the silences in order to space the sections of audio the correct distances apart. As with Figure 6b, the system will merely wait the required time before playing the next section of audio.
Most disk-based systems slave to time code. This allows the system to be synchronised to an external device such as a sequencer, tape machine or VTR. In the case of tape-based systems, the time code most commonly used is LTC (longitudinal time code) which counts in hours, minutes, seconds, frames and subframes. It will be recorded onto tape alongside the audio or picture and will therefore give a time reference for any location in the recording. In the case of a sequencer, MTC (MIDI time code) is often used. This converts LTC into MIDI information and again, allows the tapeless system to follow the playback of the sequencer.
The tapeless system will normally offer the operator a choice of timing references to work in. These can include LTC, absolute time, feet and frames (for film), measures and beats and even sample numbers. An example of the use of LTC is the placing of sound effects against picture. The tapeless system will read and display the incoming code. The operator can then see from the code (usually displayed in the picture as well) where the beginning of a sound effect is supposed to occur. This time can then be used to correctly place the cue within the playback sequence by typing the value in, or by 'grabbing' it from the display.
Most systems also provide an event list, which sequentially lists the cues which are used in a sequence and at what time position each is triggered. A cue can be repositioned by selecting it in the events list and entering a new time code value. On playback, the cue will be triggered at the new time.
Recording studios often use disk-based systems in conjunction with a digital multitrack tape machine. Editing is usually completed on disk, with the final product transferred to multitrack tape for mixing. Other examples of applications for which tapeless systems are particularly useful include: