Magazine Archive

Home -> Magazines -> Issues -> Articles in this issue -> View

Bits 'n' Pieces (Part 1)

An Introduction to Digital Audio

Digital audio explained from the ground up — bit by bit!


To those of us brought up with analogue tape recording technology, it's difficult to imagine that it could ever completely disappear — but it almost certainly will, and its place will be taken by ever cheaper and more accessible digital techniques. Yasmin Hashmi introduces this important subject.


In the not too distant past, the transistor was responsible for a revolution in communications. One of the offshoots of transistor technology, the personal computer, has had just as great an effect on modern media editing. Take publishing, for example. The word processor (a computer with a bent for text editing), has been responsible for completely redefining the market. Publications of high quality can be designed, edited and printed from the desktop — in a fraction of the time it would take to complete the process conventionally. In addition, the non-destructive editing computers provide means freedom to try out any number of designs and arrangements without incurring high costs in terms of time and materials. Who could imagine giving up their word processor and going back to the manual typewriter? The personal computer has had the same effect for audio — allowing the musician to record and edit audio and arrange it in a number of different ways.

The first musical task the personal computer (from now on called PC) tackled was sequencing. By adding the appropriate software, the PC could be turned into a compositional tool, allowing the musician to notate a score using non-destructive word-processing style operation. By adding a MIDI interface to the PC and connecting it to a MIDI-compatible sampler or synthesizer, the score could automatically trigger the selected sound in the sampler/synth, telling it when to play a note (or notes if polyphonic), at what pitch and level and for how long.

With a library of sounds and a sufficient number of samplers and/or synths, the musician could now create complete backing tracks at leisure. In addition, the PC could be synchronised with a tape machine which would be used for recording live performances. Furthermore, the number of tape tracks normally required could be greatly reduced. The overall cost savings, flexibility and freedom provided by the PC, MIDI, sampler, and synth combination led to a marked increase in the number of musicians creating arrangements at home — with a commensurate decrease in business for multitrack recording studios.

Because the computer provides editing features which are difficult, if not impossible, to achieve with tape, its capabilities have inevitably been applied to the live recording and editing process itself — in many cases replacing tape altogether, whilst in others complementing it. This series of articles will explain the basic principles behind tapeless recording and editing and how the technology is applied.

Digitisation



All tapeless recording and editing systems require audio to be digitised, and there are a number of advantages to using digital audio. However, in nature, all that we see and hear is analogue. That is, a sound or image consists of a signal which is continuous for the entire duration we hear or see it. An analogue signal is shown in Figure 1.

Figure 1: An analogue waveform.


It is, however, possible to fool the eye or the ear into thinking that something is continuous when in fact it isn't. A prime example is film, where a moving image actually consists of discrete pictures (or snapshots) which have been taken of an analogue image. The number of pictures shot per second must be more than the eye can detect, so that when the snapshots are run past the eye, the brain does not have enough time to distinguish between one snapshot and the next and so thinks that the image is continuous (or uninterrupted). The same can apply to sound. If sufficient snapshots are taken of a sound, they can be run past the ear at a rate which makes the sound seem continuous.

Figure 2a: Expanded analogue waveform.


Figure 2a shows an expanded view of part of an analogue signal (or waveform). Snapshots of sound are taken by sampling the analogue waveform at regular intervals, as shown in Figure 2b. The voltage level of the waveform when the sample is taken is converted into a number and stored on a suitable medium. The device which carries out the sampling process of converting voltage levels into numbers is called an analogue to digital converter (or A/D — pronounced A to D).

Figure 2b: Sampling the voltage level.


Once the audio has been digitised, the stored numbers (or samples) can be manipulated by computer, edited (if necessary) and replayed. However, before they are replayed, the numbers must first be converted back into voltages in order to produce a waveform which we can hear. The device which performs this conversion is called a digital to analogue converter (D/A). Figure 2c shows the waveform created by converting the samples taken from the waveform in Figure 2b back into voltages. Depending on how many samples are taken, a good or bad approximation of the original waveform will result. Figure 2c shows that taking a sufficient number of samples results in a reasonable approximation of the original waveform. The waveform is further improved by smoothing it out with a filter after the D/A. Figure 2d shows the result of taking too few samples — the reconstructed waveform barely resembles the original.

Figure 2c: Reconstructed waveform.


Figure 2d: Reconstructed waveform at too low sampling rate.


Thus the more samples taken (i.e. the closer they are together), the better the approximation of the original waveform will be. For compact-disc quality audio, for example, 44,100 samples are taken for each second of audio, i.e. the sampling rate is 44.1kHz (kilo Herz).

The samples are not stored as ordinary decimal numbers, but rather as binary numbers — that is, numbers which are represented by 1s and 0s (for example the decimal number 141 would be represented by the binary number 10001101). The beauty of 1s and 0s is that they are very easy to represent and recognise. Taking a light bulb as an example, a '1' can be represented by 'on' and a '0' by 'off'. Time and effort does not have to be wasted in determining how bright the light is, we're only interested in whether it is on or off. If we had a row of eight lightbulbs, we could easily and unmistakably display the number 141 to someone else who can read binary by switching the appropriate lightbulbs on according to 10001101. In fact, with eight lightbulbs (or eight bits), we could represent a total of 256 different numbers.

Computer Control



In digital electronics, a '1' is represented by a voltage (or simply anything above a certain voltage) and a '0' represented by no voltage (or anything below a certain voltage) as shown in Figure 3.

Figure 3: Voltage representing 1s and 0s.


Figure 4: A chip.

Much of the circuitry inside a computer consists of chips (integrated circuits or semi-conductors), an example of which is given in Figure 4, and 1s and 0s are extremely easy for such devices to recognise and deal with. Inside the chip are arrays of microscopic transistors which are arranged in a particular way so as to give a chip a particular function. The transistors can be likened to lightbulbs, in that they can be switched on or off, but rather than being switched on or off by hand, they are switched by applying 1s or 0s to the metal pins (or legs) along the chip's sides.

Some chips are designed simply for storing binary numbers; others are designed for performing calculations on the numbers — accepting two different sets of numbers and comparing them, multiplying the numbers by other numbers, adding numbers together, and so on. An advantage of binary numbers is that computers use them. A computer makes no distinction between binary numbers which represent audio, video, text or anything else — the number 10001101 could, for example, uniquely represent the letter 'Q', or even a sample of air pressure. The only way in which this information is distinguishable is in how it is presented to the outside world. Digitised audio therefore readily lends itself to computer control and this opens up a new dimension in editing and processing — in the same way as the word processor has done for text.

Digital Sound Quality



Another advantage of digital audio is that it cannot easily be degraded. When an analogue signal is recorded and mixed, it can be affected by other (usually smaller) signals generated in and around the circuitry through which the audio must pass. In other words, analogue recordings are susceptible to unwanted noise, and copying or bouncing tracks down, for example, adds successively more noise (or hiss). In addition to hiss, analogue recording to tape can suffer from dropout, wear and track bleeding. Domestic media, such as cassette or vinyl record, introduce further limitations. The master may be relatively noise free, but the disc-cutting process is regarded as fairly crude and can introduce noise and bandwidth restrictions. Cassette tape also suffers from bandwidth restrictions, as well as tape dropout. In addition, both media are susceptible to wear, which can also cause unwanted noise.

Figure 6: Analogue waveform with noise.


With an analogue signal, noise actually affects the waveform that you hear, as shown in Figure 6. But with digital, noise signals can be likened to a finger tapping on a lightswitch, but with insufficient force to throw it from on to off — noise is unlikely to turn a 1 into a 0 (or vice versa). Successive circuitry does not care about how clean the 1s or 0s are, only whether they are above a certain voltage or not, so that they will switch transistors on or off — at which stage, if there is no noise present, the 1s and 0s generated by the successive circuitry will be clean, as shown in Figure 7. This means that digital audio can be copied, an infinite number of times, without degradation; the copying circuitry will faithfully duplicate the 1s and 0s (and may even clean them up!).

Figure 7: Digital information unaffected by noise.


In any case, the 1s and 0s are not the actual waveform that you hear, but will ultimately be converted back into an analogue waveform at the last stage (the output of the system). Therefore, in the case of compact disc, for example, depending on the quality of your amplifier and monitors, you will have the opportunity to hear the audio with the same quality as when it was mixed in the recording studio.

Storage Media



Digital audio can be stored on tape, hard disk, floppy disk, optical disk, memory chips or compact disc. However, in order to take advantage of the recording and editing capabilities a computer can offer, the storage medium must satisfy certain criteria. As we will see in Part 2, the key to computer-based editing is random access. This is the ability of the computer to access any part of the digitised recording almost instantly. It is therefore essential that the recording medium allows random (or instant) access. It is also essential that the recording medium has sufficient capacity to store a practical amount of information — in the case of audio, for compiling lengthy arrangements such as entire albums or the soundtrack for a film, this can amount to hours rather than seconds or minutes. The medium must also be affordable and, in this green and cost-conscious world, should be reusable (erasable).

Tape



Tape has sufficient storage capacity in terms of both tracks and time. It is also erasable and affordable. However, it is unsuitable for our purposes because it does not allow random access (because it takes too much time to spool back and forth). It is, therefore, best suited to mastering and/or recording which does not need much editing.

Compact Disc



Compact disc is currently the only medium available on a large scale for domestic digital audio reproduction, but is not a suitable medium for recording/editing purposes since it is not erasable and its access, although much faster than tape, is not fast enough.

RAM



RAM (Random Access Memory) chips provide static storage (there is no physical movement involved), and so have the fastest possible access time. Although professional audio generally converts samples into 16-bit binary numbers, commercially available chips generally support 8-bit numbers (or bytes). A 16-bit sample will therefore be stored as two bytes. Figure 5 is a simplified representation of how RAM works. Inside the chip, bytes are stored in horizontal rows. The storage capacity of the chip depends on the number of rows provided — if there are 1024 rows, for example, the chip will have a storage capacity of 1Kbyte. For compact disc quality, this amounts to just over a tenth of a second (since for one second, CD requires 44,100 16-bit samples = 44,100 x 2 bytes = 88,200 bytes. Therefore 1Kbyte provides 1024/88,200 seconds = 0.012 seconds).

Figure 5: Simplified representation of how RAM works.


The pins on the left side of the chip consist of address lines, and those on the right serve as both inputs and outputs. Each row in the chip has a unique address (in the form of a binary number), and a row can be selected by setting up its address on the address lines on the left. There is also a pin for read/write (not shown) and if a 1 is applied to this pin, the pins on the right become outputs and the contents of the selected row will be output to the pins. If a 0 is applied to the read/write pin, the pins on the right become inputs and whatever is applied to them will be transferred to the selected row. Since there are no physically moving parts, the time it takes to change addresses, input or output information, change from read to write, and so on, is so short as to be virtually instantaneous.

RAM therefore satisfies the need for instant access, but the amount of storage space provided by a chip is very small. This can be increased by using multiple chips, but RAM is relatively expensive and takes up space. This means that RAM is more suited for storing seconds/minutes of material rather than long recordings. In addition, RAM is volatile, which means that if the power source is removed, the contents of the RAM are lost (although cartridges are available which have long-lasting battery backup). Nonetheless, because of its instant access, RAM is highly suited to being a temporary work area, where audio can be loaded from another medium for processing of some kind and then either output or loaded back into the original medium. Don't confuse RAM with ROM (Read Only Memory); ROM is also a chip, but the information inside has been permanently 'blown' and can only be read. Once blown, a ROM chip cannot be erased or further recorded to.

Floppy Disk



The floppy disk is designed to be a cheap and convenient storage medium. It stores information magnetically and is inserted into a drive which can be likened to a vinyl record player that can record as well as play. A record player allows quick access to any part of a recording by lifting the head and placing it elsewhere, although this can be somewhat hit and miss. A fresh floppy disk must first be formatted (also using binary codes) by the computer, so that the record/play head can precisely find its way around the disk. The disk is divided into tracks (not one continuous track as with vinyl) and information is stored as blocks within a track. Preceding each block is an address which uniquely identifies that particular location on the disk. To find a section of audio, the head will move across the tracks with the disk rotating underneath. When it sees the address associated with that section of audio, it will read the block of information which follows.

The relatively cheap materials, and the way in which the drive operates, mean that the density of information stored on disk is not very high, providing seconds or minutes of storage rather than minutes or hours. The head touches the disk, which means that the track width cannot be very narrow, and if damage from overheating or dirt is to be avoided, the disk cannot rotate very fast. This means that there is no point in using sophisticated mechanics to quickly move the head across the tracks if it must wait a relatively long time for the correct address to pass underneath it. Thus a simple and cheap stepper motor will do — adding to the floppy's affordability. However, because of its slow access time and low density, the floppy is not suited for real time (live) recording or playback of full 16-bit, 44.1kHz audio. It is therefore more commonly used as a non real-time backup medium for short sounds, as in RAM-based samplers, for example.

Hard Disk



This is more expensive than floppy disk but stores much more information and is more cost-effective than RAM. It uses different materials to floppy, but the general principles are similar. However, the head rests just above the surface of the disk, so there is no physical contact, which means that neither the disk nor the head are subject to wear. More importantly, this allows the hard disk to rotate much faster than floppy, and using superior mechanics for head control means that the time taken for the head to move from one position to another is extremely short. Its operation also means that the track width can be much narrower than floppy, allowing more tracks in total and increasing the density of information.

Because of the very high rotational speed of the disk, any debris or dirt particles caught between the head and the disk surface could cause severe damage. In order to avoid this, hard disks are sealed in the drive and are not removable. The entire drive itself can be removed, but this is a rather expensive and unsatisfactory solution, and hard disk is generally not considered a removable medium. This has proved to be one of its major drawbacks, since once the disk is full, the recorded material (if it is to be kept and further recordings are to be made to disk) must be transferred to another storage device. This takes time, which can become significant if a great deal of material has been recorded.

Bernoulli Disk



This type of disk may be described as a cross between floppy and hard disk. It is removable, and has a much higher capacity than floppy, but a much lower capacity than hard disk.

Optical Disk



Optical disks have always been removable, but are now also erasable and have large recording capacities. However, the problem with optical has been its slow access time compared with hard disk, although the technology is ever-improving and optical is now being used as the primary recording medium for a number of tapeless editing systems. However, if the choice is between hard disk and optical, hard disk is still preferred by the majority of manufacturers, since it is still faster than optical — with optical being recommended as an archiving medium.

Next month, in part two of this introduction to digital audio, we'll look at the advantages of instant access and the principles behind non-destructive editing.

Data Compression

One way to increase the amount of storage available or to overcome access time constraints is to decrease the amount of data to be stored, by using data compression. The latest generation of data compression techniques generally rely on a phenomenon called psychoacoustic masking. This is the phenomenon whereby frequencies either side of one which is relatively louder will not be heard and therefore represent data which is not essential; the data is therefore ignored and not stored. Such compression can reduce the amount of data by as much as a factor of 12, thus effectively increasing a medium's storage capacity by the same amount. It is claimed that data compression can produce almost CD quality audio, but subjective tests have so far produced mixed results. Manufacturers whose tapeless systems are aimed at complex editing and mixing are reluctant to use data compression, since the effects of processing compressed audio can be unpredictable. However, manufacturers of systems aimed at applications such as radio broadcast, where editing and processing requirements are low and storage requirements high, are eagerly adopting the technique. Some are even using it to provide real-time recording to and playback from floppy disk.


Series - "Digital Recording"

Read the next part in this series:


All parts in this series:

Part 1 (Viewing) | Part 2


More from these topics


Browse by Topic:

Digital Audio

Recording

Sound Fundamentals



Previous Article in this issue

Recording Musician

Next article in this issue

Microphone Basics


Recording Musician - Copyright: SOS Publications Ltd.
The contents of this magazine are re-published here with the kind permission of SOS Publications Ltd.

 

Recording Musician - Nov 1992

Donated by: Mike Gorman, Colin Potter

Scanned by: Mike Gorman

Topic:

Digital Audio

Recording

Sound Fundamentals


Series:

Digital Recording

Part 1 (Viewing) | Part 2


Feature by Yasmin Hashmi

Previous article in this issue:

> Recording Musician

Next article in this issue:

> Microphone Basics


Help Support The Things You Love

mu:zines is the result of thousands of hours of effort, and will require many thousands more going forward to reach our goals of getting all this content online.

If you value this resource, you can support this project - it really helps!

Donations for August 2022
Issues donated this month: 0

New issues that have been donated or scanned for us this month.

Funds donated this month: £138.00

All donations and support are gratefully appreciated - thank you.


Magazines Needed - Can You Help?

Do you have any of these magazine issues?

> See all issues we need

If so, and you can donate, lend or scan them to help complete our archive, please get in touch via the Contribute page - thanks!

Please Contribute to mu:zines by supplying magazines, scanning or donating funds. Thanks!

Monetary donations go towards site running costs, and the occasional coffee for me if there's anything left over!
muzines_logo_02

Small Print

Terms of usePrivacy