The Unholy Marriage
MIDI and SMPTE
Two industry standards settle their timing differences and agree to work together. Chris Meyer checks it out and discusses proposals for the MSMPTE.
The SMPTE standard is now as important for the recording and film industries as MIDI is to the musical instrument field. As the two areas move closer together, communicating between them has become easier. We report on the latest proposals for a MIDI-SMPTE marriage.
About a year ago, my brother Ron, who has been doing live audio and audio for video for longer than I can remember, gave me a call. He told me he had a client who wanted to use MIDI'd synthesisers for backing music and sweetening on some small video projects, and wanted to know if I could recommend anything. I honestly didn't know what to say.
About the same time, a marketing person at Sequential asked me if SMPTE timecode was going to become the next MIDI - nobody understood it, but everybody had to have it. That time I knew what to say: yes.
There are a lot of good reasons to combine the use of MIDI and SMPTE timecode. MIDI is basically a standard way to control musical instruments (and these days effects units, mixing boards, lighting units, and so on). SMPTE timecode is basically a standard way of knowing precisely what time it is, whether it be on film, videotape, or audio tape. As soon as you place the two ideas next to each other, it becomes desirable to control musical events based on what time it is (whether it's in conjunction with visual or other musical events). So that's what we're going to be discussing here - using SMPTE to synchronise MIDI with other musical events, and some recent advances and ideas that use MIDI to synchronise musical and non-musical events to film and video.
First, some history. Musicians have been synchronising musical events for ages - it's called playing together. If the musicians were not proficient enough to keep time with each other, then there was always the conductor, or at least a trusty metronome. When music started being recorded on tape, so did the metronome click, which became known as the 'click track'. The click told you where the quarter- or eighth-notes fell, and with luck, the musicians were good enough to listen to each other and figure out at least which bar and beat they were on (if not always which verse or chorus).
Then came machines. Drum machines and sequencers that wanted to play along too. Not being very good at listening to other musicians to see where they were, the musicians were forced either to follow the machines, or to let some warlock (known as a 'studio engineer' today) lay down a click track on tape for the machines to follow. Since the machines were also not good enough to figure out all the sub-divisions of time between quarter- or eighth-notes, they needed a special faster click track (ranging from 24 to some ridiculous number of clicks per quarter-note) to keep up.
Suddenly, all in the garden was rosy. Machines could come in later and lay their parts down without other musicians around, just like real musicians. And just like real musicians, they could also do all sorts of crazy overdubs after the fact.
When MIDI came along, it contained, among other things, a special code that was the equivalent of a drum machine or sequencer's click (standardised at 24 pulses per quarter note). It was a relatively simple task to build boxes that converted from normal audible or electrical clicks to MIDI clicks (or 'clocks', as they were now called).
But there were still problems. Unlike real musicians, machines were (and for the most part still are) not able to listen to other musicians to tell which beat or bar everybody else was on. In many cases, they couldn't even listen to other machines to figure out where they were supposed to be. That meant they always had to be started from the very beginning of a piece, and from that point count quietly to themselves to keep track of where they were. So the song always had to be taken from the top - annoying for real musicians, and wasteful of expensive studio time. And real musicians still had to play to them, because they were incapable of following other musicians.
An improvement came when manufacturers started implementing what's known as the MIDI Song Position Pointer. Now, a machine could at least tell where other machines were within a song (resolving to the nearest sixteenth-note), but it still had no clue as to where the humans were.
Luckily, the warlocks (studio engineers) came to the rescue again. The practice of recording SMPTE timecode on one track of a multitrack's audio tape became a commonly used method of synchronising two tape machines together. From there, it took just a small leap of the imagination to start using SMPTE to synchronise musical machines to the tape as well - and thus to the real musicians recorded on it. At the very least, the click track was no longer needed, because boxes such as the Friend Chip SRC could translate from SMPTE to normal clicks or MIDI, at whichever constant tempo was desired. Timings could be offset slightly or drastically by changing the SMPTE time that the machine considered to be the 'start'.
On a more intriguing level, machines could now pick up a song from the middle, too. Since each slice of SMPTE time is unique from any other (SMPTE timecode rolls over every 24 hours, which is long enough even for Philip Glass song cycles), each moment in a piece of music has a unique time. Combining this with the aforementioned MIDI Song Position Pointer, a box can look at what SMPTE time it is, calculate precisely how far into the song that time corresponds to, and forward the machines to that point.
Some manufacturers are starting to integrate this feature into their drum machines and sequencers. And some outboard machines, such as the Fostex 4050, promise to integrate this feature with the already accepted practice (known as 'autolocating') of forwarding and starting tape machines to specific points referenced to the SMPTE timecode on the tape. Now everybody - tape machines, sequencers, drum machines and humans - can know which beat, bar and (with the exception of some musicians) verse everybody is on.
But what about that last point of machines playing along with musicians, instead of musicians playing along with machines? That brings us along to a personal fave, the Roland SBX80 Sync Box. With this cute little toy, a bass player or drummer (even a sloppy one) can come in and lay down his or her parts first, and then have the studio engineer either use a constant rhythm track (eg. cowbell) or even hand tap the tempo along with the beat. Then, the SBX80 recreates a high-resolution MIDI click track internally to follow this rhythm, and can make all the other machines follow suit. It can also do all the other things mentioned above, along with changing the programmed tempo at any given measure.
So far, all we've talked about is audio. As audio started to get teamed up with video, it became important to synchronise audible events (such as orchestra strikes, car doors slamming, and dialogue) to visual events. You may not realise it, but well over 90% of the sounds you hear on pre-recorded programmes are recorded after the actual video or film. And all of these audible events have to be lined back up with their corresponding visual events.
Editing sound for programmes produced strictly on film is a real ordeal. Sound effects, dialogue and so on, must be recorded on magnetic tape stock with sprocket holes just like the film, and then synchronised mechanically. Sliding audio events against visual ones involves slipping the mag stock a couple of sprocket holes one way or the other.
In the last few years has the film industry been dragged (kicking and screaming, some would say, but I actually think they've embraced the technology quite easily) into the Electronic Age. The editing stage now tends to be in video, and videotape has no sprocket holes. You can't even see the images on it. Therefore, SMPTE (The Society of Motion Picture and Television Engineers) eventually standardised a timecode to mark videotape. Each SMPTE slice or 'word' defines the start of a specific video or film frame (think of a frame as a picture, screen, or snapshot). Having thus labelled each individual frame of a continuous video image, it becomes far easier to edit it electronically.
With this has come the use of multitrack tape locked to the video to construct, edit, and synchronise the audio elements. This is a definite improvement on what went before, but life still isn't all that easy-sound effects and dialogue must still be carefully matched to the video, with 'time' being slipped back and forth by triggering cartridge machines and other tape decks based on electronic cues and SMPTE times. In addition, the musician, band, or orchestra must still play at precisely the right tempo to match musical events with visual ones properly. If varying the tape speed won't rematch the tempo to the visuals, you have to cut out or add in individual frames of the film, and splice the thing back together to make it fit.
The same techniques mentioned earlier to match drum machines and sequencers to real musicians can also be used to match them to video or film. Once a piece of music is recorded on a sequencer, its tempo, along with where it starts, stops, speeds up, and slows down, can be manipulated to match the visuals. For example: what if a cymbal crash happens just after the car door slam it was supposed to be synchronised with? Simple. Just speed up the tempo by the right amount, and you won't have to add frames, or change the pitch of the piece by varying tape speed.
Beyond music comes the dropping of sound effects (and even dialogue) onto tape synchronised and triggered by SMPTE timecode. Synthesised and real sound effects can be recorded onto a sampler such as an Emulator, Prophet 2000, Fairlight, or whatever. This sample can then be recorded as the first note in a sequence. Next, a SMPTE/MIDI converter can be used to start the 'sequence' at the SMPTE time the effect is supposed to happen. It's a slightly roundabout way of doing things, but it works, and it's seeing more and more use.
Dedicated units, such as the Polyphonic FX System or customised CD players, are even starting to appear to perform as sound effects playback units. Storing sound effects as samples on disk has a couple of advantages over using cartridges to do the same job (broadcast 'carts' are little more than the eight-track cartridges of old). Paramount among these is that it's much more convenient and often sounds better. And the editing power of a Fairlight or Digidesign's Sound Designer package helps customise or edit sound effects and dialogue for each individual track.
So, what we have so far is a situation vastly improved over what engineers, composers and musicians have put up with in the past - but which isn't perfect yet.
To come right up to the present day, there's a new proposal currently being tossed about inside the MMA (MIDI Manufacturer's Association) and JMSC (Japanese MIDI Standards Committee) known as MSMPTE - for MIDI/SMPTE. Put simply, it's designed to bring the worlds of MIDI and SMPTE closer together. It contains essentially two separate proposals: one for the transmission of SMPTE timecode over MIDI, and one for the transmission of something called MSMPTE Set-Up Information.
Transmitting SMPTE timecode in the form of MIDI data carries several advantages. First, all the advantages of MIDI clocks over electrical clicks will be realised - hardware compatability (ie. no varying voltage levels, connector types, and so on), greater reliability (it's far easier to recognise a byte over MIDI than a signal off an analogue tape), and wider varispeed range (as tape slows down, a click's strength diminishes; as it speeds up the clicks start smearing into each other - over MIDI, the events merely happen further apart or closer together). Both hardware standards of SMPTE timecode (longitudinal and Vertical Interval) are converted to the same MSMPTE messages. What's more, the cost of the actual SMPTE-to-MIDI conversion is now carried by just one device - ideally the master audio or video tape deck itself, as the reliable reading of SMPTE timecode is notoriously finicky, and boxes such as the Friend Chip, Roland and Fostex all cost well over £500. Now, the tricks of sliding events against each other in time, fast-forwarding to a specified cue point and so on can be included (with luck, more easily) in the sequencer or drum machine itself, allowing closer integration of those tricks with the sequenced data.
The Set-Up Information goes beyond normal 'what time is it?' functions. As mentioned earlier, the current practices for lining up audio with video or film include: a) synchronising multiple cart and tape machines; b) chaining together a SMPTE-to-MIDI converter, sequencer and sampler; or c) slipping sprocket holes on mag tape stock relative to a piece of film. By contrast, the most basic use of the Set-Up Information will be to tell the slave units in advance when to trigger certain events. Then, when the machine receives the appropriate time, it performs the required action.
An immediately obvious application of this is to use this information to set up when samples of sound effects or dialogue should 'fire' (playback).
But the Set-up Information proposal goes further. It has provision for handling up to 127 separate machines, each of which can then have up to 16,384 punch-in and punch-out points, 16,384 event start and stop times, and 16,384 marked cue times, all with their own individual SMPTE times at which they're supposed to occur. The punch-in and punch-out times can refer to tape machines, sequencers, samples, effects, and even special effects and effects units. 'Cue' events could include the aforementioned triggering of carts and CD players, one-shot samples, changing effects programs or mixing configurations, even lighting flash pots - or for just the plain marking of edit points.
Hold your breath. This is the one I've been building up to. Since all the above happens on the same format over the same medium, it can all be programmed from the same master unit. Up until this point, even with SMPTE and MIDI, each device had to be programmed separately. The idea of controlling and orchestrating an entire studio from just one machine - be it a terminal, computer, or whatever - is, ahem, at least mildly interesting.
What's amusing to me is I've heard that SMPTE (the organisation, not the timecode) themselves had created a proposal for the Interconnection of Tributary Systems about the time that MIDI was being created, with at least as many hardware similarities (same baud rate and so on). Although I've not seen the document, I'm told that the proposal was several hundred pages long. To the best of my knowledge, though, it died on the vine. It's amusing because MIDI manufacturers have had to put up with occasional polite scoldings from members of ANSI standards committees for not going through the 'official' standardisation procedures with MIDI, and put up with the odd heretic/user who claims in screaming paranoia that manufacturers are purposely holding back advances from the users.
'Taint so, brothers and sisters. In the meantime, prepare for the time when audio and visual really get together. If you have only one tape deck, and all you want to make is perfect tempo music following machines, or with nothing but machines (or no machines at all), you don't need SMPTE. If you have ambitions beyond that, then SMPTE, or the combination of MIDI and SMPTE, is in your future - if not already in your present.
Throughout this feature, I've been purposely light on technical details and concentrated on the concepts themselves. Since MSMPTE itself is still just a proposal, it would not be of much value (and possibly even misleading or damaging) to publish details about it now. If and when it is adopted (optimistically, that could be later this summer or autumn), we'll be publishing a separate article on the subject. If you're after more technical information on MIDI and SMPTE themselves, here are some references.
There have been a number of articles published explaining MIDI; unfortunately, I don't really feel comfortable with any that I've read. Therefore, I suggest you get a copy of the MIDI 1.0 Detailed Explanation, written by the JMSC and MMA, and distributed by the IMA (International MIDI Association) at (Contact Details). Cost is $30 to members and $35 to nonmembers.
The best article I've read explaining SMPTE is a 30-page pamphlet put out by EECO (a manufacturer of timecode equipment, and inventor of 'ON-TIME', the father of SMPTE timecode). An excellent layman's article (including both technical details and applications) is 'Everything But The Kitchen...' published in E&MM February 1985.
The official SMPTE document explaining timecode is SMPTE 12 (also known as ANSI V98.12M-1981). The SMPTE Tributary Systems proposal I referred to is known as T-14.10/7-651 (if anybody manages to get one, please send me a copy!). SMPTE's address is (Contact Details)