Standard MIDI Files
In Search Of The Truth
Despite the fact that Standard MIDI Fil es have been about for quite a while there is, for many users, still more than a little confusion about their purpose and use. Paul Overaa sheds some light on the subject...
Unlike the MIDI standard itself, about which much has been written, the MIDI File is, for many, still surrounded in an air of mystery. This article is basically going to try and demystify the subject.
Standard MIDI Files (often called SMF files or just plain 'MIDI Files') are used primarily to transfer time-stamped MIDI data — sequence data — between different programs. In other words they are computer disk files which have been created in such a way that other programs are able to read and understand their contents. Having this sort of 'data portability' is obviously good news for MIDI users because it means that they can change sequencers or even change computers without having to worry about sequencer file compatibility problems — at least that's the way it should work!
Today a great many software packages provide some level of MIDI File support, and more and more users are beginning to realise the benefits of such support. Unfortunately things have, through the years, gone rather less than smoothly on the MIDI File development front and it's possible that early snags are partly to blame for many users' continuing suspicions of using them. To start with the file format initially proposed was limited to just a single stream of MIDI data, a single track. This was felt by many software houses to be too restrictive, and nowadays the MIDI File standard, under the control of the International MIDI Association, has grown to allow multiple streams, plus messages that are specific to particular programs, and a host of other goodies.
The penalty paid for these new benefits is that the standard has changed somewhat from that originally proposed. Most of these changes were to be expected and from the users' viewpoints they have, in the main, been fairly transparent. Some transitional problems have, however, been evident and problems have come to light particularly when users have tried to read newer MIDI files with programs which expected older style MIDI files to be presented to them.
When you add such problems to all the others that people encountered as they moved into the world of MIDI, then it's not surprising that many users didn't place learning about MIDI Files very high on their list of early priorities. If as a MIDI user you were new to computers (and all of their related jargon) then you had your work cut out anyway. Once happy with the computer-related aspects of this new technology then MIDI itself, and another load of jargon, had to be tackled.
Another reason for the lack of knowledge about MIDI Files is perhaps that there's simply less published information about how programs use them. When you go out and buy, say, a Yamaha sound module, you'll get a comprehensive manual which contains a lot of technical MIDI info. You'll learn not only how to use the beast from the front panel but also how the various remote MIDI facilities and SysEx control options can be used. One of the reasons that you are invariably provided with the manufacturer-specific SysEx information is that there is a general directive within the MIDI specification for manufacturers to provide this type of information in their manuals — most do, and make a very good job of it.
Now you might not at first make much sense out of all the technical material, but it's there if you need it. What's more, if you hit a snag, or want something explained in more detail, all you have to do is write to Yamaha and they should be able to help you out — and all of the other major musical equipment companies do likewise. Many organisations have even issued introductory guides and books explaining what MIDI is, how it can be used, and these include outlining the MIDI standard and its purpose. The end result is that, nowadays, it's no struggle to become MIDI-literate.
When you move into the MIDI file arena, however, this situation changes drastically. Very few software companies will discuss internal details of their sequencer files, and are far from happy about discussing the sequencer-specific events which many embed in such files despite the fact that, like the MIDI standard itself, the MIDI File standard includes a general directive to software companies to include such details. Software companies seem to regard any such file format information as a strictly private area, and do all they can to discourage you from delving too deeply into it. There are exceptions of course, and companies like Dr. T do publish a certain amount of technical info about their file formats.
The trouble with MIDI Files is, to a large extent, quite simply that not enough has been written about them. If the same barriers had existed with MIDI itself then there would have been a good chance that MIDI would never have taken off in the way that it did.
My solution to this particular technical black hole (and I wouldn't recommend unless you have a lot of time available) was to sit down and, using the official standard as a guide, write my own MIDI File diagnostic software. The benefit is that I'm now in a position to understand and write about MIDI Files in some depth. That is exactly what I will be doing in this article, and before going any further there's something I need to point out. I have, quite deliberately, included some rather technical file structure material. Don't panic: it's there to provide both an anchor point for the things I want to discuss and to provide some reference material. If you are already MIDI-literate you will be able to get an overall picture of the MIDI file arena in as much depth as you want. If you've only just been introduced to MIDI then don't get paranoid about the fact that there's so much to learn. We've all gone through the same stages, so just absorb what you can and then file the article away until you get to the point where you need to know a bit more about the subject!
MIDI Files, as I've already mentioned, are designed to allow MIDI data (ie. your sequences, track data and songs) to be stored in a standardised way. Such portability magic, providing a standard file arrangement is available, should in theory be fairly straightforward, but problems can arise even before you get anywhere near the data itself. The first snag of course is the physical size of the disks you are using. Some dedicated hardware sequencers use 3.5" disks and personal computers use either 5.25" or 3.5" disks. If you can't physically put the source disk into the new computer or sequencer you are using, you'll be stuck before you even start.
Equally important, however, is the disk format, which dictates the methods and arrangements by which data files are physically stored on the disk. In order to easily copy a computer file from one disk to another, or to read it on a new computer, it should be pretty obvious that the disk containing the file needs to be the same physical size and be of a compatible format. (Some PC-compatible computers have both 5.25" and 3.5" disk drives, but these aren't that common amongst MIDI users).
The two early solutions to this problem were to resort to copying the MIDI data in real time (ie. by connecting the computers/sequencers via a MIDI lead and playing one sequencer whilst recording with the other) or, if you were using computer-based sequencers, transferring the data files using the computer serial ports. Linking two computers via their serial port connections is not hard, but it's beyond the scope of this article: if you ever need to adopt this approach, then either find yourself a friendly computer freak, or use one of the many disk duplication companies who will copy your data onto new size/format disks for a (usually reasonable) fee.
Nowadays disk drives are quite flexible devices, and many computers can read and write more than one type of disk format, which has eased many file transfer problems. Atari STs, for instance, can read and write to 3.5" PC (MS-DOS) disks. With a utility like Consultron's CrossDOS so can the Commodore Amiga, and it's therefore relatively easy to transfer MIDI Files between PC, ST and Amiga-based sequencers using 3.5" disks — you just shove the alien disk into the drive, and your software will read it just as if it were a disk which had been written by the machine itself.
The drives on all recent Macs can also read PC disks, via Apple File Exchange software. You can also use this software to initialise a disk in MS-DOS format, pop in an Atari drive, save a file from the Atari, and then read that file on the Mac. So, 720k PC disks can be read by PCs, STs, Amigas and Macs, providing a fairly universal interchange medium..
"Today a great many software packages provide some level of MIDI File support, and more and more users are beginning to realise the benefits of such support."
It's important to realise, however, that none of the above jiggery pokery is specific to MIDI Files as such — they are steps which would be taken to transfer any type of computer file (text, sound samples, graphics etc.).
Before talking about some of the snags that can occur I'm afraid there is yet another a technical hurdle to cope with. Just as it is not possible to discuss MIDI without having some idea of the types of MIDI messages which exist, and the information they hold, so it isn't possible to discuss MIDI files unless you know a bit about how they are built and the events they can contain.
A MIDI File, just like any other computer file consists basically of a series of bytes. The MIDI file standard specifies the interpretation and arrangement of those bytes. At the highest level MIDI Files consist of identifiable blocks of data called 'chunks'. Each chunk consists of a 4-character identifier followed by a 32-bit number which specifies the byte-length of the data held in the chunk, ie. all chunks adopt this type of arrangement:
|<chunk-identifier>||<chunk-size>||<actual chunk data>|
|4 Bytes||4 Bytes||chunk-size bytes|
Only two types of chunks are currently defined: Header chunks, which have a 'MThd' identifier; and track chunks which have a 'MTrk' identifier. It is, however, highly likely that new chunk types will come into existence in years to come, so any programs which read MIDI Files files have to assume that they will, one day, come across chunks which they cannot interpret. The idea then, if you were writing your own software, would be to write programs which looked at the chunk identifiers and skipped over any chunks that couldn't be recognised.
At the moment the two chunk types (Header and Track chunks) can be arranged in three ways and these lead to the three types of MIDI files:
• Format 0 type files contain a Header chunk followed by a single Track chunk. It's the simplest and most portable of all the MIDI file arrangements, and is used for storing a sequence as a single stream of events.
• Format 1 type files allow multiple simultaneous Track sequences to be handled. These files will contain a Header chunk followed by any number of separate Track chunks.
• Format 2 files allow sets of independent sequences to be stored. A sequencer might save the individual sequences (verse, chorus etc.) which make up a complete song as a single Format 2 type MIDI file.
• The MIDI File standard guarantees that all MIDI files will start with a Header chunk, and that even if this Header chunk is extended existing fields will not be re-arranged. Programs can therefore assume that even though they may find Header chunks larger than they anticipated, the fields defined to date will remain in the same relative positions.
As I've said, the 'MThd' Header chunk is always the first chunk in a MIDI file. Like all chunks these start with the identifier followed by four bytes which specify the chunk's size. Current header chunks have six bytes of data: the first word gives the file format (0, 1, or 2); the second word tells you how many track chunks are present in the file; and the last word contains timing division information.
Interpretation-wise the only tricky item is the 'division' field because its contents and format may vary. If bit 15, ie. the most significant bit, is zero then bits 14-0 give a 15-bit number which specifies how many delta time ticks (see below) make up a crotchet. The bottom line as far as these header chunks are concerned is simple: they provide the software reading the file with some indication of the MIDI data to come.
"MIDI files are designed to allow MIDI data — your sequences and songs — to be stored in a standardised way. But problems can arise before you get anywhere near the data itself."
Track chunks are the file sub-sections which hold the real file data. They consist of a 4-byte Track chunk identifier 'MTrk', a 32-bit length field which identifies how much data the chunk contains, and one or more Mtrk 'events'. The events themselves take a standardised form which starts with a time field that specifies the amount of time which should pass before the specified event occurs. These time fields, which are incidentally an integral part of the 'grammatical description' (syntax) of a MIDI file event, are called 'delta times'.
Like several other MIDI File items, delta times are stored using a variable length format containing seven real bits per byte. The most significant bit (bit 7) is used to indicate either the continuation, or the end, of the possibly multi-byte number.
Why such a complicated arrangement? It's simply to save space. Using this number form the inter-event times which are less than 128 (ie. the majority of delta times) can be stored using just a single byte. The number 127 for example can be stored simply as binary 0111 1111. Once the time value gets above 127 more bytes are needed to store the number.
MIDI File events themselves can be one of three types: A MIDI event; a SySex event; or something called a Meta event. MIDI events should already be familiar to you so I'll deal with these first.
Nowadays these are defined as being any MIDI channel message. This implies that MIDI Files can only contain channel voice or channel mode MIDI messages. In this respect the MIDI File standard would seem to have changed because, as far as I am aware, the early (pre-IMA) standard used to allow storage of both channel messages and system common messages.
Normal SysEx messages use a modified form which includes an additional length, ie. byte-count, field (stored as a variable length number):
<F0 hex> <length> <data bytes>
If the SysEx message is sent as a single packet then the last data byte should be the EOX (F7 hex) SysEx terminator. This may appear to be unnecessary since a SysEx message length field is also included. In the original MIDI File standard it was indeed unnecessary and the terminal F7 byte was not required. The reason that the F7 terminator has been re-introduced is that a new MIDI File SysEx message has been devised which allows large SysEx messages to be broken up into time-stamped packets.
The new message actually starts with the F7 hex terminator and takes this general form:
<F7> <length> <data bytes>
If a program wants to split a SysEx message into time-stamped packets it does it by using the F0 form for the first data packet, and F7 forms for any subsequent packets. The last data byte of the last packet of information containing a 'real' terminal F7 hex data byte.
The current MIDI File standard supports a number of 'non-MIDI' events known as Meta Events. All of these start with an FF hex character as the primary Meta Event identifier and this is followed by a Meta Event 'type' field, a byte count, and finally the data itself:
"It is evident that several changes have occurred which may make MIDI Files written with early sequencer programs incompatible with sequencers designed to read and write the current standard."
<FF hex> <Meta Event type> <length> <data bytes>
The Meta Event type field is a 1-byte value between 0 and 127 and the length field is stored in the same variable length format as is used for delta time values. In a sense the type field byte performs the same job as a MIDI status byte but it is of course being used to classify an event type, not a MIDI message type.
The above table provides a summary of some of the meta events currently defined.
That then is the basis of the current MIDI File standard as adopted by the International MIDI Association. It's worth mentioning that running status (ie. the use of implied status bytes) is also allowed within a stream of MIDI events, but this must not be carried across non-channel events. If a stream of running status MIDI messages are interrupted by one or more Meta or SysEx events then a new status byte must be present in the first of any MIDI messages which follow.
To my mind at least it is evident that several changes have occurred which may make MIDI Files written with early sequencer programs incompatible with sequencers designed to read and write the current standard. If nothing else this might be a useful warning for many people who seem to believe that early MIDI files are equivalent to the new Type 0 MIDI files. Whilst similar, they are clearly not identical: the End Of Track events, which used to be optional, must now always be present, and a number of other events have changed size or been redefined.
Despite odd hiccups MIDI Files are well worth experimenting with and using. As well as ensuring that your own song data remains portable there are other benefits to be had — MIDI songs, programmed by professional musicians, are now available from many sources and whether your interest lies in classics or rock this is an up-and-coming avenue worth exploring. Sometimes you may have to do a bit if hacking to get the arrangement into a suitable form for your own MIDI gear (for example, you'll probably have to re-map the drum kit notes) but if you are into 'instant music', or just plain lazy or short of time, then canned MIDI file arrangements could be a godsend.
Perhaps the best news of late is that new MIDI File applications are emerging, in areas which are not strictly just 'sequencer A to sequencer B' type uses: Freestyle, for instance, the ST arranger program, uses MIDI files to store its styles and patterns. Once you've gained a bit of experience with the package you can customise Freestyle riffs and patterns and create new ones. Another area, where MIDI Files are coming into their own is for live playing. No-one wants sequencers on stage, it's on-stage bulletproof dedicated playback units that are needed. The most versatile units (eg. the MIDITemp MP44) read MIDI Files, and I'm certain that over the next year more such units will appear.
Using MIDI Files, like MIDI itself, gets easier with practice, so get stuck in and have a go. You certainly don't need to understand the technical file details to use MIDI Files but, having said that, hopefully some of the 'inner details' I've covered will make the standard MIDI File seem just a little less like magic!
Feature by Paul Overaa
Previous article in this issue:
Next article in this issue:
mu:zines is the result of thousands of hours of effort, and will require many thousands more going forward to reach our goals of getting all this content online.
If you value this resource, you can support this project - it really helps!