Protocol (Part 5)
Paul Overaa continues his series in the MIDI standard with a look at the MIDI File Format
With so many sequencers on the market the problem of transferring files from one to another has never been so acute - Paul Overaa gets to the nuts and bolts of this topic
In an ideal world all sequencers would store their information in a standardized way, so that you could read any sequencer's data file with any sequencer. There's no reason why such arrangements could not be adopted - other than the fact that different software houses and sequencer manufacturers are already committed to their own data formats and cannot afford to change (or have no real incentive to do so).
On the Amiga computer a data-file specification called the Interchange File Format (IFF) standard has proved very successful at producing portable files. You can, for example, create a picture using a drawing program and then incorporate that picture into a text-file being prepared with a word processor. You can do similar things with IFF sound samples and Amiga music programs. It's a great approach and it's all made possible because the various programs understand a common file structure. The IFF standard relies on people using blocks of data, called chunks, which are designed to be of general use although private chunks, i.e. chunks of less general use, are allowed - programs can ignore them if they don't understand them! MIDI data doesn't have that degree of portability and even Music X the new Amiga sequencer has opted for IFF definitions which are not of general use as a sequencer file standard.
Things on the MIDI front are not however quite as bad as they may seem: Moving data between sequencers is easy enough in real time... you just link two sequencers together via the MIDI IN and MIDI OUT terminals and play one sequencer whilst you record with the other. The only disadvantage with this method is that it takes a bit of time (even if you do speed the tempo up a bit). You do of course also need access to both sequencers simultaneously.
Another thing that has helped is the creation of an intermediate file standard called the MIDI File Specification. The idea is that software producers, in addition to their own chosen file formats, can either provide additional options for writing data files in a standard MIDI File Format (SMF) or provide utility programs which translate their own files into SMF form. The format has been adopted by Opcode's Sequencer 2.5, Intelligent Music's Jam Factory, Steinberg's Pro 24 and many others. The latest sequencer to provide support in Microillusions Music X package and the new Steinberg Amiga 24000 package, due out any time now, will also provide SMF support. As well as commercial offerings there's a growing number of public domain utilities which are using SMF files.
An SMF file consists of a series of events. Each event consists of two fields - a time, and the event itself. In the specification you'll see an event defined like this...
the delta-time is usually stored in 96th of a beat, although as we'll see later it is possible to specify other clock resolutions.
Because you're not involved with real-time MIDI data, serial port handling etc., it's quite an easy job to write your own programs for reading and writing SMF files. You just open a file, read characters from it, do some processing, and then write the modified file back to disk. If you're a programming whizzo you can use C or assembler etc., but it's not necessary to do this and for most people Basic is more than adequate. Before you can do anything useful with an SMF file you do however need to know a bit about the overall format, i.e. how MIDI messages are stored inside the file. The delta-times are a bit of a problem area so we'll have a look at the layout of these first.
They're stored in a variable length format containing 7 real bits per byte. The most significant bit (bit 7) is used to indicate the end byte of a number, i.e..
0 *** ****
end byte of the number
Numbers which are less than 128 can be stored in a single byte using this arrangement, e.g the number 127 is written as...
we'll know that this byte is also the last byte because bit 7 is clear
Once we get above 127 we need more bytes to store the number. The variable length format is a bit tricky so here's an example which will show you how such numbers are formed. Basically we take the binary form of the number, peel off bits seven at a time, and then push each group of seven bits into an eight bit (i.e. single byte) arrangement. Once a suitable number of bytes have been formed we then set the high bits (bit 7) in every byte except the last one. See figure A.
So, a delta-time of 128 will be found in the file as the two bytes which in hex form are 81 hex followed by 00 hex! Why is this method adopted? It's to save space. Most MIDI delta-times, which are usually the times between various MIDI events, are small numbers (less than 128) and this arrangement allows a single byte representation to be used for these delta-times. If a fixed format had been chosen it would have had to be a size suitable for ALL delta-times - the result would have been that all of the small value delta-times would probably have ended up being stored as four byte fixed length numbers which would have meant a lot of wasted space.
Now that we've taken care of the delta-time fields we can look at the events themselves. Three types of events occur in SMF files... ordinary MIDI events, SYSTEX events and META events.
These can be any channel-voice or system common message and running status, i.e. the use of implied status bytes is allowed. These events are easily recognized in an SMF file because they're just like a normal MIDI message, e.g. the two byte program change 20 on channel 4 event would look like figure B. In the SMF file you'd read C3 hex followed by 14 hex!
SYSTEX events are slightly different to the normal transmitted MIDI form of F0 hex (data) F7 hex because a count value is included after the F0 identifying byte. The reason for this arrangement is that it allows a program reading the file to work out how big the message is without having to read all the data. Now for the bad news... the count is in the variable length form described earlier and you'll need to decode it even if you want to ignore the SYSTEX message because the terminal F7 byte is dropped, i.e. the format used is...
F0 (variable length count) (data)
Time Signature FF 58 02 nn dd
The bytes nn and dd hold the time signature. The denominator dd is expressed as a power of 2, so the META event read as FF 58 02 03 02 would indicate 3/2 squared, i.e. 3/4 timing.
These carry various non-MIDI data items. All META events start with an FF hex followed by an event-type byte. Following that comes the length of the data (again in variable length form) and the data itself.
The standard distinguishes between header events, i.e. those which come at the start of the file, and other timed events which can occur anywhere. Currently only one header event is defined - the Beat Time Base event and this has a type code of 08 hex. If such an event is not present then delta-times default to 96ths of a beat, ie. the resolution is based on the standard MIDI clock. It's pretty obvious why an event such as this must come right at the start of the file - if a modified resolution was being used you'd need to know it before you could interpret any of the other events in the file.
Three other META events are also defined, here's the necessary details in hex form....
Set Tempo FF 50 02 tt tt
The tempo tttt is defined in 1/128ths of a beat per minute. If this event occurs at the start of a file it defines the starting tempo.
End Of Track FF 2F 00
The use of this event is optional. If it's not present then the end of the track is based on the last event of the file.
That then is the layout of the Standard MIDI File. A growing number of sequencers are providing facilities for reading and writing these files and there are public domain programs scattered around for doing things like KCS <-> MIDI File conversion etc. Standard MIDI Files (or SMF files as they're more commonly known) are simple serial files which are easily handled. We can't get too involved with the programming aspects here, but the following guidelines might help a bit...
Essentially the files consist of a series of events each having an associated delta-time which you can decode! The times correspond to the time gaps between events and in the default resolution are a number representing the MIDI clocks which have elapsed since the previous event. When you read an event there are four possibilities:
1: It will be a normal MIDI event starting with a status byte
2: It will be a MIDI event with an implied status byte, i.e. running status is being used and the first byte you see will be the 0 - 7F hex data byte.
3: It's a SYSTEX event starting with F0.
4: It's a META event starting with FF hex.
To identify the event type look at bit 7 of the first byte of the message. If it's not set then you have a running status message to deal with. If it is set then examine the whole of the first byte - there will be three possibilities....
a: First byte is F0 hex signifying a SYSTEX message.
b: First byte is FF hex signifying a META event,
c: If the first byte is nether of the above the event will be an ordinary MIDI message (i.e. one with a status byte).
Once you know what type of event you are dealing with you can process or skip over it as you see fit!
Many thanks to EVENLODE Soundworks for supplying details of the MIDI File specification.
Feature by Paul Overaa
mu:zines is the result of thousands of hours of effort, and will require many thousands more going forward to reach our goals of getting all this content online.
If you value this resource, you can support this project - it really helps!