Everything You Ever Wanted To Know About System Exclusive (Part 9)
(But Were Too Afraid To Ask!)
PART 9: Martin Russ continues his mammoth in-depth analysis of MIDI's SysEx codes with an explanation of 'checksums'.
As the official DX7 Voice Librarian for the Yamaha X Club, one of my assigned tasks has been to distribute the club voices to members and any other interested people. This was done originally for the BBC computer, and more recently for the Atari ST computer.
Yamaha's aim with these public domain voices is to make them widely available to any user of Yamaha 6-operator FM synthesizers (if only all manufacturers adopted the same policy!). The result is that the Yamaha X Club 6-operator voice library is available from SOS Software (see page 38). There are currently two 3.5" disks, each containing about 50 sets of 32 voices, making 3,200 FM voices in all. Included on each disk is a simple GEM program, which allows the voice data files to be transferred to any 6-operator Yamaha synthesizer. As more voices become available I will add further disks as necessary, if there is sufficient interest.
You are probably wondering why I have started by talking about distributing voices for synthesizers, and as usual there is a very good reason. I have discussed many aspects of using System Exclusive in this series, but have so far avoided any thoughts on the actual implementation of storage. When I described the Movie program, I briefly mentioned compatibility with other programs but have omitted any details so far. As often happens, following one line of enquiry leads nicely to another, and in this case, looking at the formats you can use to save System Exclusive information leads to some very useful information about 'checksums'.
Having mentioned the DX7 already, now seems like a good time to talk about Yamaha's standard 6-operator 32-voice bulk dumps, as found in the voice files on the SOS Shareware disks. This will form the background information for what follows; bear in mind that the ways in which you can store such dumps are probably common to both computer and hardware systems, irrespective of the piece of equipment supplying the System Exclusive data.
The format for these bulk dumps is the same as for any System Exclusive message, and follows Yamaha's original standard format for such transfers, as established by the first generation of DX7s. (A newer 'universal' bulk dump format is also used by the DX7 MkII and other second generation FM instruments, but all 6-operator FM synths will correctly recognise and respond to the original format dumps.) The basic format looks like this:
$FO start of System Exclusive
$43 Yamaha Identification
$00 Sub-status and channel (ssss nnnn, where n = MIDI channel)
$09 Format number
$20 Byte count MSB
$00 Byte count LSB ($2000 = 4096)
... firstdata byte
... last data byte
$XX checksum of data bytes
$F7 end of System Exclusive
Six bytes are added before the voice data and two bytes after it. These are called the 'header' and 'tail' of the bulk dump. The six header bytes indicate the start of System Exclusive and the manufacturer (Yamaha, in this example), whilst the two tail bytes contain just the checksum and the end of Exclusive message. (Checksums are explained in more detail in a separate panel.) The total size of the System Exclusive bulk dump is thus 4096+6+2 = 4104 bytes. The important thing to observe here is that the header is unlikely to change from one dump to another - the status, format and byte counts are fixed. The $F0, $43 and $F7 bytes are also common to all Yamaha System Exclusive messages.
The data bytes themselves are organised in a similar way to the general outline explained last month: a list of the parameters for the operators used in each sound are arranged in a fixed order. This time we do not need to know the details of the organisation of the bytes, only the structure of the message into header, data and tail.
There are basically two different and incompatible ways of storing these 6-operator FM voice dumps (and many other bulk dumps as well) in a computer. They are called by a wide variety of names, but I will use just two: 'Headerless' and 'Image'.
Headerless files consist of just the voice data for each of the 32 voices in a 6-operator bulk dump. The System Exclusive header and tail information is not stored, since it can easily be added for re-transmission, and omitting it saves storage space. The checksum is also left out, since it is relatively easy to calculate (see 'Checksum' panel). The file size reflects the sum of 32 voices, each of 128 bytes: ie. 4096 bytes.
Removing just eight bytes from a total of 4104 may not seem like a worthwhile gain, as 513 lots of eight bytes need to be saved in order to equal just one extra voice dump. 513x32 equals 16,416 voices, which is more than I have in my entire collection, and I would need just over two megabytes of storage to contain them!
In practice, because of the way that most storage is designed, it is usually advantageous to have files of a particular length which matches or is slightly less than the basic unit of storage.
As an example, suppose that we store the 4096 bytes in a system which utilises basic storage blocks of 1024 bytes. 4096 bytes would occupy exactly four blocks, whilst 4104 bytes would require five - wasting most of the fifth block. This wasted storage space is not usable, because of the minimum addressable storage block of 1024 bytes. You should be able to see that after saving four lots of 4104 bytes in five blocks, the wasted space amounts to enough storage for an extra set of 4096 bytes! Losing 1/5th of the storage capacity is much more serious, and minimising the size of the file you store goes some of the way towards minimising such losses.
Image files are so called because they are an exact image of the MIDI data which was originally sent - nothing is removed, and so they can be sent as they are to a suitable synthesizer. Thus the 4096 bytes of data have the header and tail intact, resulting in a total file size of 4104 bytes.
This is the file format used for programs like Steinberg Pro24 - .SND files are suitable for loading into it. Movie's .MVI files are also images, and so are compatible with any other image-based storage method.
At this point I would normally say that I have just finished writing a program to convert between the two types of file (headerless and image), but in fact, one of the first programs offered by SOS does the job already! It can only be used for converting between .DTX and .SND formats of DX7 bulk voice files. The MIDI Bulk Dump Convertor is available from SOS Shareware (Disk 07) and converts voice files from either format to the other, using a friendly user interface. I offer no prizes as to the author!
Converting between headerless and image formats for other pieces of equipment is only a matter of deciding which format it is (use ASCHEX to look at the contents of System Exclusive data files), and then supplying the header and trailer, or removing them as appropriate.
The above two file types are stored directly in memory or on disk files, and so printing them out on the screen or on paper produces meaningless rubbish or upsets the printer, because the values sent to the printer/screen represent control characters rather than printable characters. This can be cured either by using a special program to view the file contents, by converting the bytes to a hexadecimal or decimal representation, or by storing the file data in a printable form in the first place. The latter type of storage is called 'ASCII format' and usually converts each byte into two hex characters followed by a space character. For example:
F0 43 00 09 20 00 32 43 54 72 44 00 00 12 23 34 76 45 00 00 46 F7
Some alternative schemes convert the data to three decimal characters with spaces:
240 067 000 009 032 000 004 057 013 064 078 068 045 000 002 102 247
In each case the resulting file is printable, although it is three or more times as long as the more compact but unprintable version. Superconductor, the ST sequencer program by Michtron (available from Microdeal), stores System Exclusive information in this format, enabling an ordinary word processor to be used to edit the information.
After the revised versions of SYSEX which have appeared in the last two installments, it may not surprise you to learn that this month's software is also another variation on the same utility. SXCHEK is almost exactly the same as SYSEX in its use, except that it has two screen areas in which to place values. The upper box contents are not included in the checksum, whilst the lower box contents are - this means that you can choose which parts of the message are to be in the header and which parts are to be in the data. The checksum byte is added to the message after the data and just before the $F7 end of Exclusive byte. SXCHEK also adds a few more manufacturer ID numbers to those found in SYSEX V0.3.
Since this part of the series has a panel which covers the topic of checksums, the practical content this time continues the editing theme of the last few episodes and gives an example of using checksums whilst editing.
Roland have a consistently high standard of documentation, but their MIDI System Exclusive information can be difficult to interpret because it splits the explanation into two parts: first a general discussion of the System Exclusive information for all Roland instruments, and then the specific details of the instrument in question. This means that you need to correlate two different sources together in order to figure out exactly what messages are needed. To help you, here is my interpretation of what the Roland U20 documentation means...
If we wish to alter parameters we only need to send messages to the U20, and so only one MIDI cable is needed: from the ST's Out to the U20's In. This is thus a 'One-Way Transfer', and Roland define two types of message which can be used:
- RQ1 is a Request for data message, and is used for interrogating the instrument about the current value of parameters. Since you need MIDI cables to send data back in response, this should not really be part of a 'one-way' system, but it is included because the implied use of a 'One-Way Transfer' is for the transmission of small amounts of data - like single parameters. For larger amounts of data, such as bulk dumps, the 'Handshake Transfer' mode would be used. We will look closely at handshaking in a future article.
- DT1 is a Data Transfer message, and is used to send information about parameter values to and from an instrument. DT1 messages are always less than 256 bytes long (remember that 'One-Way' transfers are for short messages). It can be sent from the Atari ST to the U20 to alter a parameter value, or it can be output from the U20 in response to an RQ1 message. The basic format of the DT1 message is:
$41 Roland ID
$dd Device ID
$mm Model ID
$12 DT1 command
$a1 Address MSB
$a3 Address LSB
$vv data value
The Device ID is used to uniquely identify the instrument. This means that several U20s could utilise the same MIDI channel but could each be controlled separately. The ID is factory set to 17 for the U20 and is included within messages in the standard MIDI channel form, ie. one less than the number shown. So the message byte will be 16 or $10. The Model ID is used to identify the type of instrument - the byte for the U20 is $2B. The $12 in the Command byte position shows that this is a DT1 message: a $11 here would indicate an RQ1 message.
The Address bytes are used to show which parameter inside the U20 is being accessed by the message - most Roland equipment uses three bytes for the address. The data value is always seven bits or less, and so occupies only a single un-nibbleised byte. The Checksum byte is used to ensure that the message has been correctly received by the U20. It is a number which represents the sum of the bytes sent in the message, such that the U20 can verify that the message has not been corrupted during transmission. Checksums are more normally found in bulk dumps where there are large numbers of bytes and errors might be encountered. In the case of short parameter edits, corruption is unlikely and has an unfortunate side-effect for last month's programs, as we shall see.
The U20 documentation shows the address of the parameter and other information in a series of tables, with almost no further explanation. For example:
|10 01 25||KeyTranspose||(28...100: -36...+36)|
|10 01 29||Arpeggio Type||(0..3 : Up, Down, Up&Down, Random)|
|10 01 2A||Arpeggio Rate||(0...127 : 0...127)|
|10 01 2B||Transpose||(0...1 : Off...On)|
|10 01 2D||Arpeggio||(0...1 : Off...On)|
|10 10 19||Tone Number||(0...127 : 1 ...128)|
|60 00 03||Part Number||(0...5 : 1...6)|
Although the specific information described here applies only to 6-operator Yamaha synthesizers, the basic principles of headers, checksums and tails surrounding the data is the same for all System Exclusive bulk dumps. In particular, the methods for storing and displaying the files, as well as calculating the checksum, apply to all dumps.
So far, all the information on bulk dumps has avoided any detailed mention of 'handshaking'. This will be rectified in the next part of this series.
The programs mentioned are available on the SysEx Toolkit 1 and SysEx Toolkit 23.5" disks (Atari ST only) and cost £7 each inc. postage.
SOS Software, (Contact Details).
Feature by Martin Russ
Previous article in this issue:
Next article in this issue: