The New Standard
MIDI Sample Dump Standard
Chris Meyer reports on a new system that could revolutionise the way we treat sound samples - the MIDI Sample Dump Standard - and uncovers its usage in the Prophet 2000 sampler.
Transferring sound samples between MIDI machines in digital form is now a reality, thanks to the Sample Dump Standard - SDS for short. We look at the development of SDS, and how it's implemented on the Prophet 2000 sampler.
THE MIDI SAMPLE Dump Standard was born during the development stages of Sequential's Prophet 2000 in May 1985. I was in charge of specifying the MIDI handler on the 2000, and the project manager came up to me wanting a way to transfer files between the Prophet 2000 and some custom development software he had written on an IBM XT personal computer.
At this time, the E-mu SP12 sampling drum machine had already been announced, and much splash was being made about it using a 12-bit linear data format- the same as we intended for the P2000. Knowing this, I wanted to support both Sequential's own internal sample dump protocol and also be able to receive samples from the SP12 (in the name of compatibility, but also to steal their samples as soon as possible to help get our library started). Unfortunately, the SP12 sample dump routines did not exist.
So, with the best poker face I could put on (Sequential hadn't yet announced the existence of the P2000), I approached Dave Rossum of E-mu in the name of trying to come up with a common Sample Dump Standard.
Starting with the in-house protocol I had created, Dave Rossum and I lashed together an initial proposal for the Sample Dump Standard in June/July 1985. Pushed to make a shipping deadline of September 1985 for the Prophet, I implemented it on the first version of the firmware as a trial run - much to the consternation of other manufacturers, since the "Standard" was still only a proposal and not yet fully approved.
With the experience of making the Prophet 2000 and the Sample Dump Standard work with early versions of Digidesign's Sound Designer software, and after discussion and consultation with other companies (including Octave-Plateau, JL Cooper, Ensoniq, SSL, After Science, E-mu and others), the Sample Dump proposal was updated to its current level.
It was approved by both the MMA (MIDI Manufacturers' Association) and the JMSC (Japanese MIDI Standards Comittee) at Winter NAMM in January 1986.
THIS STANDARD IS the first set of protocols to use the new MIDI Universal System Exclusive area. This area is a set of three System Exclusive ID numbers set up by the MMA and JMSC, and approved at the same meeting in January 1986 as the Sample Dump Standard. Normally, a manufacturer owns their own System Exclusive ID number, and puts any messages or MIDI protocol specific to their devices behind it. The purpose of the Universal System
Exclusive area is to have a set of messages agreed upon and used by a variety of manufacturers, as a way to extend the MIDI specification to handle common, more complex duties.
The SDS (Sample Dump Standard) was created to work in a simple or complex system. In a simple system, only one MIDI cable is connected from the master to the slave. The master sends out short bursts of information (called "packets"), pausing between each one to give the slave a chance to read it and accept the next one. There is no communication back from the slave to the master, so the code to dump (or receive) a sample is not much more difficult to write than a normal program dump routine.
The complex system is referred to as a "closed loop" system. In this, there is two-way communication between the master and the slave (two MIDI cables connecting the two together). The master sends the slave a packet (complete with an error-checking byte on the end), and the slave tells the master either to Wait, that there was an error with the packet and it's therefore refused it ("NAK", or Not Acknowledge), or that the last packet was received intact and it is ready for another ("ACK", or Acknowledge).
If the last packet was refused, the master has the option of resending the last packet, or ignoring the slave and sending the next one (the packets are numbered, so the slave can see if it's getting a new packet, or a copy of the old one).
Packets were held down in size to 127 bytes. MIDI buffers tend to come in sizes that are powers of 2, with 128 bytes being a common and more or less comfortable size. This helped prevent sending so much information in a string that the slave would overflow and miss some of it.
Small packets also allow regular checking of, and recovery from, errors. Almost everyone agrees that error checking is a theoretical must - but as an interesting aside, neither Sequential (with the SDS) nor Ensoniq (with their own protocol) have ever seen a case where a byte was dropped and an error has actually occurred as a result.
The data format for the SIDS was chosen as linear. This is the simplest of formats, and seemingly the one of preference for higher-resolution applications. Every manufacturer is responsible for converting from their private, internal format into linear. This is much better than each slave, upon receiving a sample, being responsible for converting it from any of a variety of formats into its own - a master, instead, only has to convert from one.
For example, converting from formats such as eight-bit COMDAC to linear is easy - a 256-value look-up table is all that is needed, and there's no tricky internal arithmetic to be done. It was thought the minimum resolution worth supporting was eight-bit linear, with the maximum being the AES standard 24-bit linear. Since MIDI data likes to come in multiples of seven bits (7, 14, 21, 28, and so on), 28-bit linear was chosen as the upper limit.
The SDS also includes a way to specify which resolution is being used, and the slave is supposed to switch to that resolution. Therefore, each individual "sample" is encoded into 2, 3, or 4 MIDI data bytes. To simplify receiving routines, 120 bytes was chosen as the common length of data per packet (it's nicely divisible by the 2, 3 or 4 bytes it takes to make up a sample word).
Following the same philosophy of using data that was multiples of 7 bits, 21 bits was chosen to define the length of samples, along with the addresses of the loop points. This translates to a maximum length of 2,097,151 samples per sound. In the not-so-distant future, 2048K of sample memory may not seem as outrageous as it did then (or even now), but considering that's per sample, it should prove to be sufficient. Totally beside the point is that it would also take well over 23 minutes to transmit 2,097,151 samples of even eight-bit data via MIDI. So now you know.
Deciding how to represent sample rate was more sticky. Representing it as simple cycles per second does not give sufficient resolution. For example, the difference between CO and C#O is only two cycles per second. To achieve sufficient resolution would require at least fiftieths of a cycle per second (giving 1 cent). A sample rate of 100kHz (a goal of Fairlight and Synclavier) translates to a number over 5 million - taking 23 bits to represent.
So, the sample rate was flipped upside-down and represented as seconds per sample (actually, nanoseconds). This gave exceptional range and resolution within 21 bits, and is actually better suited to some samplers, since they tend to think internally in terms of how long it is between samples - not how many samples to put out in a given length of time.
Finally, room for defining up to 16,384 different samples and 128 different loop types was allowed. Loop points are the one current shortcoming of the SDS - only one set was provided for, and only two loop types were initially defined. However, I am happy to report that proposals are already in motion to remedy these two problems.
WHAT FOLLOWS IS a technical description of the SDS itself, given in the context of the implementation on the Prophet 2000, so as to get a sort of handle as to what a sampler would do with all these messages. Prior knowledge of MIDI is, of course, assumed.
FO 7E cc 03 ss ss F7
(cc = channel number)
(ss ss = sample requested, LSB first)
Upon receiving this message, the Prophet checks "ss ss" to see if it is within legal range (0000- OF 00, since the 2000 holds up to 16 samples). If it is, this sample requested becomes the Prophet's current sound number, and it is dumped to the requesting master following the standard outlined below. If it is not within range, it ignores it.
The channel number was added to all Universal System Exclusive messages to accommodate multiple samplers in the same MIDI system. It allows up to 127 devices to be individually addressed (channel number 128 = "for everybody in the system"). The actual definition of what the channel number does came after the room was originally saved aside for it.
As a result of this timing, the P2000 properly selects the channel number, but ignores it. Luckily, this has yet to be a problem since in most applications there are only two devices - the master and the slave (but in the future ...who knows?). In the case of the Prophet 2000, on transmit, the channel reflects the Prophet's current MIDI base channel. Future devices will probably have a separate Device ID for this channel.
FO 7E cc 7F pp F7
(cc = channel number)
(pp = packet number)
One of four handshaking flags, this one means "Last data packet received OK; start sending next one". The packet number reflects which packet is being acknowledged as correct. This will be explained in context later on.
FO 7E cc 7E pp F7
(cc = channel number)
(pp = packet number)
Another of the four handshaking flags.
Means "Last data packet received had an error; please resend". The packet number reflects which packet is being rejected. This too will be explained in context later.
F0 7E cc 7D pp F7
(cc = channel number)
(pp = packet number)
The third of our four handshaking flags. Means "abort dump". The packet number reflects on which packet the 2000 decided to abort the dump. And you've guessed it: this will be explained in context later on.
FO 7E cc 7C pp F7
(cc = channel number)
(pp = packet number)
Fourth of the handshaking flags. Means "do not send any more packets until I have finished with this one". It helps systems where the slave (ie. terminal support computer) may need to perform other functions, such as disk access, before receiving the remainder of the dump.
FO 7E aa 01 bb bb cc dd dd dd ee ee ee ff ff ff gg gg gg hh F7
(aa = channel number)
(bb bb = sample number, LSB first)
(cc = sample format-significant bits, from 8-28)
(dd dd dd = sample period (1/sample rate), in nanoseconds, LSB first)
(ee ee ee = sample length, in words, LSB first)
(ff ff ff = sustain loop start point (word number), LSB first)
(gg gg gg = sustain loop end point (word number), LSB first)
(hh = loop type - 00 = forwards only; 01 = backwards/forwards)
FO 7E aa 02 ii [120 bytes] jj F7
(aa = channel number)
(ii = running packet count - 00-7F)
(jj = checksum - EXOR of previous 7E, aa, 02, ii, [120 bytes])
The total size of a data packet is 127 bytes. As mentioned earlier, this was an attempt not to overflow MIDI input buffers in machines that may want to receive an entire message before processing it.
You may notice that the data packet is enclosed as its own System Exclusive message. This was done for two reasons: if a byte was dropped in a packet without some way of telling when a packet started and stopped, it would not be detected until the next packet started - much too late. Then there are devices such as the Roland MPU4O1, which expect each System Exclusive message to be wrapped with a typical F0/F7 in order to recognise it. If it was not, it would be very difficult for the MPU401 to handshake it.
The checksum includes the header (minus the "FO"), in keeping with other communication protocols. It was decided not to include the packet length in the header, to reduce needless complexity (the packet is always of a fixed length).
Once a dump has been requested either over MIDI or from the front panel the Header is sent. As mentioned earlier, the channel number equals the 2000's base channel on transmit, and is ignored on reception. The sample number reflects the current sound number selected on the Prophet (00 00 - OF 00). The sample format in the 2000 is 12 (OCH, for 12-bit linear - again, as mentioned earlier, all samples dumped via the SDS are encoded in linear format). Sample period for the Prophet 2000 follows Table 1.
Sample length and the sustain loop start and end points are in words, with the first word being called word #00 00 00. Since the 2000 allows only forwards-only sustain loops, the loop type for it is always = 00.
THIS IS HOW it all works out in practice. If the 2000 is receiving a data dump, it'll ignore the sample number and the sample rate in the header and use the currently selected one, to facilitate cross-loading between machines with different sample rates (the sample can always be retuned) or between different sample numbers. The Prophet has the facility to force the sample rate as part of a different command (used by Sound Designer and other P2000s).
After sending the header, the master must pause for at least two seconds, allowing the receiver to decide if it will accept this sample (it may not if there isn't enough memory, say). If it receives a Cancel within this time, it will abort the dump immediately. In the case of the Prophet 2000, any other inappropriate MIDI message (eg. a note-on) within this time will also abort the dump.
If it receives an ACK, it will start sending data packets immediately. If it receives a Wait, it pauses indefinitely until another message is received (for example, an ACK, which will continue the dump). If nothing is received within the two-second pause, the master assumes an open loop system, and should start sending packets.
If the 2000 is receiving a data dump, it'll ignore the length if over the space allocated to the current sample, and take in as much of the sample as possible. If either of the loop points is also beyond the end, they will be set to the end of the sample. If the sample received is shorter than the space currently allocated for it in the 2000, this leftover space will remain allocated (the Prophet has other means by which it can recover this memory).
A data packet consists of its own header, a packet number, 120 bytes, a checksum, and an End of Exclusive message (EOX).
On transmit, the channel number equals the 2000's base channel (this is ignored on reception). The packet number will start at 00 and increment with each new packet, resetting to 00 after it reaches 7F. As mentioned briefly earlier, this is used by the receiver to distinguish between a new data packet, and a resend of the previous one (in the latter case, the packet number will be the same as the previous one). This will be followed by 120 bytes of data, which form 60 words (MSB first).
Each data byte holds seven bits. If the sample format is 8-14 bit, 2 bytes form a word; 15-21 bits require 3 bytes/word (giving 40 words/packet), and 22-28 bits require 4 bytes/word (30 words/packet). The receiver should be able to adjust depending on the sample format in the header.
Information is left justified within the seven-bit bytes, and unused bits will be filled out with zeroes. For example, a sampled word of FFFH will be sent as 0111111 1B 01111100B. A word of FFFH happens to represent a full positive value (000H represents full negative).
The checksum is the digital logic EXOR (EXclusive OR) of 7E (channel), 02 (packet number), and the 120 data bytes.
If the Prophet is receiving a data dump, and the specified format is over 12-bit, it will adjust to the correct byte/word count, round up the 13th bit, and throw away the unused bits. It keeps a running checksum during reception. If the checksums match, it will send an ACK and wait for the next packet. If they do not match, it will send an NAK (see above) and wait for the next packet.
After sending a packet, the 2000 will watch its MIDI In port. If an ACK is received, it will start sending the next packet immediately. If it receives an NAK, and the packet number matches the packet just sent, it will resend the previous packet (if the packet numbers don't match, it ignores the NAK). If no activity occurs in over 20 msecs, it will assume an open loop situation, and send the next packet.
If a Wait is received, the Prophet will watch its MIDI In port indefinitely for another message, and process it like a normal ACK, NAK, Cancel, or illegal message. By using the Wait command, a host computer can stop a sampler in the middle of a dump while it saves part of the sample to disk, and so on.
The packet numbers are included in the handshaking commands (ACK, NAK, Wait, and Cancel) to accommodate future machines that might have the intelligence to re-transmit specific packets after the entire dump is finished, or if synchronisation is lost.
If a receiving Prophet 2000 sends an NAK, but the next packet has a different packet number, it assumes the NAK was missed (open loop situation), will ignore the error, and will continue as if the checksum had matched. Other more intelligent machines may be able to transmit (and receive) packets out of sequence.
This process continues until there are less than 121 bytes to send. The final packet will still consist of 120 data bytes, regardless of how many significant bytes remain, and the unused bytes will be filled out with zeroes.
On the receiving end, the slave should take in and handshake on the last packet (the 2000 will Cancel as soon as its memory is full - it will not handshake the
last packet). The Cancel is useful if the master is trying to dump more data than the slave can accept - it stops the dump as soon as it is no longer needed. On the Prophet 2000, any unexpected messages (eg. a note-on) will abort a dump.
IT IS WIDELY realised that the Sample Dump Standard (like any "standard") has compromises. However, everyone hopes it'll be comprehensive and flexible enough to cover most people's basic needs. By having a Standard in place as the majority of new "affordable" sampling machines are entering the market, there is a better chance that it'll be used by several manufacturers, instead of waiting for second or third generation machines to come along.
In the meantime, the more ambitious (read: crazed and bored) of you can start hacking now. Have fun.
Gear in this article:
Feature by Chris Meyer
mu:zines is the result of thousands of hours of effort, and will require many thousands more going forward to reach our goals of getting all this content online.
If you value this resource, you can support this project - it really helps!