Talking MIDI (Part 2)
Who says you can't use those hi-fi DIN cables to connect all your MIDI gear together? This month, Jay Chapman takes a logical look at the hardware side of the Musical Instrument Digital Interface.
WHAT IS MIDI?
MIDI is a communications medium (no, not the ouija board variety) designed to facilitate the interconnection of musical equipment. Indeed, it is not overstating the case to say that MIDI forms the basis of a computer network. The MIDI 1.0 Specification explicitly defines the hardware interface which will be used to connect musical equipment together and explicitly defines the protocol governing the messages that can be sent around a MIDI network. MIDI also implicitly defines a set of facilities that MIDI synthesizers might well provide.
Note that software wasn't mentioned above. This is because MIDI does not attempt to define what the computer programs running in your musical equipment and microcomputers will do for you beyond understanding the MIDI protocol where necessary and mapping any relevant control information onto the facilities provided by the equipment. There are hints and suggestions, of course, since MIDI was designed with musical activity in mind! In this article we will prepare for considering the MIDI protocol specification in some detail.
The preparation involves becoming familiar with the skills of representing information by numbers since the whole of MIDI, both the hardware and the protocol, revolves around agreeing methods of converting the internal state of our musical equipment into numbers that can be transmitted down MIDI connecting cables. Once we understand where we get these numbers from and what they represent, we can play with them to our heart's content and thereby open up the tremendous potential that MIDI offers us.
We will also have a brief look at the MIDI hardware specification in this article so that we know how the numbers just mentioned are physically sent over the MIDI cables.
Hey, aren't those peculiar words something to do with computers? I'm afraid so! It is possible to skim the surface of MIDI without getting too deeply into computer terminology but quite frankly, your understanding of the capabilities of MIDI, and your competence in using it, can be severely limited if you don't bite the bullet and admit a little computer science into your soul.
To give an analogy, you don't have to be an expert car mechanic to drive a car but using the clutch and changing gear make a lot more sense when you know how the gearbox works. If you don't know the why and how of the gearbox, you might spend some time learning a set of rules to work from (clutch in; engage 1st gear; clutch out (slowly); when the engine starts screaming change up; etc, etc). When circumstances change - you try towing a caravan for example — the rules probably need amending and if you have no real understanding, you're in trouble. Whilst I'm not suggesting you need to know how to actually design and build a gearbox, or a MIDI interface for that matter, I am confident that understanding is much better than learning by rote.
Since the musical equipment we will use needs to communicate information over a MIDI network, there must be some definition of how basic units of information are to be stored, transmitted and received. We are not going to spend a lot of time discussing the insides of computers but it is worth discussing how data is represented from MIDI's point of view. This involves an understanding of some of the terminology mentioned above.
All the information MIDI is to deal with is represented by numbers. Some numbers will simply represent numeric values (now there's a really clever trick!): for example, a volume level of 57 would be represented by the number 57. Other numbers will represent some encoding of the information being represented. For example, if a key is pressed, a note number rather than the note name (eg. 60 for middle C) will be transmitted. There is an obvious correspondence in this case between note names and numbers (Figure 1) and we only need to agree on i) the representation of one note, ii) whether the note number rises with the pitch or vice versa and iii) whether we are counting in semitones, to have specified the data representation in full.
Further numbers will be more arbitrary encodings of data.
If our low frequency oscillator is capable of sine, triangle, forward and reverse sawtooth, square and sample and hold outputs, it is not quite so obvious what number should correspond to each waveform. We simply number them in any order and publish a table (Figure 2) to let anybody interested know what correspondence we decided upon!
We should also bear in mind the fact that some of our data is not digital in nature. The volume control mentioned above, as well as such devices as the pitch and modulation wheels on most synthesizers for example, are really analogue devices.
That is they represent data which is continuous rather than step-like as its value changes. Keyboard data represents naturally digital quantities (each key is separate from every other key) and we can find other examples, such as control panel switches, which are either On or Off, ie. there is only a finite number of different states of the data. Where the data is analogue we have to first convert it to digital in order to be able to deal with it over MIDI.
Consider volume control, via MIDI, on a Yamaha TX7 expander module. Rather than using a voltage level to represent the volume (which is exactly what a volume pedal does on most synthesizers if you think about it), numbers in the range 0 (volume Off) to 127 (maximum volume) are used. There are therefore 128 possible volume levels catered for and we cannot specify any volume settings in between these integer (whole number) steps. Since most, if not all, of us couldn't hear any difference in volume between smaller steps than 1/128ths anyhow, this representation is quite satisfactory. However, I can think of at least one synthesizer where insufficient ranges of numbers have been allocated for filter control parameters, for example, and the synthesizer cannot be controlled to a sufficient level of accuracy which is bad news...
The basic unit of information is the bit (or Binary DigIT) which has two possible states taken variously as representing a value of Off or On, No or Yes, False or True, 0 or 1. Without discussing this in incredible detail we can see that if we can store, transmit and receive bits, then we can deal with the state of an On/Off switch on the synthesizer's front panel. If the switch is On then we transmit a 1 and if it is Off we transmit a 0 (or perhaps vice versa as we see fit!). If a volume control had only two states, eg. soft and loud, then we could deal with it in the same way as the On/Off switch since it would then be a two-state, digital piece of data.
What do we do if our data has more than two states? Imagine a volume control which 'thinks' in terms of only four volume graduations see the 'volume slider' in Figure 3 where the four graduations are marked A (minimum volume), B, C and D (maximum volume). We could use one bit to tell us whether we are in the upper or lower half of the slider's travel, ie. the high volume end (C,D) or the low volume end (A, B). Having now selected two out of the four graduations we started with (and therefore deselected the other two!) a second bit then tells us which of these two graduations is the one we actually want.
If we now always transmit the two bits together then the synthesizer (or whatever) receiving them can decipher them into the correct volume graduation A, B, C or D. Figure 3 shows the two bits to the left of the slider. The pairs of bit patterns 00, 01, 10 and 11 correspond to counting from 0 through to 3 in BINARY rather than day to day DECIMAL. If we then replace the letters A, B, C and D with the volume levels 0,1,2 and 3 (Figure 3 again - to the right of the slider), you can see that we have a direct correspondence between the count and the level of the volume control. If four volume levels is not accurate enough then perhaps eight or sixteen (or more) would be better the greater the accuracy required (or the number of states that need representing), the greater the number of bits that will be required.
Figure 4 shows the correspondence when sixteen levels are used. We now need four bits to represent our volume graduations which run from 0 to 15. The first (leftmost) of the four bits tells us whether we are in the first half or the second half of our sixteen values (we have now selected 1/2 of the possible slider values). Knowing this, we look at the second bit which tells us whether we are in the first or second half of the group we just selected (we have now selected a further 1/4 of the possible slider values). The third and fourth bits are now used to select 1/8th and then 1/16th of the possible slider values in turn. Of course, since there were 16 values to start with, we have now selected one of them exactly and so we know what the volume setting is. Hopefully, you can see that the binary patterns 0000, 0001, 0010, 0011... 1110, 1111 correspond to counting from 0 to 15 in decimal.
Whilst I'm not pretending that I've just given you a comprehensive course in the use of the binary number system, I hope that you can see how bit patterns can represent data. If you need more accuracy you just use more bits. Five bits will let you count from 0 to 31, six bits from 0 to 63 and seven bits from 0 to 127 (which should ring a bell! Remember the TX7 volume parameter?).
You will have noticed that we group a number of bits together to represent our data. Many of the current microcomputers that manipulate MIDI data tend to group bits eight at a time and this is referred to as a byte. A byte can represent any one of 256 states using the patterns 00000000, 00000001, 00000010... 11111110, 11111111 which we more commonly think of as the decimal numbers 0 to 255. MIDI was designed taking the byte as its basic unit of data and it is 8-bit bytes that are transmitted round a MIDI network. If you want to send more than 8-bits-worth of data, you send multi-byte messages; if you want to send less than 8-bits-worth you will have to send 8 bits and ignore some of them!
Sometimes we will want to split a byte up into smaller groups of bits. We could send the states of (say) four On/Off switches as four individual bits which we group together to promote efficient use of storage and transmission time. Since we always store and send a minimum of a byte at a time, it would be wasteful to store each single-bit representation in a byte of its own (with the other seven bits ignored). A sub-grouping of a byte that is particularly useful is the four bit nybble. It's called a nybble because it's half a byte (get it?!).
This section is probably going to be the low point of this article for a lot of readers! I've taken you from good old decimal numbers into binary and now I'm going to lead you gently by the hand into the HEXADECIMAL or base 16 (rather than base 10 or base 2) number system. The reason for forcing such brain ache upon you is quite simple - it will help you avoid more brain ache in the future! Hexadecimal is a convenient shorthand method of representing strings of bits. It is generally easier to remember a hexadecimal number than its binary equivalent.
Each hexadecimal digit corresponds to exactly one nybble which consists of four bits and can therefore represent 0 (binary 0000) to 15 (binary 1111). We use the normal digits from 0 to 9 to represent the values 0 to 9 (binary 1001) and the letters A to F to represent the values 10 (binary 1010) to 15. The correspondence between decimal, binary and hexadecimal is shown in Figure 5.
Unless we work in the same base all the time, we need to know which base a given number is in. Consider the digit string 1111 for example. We can't tell which base it is in just by looking at it! Note that we can tell that the number F8 is hexadecimal (or at least that it's illegal in decimal or binary) whereas we can only tell that 88 is illegal binary but we don't know if it's meant to be decimal or hexadecimal. To solve this problem in these articles, we will introduce the prefix '&' to denote hexadecimal and '% ' to denote binary. A little brain ache will convince you that &66, %01100110 and 102 are three ways of representing the same thing!
Notice how the two nybbles of an 8-bit byte are represented cleanly by the two digits of the corresponding hexadecimal number. For example, &12 tells us that the left nybble must be %0001 and the right nybble %0010 and therefore the whole number in binary is just the concatenation (linked series) of these two bit strings ie. %00010010. Similarly, the bit string %00110000 gives two hexadecimal digits &3 (%0011) and &0 (%0000) and therefore the corresponding byte written in hexadecimal is &30. This knowledge will make life easier when we deal with MIDI message Status bytes later on, (honestly!).
So, now that we have some idea about data representation, we can proceed to discuss MIDI. The hardware side will move the bits about for us and the protocol will tell us what strings of bits to ask the hardware to move!
Knowing that all our MIDI information is made up of numbers, we can look at the hardware required to receive and transmit these numbers around our MIDI network. As I've already said, MIDI receives and transmits a byte at a time which means that we have to arrange for each of the eight bits to get from some point A to some point B whenever a transmission is required. We have two obvious choices: we can either send each of the eight bits down their own personal piece of wire at the same time (parallel) or send the bits one after the other down a single piece of wire (serial). In either case, we would need a common return wire to act as a reference and it would be a good idea to screen the wires to avoid any transmission or reception of electrical interference.
MIDI uses the serial 'one bit after the other' transmission method. Parallel has the distinct advantage of being faster than serial (all other things being equal!), since eight times as much information gets passed in each transmission. Serial, whilst slower, means that much less expensive and more robust cables can be used. Forcing people to buy, use and maintain expensive multicore cables is not a good way to make something accessible to your average musician. In fact, normal hi-fi DIN to DIN leads are often pressed into service because although they don't match the specification (!) they work and they're both cheap and available.
Your MIDI cables should be a maximum of 50 feet (15 metres) long and consist of a shielded twisted pair of wires. The cables are terminated at both ends by 5 pin DIN 180 degree plugs. The shield is connected to (pin 2 on) both plugs so that it doesn't matter which way round it's plugged in as the shield will still be connected. Your average hi-fi DIN lead won't have a shield and the signal wires won't be a twisted pair but I manage with them at home with no problems. If I was paying a lot of money in some mega-studio though, I think I'd prefer cables made up to the proper specification, they would hardly dent the budget and could well improve noise problems (don't forget that the MIDI signals flashing down the cables definitely aren't musical and you don't want any audio circuits picking them up!).
The connectors on the equipment are not unnaturally the female counterparts of those on the cable and one expects to see at least one of, and very likely all three of, the ubiquitous MIDI In, MIDI Out and MIDI Thru sockets on the back panels of all those synthesizers advertising their manufacturers' names on Top Of The Pops (you know the ones with no mains or signal leads plugged in!).
An important point here is that the MIDI specification is very careful on the subject of shielding and ground loops. As studio users will be well aware, you are tempting fate as far as ground loop hums are concerned once you start connecting lots of pieces of equipment together, so MIDI does two things.
Firstly, pin 2 of the MIDI In DIN socket is not connected to ground (or anything else) inside the equipment so although the cable will be shielded (pin 2 of the Out or Thru socket at the far end is connected to ground), there is no actual connection between the pieces of equipment via the cable shield. Secondly, the signal wires are connected via an opto-isolator (again on the MIDI In port) so that there is no actual physical connection between the pieces of equipment at all! Damn cunning, what?!
For those who appreciate the fine detail, the two wire circuit is effectively a 1.5 mAmp current loop where a logical 0 is represented by the current flowing. Coming back to our bits that need communicating, we can now see that if we want to send a 0 we turn the current on, and it we want to send a 1 we turn the current off.
I'm not going to discuss the MIDI interface from an electronics point of view but I do want to give an idea of the timing of events and the limitations that are therefore built into MIDI.
To cut a long story short asynchronous in this context means that the receiver doesn't know when the transmitter will send the next byte along the line. The line sits idle for quite a lot of the time and so one bit (known as a start bit) is sent to wake the receiver up. Both transmitter and receiver have their own clocks which should be running at 31.25 kBaud (plus or minus 1 %) ie. 31,250 bits per second. When the start bit arrives the receiver synchronises its clock and then 'reads' each data bit value at the appropriate time - the receiver assumes that the transmitter is sending out each bit according to the same clock rate of course. A stop bit is sent after the last data bit to help the receiver check that the synchronisation worked. The next byte might arrive immediately or the line might now go idle for a while.
Figure 6 shows the state of the line going from idle (in the 1 state) through the start bit (the first 0 bit), through the data bits representing %00001111 (the righthand bit as we read it is transmitted first and is therefore next to the start bit) and finally through the stop bit (a 1) and back to the idle state. The conversion of the byte that arrives from the MIDI software into this 10 bit stream and its conversion back on reception at the far end is handled in each case by a Universal Asynchronous Receiver Transmitter (UART) chip.
Let's consider the timing of some MIDI messages to see where any limitations might arise. As we will see when we discuss the MIDI protocol, it is possible to take up three bytes with a message saying Key On or Key Off. Imagine you are playing a 10 finger chord and then release it and play it again very quickly what happens over MIDI?
As we know, each byte consists of eight bits. However, when the byte is transmitted, a start bit and a stop bit are added so that there are actually 10 bits transmitted per message byte. Each Key On and Key Off message takes 3 bytes to represent and so takes 3x10=30 bits. There are 20 messages (10 Key On and 10 Key Off) in all, which therefore take up 20x30=600 bits. We can transmit a maximum of (approximately) 30,000 bits per second if we keep the line constantly busy which means that our 20 messages will take 600/30,000 bits per second = 1/50th of a second.
You can soon see that if anything else is going on down this particular MIDI cable, then delays are going to start becoming a problem. More about this in a future instalment.
This month's article has dealt with but two subjects: the representation of information in numeric form and a logical rather than electronic view of the hardware side of the MIDI specification.
The pace of the article has deliberately been kept slow. I have talked to quite a few intelligent musicians who would dearly like to understand MIDI but who do not have sufficient computer science knowledge to have gained much insight into MIDI from more concentrated technical articles on the subject. Generally writers assume a great deal of background knowledge of their readers (and I have been just as guilty as the rest!) because otherwise the subject matter could not be dealt with in just one or two articles. So, it is all too easy for the musician to find MIDI largely obscured behind terminology or notation with which he is totally unfamiliar.
In this series of articles I will try to make less assumptions about what the reader knows and attempt to explain the terminology and notation as I go along, which will no doubt slow the presentation of the material down but also make it hopefully far more accessible. If you have any comments (positive or negative!) on this approach, please write to me via the SOUND ON SOUND offices.
Next month I will begin looking at the MIDI protocol. We now know something about how the bits and bytes get between the pieces of equipment so we will go on to look at the messages we can send.
Readers please note: if you are having problems with any aspect of MIDI, drop us a line and let SOS help.
Feature by Jay Chapman
Previous article in this issue:
Next article in this issue:
mu:zines is the result of thousands of hours of effort, and will require many thousands more going forward to reach our goals of getting all this content online.
If you value this resource, you can support this project - it really helps!