Bryan Parkoff wrote: > Clock Bits On Disk Question > > I read both Beneath Apple DOS and Beneath Apple ProDOS. Chapter 3 of > Beneath Apple DOS says that the head arm reads clock bit before it begins to > read data. > Now, it looks conflict that Beneath Apple ProDOS says not to use clock > bits. It shows the first page of Appendix D -- The Logic State Sequencer. The use of clock bits on Apple II 5.25" disks varies depending on the type of data being accessed. The address field for each sector is encoded using a 4-and-4 method, where every eight raw bits on the disk encode four actual data bits. The other four bits are clock bits (always "1"). This is the same bit-level encoding as the FM technique. The data field for each sector is encoded using a 6-and-2 method (or 5-and-3 for DOS 3.2.1 and earlier), which does not use clock bits. For 6-and-2 encoding, every eight raw bits on the disk encode six actual data bits, using a lookup table to convert from valid bit patterns on disk to the appropriate data value (and vice versa). The 6-and-2 name comes from the manner in which each data byte is broken up before being written to disk: six of the bits are converted and written as an 8-bit value to the disk, and the remaining two bits are combined with the corresponding bits from two other bytes to form another six bit value, which goes through the same conversion process and is written as another 8-bit value to the disk. The end result is that three data bytes are written as four disk bytes (a 33% increase in size). The earlier 5-and-3 technique worked on the same principle, but only used 32 distinct disk byte values instead of 64, which means that it could only encode 5 data bits per 8 bit value written to the disk. I don't recall the exact method by which the other three bits are grouped with other bytes, but the end result is that five data bytes are written as eight disk bytes (a 60% increase in size). This is still better than the 4-and-4 method, which generates two disk bytes for every data byte (a 100% increase in size). If a single sided single density 5.25" disk was written using 4-and-4, it would only be able to hold about 10 sectors per track, at 256 bytes per sector (87.5 KB per disk, assuming 35 tracks). The 5-and-3 method used by early versions of DOS allowed 13 sectors per track (113.75 KB per disk). The 6-and-2 method used by later versions of DOS (as well as ProDOS, Pascal and CP/M) allows 16 sectors per track (140 KB per disk). The reason that the address field uses 4-and-4 encoding is that address fields need to be decoded very quickly (so that the correct data field can be located for reading or writing a sector), and it is much faster for software to decode 4-and-4 data than it is to decode 5-and-3 or 6-and-2. (Decoding two 4-and-4 bytes involves two instructions: rotate one of the bytes with carry set and AND them to produce the data byte.) Not much space is wasted, because the address fields are very small compared to the data fields. > It says that Disk II Drivers can write data bytes without using clock > bits. Please explain what it means. The general principle of writing data to a floppy disk is that the disk records "flux reversals", i.e. inversions in the magnetic field. In the raw data on the disk, a flux reversal is used to represent a "1" bit, and the absence of a flux reversal is used to represent a "0" bit, with fixed timing for each bit cell (4 microseconds for the single density 5.25" drive, 2 microseconds for double density 3.5" drive, 1 microsecond for the high density 3.5" drive). The problem is that the disk can reliably reproduce consecutive flux reversals when read back, but prolonged absence of a flux reversal produces unrelabile data readback. In other words, a raw disk "1" bit is 100% reliable, but there is a limit to the number of consecutive raw disk "0" bits which can be read back again. A single "0" bit immediately after a "1" bit is OK. If the disk drive hardware is good enough quality, it should be possible to read two consecutive "0" bits after a "1" bit (as long as they don't occur too close to other "00" pairs), but three or more "0" bits is unreliable. FM was the earliest solution to this. It is typically used on single density disks, e.g. the standard single density 8 inch floppy format used by CP/M is encoded using FM. With FM, data bits are interleaved with clock bits. Clock bits are always written as "1" and data bits may be either "0" or "1". This means that there is never more than a single "0" bit after a "1" bit, and the disk will read back reliably. (There are some special exceptions, e.g. a clock bit written as a "0" is used as a timing reference.) Eight data bits will require sixteen bit-times of data to be written to the disk, consisting of eight clock bits interleaved with the eight data bits. With the original GCR technique (5-and-3), a raw byte on disk is not interpreted as clock and data bits. Since the disk can reliably reproduce single "0" bits, 5-and-3 allows "0" bits to appear in any of the lower order seven bits of each byte on the disk, as long as they are never adjacent. Using this rule, there are at least 32 unique disk byte values available, so a table lookup method can be used to encode 5 data bits in 8 raw bits on the disk. Here is an example of the comparison with 4-and-4 (FM). The valid 4-and-4 codes are: 10101010 10101011 10101110 10101111 10111010 10111011 10111110 10111111 11101010 11101011 11101110 11101111 11111010 11111011 11111110 11111111 Note that the leftmost bit and every second subsequent bit is a "1". These are the clock bits. All of the above are valid 5-and-3 codes, but if we allow "0" bits to appear in the third, fifth and seventh column while still avoiding two consecutive "0" bits, the following codes are also valid: 10101101 10110101 10110110 10110111 11010101 11010110 11010111 11011010 11011011 11011101 11011110 11011111 11101101 11110101 11110110 11110111 11111101 (I think I got them all.) There are 17 additional codes. One of them (11010101 = D5) was reserved for use as a unique byte to appear in a sector header, leaving 32 in total, which is sufficient to encode 5 data bytes. The later GCR technique (6-and-2) also allows two consecutive "0" bits to appear in the byte, but only once, and not immediately after the leading "1" bit (this ensures sufficient separation between pairs of "0" bits). I won't list out the values, but there are at least another 32 codes available, allowing 6 data bits to be encoded in 8 disk bits. All of these techniques make no change to the timing of the bits on the disk (four microseconds per cell). MFM uses a different technique. It is based on FM (using clock and data bits), but the raw disk bits are written at twice the speed (2 microseconds per bit cell on a 5.25" double density disk). The disk isn't actually able to record flux reversals that close together, so MFM writes a "0" instead of a "1" for a clock bit adjacent to a "1" data bit. The data written to disk looks like this, assuming the preceding data bit was a "0": Data Disk 0000 10101010 0001 10101001 0010 10100100 0011 10100101 0100 10010010 0101 10010001 0110 10010100 0111 10010101 1000 01001010 1001 01001001 1010 01000100 1011 01000101 1100 01010010 1101 01010001 1110 01010100 1111 01010101 Note that in some cases there are three consecutive zero bits, but this is only six microseconds on the disk, which is shorter than the two consecutive zero bits written by GCR (eight microseconds). MFM can store twice as much data on the same disk as FM, so it is better than GCR's 6-and-2 method (but it requires more complex disk controller hardware). FM and MFM are reasonably simple to encode and decode, and this is typically done in hardware using a dedicated floppy disk controller chip. The original chips of this type used in microcomputers were made by Intel, and were very expensive. Steve Wozniak's design of the Apple II disk controller card was a major breakthrough because it was considerably cheaper than a traditional disk controller, and it allowed greater disk capacity than FM encoding (while not being as good as MFM). GCR typically requires encoding and decoding to be done in software, so it imposes more overhead on the host machine. (It can be done in hardware using complex logic circuitry, as was done in the IIc+, for example.) > Do MFM disk have address field and data field like GCR disk? Yes, though the details are somewhat different. The sector formats used on FM and MFM encoded 5.25" and 8" disks were established by IBM. Apple's GCR sector format is based on the same principles. > Where can I find more information how MFM disk is encoded. You could try looking for data sheets on FM or MFM floppy disk controller chips, or for documentation on the IBM floppy disk formats. -- David Empson dempson@actrix.gen.nz David Empson wrote: Again, a lovely and complete description of disk encoding--this should really find its way into the FAQ. It certainly comes up regularly. ;-) >The problem is that the disk can reliably reproduce consecutive flux >reversals when read back, but prolonged absence of a flux reversal >produces unrelabile data readback. It may be useful to say why the prolonged absence of flux reversals is an issue. Magnetic coatings are applied to the substrate by a mechanical process that can produce nonuniform thickness and properties. As a result, the amplitude of the signal read back from a disk may vary considerably with rotational position, and these variations may be relatively short-term. The drive electronics must be able to cope with these rapid amplitude variations in order to correctly recover the magnetic transitions. The method of coping is to vary the read signal gain to try to keep the readback levels approximately constant in the face of wide variation in read signal levels. This Automatic Gain Control works by looking at the short-term average of the head signal amplitude. If the amplitude drops, the AGC increases the gain to compensate, with a time constant that depends on the anticipated data rates and media characteristics. In the Shugart drive which was the basis of the Apple Disk ][, the AGC time constant was chosen so that the gain would rise or fall significantly within several bit cells. As a result, when there are no magnetic transitions for three or four bit times, or whenever the average number of transitions within a half dozen bit times falls below average, the AGC turns up the gain enough that read signal noise can begin to look like an actual transition signal--a false 1. This is why both the number of consecutive 0 bits and the frequency of occurance of multiple 0 bits must be limited to ensure reliable recovery of the transitions written to the disk. The AGC must be kept supplied with a signal so that it keeps the gain properly adjusted. There is a design tradeoff between the quality of the anticipated disk media and the time constant of the AGC. The choices that Shugart Associates made were appropriate for the rather low uniformity of media pravalent in the mid-70s, but coating technology and composition have improved somewhat since then. It might be interesting to increase the capacitance of the AGC filter on the Disk ][ analog card, which would make it less tolerant of short-term read level fluctuations, but more stable in dealing with longer strings of 0 bits. The controller would still need the high bit set as a "start" bit, but the remaining seven bits might be usable without further encoding--providing a 7-and-1 encoding scheme. ;-) -michael Check out 8-bit Apple sound that will amaze you on my Home page: http://members.aol.com/MJMahon/