1541: The floppy disk
What is it.
Almost nobody thinks about the fact how data is stored on a floppy disk. Most of the time people first start paying attention to matter the moment they see the very unpopular message "LOADING ERROR" on the screen. This document gives you the ins and outs how data is stored on a disk. A part of what you will read here is also covered by "1541: Transferring data" .First there were bits....
All the data is stored as bits on the floppy. For this purpose magnetic particles on the surface of the floppy are magnetised. As long as the little magnets points in the same direction, a "0" is read. The moment the direction is reversed, a "1" is read.And then there were bytes.
OK, we have the bits but as you'll probably know, most of the time we work with bytes. Bytes are made of combining 8 bits in a row. So, where should we start with combining them?The 1541 uses a counter to check at regular intervals if there is a "0" or a "1" under the head. Another counter checks how many bits have been read. The hardware is designed in such a way that reading ten "1"s in a row causes both the counters to be reset. At that moment the drive also stops reading bytes. So the first "0" after ten (or more) "1"s causes both counters to start counting again. From that moment on bytes are read again as well.
Two or more $FFs in a row will cause the drive to stop reading. I now can hear you think: "But if I store 5 $FF bytes in a row on the disk, I can read them again afterwards. How about that?"
The fact is that data that is to be stored on the disk, is not stored "as it is". It first goes through an encryption scheme called "binary to GCR (Group Code Recording) conversion". Every nibble (= 4 bits) is turned into a block of 5 bits using the following scheme:
0000 - 01010 0001 - 01011 0010 - 10010 0011 - 10011 0100 - 01110 0101 - 01111 0110 - 10110 0111 - 10111 1000 - 01001 1001 - 11001 1010 - 11010 1011 - 11011 1100 - 01101 1101 - 11101 1110 - 11110 1111 - 10101
- Every combination of two nibbles does not result in combination of bits containing more then eight "1"s in a row. ($5E = 0111111110) With a maximum of eight "1"s in a row, data can never reset the counters.
- Every combination of two nibbles does not result in combination of bits containing more then two "0"s in a row.
If you would write 100 "0"s in a row followed by a "1" to a disk and then started to read these bits again with a drive which is 1% faster, one can imagine that you probably will read 99 "0"s and a "1" instead of the original 100 "0"s. I have no idea about the variation in drive speeds but I can imagine that it is more then 1%. More important is that this trick works.
Bytes make sectors.
From now on it looks simple: take some bytes together and call this group a sector. Unfortunately this is not the case. For a starter: how does the drive know with which sector it is dealing? So the C= designers developed a sector made out of two blocks: the header block and the data block.The header block
The header block actually contains eight bytes of data:
- Header Block ID
This always is $08. - Header Block Checksum
This byte is found by EORing the next four bytes. - Sector Number
- Track Number
At first it looks illogical to store the number of the track. But remember that the 1541 has no "track 0" detector (exception: 1541C). The first thing a drive does after powering on and getting a command concerning the floppy is reading a sector header to find out where the head is positioned. - ID Character #2
This is the second character of the ID that you specify when creating a new disk. The drive uses this and the next byte to check with the byte in memory to ensure the disk is not swapped in the mean time. - ID Character #1
- Two $0F bytes
These bytes are used to complete eight bytes as we need (a multiple of) four to create (a multiple of) five GCR bytes.
NOTE: The header block is written ONLY during the formatting process.
The data block
The data block contains 260 bytes of data:
- Data Block ID
This always is $07. - 256 Data Bytes
- Data Block Checksum
This byte is found by EORing the above 256 bytes. - Two $00 bytes These bytes are needed to complete the needed multiple of four bytes.
Between the header and data block we'll find five $FF bytes as synchronisation markers. Between each sector you'll find these synchronisation markers as well but the number depends on the number of the track and the speed of the drive.
Organising the sectors
Having created the sectors does not mean that the floppy disk is now ready to be used. If we write data to the floppy, we must administrate somewhere and somehow that there is data on the floppy at all. For this reason some sectors are reserved for special purposes. All others can be used to store data.The BAM
BAM means "Block Availability Map" and is found at track 18, sector 0. Generally it contains the information about which sector is still free to be used or isn't. Following list displays the meaning of every byte(s):Bytes Content Meaning ------- ------- --------------------------------------------------- 0 $12 Track where first directory sector can be found 1 $01 Sector where first directory entry can be found 2 "A" Indication of drive format; 1541/4040 in this case 3 0 Unused 4-143 Block Availability Map 144-159 Diskette name padded with shifted spaces (= $A0) 160-161 $A0 Shifted spaces 162-163 Diskette ID 164 $A0 Shifted space 165 $32 DOS version: 2 166 "A" Format type 167-170 $A0 Shifted spaces 170-255 ? Unused
Block Availability Map - entries
For every track four bytes are reserved. The first byte indicates the number of free sectors on this track. Bit 0 to bit 7 of the second byte represent the state of the first 8 sectors of the track. Bit 0 to bit 7 of the third byte represent the state of sector 8 to 15. Finally bit 0 to bit 7 of the fourth byte represent the state of the last 1 to 5 sectors of the track. If the bit is "1" it means the sector is still free. A "0" means allocated or nonexisting.
The directory
The general pattern for a directory sector is:Bytes Content Meaning ------- ------- --------------------------------------------------- 0 $12 Track where next sector can be found 1 Next sector 2- 31 File entry #1 32- 33 0 Unused ... 224-225 0 Unused 226-255 File entry #8As the directory must end somewhere, byte 0 of the last sector is filled with a 0. Byte 1 is filled with $FF. This byte informs the system how many bytes of this sector are actually used. In case of a directory sector all bytes are used.
File entry
The entry exists out of 30 bytes. The first one is the file type byte. Bits 0 to 2 determine the type of file. The use of bit 3 and 4 are unknown to me. Bit 5 determines if it is a replacement file (default 0). Bit 6 determines if it is locked (= write protected) or not (default 0 = unlocked). Bit 7 determines if the file is closed (= free to use) or not (default 1 = closed).
HEX File type Directory shows --- --------- ---------------------- $00 Scratched Does not show $01 Unclosed sequential *SEQ $02 Unclosed program *PRG $03 Unclosed user *USR $04 Unclosed relative Cannot occur $80 Deleted DEL $81 Sequential SEQ $82 Program PRG $83 User USR $84 Relative REL $A0 Deleted @ replacement DEL $A1 Sequential @ replacement SEQ $A2 Program @ replacement PRG $A3 User @ replacement USR $A4 Relative @ replacement Cannot occur $C0 Locked deleted DEL< $C1 Locked sequential SEQ< $C2 Locked program PRG< $C3 Locked user USR< $C4 Locked relative REL<The next two bytes show the track and sector of the first sector of the file.
The next 16 bytes are used for storing the filename. If the length of the name is smaller then 16 bytes, the rest is filled with shifted space (= $A0).
The next three bytes have only a meaning for relative files. The first two give you the information where to find the track and sector of the "Side sector information". The third byte is the record size of each entry.
The next four bytes are unused and filled with $00.
The next two bytes are only used when saving the file using the replace option (@). During the actual saving these two bytes point to the track/sector of the first record of the replacement. When the saving is finished, these two bytes replace to the first two bytes of the entry. After this replacement the bytes are nullified.
The last two bytes represent the size of the file.
The files and their file types
A normal file is made out of one or more sectors. Every sector has, as you'll already know, 256 bytes. The first two bytes of every sector point to the track/sector of the next sector. In case the file is just one sector long, the first byte is $00. As with a directory sector, the second byte then informs the system how many bytes of the last sector are actually used by the file.There is no reason for setting up a complete different scheme for linking sectors together to make a file so you can use all 256 bytes. But then do NOT validate such a disk: the drive operating system relies on this two-byte-linking system. With no valid bytes, the DOS probably will free and allocate the wrong sectors.
SEQ - sequential file
A SEQ can contain all kind of information varying from text to database records to binary data. The normal way to approach this file is reading the data starting at the very first byte. The normal procedure to alter data is by rewriting the file completely.
PRG - program
You could say that a PRG file is a SEQ file containing machinecode and/or BASIC statements. The big difference is that the third and fourth byte of the first sector store the memory address where the rest of the data has to be loaded. This is the address used when performing a LOAD"
USR - user file
The only program I know using USR files is GEOS. And that is the only info I can give you. The books I read about this subject only said that the user is free to do with the contents of this file as long as he uses the two-byte link system.
DEL - deleted file
There is even mentioned less about this file. My guess is that is meant to be used as an extra phase before deleting the file when using the "replace option". I only saw people using this file type when inserting headers, footers, separators and other nice features in their directory structure.
REL - Relative file
This is the toughest one and so I kept it to the last.
Technical seen a REL file exists of two files: a sequential file containing the records, and a "side sector" file containing the pointers to sectors containing the records. As already said, byte 20 and 21 of an directory entry point to this "side record" file. Byte 01 and 1 point to the sequential part.
A side sector is built according the following scheme:
Bytes Meaning ------- ----------------------------------------------------------- 0 Track where next sector can be found 1 Next sector 4- 15 T/S bytes of maximal 6 side sectors (both 0 when unused) 16-255 T/S bytes of 120 data blocks (both 0 when unused)
You can email me here.