Mana In this chapter, we familiarize you with the concept of a file and explain how the I'roDOS file system organizes files on the disk drive medium. You need to know the details of the ProDOS file system if you want to better comprehend the internal GS/OS and ProDOS 8 file-handling commands described in Chapter 4. (GS/OS works with non-ProDOS file systems as well, but most users will be using it with disks formatted for the ProDOS file system.) The concept of a file is fundamental to all disk operating systems. A file is just a collection of data that can define an executable program, a letter to the editor, a spreadsheet template, or any other document a program can deal with. The general structure of a file is defined by the operating system itseli; the operating system also provides the various commands for accessing the file in different ways: create, open, read, write, close, destroy, rename, and so on. NAMING FILES When you first save a file to disk, you must assign it a unique filename that a program can use to identifi' it thereafter. A ProDOS filename can be up to 15 characters long. It must begin with an alphabetic letter (A to Z), but the other characters may be any combination of letters, digits (0 to 9), and periods (.). You can use lowercase letters, too, but ProDOS 8 and GS/OS automatically convert them to uppercase when dealing with the ProDOS file system. Here are some examples of valid ProDOS filenames: FORM.LE'1TEB CONTBACT.3 CHAPTER. FOUR 13 Here are some examples of invalid filenames and the reasons they are invalid: 5.EASY.PIECES starts with a number EXPLORING MARS contains an illegal space THIS&THAT contains an illegal & THIRD.AND.TWELVE too long A common mistake that arises in naming files is the use of the space as a word separator (as in the second example). This is permitted with DOS 3.3 but not ProDOS. Periods, not spaces, must be used to separate words in a filename to improve readability. Some pro- grams, like AppleWorks, allow users to enter spaces in filenames, but they internally con- vert the spaces to periods before using the filenames with operating system commands. GS/OS, of course, can work with disk volumes that have been formatted for foreign operating systems (such as Macintosh HFS, MS-DOS, and High Sierra) if the appro- priate file system translator files are on the boot disk. The naming rules for these file systems are different from those for the ProDOS file system. Macintosh HFS, for example, allows names up to 31 characters long; these names can contain any print- able ASCII character except the colon. Refer to the appropriate operating system reference manuals for the naming rules for other operating systems. DIRECTORIES AND SUBDIRECTORIES When you save a ProDOS file to disk, you can store it in any one of several directories that may have been created on the disk. These directories are analogous to file folders in that they are often used to hold groups of related files. (In fact, they are often referred to as folders instead of directories.) For example, you may create one directory to hold word processing documents, and another to hold Applesofr programs. The ability to create separate directories on the same disk makes it much easier to efficiently organize large numbers of files. When you first format a disk, only one directory, the volume directory or root directory, exists; you name it as part of the formatting procedure. (The rules for naming directories are the same as for naming standard files.) The volume directory for a ProDOS-formatted disk can hold the names of up to 51 files (whereas a DOS 3.3 directory can hold 105 files). You can create additional directories (called subdirectories) within the volume directory using the GS/OS or ProDOS 8 Create command. Indeed, you can even create subdirectories within subdirectories. A subdirectory can hold the names of as many files as you wish to store in it, although at some point the disk will become full. This system of nested directories is called a hierarchical directory structure. Most modern file systems, including Macintosh HFS, MS-DOS (version 2.x and higher), and CD-ROM's High Sierra, use similar hierarchical directory structures. 14 Disk Volumes and File Management To specify the directory a file is to be saved in, you normally add a special prefix to the filename to create a unique identifier called a pathname. A pathname comprises the names of a series of directories, beginning with the name of the volume directory and continuing with the names of all the directories you must pass through to reach the target directory, followed by the filename itself. Each directory name is separated from the next by a special separator character, and a separator must precede the name of the volume directory. Under GS/OS, the separator character can be either a slash (I) or a colon (:). Under ProDOS 8, it must be a slash. We use the slash as the separator character in the following discussion. The directory names in a pathname chain must define a continuous path - that is, each directory specified must be contained within the preceding directory. For exam- ple, suppose a disk has a volume directory called BASEBALL and two subdirectories within BASEBALL called AMERICAN and NATIONAL. (Figure 2-1 shows such a directory arrangement.) If you want to save a file called NY.YANKEES in the AMER- ICAN subdirectory, you would specif' the following pathname: /BASEBALL/AMERIcAN/NY YANKEES If you had specified the name NY.YANKEES itself the file would have been saved in the current directory, which is usually the volume directory (unless it has been changed using the SetPrefix command described next). Under GS/OS, you can specifi' a device name, instead of a volume directory name, when forming a pathname. Device names begin with a period (.) and can be between 2 and 31 characters long. Examples of device names are .SCSI1, .DEV4, and .APPLEDISK3.5A. If the NY.YANKEES file in the above example is on the disk in the drive whose device name is .SCS11, you could identifi' it with the following pathname instead: .SCSI1/AMERIcAN/NY.YANKEES This technique cannot be used with ProDOS 8 because ProDOS 8 does not use device names. As we saw above, the separator for a GS/OS pathname can be a slash or a colon, but you can't use both as separators in a single pathname. GS/OS determines what the separator is by scanning the pathname from left to right until it finds a slash or colon; the character it finds is the separator. If the GS/OS separator is a colon, you can use slashes in GS/OS filenames, which is important if you're accessing files on a non-ProDOS disk volume through a GS/OS file system translator. (Macintosh files, for example, can include slashes.) The reverse is not true, however: If the separator is a slash, you cannot use a colon in a filename. Thus it's best to always use the colon as a pathname separator in GS/OS applications. Directories and Subdirectories 15 Figure 2-1 The ProDOS hierarchical directory structure /BASEBALL/NATI0NAL/CHAMPS AMERICAN/ PIRATES. 1960 TXT NY.YANKEES TXT GIANTS. 1954 TXT /BASEBALI,/AMERIcAN/CHAMPS Prefixes If most of the files you arc using are in the same subdirectory, it becomes annoying to have to specify the same chain of directory names every time you want to access a file. To abate this annoyance, GS/OS and ProDOS 8 have a SetPrefix command you can use to set the chain of directory names to which any filename specified in a command will be automatically appended. The chain is the default prefix and cannot be more than 64 characters long under ProDOS 8 or 8K characters long under GS/OS. For example, if you set the default prefix to ,BA5EBALL/AMERICAN/, you can refer to any file ii' the directory at the end of this path (such as NY.YANKEES) by filename only. A name that is a continuation of the default prefix could also be specified to access files in lower-level subdirectories; such a nanle is called a paJ~al pathnamc. If the default prefix has the value just described, and if AMERICAN contains a subdirectory called CHAMPS that contains a file called TWINS.1987, you could access the file by specifying a partial pathname of CHAMPS/TW1NS.1987. Here the pathname is not preceded by a slash. 16 Disk Volumes and File Management Under GS/OS (but not ProDOS 8), the default prefix also goes by the shorthand name of 01. This means 01 is equivalent to 1BASEBALL1AMER1CAN1 if you've used SetPrefix to assign 1BASEBALL1AMER1CAN/ to the 01 prefix. As Table 2-1 shows, GS1O5 supports 32 different prefixes you can refer to by a number followed by a slash (01 through 311) and a boot prefix called ~/. GS1OS sets */ to the name of the disk you booted from; you cannot change */z 11 and 91 identify the directory in which the current application resides, and 21 identifies the directory containing system library files. You can change 11, 21, and 91 with the GS1O5 SetPrefix command, but it's probably best to leave them alone. Use the user-definable prefixes if your application needs to identifi' a particular directory using the convenient GS1O5 shorthand notation. ProDOS 8 prefixes can be up to 64 characters long, including the preceding slash. Partial pathnames can be up to 64 characters long as well. GS1OS has both short and long prefixes. Short prefixes (*1 and 01 through 71) can be up to 64 characters long and long prefixes (81 through 311) can be up to about 8192 characters long. A good feature of GS1OS and ProDOS 8 is that whenever a command must locate a file described by a pathname, it searches every disk available to the system. Contrast this with the DOS 3.3 environment where you must explicitly specifi' the drive and slot number for the file before you can access it (using the ,S# and ,D# parameters). BASIC.SYSTEM, for reasons of compatibility, also permits the use of the ,S# and ,D# parameters. If you specifi' a filename or partial pathname in a command line, and no default prefix has yet been defined, or if either the slot or drive parameter is used, BASIC.SYSTEM automatically uses the name of the volume directory for the disk in the specified slot and drive (or their defaults) to create the full pathname. The advantages of using subdirectories are often not readily apparent to users of floppy disks but are obvious to hard disk users. Hard disks have enough room for hundreds of files. If all the files were held in one directory, you might have to wait a long time to spot your file when the disk was cataloged, and even then you could well miss it among the other files. Fortunately, the hierarchical directory structure ProDOS uses allows related files to be grouped within the same subdirectory for easy access. FUNDAMENTAL FILE-HANDLING CONCEPTS As we see in Chapter 4, G51OS and ProDOS 8 both include a command interpreter that understands a variety of file-handling commands. The most common commands used with existing files are Open open a file for 1/0 operations Read read data from the file Write write data to the file Close close the file to 1/0 operations (Four similar commands are also available from Applesoft when you are using the BASIC.SYSTEM interpreter in a ProDOS 8 environment.) Let's review each of these fundamental file-handling operations. Fundamental File-Handling Concepts 17 Table 2-I Standard prefix numbers for GS/OS The boot prefix. This is the name of the volume GS/OS was booted from. This prefix cannot be changed by the user. 01 The default prefix. GS/OS automatically attaches it to any filename or partial (rather than full) pathname you specify. 11 The application prefix. The pathname of the directory containing the current application program. 21 The system library prefix. The pathname of the directory containing library modules used by the current application. For a standard GS/OS boot disk, this is 1MYDISK/SYSTEM1LIBS. 31 to 81 91 101 to 311 Opening a File User-definable. Same as for 11. User-definable. You must open a file before you can access it. Do this by using the Open command and specifying the name of the file you wish to open. The operating system opens a file by first locating it on the disk and then setting up a special buffer area for it in memory. Part of the file buffer holds information that tells the operating system where the file data is located on disk; another part holds the most recently accessed portion of the file. Whenever you request a file 110 operation, the operating system determines whether the portion of the file to be accessed is already sitting in the file buffer. If it is, the operating system does not need, nor does it bother, to access that portion of the file from the disk. Instead, it simply stores the data in the buffer (a write operation) or reads the data from the buffer (a read operation). As a result, file operations occur much more quickly than if unbuffered disk 110 techniques were used. ProDOS 8 can open a file at one of sixteen different system file levels (numbered from 0 to 15); GS1OS supports 256 different system file levels (0 to 255). Under ProDOS 8, an application can specify the system file level by storing the level number at a particular memory location ($BF94) just before opening the file. Under GS1OS, the application must use the SetLevel command instead. The default system file level is 0. The main advantage of having different file levels available is to make it easier to write supervisory or executive programs. These types of programs typically open their own work files, pass control to user programs, and regain control when the user programs end. If a supervisory program bumps the file level by one before a user program takes over, its work files can't be inadvertently closed by the user program, 18 Disk Volumes and File Management even if the program tries to close all open files (unless the user program breaks a rule and decrements the file level). Reading and Writing a File When the operating system opens a file, it initializes two important internal pointers it uses for keeping track of the size of the file and the last position in the file that an application accessed. These are called the EOF and Mark pointers. See Figure 2-2. EOF is the end-of-file pointer, and it always points to the byte after the last byte in the file. If you try to read data from the file past this position, an error occurs (the "end of data" error). EOF normally changes only if an application writes information to the end of a file; when this happens, EOF automatically increases by the appropriate number of bytes, and if necessary, the operating system allocates more blocks on the disk. But as we see in Chapter 4, GS/OS and ProDOS 8 also have a SetEOF command you can use to set EOF to any specific value. Mark is the position-in-the-file pointer, and it always contains the position at which the next read or write operation will take place. It is set to 0 (the beginning of the file) when you first open a file, but it automatically increases as information is read from or written to the file. For example, if Mark is currently 10 (that is, it is pointing to the 11th byte in the file), and you read or write 14 more bytes of information, Mark advances to 24. It is also possible to explicitly set Mark to any position in the file so that you can access the file randomly. This means a program can retrieve a record from a file containing fixed-length records very quickly because it is not necessary to read through all preceding records first. Closing a File You must close a file when you're finished dealing with it. This ensures that any data written to the file buffer, but not yet stored on the disk itself is actually stored on the disk. It also updates file information, such as size, in the directory. Although it is not necessary to close a file immediately after you're finished with it (you could wait until the program is about to end), it makes good sense to do so to reduce the risk of data loss in the event of an unexpected power loss or a system reset. Moreover, ProDOS 8 allows only so many files to be open simultaneously; if you have a lot of inactive, but open, files lingering around, you could be faced with a surprising error message the next time you open a file. Another compelling reason to close unused files is to free up memory space; each open file reserves a buffer area that is made available to the system when you close the file. GS/OS DISK CACH1NG To speed up disk operations like the ones described above, GS/OS supports the caching of disk blocks. The cache is an area of memory where GS/OS saves copies of GS/OS Disk Caching 19 Figure 2-2 The ProDOS 8 and GS/OS EOF and Mark pointers 82 83 Mark E0F (b) EOF and Mark after 10 bytes of the file have been read: 95 (c) EOF and Mark after 12 bytes have been written past the end of the file (an append operation): 95 E0F and Mark NOTE: EOF is automatically extended. disk blocks when it first reads them from disk. GS/OS also puts in the cache copies of blocks it writes to disk. Once a block is in the cache, GS/OS can quickly get it from memory whenever it needs to read the block again; GS/OS doesn't have to access the relatively slow disk drive to get it. The user usually sets the size of the disk cache with the Disk Cache desk accessory. Like any desk accessory, Disk Cache appears in the Apple menu of most applications which use the Apple IIcs Menu Manager, including the Finder. An application can also set the cache size by calling the GS/OS ResetCache command after saving the new cache size to Battery RAM with the WriteBParam function (see Chapter 4). Generally speaking, the larger the cache, the better GS/OS will perform, but less memory will be available to applications. In most cases, the block cache is not large enough to hold all the blocks which GS/OS may want to cache. When the cache is full, GS/OS throws out the least recently used block to make room for the next block. The GS/OS Read and Write commands (see Chapter 4) let you specify whether specific disk blocks are to be cached or not. Applications should try to cache blocks they expect to frequently access. 20 Disk Volumes and File Management PRODOS FILE MANAGEMENT Disk operating systems use different methods to organize files on disk and keep track of what parts of the disk are being used for data storage so that files can be easily and effi- ciently created, deleted, and accessed. In this section, we investigate the following topics: z The structure of a ProDOS-formatted disk z The structure of the ProDOS volume bit map z The structure of ProDOS directories and subdirectories z The structure of a ProDOS directory entry z The indexing schemes ProDOS uses to locate files ProDOS uses the same general method to organize files on every block-structured, mass-storage device it works with (such as an Apple 5.25 Drive, an Apple 3.5 Drive, an HD20SC, and the /RAM volume). Specific differences arise because the storage capacities of these different devices vary. Furthermore, the sizes of two important data structures stored on the media, the volume directory and the volume bit map, might be different. We generally focus on the Apple 5.25 Drive (and its 5.25-inch floppy disks) in this section; any specific differences for other devices that are not obvious will be mentioned. FORMATTING THE DISK MEDIUM Before you can use a floppy disk (or any other disk medium) with GS/OS or ProDOS 8, it must be formatted into a state that GS/OS or ProDOS 8 recognizes. You can format a disk with the Filer or System Utilities program on Apple's ProDOS 8 master disk or the Apple IIcs Finder. GS/OS also has a Format command that applications can use to format a disk. The method used to format a disk depends on the nature of the disk device. When you format a 5.25-inch floppy disk, for example, templates for 35 tracks on the disk are created (numbered from 0 to 34), each of which can hold 4096 bytes of information. These tracks are arranged in concentric rings around the central hub of the disk, with track 0 at the outside edge and track 34 at the inside edge. The operating system can access any track by causing a read/write head (located inside the disk drive) to move to the desired track. This is done using 1/0 locations that activate a stepping motor that controls the motion of a metal arm the read/write head is connected to. This arm moves along a radial path beginning at the outside edge of the disk (track 0) and ending at the inside edge (track 34). Each of the 35 tracks formatted on a disk is subdivided into 16 smaller units, or sectors. A sector is the smallest unit of data that can be written to or read from the disk at one time. The sectors that make up a track are numbered from 0 to 15, and each can Formatting the Disk Medium 21 hold 256 bytes of information. If you do the mathematics, you will quickly determine that a disk can hold 560 sectors (140K) of information. This is the last you'll hear about sectors, however, since ProDOS uses the 512-byte block as the basic unit of file storage; each block is made up of two disk sectors. An initialized disk is made up of 280 such blocks (numbered from 0 to 279). Fortunately, it is rarely necessary to know where these blocks are actually located on the disk since the operating system disk driver subroutine automatically maps block numbers to actual physical locations on the disk. DISK VOLUMES AND DISK DRIVES A formatted floppy disk that is on line (placed in a system disk drive and ready to be accessed) is often called a disk volume. ProDOS-formatted volumes have names that follow the same naming rules as files, but they are often preceded with a slash (/) to make them more recognizable as volume names. Disk drives themselves also have unique identifiers. ProDOS 8 assigns a unit number to each disk device it finds in the system. The value of the unit number is formed from the slot number of the disk drive controller card and the drive number. Figure 2-3 shows the format of the unit number byte. In Figure 2-3, SLOT may actually be the number of a phantom, or logical, slot if the system contains nonstandard disk devices like BAMdisks. The unit number for the /RAM volume on a lIe, lIe, or IIGs is $B0, for example; in other words, /RAM is the logical slot 3, drive 2 device. DR indicates the drive number: It is 0 for drive 1 and 1 for drive 2. More than two drives may be connected to the port 5 SmartPort. In this case, ProDOS 8 logically assigns the next two drives to slot 2, drive 1 and slot 2, drive 2. ProDOS 8 ignores all SmartPort drives after the first four. GS/OS assigns unique device reference numbers to the disk devices (and character devices) it finds - these numbers are consecutive integers beginning with 1. It also assigns device names to each device; examples are .APPLEDISK3.5A, .SCSI1, and .DEV3. (These names can be from 2 to 31 characters long.) GS/OS does not use the unit number scheme that ProDOS 8 uses. (See Chapter 7 for more detailed information on disk devices and naming conventions.) DISK VOLUME BLOCK USAGE We are now ready to examine the method ProDOS uses to manage files on a disk. Our discussion includes an analysis of the structures of the directories that hold informa- tion about files, of the volume bit map that keeps track of block usage on the disk, and of the index blocks that contain the locations of the data blocks each file uses. But before we continue, keep in mind that the following descriptions relate only to the ProDOS file system and not to its predecessor, DOS 3.3, the Apple Pascal file system, or any other foreign operating system. 22 Disk Volumes and File Management Figure 2-3 The format of a ProDOS 8 unit number byte 7 6 $ 4 3 2 1 0 OR SLOT [Unused] As we have seen, a total of 280 blocks, holding 140K of data, are available on a ProDOS-formatted 5.25-inch disk. If a standard disk-formatting program is used, however, seven of these blocks (0-6) are not available for use by files becanse ProDOS reserves them for special purposes. Figure 2-4 shows the usage of blocks on freshly formatted 5.25- and 3.5-inch disks. Blocks 0 and 1 contain a short assembly-language program that the firmware on the drive controller card loads into memory and executes whenever it boots a disk. This program is called the boot record, and it locates, loads, and executes a special system file called PRODOS if it finds it on the disk. (A system file has a file type code of $FF and a CATALOG mnemonic of SYS. We discuss file type codes later in this chapter.) PRODOS is the program ultimately responsible for installing and activating the operating system. (See Chapter 3.) Blocks 2 through 5 are the blocks containing the volume directory for the disk. We describe the structure of this directory later in this chapter. Block 6 is the first volume bit map block for the disk. Each bit in the map indicates whether the block it corresponds to is free or in use. ProDOS reserves one bit map block for each 2Mb (4096 blocks) of storage space. The blocks past the end of the bit map block (or blocks), a total of 273 for a 5.25-inch disk or 1593 for a 3.5-inch disk, are free for use by files stored on the disk. THE VOLUME BIT MAP The operating system accesses the volume bit map to determine the status of each block on the disk. It reads the bit map whenever it allocates new space to a file so that it can quickly locate free blocks on the disk. It writes to the bit map to reserve new file blocks (this occurs when an existing file grows or a new one is saved) or to free up blocks (this occurs when a file shrinks or is deleted). Standard formatting routines use block 6 as the first block for a disk's volume bit map. But block 6 is only the conventional location for the bit map; it is permissible to store the map in any free block on the disk. For example, the volume bit map for the /RAM volume is in block 3. As we see in the next section, the block number for the first bit map block appears in the directory header that describes the characteristics of the disk volume. For a 5.25-inch disk, only the first 35 bytes (280 bits) in the volume bit map block are used, and each bit in each byte corresponds to a unique block number. A one-block bit map such as this can handle volumes of up to 4096 blocks. For larger volumes, like a hard disk, a continuation of the bit map can be found in the blocks on the disk immediately following the first one used. For example, the old 9728-block Apple ProFile hard disk The Volume Bit Map 23 Figure 2-4 Map of block usage on a 5.25-inch disk and a 3.5-inch disk Each block holds 512 bytes. Continuation of tne volume bit map (one block for each 2Mb of storage) Start of the volume bit map Volume directory Boot record Total storage capacity is 280 blocks (140K) for a 5.25-inch disk Total storage capacity is 1600 blocks (80OK) for a 3.5-inch disk requires three blocks for its bit map; the standard formatting program stores the first part of the map in block 6 and the continuation in blocks 7 and 8. (The operating system determines the size of the volume bit map by examining 2 bytes in the volume directory header that hold the size of the disk; the program used to format the disk places them there. We look at volume directory headers later in this chapter.) Figure 2-5 shows the structure of the volume bit map for 5.25-inch disks. As you can see, the bits in each byte in the bit map block reflect the states of eight contiguous blocks; bit 0 corresponds to the highest-numbered block in the octet and bit 7 to the 24 Disk Volumes and File Management