Marvin Miller wrote: > Are the TAG and Data rams the same chips? I understand the needs for > different chip speeds between the two, but other than that, does the > fact that one is called TAG and one is called DATA mean that the chips > themselves are different? No - apart from possible timing issues, both chips are standard 8Kx8 or 32Kx8 static RAMs. As far as functionality goes: The DATA RAM holds actual data read from memory in the computer. The TAG RAM holds information about the validity of the cached data. My earlier theory was that it holds a bank number with a reserved value to mean "invalid", and the ASIC may be using some additional internal RAM to store the extra address bits if you have less than 64 KB of cache installed. Another possibility is that if you don't have 64 KB of cache installed, it doesn't cache as many banks. With 32 KB of cache, it would need to store an extra address bit in the tag RAM, leaving 7 bits for the bank number. Assuming it can cache fast RAM and ROM (banks $FC to $FF in a ROM 3) but not slow RAM (banks $E0 and $E1), and needs to reserve one value to mean "invalid", it would not be able to cache banks $7B through $7F (i.e. only 7.6875 MB of fast RAM would be cached). With 16 KB of cache, it would need to store two extra address bits, leaving 6 bits for the bank number, and would not be able to cache banks $3B through $7F (i.e. only 3.6875 MB of fast RAM would be cached). With 8 KB of cache, it would need to store three extra address bits, leaving 5 bits for the bank number, and would not be able to cache banks $1B through $7F (i.e. only 1.6875 MB of fast RAM would be cached). It should be relatively easy to test this theory, given a ZipGS with different cache sizes: write a machine code program which repeatedly reads the same memory location a predetermined number of times (large enough to be able to get a reasonably accurate figure from a stopwatch, or use the horizontal or vertical screen position to measure the period), and repeat it for each bank, noting the period required to read in each bank. If the read slows down beyond a certain bank and this changes roughly in line with the numbers above, then we have a very likely theory of implementation. The following program does the trick. It runs at $0300 under BASIC.SYSTEM (or even if you Ctrl-Reset from the "Check startup device" screen). I call it CACHETEST. clc xce ; Native mode rep #$30 ; Go to 16-bit mode pha pha ; Space for the resull ldx #$1D02 jsl $E10000 ; Call _TotalMem pla ; Discard the low order word pla ; Keep the high order word sep #$30 ; Back to 8-bit mode (still native) tax ; X has number of banks of RAM, including slow dex dex ; Don't count the slow banks phx clear: stz $1000,x ; Clear the rest of the table inx bne clear sei ; Don't interrupt me ldy #$01 ; Flag: first pass to load code into cache ldx #$01 ; Use bank 0 for this pass banklp: dex ; Move down one bank phx plb ; Select the target data bank lda $1234 ; Read an arbitrary location once - load cache lda $E0C02F ; Read the horizontal counter now xba ; Hold it in B cmp $1234 ; Read the same location several times cmp $1234 cmp $1234 cmp $1234 cmp $1234 cmp $1234 cmp $1234 cmp $1234 cmp $1234 cmp $1234 lda $E0C02F ; Read the horizontal counter again xba ; Get back the initial value, hold final and #$7F ; Ignore the vertical count bit bne nz1 lda #$3F ; Adjust count value to be consecutive $3F-$7F nz1: pha xba ; Get back the final value and #$7F ; Ignore the vertical count bit bne nz2 lda #$3F ; Repeat for the end value nz2: cmp 1,s ; Did the value wrap around? bcs nowrap adc #$41 ; Yes - compensate sec ; and set the carry for the next subtraction nowrap: sbc 1,s ; Get number of 1 MHz cycles sta $001000,x ; Store it in the table pla ; Clean up the stack cpy #$01 ; Was this the first pass? bne notp1 dey ; Yes: go back and do it properly this time plx bra banklp notp1: cpx #$00 bne banklp cli ; Allow interrupts again sec xce ; Emulation mode rts Here it is in machine code. (Transcribed back after using the mini-assembler, and subsequently entered again by hand, so it should be right.) 300:18 FB C2 30 48 48 A2 02 1D 22 00 00 E1 68 68 E2 310:30 AA CA CA DA 9E 00 10 E8 D0 FA 78 A0 01 A2 01 320:CA DA AB AD 34 12 AF 2F C0 E0 EB CD 34 12 CD 34 330:12 CD 34 12 CD 34 12 CD 34 12 CD 34 12 CD 34 12 340:CD 34 12 CD 34 12 CD 34 12 AF 2F C0 E0 EB 29 7F 350:D0 02 A9 3F 48 EB 29 7F D0 02 A9 3F C3 01 B0 03 360:69 41 38 E3 01 9F 00 10 00 68 C0 01 D0 04 88 FA 370:80 AE E0 00 D0 AA 58 38 FB 60 The end result will be a table at memory locations $1000 to $107F which contains the number of 1 MHz cycles required to do the ten CMP instructions (plus a little overhead) for each bank, or $00 if that bank doesn't exist. These should be roughly the same for each bank, except where a cache/noncache boundary is crossed, at which point it will show significantly larger values. If the IIgs was running at normal speed (1.023 MHz), I'd expect 48 cycles to elapse between the two references to the horizontal counter. Now for some real tests: my system is a ROM 3 IIgs with 4 MB memory card (5 MB total) and an 8 MHz ZipGS with 16K of cache. At normal speed: all banks take 48 ($30) cycles. At fast speed, Zip disabled: all banks take 20 ($14) cycles. Zip enabled (8 MHz): banks $00 to $2F take 8 cycles, banks $30 to $4F take 14 or 15 ($0E or $0F) cycles. I like it when my theories turn out to be more or less right. :-) Just for a laugh, I tried this on Bernie to the Rescue. It appears to emulate the horizontal counter correctly if the Control Panel is set to "normal" speed (1 MHz): 48 cycles per loop. If the Control Panel is set to fast, it looks like the horizontal counter is tied to the emulated CPU frequency, not 1 MHz: I get 18 or 19 cycles no matter what speed I tell Bernie to run at. I have another IIgs with a 9 MHz/64K Zip, but it is packed away so I can't do any further testing. Would some other people like to try this little program and provide some results? It might also be interesting to run this on TransWarp GS systems (it doesn't do anything specific to the Zip). Now, to revise my theory slightly. It looks like a 16 K cache allows the Zip to cache 3 MB of RAM (48 banks), not 3.6875 MB (59 banks) as I expected. Further testing reveals that the Zip _is_ caching banks $E0 and $E1, as well as $FC through $FF. This leaves room for 10 unused values, one of which could be "not valid". There might be some extra details for caching the bank-switched memory areas in banks 0, 1, $E0 and $E1. To summarise my (revised) theory: With 8 KB cache, the Zip GS will only cache 1 MB of fast RAM. With 16 KB cache, the Zip GS will cache 3 MB of fast RAM. [Confirmed] With 32 KB cache, the Zip GS will cache 7 MB of fast RAM. With 64 KB cache, the Zip GS should be able to cache all 8 MB. (Mitch: this probably explains the slowdown you told me about, if you only have 32 KB of cache in your ZipGS.)