Matthew Montchalin asked:

>On 31 Oct 2002, Michael J. Mahon wrote:
>|>Can you be more explicit?  If ror <address> involves a read of a
>|>specified memory address, and then rotate some internal register
>|>+carry, before write-back, are you suggesting that it makes no
>|>difference how far into the execution we go, if the abort occurs
>|>prior to the write-back, the whole instruction is executed over
>|>again, after we get back from the Abort handler?
>|
>|I said on the _first_ reference to the memory address.  So the
>|abort would be asserted when the location was read, and before
>|any data was returned.
>
>What about memory mapped I/O?
>
>|If the memory is not present, there can be no valid data.  The
>|important part is that no internal register is updated by the
>|aborted instruction, so that it can be re-executed later.

For a virtual memory implementation, memory-mapped I/O
would normally always be mapped, and so not subject to abort.

However, since a virtual memory system maps all process
addresses, including those associated with I/O, it would be
possible to implement a virtual memory-mapped I/O system,
in which the actual mapping to a device could be done only
if the process has the "rights" to access the device--just as
with data pages.

Again, since the unmapped condition is detected without
any access to real address space, the instruction can be
aborted and transparently re-executed when the mapping
has been made.

There are a number of systems that use virtual memory-
mapped I/O, because the OS's protection mechanism can
be enforced by the normal page table lookup.

-michael

Check out 8-bit Apple sound that will amaze you on my
Home page:  http://members.aol.com/MJMahon/


Roy replied:

>Michael J. Mahon wrote:
>> Matthew Montchalin replied:
>> 
>> 
>>>On 31 Oct 2002, Michael J. Mahon wrote:
>>>|>I've heard conflicting information about just what happens when the
>>>|>Abort pin is pulled low.  Is it true that it wipes out the instruction
>>>|>currently being executed?  Someone told me that it tries to 'resume'
>>>|>the instruction when you finally RTI and get out of the Abort handler.
>>>|>Obviously a ROR <memory address> instruction, if interrupted in the
>>>|>middle of the rotation, is going to result in something odd after
>>>|>the processor resumes from the Abort.
>>>|
>>>|The intent is that it cancels the effect of the in-process instruction,
>>>|and it should work if asserted during the first memory-access
>>>|cycle of an instruction.
>>>
>>>Can you be more explicit?  If ror <address> involves a read of a
>>>specified memory address, and then rotate some internal register
>>>+carry, before write-back, are you suggesting that it makes no
>>>difference how far into the execution we go, if the abort occurs
>>>prior to the write-back, the whole instruction is executed over
>>>again, after we get back from the Abort handler?
>> 
>> 
>> I said on the _first_ reference to the memory address.  So the
>> abort would be asserted when the location was read, and before
>> any data was returned.
>> 
>> If the memory is not present, there can be no valid data.  The
>> important part is that no internal register is updated by the
>> aborted instruction, so that it can be re-executed later.
>> 
>> 
>>>|This is consistent with it's intended use for a virtual memory
>>>|implementation,
>>>
>>>I was just thinking it would be more useful for implementing
>>>a copy-protecting mechanism to be used in conjunction with
>>>copy-protected software.
>> 
>> 
>> I don't know what you're thinking about here.  NMI should do
>> a fine job of interrupting a program (after the conclusion of the
>> currently-executing instruction).
>> 
>> 
>>>|where it would be used to interrupt & cancel the current instruction
>>>|if it referred to un-mapped memory--a page fault.
>>>|
>>>|After the page has been made present by an interrupt service
>>>|routine,
>>>
>>>Sure, the Abort handler itself would have to save stuff to disk,
>>>bring something else in from disk, and then restore registers
>>>and RTI.
>>>
>>>|the program would be resumed at the aborted instruction
>>>|so that it could re-execute to completion.
>>>
>>>Hmmmmmmm....  Okay.
>> 
>> 
>> This is the way that all modern data page-fault handling works.
>> 
>> Some older machines, because of their instruction semantics,
>> needed to pre-check data presence before starting execution of
>> an instruction, because changes were made to memory prior to
>> a possible data page fault--for example, a page-crossing move
>> that could not be re-started in the middle of the move.
>
>This is a bit over my head, so if I'm off, don't hesitate to correct me, 
>but, it seems to me that what you are saying is that, via a virtual 
>memory scheme, a IIgs could have much more than 16 megs of RAM 
>installed, switching banks in and out of the memory space, not too 
>dissimiliar to the old LIM EMS scheme on PCs. Have I got that right?

Virtual memory is more than bank-switched physical memory.

The usual implementation of virtual memory is to allow a processor
with a large virtual address space to operate with significantly less
physical memory, with the non-resident "memory" located on disk.
This is most useful when there are multiple processes running, so
that another active process can be running while a page fault for the
faulting task is being serviced.  In this implementation, the purpose
of virtual memory is to permit a large address space to be simulated
by a smaller physical memory.  It can only work if the "active"
pages of the program are substantially less than the "total" pages
in its address space, so that the "working set" will fit into the
available physical memory.

A virtual memory scheme can also support more total memory
space than the processor can simultaneously address.  In such
an approach, the processor's effective addresses are extended by,
for example, a process id, which is then mapped into the physical
memory through page tables.  This is a "segmented" virtual
memory, since the high order virtual address bits do not participate
in the processor's effective address calculations.

In both schemes, it is common to associate interprocess access
permissions with the page table, allowing "protected" multi-
programming and controlled interprocess communication.

It sounds like you are thinking more of the second scheme, to
permit more real memory to be used than the processor can
directly address.  In this scheme, the physical memory may be
much greater than the processor's address space (16MB), and
paging to disk may be a non-issue.

Unfortunately, no individual program could see more memory
space than what the processor can currently address, since
all existing applications operate within the processor's address
space limits.  To use more address space than this would
require application changes, akin to explicit bank switching,
since the processors effective address calculations extend
only to its native address space.  The primary value of applying
this approach to the IIgs would be to permit it to multiprogram
multiple applications.

Although this would be interesting, the relatively low performance
of the IIgs processor generally requires dedicating it to a single
task at a time, and there are no popular multiprogramming 
environments for the IIgs (GNO is pretty much a niche OS).

A virtual memory system involves some pretty specialized
hardware support (the address translation hardware), plus an
OS which handles the setup of that hardware and the page
and protection faults which it detects.  It seems unlikely that
anyone would invest the design and implementation time to
create these for the IIgs at this point.

-michael

Check out 8-bit Apple sound that will amaze you on my
Home page:  http://members.aol.com/MJMahon/


Roy wrote:

>Michael J. Mahon wrote:
>> Roy replied:
<snip>
>> A virtual memory scheme can also support more total memory
>> space than the processor can simultaneously address.  In such
>> an approach, the processor's effective addresses are extended by,
>> for example, a process id, which is then mapped into the physical
>> memory through page tables.  This is a "segmented" virtual
>> memory, since the high order virtual address bits do not participate
>> in the processor's effective address calculations.
>> 
>> In both schemes, it is common to associate interprocess access
>> permissions with the page table, allowing "protected" multi-
>> programming and controlled interprocess communication.
>> 
>> It sounds like you are thinking more of the second scheme, 
>
>Exactly.
>
>to
>> permit more real memory to be used than the processor can
>> directly address.  In this scheme, the physical memory may be
>> much greater than the processor's address space (16MB), and
>> paging to disk may be a non-issue.
>
>The LIM EMS specs 3.2 and 4.0 did this, allowing (under 3.2) an 8086 to 
>address 8 megs of RAM (4.0 was more, but I can't remember how much more 

Strictly speaking, it didn't increase the amount of memory that
the 8086 could _simultaneously_ address, but simply standardized
a bank switching scheme--much like RamWorks did for the Apple II.

At an application level, if data memory was always accessed using
certain standardized library routines, then it created the software
abstraction of a larger virtual address space (like Lissners storage
access routines in AppleWorks).  But such a scheme does not
support all the standard memory reference instructions of the host
processor--usually only load byte/word and store byte/word.

Alternatively, it may support the idea of copying or bank-switching
a software selected region of memory into one or more "windows",
where up to a "window-full" can be accessed with full generality
(but any pointers outside the "window" will need to be handled
interpretively by the storage manager).

>- it also allowed multitasking with the programs and data in the EMS 
>space.) Both specs used a 64K bank in the upper one third of the 8086 
>address space as the "window" to the EMS space, allowing a program to 
>move data into and out of the EMS space (the "L" was Lotus, as some 123 
>spreadsheets were so large they couldn't fit into the 640K of a PC's 
>user's space, and, since "A-DOS" wasn't around yet, Lotus' customers 
>needed a way to have the future then), as well as moving not just data 
>but program code: overlays, and sometimes whole programs into and out of 
>the DOS space (task switching - WordPerfect's "Library" [later named 
>"Office" before any "Suites" were ever sold] would move WP's WP, 
>spreadsheet, database, and accessory programs like calculator and 
>calendar in and out with their data in a task switching mode.)

Yes, these were all handled very much like AppleWorks handles
its extended memory through its storage manager.

>> Unfortunately, no individual program could see more memory
>> space than what the processor can currently address, since
>> all existing applications operate within the processor's address
>> space limits.  To use more address space than this would
>> require application changes, akin to explicit bank switching,
>> since the processors effective address calculations extend
>> only to its native address space.  The primary value of applying
>> this approach to the IIgs would be to permit it to multiprogram
>> multiple applications.
>> 
>> Although this would be interesting, the relatively low performance
>> of the IIgs processor generally requires dedicating it to a single
>> task at a time, 
>
>Which is why a task switching scheme would make more sense than a 
>multitasking one. Of course, currently, how many IIgs programs can one 
>fit into the 8 megs of memory space? If I had the entire Spectrum 
>Internet Suite loaded, could I also have AppleWorks GS fit in memory at 
>the same time? Can anyone conceive of loading all the programs they 
>might really want to be using, and have 8 megs not be enough RAM?

And since each of these applications could use more memory in some
circumstances, do they contain the logic to request more memory after
they have been started?  I suspect not, since the concept of using only
as much memory as you need, and requesting more as the need grows,
is a concept that comes from multiprogramming, where it is assumed
that some other task can make good use of what you are not using.

>and there are no popular multiprogramming
>> environments for the IIgs (GNO is pretty much a niche OS).
>> 
>> A virtual memory system involves some pretty specialized
>> hardware support (the address translation hardware), 
>
>That's why EMS cards weren't cheap. If done on the IIgs, I would imagine 
>that the "window" would need to be loaded in the 8 meg ROM space.

I expect that the IIgs wouldn't work if writes were done to this space.

The windows would have to come out of normal RAM space.

EMS cards were software-controlled explicit bank-switching
devices, so they only required selective re-mapping of a few high-
order address bits.  This would not have been very expensive.  On the
other hand, the large number of DRAMs on the card would account for
most of its cost.

The virtual memory support on a processor chip (TLB, etc.) is
a small, specialized associative memory.  Though it is specialized,
it does not occupy a large fraction of the chip (5% might be typical),
and so is not a primary cost determinant.  For many years, the
80386 had an on-chip TLB that went virtually (!) unused by any
Microsoft OS.

It's a lot cheaper (and faster) to build it in than to add it on externally,
as was done with the 680x0 Macs when virtual memory was introduced.

>plus an
>> OS which handles the setup of that hardware and the page
>> and protection faults which it detects.  
>
>For the PC, the EMS drivers took care of that. I would think that such 
>would also be the case with GS/OS. But it certainly wouldn't be a 
>trivial programming job.

The only way to make it truly like virtual memory would be to find a
way to reference all the memory you could ever wish for (probably more
than the 8MB supported by the IIgs hardware) and have the processor
be interrupted on any access to non-mapped memory, but allowed
to proceed at full speed if the memory were already mapped.

This would necessitate a larger-than-24-bit effective address in the
65816 and (since we're postulating a processor which does not
currently exist) a built-in TLB to efficiently handle the mapping and
interrupt when mapping fails.

Alternatively, the 65816 and IIgs could be left alone, and external
mapping hardware could support up to 8MB per application, for
several concurrently loaded applications.  This would also require
that the OS manage the mapping hardware upon task switch.

The latter is more easily done, but would offer a smaller set of
advantages.

>It seems unlikely that
>> anyone would invest the design and implementation time to
>> create these for the IIgs at this point.
>
>I think that's a realistic assessment. I was just asking "in theory."
>Thanks Michael.

It's interesting to consider "alternate realities".  ;-)

-michael

Check out 8-bit Apple sound that will amaze you on my
Home page:  http://members.aol.com/MJMahon/