Discussion:
virtual memory
(too old to reply)
Anne & Lynn Wheeler
2006-05-12 04:00:00 UTC
Permalink
There are (at least) two overlapping meanings of the phrase "virtual
memory" here: a virtual (i.e., non-real) memory address and a
virtual eXtention ("X" as in VAX) of memory out to disk. Most
people seem to use the latter meaning.
The first, virtual memory addressing (dividing up of RAM into fixed
sized pages) is in most cases a big win: drastic decrease in memory
fragmentation.
Extending RAM out to disk pages adds all the cute PhD thesis project
benchmarkable optimizations: page size, replacement algorithms,
thrash minimization, etc. In an early attempt a compiling TeX in SUN
the person finally shut down the machine after 36 hours, noting the
paging daemon was taking up more and more of cpu time and eventually
might take all of it.
the post did describe a case where virtual memory was used
to address fragmentation problem
http://www.garlic.com/~lynn/2006i.html#22 virtual memory

some subsequent drift on this thread in a.f.c.
http://www.garlic.com/~lynn/2006i.html#23 virtual memory implementation in S/370
http://www.garlic.com/~lynn/2006i.html#24 virtual memory implementation in S/370

i had done a bunch of the paging stuff as an undergraduate in the 60s
... which was picked up and shipped in cp67 product.
http://www.garlic.com/~lynn/subtopic.html#wsclock

decade plus later there was some conflict around somebody's stanford
phd on global lru replacement (work that i had done as an undergraduate in
the 60s) vis-a-vis local lru replacements. some past posts mentioning
the conflict.
http://www.garlic.com/~lynn/93.html#4 360/67, was Re: IBM's Project F/S ?
http://www.garlic.com/~lynn/94.html#49 Rethinking Virtual Memory
http://www.garlic.com/~lynn/2001c.html#10 Memory management - Page replacement
http://www.garlic.com/~lynn/2002c.html#49 Swapper was Re: History of Login Names
http://www.garlic.com/~lynn/2003k.html#8 z VM 4.3
http://www.garlic.com/~lynn/2003k.html#9 What is timesharing, anyway?
http://www.garlic.com/~lynn/2004g.html#13 Infiniband - practicalities for small clusters
http://www.garlic.com/~lynn/2004q.html#73 Athlon cache question
http://www.garlic.com/~lynn/2005d.html#37 Thou shalt have no other gods before the ANSI C standard
http://www.garlic.com/~lynn/2005d.html#48 Secure design
http://www.garlic.com/~lynn/2005f.html#47 Moving assembler programs above the line
http://www.garlic.com/~lynn/2005n.html#23 Code density and performance?
http://www.garlic.com/~lynn/2006f.html#0 using 3390 mod-9s

i had also done this stuff with dynamica adaptive scheduling
(scheduling policies included fair share) ... and scheduling to the
bottleneck
http://www.garlic.com/~lynn/subtopic.html#fairshare

much of it was dropped in the morph from cp67 to vm370 ... but
i was allowed to reintroduce it as the "resource manager"
which became availble 11may76
http://www.garlic.com/~lynn/2006i.html#26 11may76, 30 years, (re-)release of resource manager

around this time, i was starting to notice the decline of relative
disk system performance ... and significant increase in amount of
available real storage ... and being able to start to use real storage
caching to compensate for the decline in relative disk system
performance

i started making some comments about it and the disk division
eventually assigned their performance group to refute the comments.
the performance group came back and observed that i had slightly
understated the problem.

misc. past posts on the subject:
http://www.garlic.com/~lynn/93.html#31 Big I/O or Kicking the Mainframe out the Door
http://www.garlic.com/~lynn/94.html#43 Bloat, elegance, simplicity and other irrelevant concepts
http://www.garlic.com/~lynn/94.html#55 How Do the Old Mainframes Compare to Today's Micros?
http://www.garlic.com/~lynn/95.html#10 Virtual Memory (A return to the past?)
http://www.garlic.com/~lynn/98.html#46 The god old days(???)
http://www.garlic.com/~lynn/99.html#4 IBM S/360
http://www.garlic.com/~lynn/99.html#112 OS/360 names and error codes (was: Humorous and/or Interesting Opcodes)
http://www.garlic.com/~lynn/2001d.html#66 Pentium 4 Prefetch engine?
http://www.garlic.com/~lynn/2001f.html#62 any 70's era supercomputers that ran as slow as today's supercomputers?
http://www.garlic.com/~lynn/2001f.html#68 Q: Merced a flop or not?
http://www.garlic.com/~lynn/2001l.html#40 MVS History (all parts)
http://www.garlic.com/~lynn/2001l.html#61 MVS History (all parts)
http://www.garlic.com/~lynn/2001m.html#23 Smallest Storage Capacity Hard Disk?
http://www.garlic.com/~lynn/2002.html#5 index searching
http://www.garlic.com/~lynn/2002b.html#11 Microcode? (& index searching)
http://www.garlic.com/~lynn/2002b.html#20 index searching
http://www.garlic.com/~lynn/2002e.html#8 What are some impressive page rates?
http://www.garlic.com/~lynn/2002e.html#9 What are some impressive page rates?
http://www.garlic.com/~lynn/2002i.html#16 AS/400 and MVS - clarification please
http://www.garlic.com/~lynn/2003i.html#33 Fix the shuttle or fly it unmanned
http://www.garlic.com/~lynn/2004n.html#22 Shipwrecks
http://www.garlic.com/~lynn/2004p.html#39 100% CPU is not always bad
http://www.garlic.com/~lynn/2005h.html#13 Today's mainframe--anything to new?
http://www.garlic.com/~lynn/2005k.html#53 Performance and Capacity Planning
http://www.garlic.com/~lynn/2005n.html#29 Data communications over telegraph circuits
--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/
Anne & Lynn Wheeler
2006-05-12 22:04:44 UTC
Permalink
Don't forget the single-address-space systems, of which the iSeries
(nee AS/400) is perhaps the most commercially successful example. Just
because process-per-address space is fashionable, doesn't mean it's
the only viable approach.
precursor to as/400 was the s/38 ... which folklore has having
been a bunch of future system people taking refuge in rochester
after FS was killed. reference to future system effort earlier
in this thread.
http://www.garlic.com/~lynn/2006i.html#32 virtual memory

misc. collected postings referencing FS. I didn't make
myself particularly popular with the FS crowd at the time,
drawing some parallel between their effort and a cult film
that had been playing non-stop for several years down in
central sq.
http://www.garlic.com/~lynn/subtopic.html#futuresys

the transition of as/400 from cisc architecture to power/pc ...
involved a lot of hangling during the 620 chip architecture days
... with rochester demanding a 65th bit to be added to the 64bit
virtual addressing architecture (they eventually went their own way).

a few past posts mentioning 620, 630 and some of the other power/pc
activities:
http://www.garlic.com/~lynn/2000d.html#60 "all-out" vs less aggressive designs (was: Re: 36 to 32 bit transition)
http://www.garlic.com/~lynn/2001i.html#24 Proper ISA lifespan?
http://www.garlic.com/~lynn/2001i.html#28 Proper ISA lifespan?
http://www.garlic.com/~lynn/2001j.html#36 Proper ISA lifespan?
http://www.garlic.com/~lynn/2004q.html#40 Tru64 and the DECSYSTEM 20
http://www.garlic.com/~lynn/2005q.html#40 Intel strikes back with a parallel x86 design
http://www.garlic.com/~lynn/2005r.html#11 Intel strikes back with a parallel x86 design

i've perodically mused that the migrations of as/400 to power/pc was
somewhat fort knox reborn. circa 1980 there was an effort to migrate a
large number of the internal microprocessors to 801 chips. one of
these was to have been 370 4341 follow-on. I managed to contribute to
getting that effort killed ... as least so far as the 4381 was
concerned. misc. collected posts mentioning 801, fort knox, romp,
rios, somerset, power/pc, etc
http://www.garlic.com/~lynn/subtopic.html#801

for misc. other lore, the executive we had been reporting to when we
started the ha/cmp product effort ... moved over to head up somerset
... when somerset was started (i.e. the apple, motorola, ibm, et all
effort for power/pc).
http://www.garlic.com/~lynn/subpubtopic.html#hacmp

the initial port of os/360 (real memory) mvt to 370 virtual memory was
referred to as os/vs2 SVS (single virtual storage).

the original implementation was an mvt kernel laid out in a 16mbyte
virtual memory (somewhat akin to mvt running in 16mbyte virtual
machine) with virtual memory and page handler crafted onto the side
... and CCWTRANS borrowed from cp67.

the os/360 genre was real memory orientation with heavy dependancy on
pointer passing in majority of the APIs ... being able to process any
kind of service request required directly addressing parameter list
pointed to by the passed pointer. this was, in part, big part of
having address space for os/360 operation. The application paradigm
involving I/O was largely dependent on direct transfer from/to
application allocated storage. Application and library code would
build I/O programs with the "real" address locations that were
assigned to the application. Transition to virtual memory environment,
had the majority of application I/O involved passing address pointers
to these application build I/O programs with "application" allocated
storage addresses. In the real address world, the kernel would
schedule some I/O permission restrictions and then transfer control
directly to the application I/O program. In the transition to the
virtual address space world ... all of these application I/O programs
were now specifying virtual addresses ... not real addresses. CP67's
kernel "CCWTRAN" handled the building of "shadow" I/O program copies
... fixing the required virtual pages in real storage and translating
all of the supplied virtual address into real address for execution of
the "shadow" I/O programs.

recent post about CCWTRAN and shadow I/O programs
http://www.garlic.com/~lynn/2006b.html#25 Multiple address spaces
http://www.garlic.com/~lynn/2006f.html#5 3380-3390 Conversion - DISAPPOINTMENT

SVS evolved into MVS ... there was a separate address space for every
application. However, because of the heavy pointer passing paradigm,
the "MVS" kernel actually occupied 8mbytes of every application
16mbyte virtual address space. There was some additional hacks
required. There were some number of things called subsystems that were
part of most operational environments. They existed outside of the
kernel (in their own virtual address space) ... but in the MVT & SVS
worlds, other applications were in the habit of directly calling these
subsystem functions using pointer passing paradigm ... which required
the subsystems (which now were in separate address space) to directly
access the calling application's parameters in the application's
virtual address space.

The initial solution was something called a "COMMON" segment, a (again
initially) 1mbyte area of every virtual address space where
applications could stuff parameter values that they needed to be
accessed by called subsystems, resident in other address spaces. Over
time, as customer installations added a large variety of subsystems,
it was unusual to find the COMMON segment taking up five megabytes.
While these were MVS systems, with a unique 16mbyte virtual address
space for every application, the kernel image was taking 8mbytes out
of every virtual address space, and with a five megabyte COMMON area,
that would leave a maximum of 3mbytes for application use (out of a
16mbyte virtual address space).

Dual-address space mode was introduced in the late 70s with 3033
processor (to start to alleviate this problem caused by the extensive
use of pointer passing paradigm). This provided virtual address space
modes ... a subsystem (in its own virtual address space) could be
called with a pointer to parameters in the application address
space. The subsystem had facilities that allowed it to "reach" into
other virtual address spaces. A call to one of these subsystems still
required passing through the kernel to swap virtual address space
pointers ... and some other gorp.

recent mention of some connection between dual-address space and
itanium
http://www.garlic.com/~lynn/2006.html#39 What happens if CR's are directly changed?
http://www.garlic.com/~lynn/2006b.html#28 Multiple address spaces

Along the way there was a desire to move more of the operating system
library stuff that resided as part of the application code. So
dual-address space was generalized to multiple address space and a new
hardware facility was created called "program call". It was attempting
to achieve the efficiency of branch-and-link instruction calling some
library code with the structured protection mechanisms required to
switch virtual address spaces by passing through priviledge kernel
code. the privilege "program call" hardware table had a bunch of
permission specification controls ... including which collection of
virtual address space pointers could be moved into the various access
registers. 31-bit virtual addressing was also introduced.

today there are all sorts of combinations of 24-bit virtual
addressing, 31-bit virtual addressing, 64-bit virtual addressing
... as well as possibly several such virtual address spaces be able to
be accessed concurrently.

3.8 Address spaces ... some overview including discussion about
multiple virtual address spaces:
http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/DZ9AR004/3.8?SHELF=EZ2HW125&DT=19970613131822

2.3.5 Access Registers ... discussion of access registers 1-15
can dissignate any address space
http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/DZ9AR004/2.3.5?SHELF=EZ2HW125&DT=19970613131822

10.26 Program Call
http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/DZ9AR004/10.26?SHELF=EZ2HW125&DT=19970613131822

10.27 Program Return
http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/DZ9AR004/10.27?SHELF=EZ2HW125&DT=19970613131822

10.28 Program Transfer
http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/DZ9AR004/10.28?SHELF=EZ2HW125&DT=19970613131822
--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/
b***@myrealbox.com
2006-05-23 18:29:33 UTC
Permalink
In article <***@lhwlinux.garlic.com>,
Anne & Lynn Wheeler <***@garlic.com> wrote:

[ snip ]
Post by Anne & Lynn Wheeler
SVS evolved into MVS ... there was a separate address space for every
application. However, because of the heavy pointer passing paradigm,
the "MVS" kernel actually occupied 8mbytes of every application
16mbyte virtual address space. There was some additional hacks
required. There were some number of things called subsystems that were
part of most operational environments. They existed outside of the
kernel (in their own virtual address space) ... but in the MVT & SVS
worlds, other applications were in the habit of directly calling these
subsystem functions using pointer passing paradigm ... which required
the subsystems (which now were in separate address space) to directly
access the calling application's parameters in the application's
virtual address space.
The initial solution was something called a "COMMON" segment, a (again
initially) 1mbyte area of every virtual address space where
applications could stuff parameter values that they needed to be
accessed by called subsystems, resident in other address spaces. Over
time, as customer installations added a large variety of subsystems,
it was unusual to find the COMMON segment taking up five megabytes.
While these were MVS systems, with a unique 16mbyte virtual address
space for every application, the kernel image was taking 8mbytes out
of every virtual address space, and with a five megabyte COMMON area,
that would leave a maximum of 3mbytes for application use (out of a
16mbyte virtual address space).
(Just now working my way through this thread; great stuff .... )

Was there an acronym/initialism for this COMMON area? My memory of
doing junior-level systems work on MVS systems is telling me that
there was one, but not what. ?

[ snip ]
--
B. L. Massingill
ObDisclaimer: I don't speak for my employers; they return the favor.
Anne & Lynn Wheeler
2006-05-23 19:17:43 UTC
Permalink
Post by b***@myrealbox.com
Was there an acronym/initialism for this COMMON area? My memory of
doing junior-level systems work on MVS systems is telling me that
there was one, but not what. ?
"common segment" ... having started out as a 1mbyte "shared" segment
in every address space.

the hardware table look-aside buffers (TLBs) were STO (virtual address
space) associative. the result was that there were unique entries for
the same common segment entries from different virtual address spaces.

in the early 80s, some of the high-end hardware started adding special
TLB treatment for the MVS "common segment" ... so that there would
only be one set of TLB entries for MVS common segment areas across all
MVS address spaces. However, this references a "bug" when MVS was
running as a virtual guest ... and a temporary fix ... pending
availability of MVS APAR.

Date: 18 February 1983, 12:30:13 EST
From: xxxx

Hi - the 'official' fix to the STBVR 0Cx abend problem is MVS APAR/
PTF OZ67587. I don't know if it's available yet. In the meantime,
the zap works just fine.

I'll send the zap following this note.

//xxxxx JOB (6007,X003),xxxxxxxx,MSGLEVEL=1,MSGCLASS=O,CLASS=B,
// REGION=1024K,NOTIFY=PREISM
//ZAP EXEC PGM=AMASPZAP
//SYSPRINT DD SYSOUT=*
//SYSLIB DD DSN=SYS1.NUCLEUS,DISP=SHR,UNIT=3330,VOL=SER=D00126
**
** FOR VM MVS GUEST, STBVR, TURN OFF USE OF COMMON SEGMENTS
**
NAME IEAVNPX1 IEAVNPX1
VER 0DA2 96026003 OI SGTCB=1
REP 0DA2 47000000 NOP
VER 0DCE 96026003 OI SGTCB=1
REP 0DCE 47000000 NOP
//
--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/
b***@myrealbox.com
2006-05-23 21:03:06 UTC
Permalink
Post by Anne & Lynn Wheeler
Post by b***@myrealbox.com
Was there an acronym/initialism for this COMMON area? My memory of
doing junior-level systems work on MVS systems is telling me that
there was one, but not what. ?
"common segment" ... having started out as a 1mbyte "shared" segment
in every address space.
the hardware table look-aside buffers (TLBs) were STO (virtual address
space) associative. the result was that there were unique entries for
the same common segment entries from different virtual address spaces.
in the early 80s, some of the high-end hardware started adding special
TLB treatment for the MVS "common segment" ... so that there would
only be one set of TLB entries for MVS common segment areas across all
common segment area, common segment area ....

Aha! "CSA" is the acronym/initialism I was trying to remember.
(A quick Google search turns up several other alleged expansions --
"common system area", "common storage area", "common service area" --
but maybe we shouldn't worry about that.)

Thanks.
Post by Anne & Lynn Wheeler
MVS address spaces. However, this references a "bug" when MVS was
running as a virtual guest ... and a temporary fix ... pending
availability of MVS APAR.
Date: 18 February 1983, 12:30:13 EST
From: xxxx
Hi - the 'official' fix to the STBVR 0Cx abend problem is MVS APAR/
PTF OZ67587. I don't know if it's available yet. In the meantime,
the zap works just fine.
I'll send the zap following this note.
//xxxxx JOB (6007,X003),xxxxxxxx,MSGLEVEL=1,MSGCLASS=O,CLASS=B,
// REGION=1024K,NOTIFY=PREISM
//ZAP EXEC PGM=AMASPZAP
//SYSPRINT DD SYSOUT=*
//SYSLIB DD DSN=SYS1.NUCLEUS,DISP=SHR,UNIT=3330,VOL=SER=D00126
**
** FOR VM MVS GUEST, STBVR, TURN OFF USE OF COMMON SEGMENTS
**
NAME IEAVNPX1 IEAVNPX1
VER 0DA2 96026003 OI SGTCB=1
REP 0DA2 47000000 NOP
VER 0DCE 96026003 OI SGTCB=1
REP 0DCE 47000000 NOP
//
--
B. L. Massingill
ObDisclaimer: I don't speak for my employers; they return the favor.
Anne & Lynn Wheeler
2006-05-13 16:15:09 UTC
Permalink
VMS and WNT use Second Chance Fifo, which has very different behavior
to strict Fifo, and is reputed to have the same behavior as WSClock.
VMS also has an option for a third chance - I don't know if WNT
also has that. This gives them all the control advantages that
local working sets allow with the same paging statistics as global.
In second chance fifo, pages removed from a local working set are
tossed into a global Valid list to become a candidate for recycling.
If referenced again quickly the page is pulled page into the local
working set for almost no cost. This is essentially the same as
the WSClock and its referenced bits.
In 3rd chance, VMS allows a page to make 2 trips through the working
set list. After the first trip a flag is set on the working set entry
it goes to the tail of the list and the PTE's valid flag is cleared.
If it gets touched again then the handler just enables the PTE.
When it gets to the head of the list again the PTE is checked to
see if it was referenced. If is was, it cycles again, otherwise
it goes into the global Valid list. [1]
as mentioned in the previous post
http://www.garlic.com/~lynn/2006i.html#22 virtual memory
http://www.garlic.com/~lynn/2006i.html#23 Virtual memory implementation in S/370
http://www.garlic.com/~lynn/2006i.html#24 Virtual memory implementation in S/370
http://www.garlic.com/~lynn/2006i.html#28 virtual memory
http://www.garlic.com/~lynn/2006i.html#30 virtual memory
http://www.garlic.com/~lynn/2006i.html#31 virtual memory
http://www.garlic.com/~lynn/2006i.html#32 virtual memory
http://www.garlic.com/~lynn/2006i.html#33 virtual memory
http://www.garlic.com/~lynn/2006i.html#36 virtual memory
http://www.garlic.com/~lynn/2006i.html#37 virtual memory
http://www.garlic.com/~lynn/2006i.html#38 virtual memory
http://www.garlic.com/~lynn/2006i.html#39 virtual memory
http://www.garlic.com/~lynn/2006i.html#40 virtual memory
http://www.garlic.com/~lynn/2006i.html#41 virtual memory
http://www.garlic.com/~lynn/2006i.html#42 virtual memory

some of the work that i had done in the 60s as an undergraduate for
cp67 ... had been dropped in the morph from cp67 to vm370. i was able
to rectify that with the resource manager released 11may1976.

the cp67 "clock" scanned pages by their real storage address.
basically the idea behind a "clock" reset and testing the reference
bit is that the time it takes to cycle completely around all virtual
pages represents approximately the same interval between the resetting
and testing for all pages.

one of the advantages of clock type implementation that i had done in
the 60s was that it had some interesting dynamic adaptive stuff. if
there weren't enuf pages, the replacement algorithm would be called
more frequently ... causing it to cycle through more pages faster. as
it did the cycle quicker ... there was shortened time between the time
a page had its reference reset and then tested again. with the
shortened cycle time, there tended to be more pages that hadn't a
chance to be used and therefor have their reference bit turned on
again. as a result, each time the selection was called on a page
fault, fewer pages had to be examined before finding one w/o its
reference set. if the avg. number of pages examined per page fault
was reduced ... then it increased the total time to cycle through
all pages (allowing more pages to have a chance to be used and
have their reference bit set).

part of the vm370 morph was that it change the page scanning from real
storage address (which basically distributed virtual pages fairly
randomly) to a link list. one of the side-effects of the link list
management was that it drastically disturbed the basic premise under
which clock operated. with the virtual pages position in the list
constantly being perturbed ... it was no longer possible to assert
that the time between a page had its reference reset and not taken and
the time it was examined again ... was in anyway related to the
avg. time it took clock to cycle through all pages.

basically a single reference bit represented some amount of activity
history related to the use of that specific page. in clock the
avg. amount of activity history that a reference bit represents is the
interval between the time the bit was reset and the time it was
examined again. on the avg. that is the interval that it takes clock
to cycle through all pages ... and is approximately the same for all
pages. if pages were being constantly re-ordered on the list (that is
being used by clock to examine pages) ... there is much less assurance
that the inverval between times that a specific page was examined in
any way relates to the avg. elapsed time it takes clock to make one
complete cycle through all pages. this perturbs and biases how pages
are selected in ways totally unrelated to the overall system avg. of
the interval between one reset/examination and the next ... basically
violating any claim as to approximating a least recently used
replacement strategy.

because of the constant list re-order in the initial vm370
implementation ... it was no longer possible to claim that it actually
approached a real approximation of a least recently used replacement
strategy. on the "micro" level ... they claimed that the code made
complete cycles through the list ... just like the implementation that
cycled through real storage. however, at the "macro" level, they
didn't see that the constant list re-ordering invalidated basic
assumptions about approximating least recently used replacement
strategy.

the other thing that the initial morph to vm370 was that "shared"
virtual memory pages were not included in the standard list for
selection ... so they were subject to the same examine/reset/examine
replacement cycle as non-shared pages. this was downplayed by saying
that it only amounted to, at most, 16 shared pages.

well a couple releases came and went ... and they then decided
to release a small subset of my memory mapping stuff as
something called discontiguous shared segments. recent post
on the subject in this thread
http://www.garlic.com/~lynn/2006i.html#23 Virtual memory implementation in S/370
http://www.garlic.com/~lynn/2006i.html#24 Virtual memory implementation in S/370

basically the support in vm370 for having more than single shared
segment ... and some number of my changes to cms code to make it r/o
(i.e. be able to operate in a r/o protected shared segment)
... various collected posts
http://www.garlic.com/~lynn/subtopic.html#mmap
http://www.garlic.com/~lynn/subtopic.html#adcon

in any case, this drastically increased the possible amount of shared
virtual pages ... that were being treated specially by the list-based
replacement algorithm ... and not subject to the same least recently
used replacement strategies and normal virtual pages ... some shared
virtual page at any specific moment might only be relatively lightly
used by a single virtual address space ... even tho it appeared in
multiple different virtual address spaces; aka its "sharing"
characteristic might have little to do with its "used" characteristic
(but the "sharing" characteristic was somewhat being used in place of
its actual "use" characteristic for determing replacement selection).

however, i was able to rectify that when they allowed me to ship
resource manager several months later on 11may76 ... and put the
replacement strategy back to the way I had implemented it for
cp67 as an undergraduate in the 60s.
http://www.garlic.com/~lynn/subtopic.html#fairshare
http://www.garlic.com/~lynn/subtopic.html#wsclock

so i had a similar but different argument with the group doing os/vs2
... the morph of real-memory os/360 mvt with support for virtual
memory. recent post in this thread about other aspects of that
effort
http://www.garlic.com/~lynn/2006i.html#33 virtual memory

they were also claiming that they were doing a least recently used
replacement stragegy. however, their performance group did some simple
modeling and found that if they choose non-changed least recently used
pages ... before choosing changed least recently used pages ... that
the service time to handle the replacement was drastically reduced. a
non-changed page already had an exact duplicate out on disk ... and
therefor replacement processing could simply discard the copy in
virtual memory and make the real memory location available. a
"changed" page selected for replacement, first had to be written to
disk before the real memory location was available. first attempting
to select non-changed pages for replacement significantly reduced the
service time and processing. I argued that such approach basically
perturbed and violated any claim as to approximating least recently
used replacement strategy. they did it any way.

so os/vs2 svs eventually morphed into os/vs2 mvs ... and then they
shortened the name to just calling it mvs. customers had been using it
for some number of years ... it was coming up in 1980 ... and somebody
discovered that high-useage, shared executable images (i.e. same
executable image appearing in lots of different virtual address spaces
and being executed by lots of different applications) were being
selected for replacement before low-useage application data pages. The
high-useage, shared executable images were "read-only" ... aka they
were never modified and/or changed. The low-useage application data
areas were constantly being changed. As a result, the high-useage
(execution, shared) pages were being selected for replacement before
low-useage application data pages.

in much the same way that the vm370 page list management was
constantly and significantly changing the order that pages were
examined for replacement ... invalidating basic premise of least
recently used replacement stragegies ... the os/vs2 (svs and mvs) was
also creating an ordering different than based on purely use ... also
invalidating basic premise of least recently used replacement
strategies.

some past posts mentiong the os/vs2 early forey into least
recently used replacement strategy:
http://www.garlic.com/~lynn/94.html#4 Schedulers
http://www.garlic.com/~lynn/94.html#49 Rethinking Virtual Memory
http://www.garlic.com/~lynn/2000c.html#35 What level of computer is needed for a computer to Love?
http://www.garlic.com/~lynn/2001b.html#61 Disks size growing while disk count shrinking = bad performance
http://www.garlic.com/~lynn/2002.html#6 index searching
http://www.garlic.com/~lynn/2002c.html#52 Swapper was Re: History of Login Names
http://www.garlic.com/~lynn/2003b.html#44 filesystem structure, was tape format (long post)
http://www.garlic.com/~lynn/2004.html#13 Holee shit! 30 years ago!
http://www.garlic.com/~lynn/2004o.html#57 Integer types for 128-bit addressing
http://www.garlic.com/~lynn/2005f.html#47 Moving assembler programs above the line
http://www.garlic.com/~lynn/2005n.html#19 Code density and performance?
http://www.garlic.com/~lynn/2005n.html#21 Code density and performance?
http://www.garlic.com/~lynn/2006b.html#15 {SPAM?} Re: Expanded Storage
--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/
j***@aol.com
2006-05-14 11:51:15 UTC
Permalink
In article <***@lhwlinux.garlic.com>,
Anne & Lynn Wheeler <***@garlic.com> wrote:
<snip>
Post by Anne & Lynn Wheeler
they were also claiming that they were doing a least recently used
replacement stragegy. however, their performance group did some simple
modeling and found that if they choose non-changed least recently used
pages ... before choosing changed least recently used pages ... that
the service time to handle the replacement was drastically reduced. a
non-changed page already had an exact duplicate out on disk ... and
therefor replacement processing could simply discard the copy in
virtual memory and make the real memory location available.
Right. All you had to was zero the word, or half word, or bit,
that pointed at the index of the page table.

But, now you have the decision of when does the data within
that page get zeroed when it is placed into the next
address space usage list. I don't think it matters as
long as they're all done on the same side of usage.
Post by Anne & Lynn Wheeler
a
"changed" page selected for replacement, first had to be written to
disk before the real memory location was available. first attempting
to select non-changed pages for replacement significantly reduced the
service time and processing. I argued that such approach basically
perturbed and violated any claim as to approximating least recently
used replacement strategy. they did it any way.
so os/vs2 svs eventually morphed into os/vs2 mvs ... and then they
shortened the name to just calling it mvs. customers had been using it
for some number of years ... it was coming up in 1980 ... and somebody
discovered that high-useage, shared executable images (i.e. same
executable image appearing in lots of different virtual address spaces
and being executed by lots of different applications) were being
selected for replacement before low-useage application data pages. The
high-useage, shared executable images were "read-only" ... aka they
were never modified and/or changed. The low-useage application data
areas were constantly being changed. As a result, the high-useage
(execution, shared) pages were being selected for replacement before
low-useage application data pages.
They didn't keep a count of how many were sharing the code? This
means that user data pages had the same priority as code? One
would assume that all user data pages would have be written out.
Post by Anne & Lynn Wheeler
in much the same way that the vm370 page list management was
constantly and significantly changing the order that pages were
examined for replacement ... invalidating basic premise of least
recently used replacement stragegies ... the os/vs2 (svs and mvs) was
also creating an ordering different than based on purely use ... also
invalidating basic premise of least recently used replacement
strategies.
Was there a difference between exec pages and user pages?
Then a subset of those categories would have to be code
and data, with the rare exception where code is data
(writable code segment which god never meant happen).

I suppose there would also have to be special handling
of data pages that were suddenly changed to code.

Comments on the discussion:

1. An OS did not need VM to do relocation. Example: KA10.
2. Do not confuse paging hardware with virtual memory.
They are different. The reason this confusion happens
is because both were usually done during the same
major version OS release. If your new CPU has paging
hardware, you might as well schedule your VM project
at the same time. You might as well subject the customer
to both pains all at the same time. It was like
pulling a tooth: yank it and get it over with or tweak
it and have years of long drawn annoying pain in the
nethers.

/BAH
Anne & Lynn Wheeler
2006-05-14 14:35:29 UTC
Permalink
Post by j***@aol.com
Then a subset of those categories would have to be code
and data, with the rare exception where code is data
(writable code segment which god never meant happen).
I suppose there would also have to be special handling
of data pages that were suddenly changed to code.
1. An OS did not need VM to do relocation. Example: KA10.
2. Do not confuse paging hardware with virtual memory.
They are different. The reason this confusion happens
is because both were usually done during the same
major version OS release. If your new CPU has paging
hardware, you might as well schedule your VM project
at the same time. You might as well subject the customer
to both pains all at the same time. It was like
pulling a tooth: yank it and get it over with or tweak
it and have years of long drawn annoying pain in the
nethers.
i've described two instances where there was special case processing
... and both instances resulted in non-optimal implementations ...

* one was the initial morph from cp67 to vm370 where they had actual
lists of pages for the scanning/reset/replacement selection and
"shared" pages were treated specially ... not based on reference
bits

* the other was the initial morph from mvt to os/vs2 where they would
bias the replacement selection for non-changed pages before changed
pages

post including description of the above two scenarios
http://www.garlic.com/~lynn/2006i.html#43 virtual memory

i had arguments with both groups over the details and they went ahead
and did it anyway (which, in both cases, got corrected much later
... the os/vs2 they had to do the correction themselves, the vm370
shared pages, i was able to correct in the release of the resource
manager).

i had also done a paging, non-virtual memory thing originally as an
undergraduate in the 60s for cp67 ... but it was never picked up in
the product until the morph of cp67 to vm370 where it was used. the
issue was the kernel code ran real mode, w/o hardware translate turned
on. all its addressing was based on real addresses. when dealing with
addresses from virtual address space ... it used the LRA (load real
address) instruction that translated from virtual to real.

the issue was that the real kernel size was continuing to grow as more
and more features were being added. this was starting to impact the
number of pages left over for virtual address paging. recent post
in this thread making mention of measure of "pageable pages"
(after fixed kernel requirements):
http://www.garlic.com/~lynn/2006i.html#36 virtual memory
http://www.garlic.com/~lynn/2006j.html#2 virtual memory

so i created a dummy set of tables for the "kernel" address space
... and partitioned some low-useage kernel routines (various kinds of
commands, etc) into real, 4k "chunks". I positioned all these real
chunks at the high end of the kernel. When there were calls to
addresses above the "pageable" line ... the call processing
... instead of directly transfering would run the address thru the
dummy table to see if the chunk was actually resident. if it was
resident, then the call would be made to the translated address
location ... running w/o virtual memory turned on. if the 4k chunk
was indicated as not resident, the paging routine was called to bring
it into storage before transferring to the "real" address. during the
call, the page fixing and lock ... that CCWTRANS for performing
virtual i/o ... was used for preventing the page for be selected from
removal from real storage. the count was decremented at the
return. otherwise these "pageable" kernel pages could be selected for
replacement just like any other page. some recent mentions of CCWTRANS
http://www.garlic.com/~lynn/2006.html#31 Is VIO mandatory?
http://www.garlic.com/~lynn/2006.html#38 Is VIO mandatory?
http://www.garlic.com/~lynn/2006b.html#25 Multiple address spaces
http://www.garlic.com/~lynn/2006f.html#5 3380-3390 Conversion - DISAPPOINTMENT
http://www.garlic.com/~lynn/2006i.html#33 virtual memory

this feature never shipped as part of the cp67 kernel, but was picked
up as part of the initial morph of cp67 to vm370.

Later for the resource manager, i also created a (2nd) small dummy set
of tables for every virtual address space that was used for
administrative writing of tables to disk. tables were collected into
page-aligned 4k real chunks and the page I/O infrastructure was used
for moving the tables to/from disk (in a manner similar to how i had
done the original pageable kernel implementation) previous description
of "paging" SWAPTABLEs.
http://www.garlic.com/~lynn/2006i.html#24 Virtual memory implementation in S/370

in the initial morph of cp67->vm370, some of cms was re-orged to take
advantage of the 370 shared segment protection. however, before 370
virtual memory was announced and shipped, the feature was dropped from
the product line (because the engineers doing the hardwareretrofit of
virtual memory to 370/165 said that shared segment protect and a
couple other features would cause an extra six month delay). a few
past posts mentioning virtual memory retrofit to 370/165:
http://www.garlic.com/~lynn/2006i.html#4 Mainframe vs. xSeries
http://www.garlic.com/~lynn/2006i.html#9 Hadware Support for Protection Bits: what does it really mean?
http://www.garlic.com/~lynn/2006i.html#23 Virtual memory implementation in S/370

as a result, the shared page protection had to redone as a hack to
utilize the storage protect keys that had been carried over from 360.
this required behind the scenes fiddling of the virtual machine
architecture ... which prevented running cms with the virtual machine
assist microcode activated (hardware directly implemented virtual
machine execution of privilege instructions). later, in order to run
cms virtual machines with the VMA microcode assist, protection was
turned off. instead a scan of all shared pages was substituted that
occured on every task switch. an application running in virtual
address space could modify shared pages ... but the effect would be
caught and discarded before task switch occured (so any modification
wouldn't be apparent in other address spaces). this sort of worked
running single processor configurations ... but got much worse in
multi-processor configurations. now you had to have a unique set of
shared pages specific to each real processor. past post mentioning
the changed protection hack for cms
http://www.garlic.com/~lynn/2006i.html#9 Hadware Support for Protection Bits: what does it really mean?
http://www.garlic.com/~lynn/2006i.html#23 Virtual memory implementation in S/370

past posts mention pageable kernel work:
http://www.garlic.com/~lynn/2001b.html#23 Linux IA-64 interrupts [was Re: Itanium benchmarks ...]
http://www.garlic.com/~lynn/2001l.html#32 mainframe question
http://www.garlic.com/~lynn/2002b.html#44 PDP-10 Archive migration plan
http://www.garlic.com/~lynn/2002n.html#71 bps loader, was PLX
http://www.garlic.com/~lynn/2002p.html#56 cost of crossing kernel/user boundary
http://www.garlic.com/~lynn/2002p.html#64 cost of crossing kernel/user boundary
http://www.garlic.com/~lynn/2003f.html#12 Alpha performance, why?
http://www.garlic.com/~lynn/2003f.html#14 Alpha performance, why?
http://www.garlic.com/~lynn/2003f.html#20 Alpha performance, why?
http://www.garlic.com/~lynn/2003f.html#23 Alpha performance, why?
http://www.garlic.com/~lynn/2003f.html#26 Alpha performance, why?
http://www.garlic.com/~lynn/2003f.html#30 Alpha performance, why?
http://www.garlic.com/~lynn/2003n.html#45 hung/zombie users ... long boring, wandering story
http://www.garlic.com/~lynn/2004b.html#26 determining memory size
http://www.garlic.com/~lynn/2004g.html#45 command line switches [Re: [REALLY OT!] Overuse of symbolic constants]
http://www.garlic.com/~lynn/2004o.html#9 Integer types for 128-bit addressing
http://www.garlic.com/~lynn/2005f.html#10 Where should the type information be: in tags and descriptors
http://www.garlic.com/~lynn/2005f.html#16 Where should the type information be: in tags and descriptors
http://www.garlic.com/~lynn/2006.html#35 Charging Time
http://www.garlic.com/~lynn/2006.html#40 All Good Things
--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/
j***@aol.com
2006-05-16 11:52:13 UTC
Permalink
<snip implementation descriptions reluctantly>

Just for clarification: I don't think I ever knew
how TOPS-10 did the stuff at this level. I do
remember discussions about it all but I understood 0%
of what the guys talked about. I no longer can
recall what was said, let alone the order.

ISTM that DECUS had some sessions about all this stuff.
Is any of those session writeups available?


/BAH
Anne & Lynn Wheeler
2006-05-17 14:39:16 UTC
Permalink
In virtual machine environments, software versions of hardware
approaches tends to be used instead of "creating beautiful new
impediments to understanding" (Henry Spencer -- from #8 in "The Ten
Commandments for C Programmers").
old post discussing how hardware address translation worked
on trout 1.5 (3090), including email from fall of '83
http://www.garlic.com/~lynn/2003j.html#42 Flash 10208

there is a reference to Birnbaum starting in early '75 on 801 project
(including split caches), i.e. 801 turned into romp, rios, power,
power/pc, etc. misc. past posts mentioning 801
http://www.garlic.com/~lynn/subtopic.html#801

and old discussion about SIE (virtual machine assist) from long
ago and far away (couple weeks short of 25years ago):

From: zzzzz
Date: 6/30/81 15:33:04

I would like to add a bit more to the discussion of SIE. I seem to
have hit a sensitive area with my original comments. I would have
preferred to contain this to an offline discussion, but I feel that
some of the things that have appeared in the memos require a reply.

First, let me say that all of the comments I have made have been
accurate to the best of my knowledge. The performance data I quoted
came directly from a presentation I attended given by the VM/811
group. The purpose of the presentation was to justify extensions to
the SIE architecture. Since I my last writing, I have been told by
XXXXX that the numbers quoted were for MVS running under VMTOOL on the
3081. XXXXXX mentioned that VMTOOL has some significant software
problems which are partially responsible for the poor performance.
Presumably, VM/811 would not have these problems. This was not
pointed out at the meeting.

For many years the software and hardware groups have misunderstood
each other. Engineers who knew nothing about software could not
understand why it was necessary to make their hardware do certain
things. Likewise, programmers who knew nothing about software could
not understand why the engineers could not make the hardware do the
things they wanted. Traditionally, microcode has been done by
engineers because a thorough understanding of the hardware is
necessary in order to write microcode. In recent years, this has
become a problem as more complex software functions have been placed
into microcode. In my department, we have tried to remedy this
problem by hiring people with programming experience as
microprogrammers.

The statement that millions of dollars have been spent writing
microcode in order to avoid putting a few ten cent latches into the
hardware is completely false. The truth is that changes have often
been made to the microcode to AVOID SPENDING MILLIONS OF DOLLARS by
putting a change in hardware. In the world of LSI and VLSI, there
is no longer such a thing as a "ten cent latch." Once a chip has
been designed, it is very expensive to make even the smallest
change to it.

Microcode offers a high degree of flexibility in an environment that
is subject to change, especially if one has a writable control store.
When a change is necessary, it can often be had for "free" by making a
change to the microcode and sending out a new floppy disk, whereas it
might cost millions of dollars to make an equivalent change to the
hardware. While the performance of the microcode may not be as good
as the hardware implementation, the overall cost/performance has
dictated that it is the method of choice.

As I pointed out in a previous writing, what works well or does not
work well on one machine says absolutely nothing about the performance
of that item on another machine. XXXXX seems to have completely
missed this critical point, since he expects a 158-like performance
boost from SIE on machines which are nothing like the 158 in their
design.

SIE is a poor performer on the 3081 for several reasons. One reason
is that the 3081 pages its microcode. Each time it is necessary to
enter or exit SIE, a large piece of microcode must be paged in to
carry out this function. This is rather costly in terms of
performance. A performance gain could be realized if the number of
exit/entry trips could be reduced. One way of doing this would be to
emulate more instructions on the assumption that it takes less to
emulate them than it does to exit and re-enter emulation mode. This
thinking is completely valid for the 3081, but is not necessarily
relevent when it comes to other machines, such as TROUT.

TROUT does not page its microcode, and therefore the cost of exiting
and re-entering emulation mode is less. The thinking behind the
changes to the SIE architecture should be re-examined when it comes to
TROUT because the data upon which the changes were based is not
necessarily valid. This is why I have asked that the extensions to
SIE be made optional. This would allow machines that do have
performance problems to implement the extensions, while machines that
do not have problems could leave them out and use their control store
for more valuable functions.

The extensions that are being proposed are not at all trivial. It may
seem like a simple matter to emulate an I/O instruction, but such is
not the case. To appreciate what is involved, one must have a
detailed understanding of just how the CPU, SCE and and Channels work.

Other machines do indeed have an easier time when it comes to
implementing some of these assists. That is because they are rather
simple machines internally, not because their designers had more
foresight when they designed the machines. The cycle time of TROUT is
only slightly faster than the 3081, yet TROUT is much faster in terms
of MIPS. This performance comes from the highly overlapped design of
the processor. This makes for a much more complex design. Sometimes
you pay dearly for this, like when it comes to putting in complex
microcode functions.

TROUT has never treated SIE as "just another assist." SIE has been a
basic part of our machine's design since the beginning. In fact, we
have chosen to put many functions into hardware instead of microcode
to pick up significant performance gains. For example, the 3081 takes
a significant amount of time to do certain types of guest-to-host
address translation because it does them in microcode, while TROUT
does them completely in hardware.

... snip ...

nomenclature in the above with "guest" refers to an operating system
running in a virtual machine.

...

with regard to the above comment about virtual machines and I/O
instruction ... part of the issue is translating the I/O channel
program and fixing the related virtual pages in real memory .. since
the real channels run using real addresses from the channel programs.
the channel programs from the virtual address space have all been
created using the addresses of the virtual address space. this
wasn't just an issue for virtual machine emulation ... but OS/VS2
also has the issue with channel programs created by applications
running in virtual address space.

...

3090 responded to Amdahl's hypervisor support with PR/SM, misc. past
posts mentioning PR/SM (and LPARs)
http://www.garlic.com/~lynn/2000c.html#76 Is a VAX a mainframe?
http://www.garlic.com/~lynn/2002o.html#15 Home mainframes
http://www.garlic.com/~lynn/2002o.html#18 Everything you wanted to know about z900 from IBM
http://www.garlic.com/~lynn/2002p.html#44 Linux paging
http://www.garlic.com/~lynn/2002p.html#46 Linux paging
http://www.garlic.com/~lynn/2002p.html#48 Linux paging
http://www.garlic.com/~lynn/2003.html#56 Wild hardware idea
http://www.garlic.com/~lynn/2003n.html#13 CPUs with microcode ?
http://www.garlic.com/~lynn/2003o.html#52 Virtual Machine Concept
http://www.garlic.com/~lynn/2004c.html#4 OS Partitioning and security
http://www.garlic.com/~lynn/2004c.html#5 PSW Sampling
http://www.garlic.com/~lynn/2004m.html#41 EAL5
http://www.garlic.com/~lynn/2004m.html#49 EAL5
http://www.garlic.com/~lynn/2004n.html#10 RISCs too close to hardware?
http://www.garlic.com/~lynn/2004o.html#13 Integer types for 128-bit addressing
http://www.garlic.com/~lynn/2004p.html#37 IBM 3614 and 3624 ATM's
http://www.garlic.com/~lynn/2004q.html#18 PR/SM Dynamic Time Slice calculation
http://www.garlic.com/~lynn/2004q.html#76 Athlon cache question
http://www.garlic.com/~lynn/2005.html#6 [Lit.] Buffer overruns
http://www.garlic.com/~lynn/2005b.html#5 Relocating application architecture and compiler support
http://www.garlic.com/~lynn/2005c.html#56 intel's Vanderpool and virtualization in general
http://www.garlic.com/~lynn/2005d.html#59 Misuse of word "microcode"
http://www.garlic.com/~lynn/2005d.html#74 [Lit.] Buffer overruns
http://www.garlic.com/~lynn/2005h.html#13 Today's mainframe--anything to new?
http://www.garlic.com/~lynn/2005h.html#19 Blowing My Own Horn
http://www.garlic.com/~lynn/2005k.html#43 Determining processor status without IPIs
http://www.garlic.com/~lynn/2005m.html#16 CPU time and system load
http://www.garlic.com/~lynn/2005p.html#29 Documentation for the New Instructions for the z9 Processor
http://www.garlic.com/~lynn/2006e.html#15 About TLB in lower-level caches
http://www.garlic.com/~lynn/2006h.html#30 The Pankian Metaphor

...

misc. past posts mentioning CCWTRANS (cp/67 routine that created
"shadow" channel program copies of what was in the virtual address
space, replacing all virtual addresses with "real" addresses, for
example initial prototype of OS/VS2 was built by crafting hardware
translation into mvt and hacking a copy of CP67's CCWTRANS into mvt):
http://www.garlic.com/~lynn/2000.html#68 Mainframe operating systems
http://www.garlic.com/~lynn/2000c.html#34 What level of computer is needed for a computer to Love?
http://www.garlic.com/~lynn/2001b.html#18 Linux IA-64 interrupts [was Re: Itanium benchmarks ...]
http://www.garlic.com/~lynn/2001i.html#37 IBM OS Timeline?
http://www.garlic.com/~lynn/2001i.html#38 IBM OS Timeline?
http://www.garlic.com/~lynn/2001l.html#36 History
http://www.garlic.com/~lynn/2002c.html#39 VAX, M68K complex instructions (was Re: Did Intel Bite Off More Than It Can Chew?)
http://www.garlic.com/~lynn/2002g.html#61 GE 625/635 Reference + Smart Hardware
http://www.garlic.com/~lynn/2002j.html#70 hone acronym (cross post)
http://www.garlic.com/~lynn/2002l.html#65 The problem with installable operating systems
http://www.garlic.com/~lynn/2002l.html#67 The problem with installable operating systems
http://www.garlic.com/~lynn/2002n.html#62 PLX
http://www.garlic.com/~lynn/2003b.html#0 Disk drives as commodities. Was Re: Yamhill
http://www.garlic.com/~lynn/2003g.html#13 Page Table - per OS/Process
http://www.garlic.com/~lynn/2003k.html#27 Microkernels are not "all or nothing". Re: Multics Concepts For
http://www.garlic.com/~lynn/2004.html#18 virtual-machine theory
http://www.garlic.com/~lynn/2004c.html#59 real multi-tasking, multi-programming
http://www.garlic.com/~lynn/2004d.html#0 IBM 360 memory
http://www.garlic.com/~lynn/2004g.html#50 Chained I/O's
http://www.garlic.com/~lynn/2004m.html#16 computer industry scenairo before the invention of the PC?
http://www.garlic.com/~lynn/2004n.html#26 PCIe as a chip-to-chip interconnect
http://www.garlic.com/~lynn/2004n.html#54 CKD Disks?
http://www.garlic.com/~lynn/2004o.html#57 Integer types for 128-bit addressing
http://www.garlic.com/~lynn/2005b.html#23 360 DIAGNOSE
http://www.garlic.com/~lynn/2005b.html#49 The mid-seventies SHARE survey
http://www.garlic.com/~lynn/2005b.html#50 [Lit.] Buffer overruns
http://www.garlic.com/~lynn/2005f.html#45 Moving assembler programs above the line
http://www.garlic.com/~lynn/2005f.html#47 Moving assembler programs above the line
http://www.garlic.com/~lynn/2005p.html#18 address space
http://www.garlic.com/~lynn/2005q.html#41 Instruction Set Enhancement Idea
http://www.garlic.com/~lynn/2005s.html#25 MVCIN instruction
http://www.garlic.com/~lynn/2005t.html#7 2nd level install - duplicate volsers
http://www.garlic.com/~lynn/2006.html#31 Is VIO mandatory?
http://www.garlic.com/~lynn/2006.html#38 Is VIO mandatory?
http://www.garlic.com/~lynn/2006b.html#25 Multiple address spaces
http://www.garlic.com/~lynn/2006f.html#5 3380-3390 Conversion - DISAPPOINTMENT
http://www.garlic.com/~lynn/2006i.html#33 virtual memory
http://www.garlic.com/~lynn/2006j.html#5 virtual memory
--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/
Eugene Miya
2006-05-17 17:32:07 UTC
Permalink
"Memory is like an orgasm - it's better when you don't have to fake it."
"You can't fake what you don't have".
"In this business, you can't fake what you don't have"

"If you were plowing a field, which would you rather use? Two strong
oxen or 1024 chickens?"
-- Seymour Cray

Scene: 1979 Cray Research, Inc. Annual Meeting
Lutherin Brotherhood Building, Minneapolis, Mn.
Q & A period, after the address by the Officers of the Company,

Q: "Mr. Cray, ... Since you seem to have implemented almost
all of the current schemes published in the scientific press
on improving performance in your systems, I was wondering
why you didn't also provide for virtual memeory?"

A: From Mr. Cray: "Well as you know, over the years I have
provided the largest physical memories available for use.
The addition of a "virtual memory" scheme would have
added another level of hardware and hardware addressing
delays in accessing code and data.
I believe that it's better to spend the resource providing for
a larger overall memory system for the programmer. ...
Historically, this is what the programmers have preferred."

--
Anne & Lynn Wheeler
2006-05-17 19:18:38 UTC
Permalink
Post by Anne & Lynn Wheeler
TROUT has never treated SIE as "just another assist." SIE has been a
basic part of our machine's design since the beginning. In fact, we
have chosen to put many functions into hardware instead of microcode
to pick up significant performance gains. For example, the 3081 takes
a significant amount of time to do certain types of guest-to-host
address translation because it does them in microcode, while TROUT
does them completely in hardware.
re:
http://www.garlic.com/~lynn/2006j.html#27 virtual memory

"811" (named after 11/78 publication date on the architecture
documents) or 3081 was considered somewhat of a 155/158 follow-on
machine ... being much more of a m'coded machine.

"TROUT" or 3090 was considered somewhat of a 165/168 follow-on machine
... being much more of a hardwired machine.

these were the days of processors getting bigger and bigger with much
more effort being put into more processors in SMP configuration.

they had created two positions, one in charge of "tightly-coupled"
architecuture (SMP) and one in charge of "loosely-coupled"
architecture (clusters). my wife got con'ed into taking the job in pok
in charge of loosed-coupled architecture.

she didn't last long ... while there, she did do done peer-coupled
shared data architecture
http://www.garlic.com/~lynn/subtopic.html#shareddata

which didn't see much uptake until sysplex ... except for the ims
group doing ims hot-standby.

part of the problem was she was fighting frequently with the
communication's group, who wanted SNA/VTAM to be in charge of any
signals leaving a processor complex (even those directly to another
processor).

one example was trouter/3088 ... she fought hard for hardware
enhancements for full-duplex operation. there had been a previous
"channel-to-channel" hardware which was half-duplex direct channel/bus
communication between two processor complexes. 3088 enhanced this to
provide connectivity to up to eight different processor complexes.

sna was essentially a dumb terminal controller infrastructure. their
reference to it as a "network" required other people in the
organization to migrate to using the term "peer-to-peer network" to
differentiate from the sna variety.

of course, earlier, in the time-frame that sna was just starting out
... she had also co-authored a peer-to-peer networking architecture
with Burt Moldow ... which was somewhat viewed as threatening to sna
... misc. past posts mentioning awp39:
http://www.garlic.com/~lynn/2004n.html#38 RS/6000 in Sysplex Environment
http://www.garlic.com/~lynn/2004p.html#31 IBM 3705 and UC.5
http://www.garlic.com/~lynn/2005p.html#8 EBCDIC to 6-bit and back
http://www.garlic.com/~lynn/2005p.html#15 DUMP Datasets and SMS
http://www.garlic.com/~lynn/2005p.html#17 DUMP Datasets and SMS
http://www.garlic.com/~lynn/2005q.html#27 What ever happened to Tandem and NonStop OS ?
http://www.garlic.com/~lynn/2005u.html#23 Channel Distances
http://www.garlic.com/~lynn/2006h.html#52 Need Help defining an AS400 with an IP address to the mainframe

anyway, in the trotter/3088 time-frame ... san jose had done a
prototype vm/cluster implementation using a modified trotter/3088 with
full-duplex protocols. however, before it was allowed to ship, they
had to convert it to san operation. one of the cluster example was to
fully "resynch" cluster operation of all the processors ... with took
under a second using full-duplex protocols on the 3088 ... but the
same operation took on the order of a minute using sna protocols
and a half-duplex paradigm.

we ran afoul again later with 3-tier architecture
http://www.garlic.com/~lynn/subtopic.html#3tier

this was in the time-frame that the communications group was
out pushing SAA ... a lot of which was an attempt to revert back
to terminal emulation paradigm
http://www.garlic.com/~lynn/subnetwork.html#emulation

from that of client/server. we had come up with 3-tier architecture
and was out pitching it to customer executives ... and the same time
they were trying to revert 2-tier architecture back to dumb terminal
emulation.

then we did ha/cmp product
http://www.garlic.com/~lynn/subtopic.html#hacmp

minor reference
http://www.garlic.com/~lynn/95.html#13

which didn't make a lot of them happy either.
--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/
Loading...