{SPAM?} DCSS as SWAP disk for z/Linux

Discussion:

(too old to reply)

Barton Robinson

2006-01-19 19:17:49 UTC

Yah, you might save 1% of a processor if you ever swap at
1000 per second or something like that - never bothered
to measure it, just know that the cost of swap to vdisk
is cheap, fast, and easy to set up, and everybody does it
that way for those reasons.
Just because you CAN swap to dcss does not mean you should.
The only value to dcss has been conjecture, no proof.

I was reading a document that suggested using DCSS as the SWAP
disk for z/LINUX guests instead of V-DISK. This sounded
interesting for several reasons. Unfortunately, the document did
not describe how to implement this.
Has anyone done this or experimented with it?
My guess would be that this DCSS would have to be defined
Exclusive Write.
Can the DCSS be inside the guest's Virtual Storage or must it be
outside?
How would this be formatted? And, how would it be "mounted"?
Jim

"If you can't measure it, I'm Just NOT interested!"(tm)

/************************************************************/
Barton Robinson - CBW Internet: ***@VELOCITY-SOFTWARE.COM
Velocity Software, Inc Mailing Address:
196-D Castro Street P.O. Box 390640
Mountain View, CA 94041 Mountain View, CA 94039-0640

VM Performance Hotline: 650-964-8867
Fax: 650-964-9012 Web Page: WWW.VELOCITY-SOFTWARE.COM
/************************************************************/

Adam Thornton

2006-01-19 20:35:32 UTC

Permalink

Post by Barton Robinson
Yah, you might save 1% of a processor if you ever swap at
1000 per second or something like that - never bothered
to measure it, just know that the cost of swap to vdisk
is cheap, fast, and easy to set up, and everybody does it
that way for those reasons.

And, ahem, let me plug SWAPGEN, which entirely automates the
construction of Linux swap prior to IPL from CMS. Just run SWAPGEN
in your PROFILE EXEC, stick /dev/dasdb1 (or whatever) in /etc/fstab,
make sure that the DIAG driver is loaded on IPL, and let 'er rip.
(Even better: prioritized swap disks.)

Post by Barton Robinson
Just because you CAN swap to dcss does not mean you should.
The only value to dcss has been conjecture, no proof.

I was under the impression that Rob van der Heij (I think) had in
fact measured DCSS to be slightly faster.

Me? I use VDISK and SWAPGEN. I understand exactly how it works
(for, er, obvious reasons) and it's always been plenty fast enough
for me. I haven't measured it either, but I suspect that Barton's
right: at the point at which the performance difference becomes
noticeable, you have much worse problems than which fast memory-based
virtual device you're swapping to.

Adam

Rob van der Heij

2006-01-19 21:11:38 UTC

Permalink

Post by Adam Thornton
I was under the impression that Rob van der Heij (I think) had in
fact measured DCSS to be slightly faster.

Yes, and we're getting to the point where he's also interested when I
can measure it...

Indeed, I showed that in similar scenario we could swap 50% more while
burning a CPU on it. Let's assume the test program did not use any
cycles, and only caused the swapping. Doing 40 MB/s to VDISK means 100
nS per page. With DCSS the rate was 50% higher, so we would assume 65
nS per page.

Now what would be an average swap rate you feel comfortable with when
the server is busy? 500 pages per second probably sounds at the high
end? That means 5% vs 3% when the server is busy. And when it's used
only 5% of the time...

From a pure technical point of view, swapping to DCSS is much more
elegant because you copy a page under SIE and don't step out to CP to
interpret a channel program. But the drawback is that the DCSS is
relatively small and requires additional management structures in the
Linux virtual machine memory. I see some goodness for very small
virtual machines, I think.

Rob
--
Rob van der Heij rvdheij @ gmail.com
Velocity Software, Inc

Anne & Lynn Wheeler

2006-01-19 22:30:18 UTC

Permalink

Post by Rob van der Heij
From a pure technical point of view, swapping to DCSS is much more
elegant because you copy a page under SIE and don't step out to CP to
interpret a channel program. But the drawback is that the DCSS is
relatively small and requires additional management structures in the
Linux virtual machine memory. I see some goodness for very small
virtual machines, I think.

in one sense it is like extended memory on 3090 ... fast memory move
operations. however, real extended memory was real storage. dcss is
just another part of virtual memory. in theory you could achieve
similar operational characteristics just by setting up linux to have
larger virtual memory by the amount that would have gone to dcss
... and having linux rope it off and treat that range of memory the
same way it might treat a range of dcss memory.

the original point of dcss was having some virtual memory semantics
that allowed definition of some stuff that appeared in multiple
virtual address spaces ... recent post discussing some of the dcss
history
http://www.garlic.com/~lynn/2006.html#10 How to restore VMFPLC dumped files on z/VM V5.1

if the virtual space range only occupies a single virtual address
space ... for most practical purposes, what is the difference between
that and just having equivalent virtual space range as non-DCSS (but
treated by linux in the same way that you would treat a DCSS space).

note that in the originally virtual memory management implementation
... only a small subset was picked up for the original DCSS
implemenation, a virtual machine could arbitrarily changes its
allocated segments (contiguous or non-contiguous) ... so long as it
didn't exceed its aggregate resource limit. however, the original
implementation also included support for extremely simplified api and
veriy high performance page mapped disk access (on which page mapped
filesystem was layered)
http://www.garlic.com/~lynn/subtopic.html#mmap

... and sharing across multiple virtual address spaces could be done
as part of the page mapped semantics (aka create a module on a page
mapped disk ... and then the cms loading of that module included
directives about shared segment semantics).

note that one of the issues in unix-based infrastructure ... is that
the unix-flavored kernels may already be using 1/3 to 1/2 of their
(supposedly) real storage for various kinds of caching (which
basically gets you very quickly into 3-level paging logic ... the
stuff linux is using currently, the stuff it has decided to save in
its own cache, and the total stuff that vm is deciding to keep in real
storage). for linux operation in constrained virtual machine memory
sizes, you might get as much or better improvement by tuning its own
internal cache operation.

one of the things i pointed out long ago and far away about running a
lru-algorithm under a lru-algorithm ... is that that things can get
into pathelogical situations (back in the original days of mft/mvt
adding virtual memory for original vs1 & vs2). the cp kernel has
selected a page for replacement based on its not having been used
recently ... however, the virtual machine page manager also discovers
that it needs to replace a page and picks the very same page as the
next one to use (because both algorithms are using the same "use"
criteria). the issue is that both implementations are using the least
used characteristic for the basis for replacement decision. the first
level system is removing the virtual machine page because it believes
it is not going to be used in the near future. however, the virtual
machine is choosing the least recently used page to be the next page
that is used (as opposed to be the next page not to be used).

running a LRU page replacement algorithm under a LRU page replacement
aglorithm is not just an issue of processing overhead ... there is
also the characteristic that LRU algorithm doesn't recurse gracefully
(i.e. a virtual LRU algorithm starts to take on characteristics of an
MRU algorithm to the 1st level algorithm ... i.e. the least recently
used page is the next most likely to be used instead of the least
likely to be used). misc. past stuff about page replacement work ...
originally done as undergraduate for cp67 in the 60s
http://www.garlic.com/~lynn/subtopic.html#wsclock

some specific past posts on LRU algorithm running under LRU algorithm
http://www.garlic.com/~lynn/2000g.html#3 virtualizable 360, was TSS ancient history
http://www.garlic.com/~lynn/2000g.html#30 Could CDR-coding be on the way back?
http://www.garlic.com/~lynn/2001f.html#54 any 70's era supercomputers that ran as slow as today's supercomputers?
http://www.garlic.com/~lynn/2001g.html#29 any 70's era supercomputers that ran as slow as today's supercomputers?
http://www.garlic.com/~lynn/2002p.html#4 Running z/VM 4.3 in LPAR & guest v-r or v=f
http://www.garlic.com/~lynn/2003c.html#13 Unused address bits
http://www.garlic.com/~lynn/2003j.html#25 Idea for secure login
http://www.garlic.com/~lynn/2004l.html#66 Lock-free algorithms
http://www.garlic.com/~lynn/2005c.html#27 [Lit.] Buffer overruns
http://www.garlic.com/~lynn/2005n.html#21 Code density and performance?
http://www.garlic.com/~lynn/94.html#01 Big I/O or Kicking the Mainframe out the Door
http://www.garlic.com/~lynn/94.html#46 Rethinking Virtual Memory
http://www.garlic.com/~lynn/94.html#49 Rethinking Virtual Memory
http://www.garlic.com/~lynn/94.html#51 Rethinking Virtual Memory
http://www.garlic.com/~lynn/95.html#2 Why is there only VM/370?
http://www.garlic.com/~lynn/96.html#10 Caches, (Random and LRU strategies)
http://www.garlic.com/~lynn/96.html#11 Caches, (Random and LRU strategies)

with respect to this particular scenario or 2nd level disk access
... one of the characteristics of this long ago and far away page
mapped semantics for high performance disk access (originally done on
cp/67 base) was that it could be used by any virtual machine for all
of their disk accesses (at least those involving page size chunks)
whether it was filesystem or swapping area.

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

Rob van der Heij

2006-01-19 23:29:48 UTC

Permalink

Post by Anne & Lynn Wheeler
in one sense it is like extended memory on 3090 ... fast memory move
operations. however, real extended memory was real storage. dcss is
just another part of virtual memory. in theory you could achieve
similar operational characteristics just by setting up linux to have
larger virtual memory by the amount that would have gone to dcss
... and having linux rope it off and treat that range of memory the
same way it might treat a range of dcss memory.

The experiment that I did some time ago is to carve out part of Linux
"real" memory by defining a ram-disk in it, and then use that as swap
device. As long as Linux does not need memory so much that it would
swap for it, the ramdisk only reduces the footprint (good). When
demand is higher and Linux continues to swap, you would disable that
swap device and return the ramdisk to the kernel for normal use
(certainly not intuitive). Unfortunately Linux will eventually also
swap the ramdisk pages (ouch).

But swapping to DCSS really was a side-effect of the facility. The
real value should be in sharing (code and data).

Post by Anne & Lynn Wheeler
one of the things i pointed out long ago and far away about running a
lru-algorithm under a lru-algorithm ... is that that things can get

Yes, this is where CMM plays a significant role.

Rob
--
Rob van der Heij rvdheij @ gmail.com
Velocity Software, Inc

David Boyes

2006-01-19 21:07:16 UTC

Permalink

Post by Adam Thornton
I was under the impression that Rob van der Heij (I think)
had in fact measured DCSS to be slightly faster.

I'd also observe that DCSS swapping may avoid a SIE drop, but as others
have mentioned, if you're worrying about that size of overhead, you're
already so far up the creek that it's probably not going to matter.

DCSS swapping is also much more difficult to configure and manage than
VDISK, so it comes down to whether your time is more expensive than some
loss of efficiency.