List Info

Thread: Kernel flag to reduce swap usage of reserved (but unallocated) stack memory...




Kernel flag to reduce swap usage of reserved (but unallocated) stack memory...
country flaguser name
United States
2007-06-04 17:03:55
Hi!

----

Would it be usefull to create a kernel flag to mark all
process stack
memory with |MAP_NORESERVE|, e.g. don't allocate swap memory
for stacks
? The idea is to reduce the swap requirements for platforms
like settop
boxes etc. with limited or no swap space by exploiting the
detail that
almost all applications reserve memory for large stacks (8MB
default)
but only use a tiny fraction (most shells and shell
applications rarely
use more than 64k).

AFAIK targets like BeleniX's LiveDVD would greatly benefit
from such an
optional switch in /etc/system

----

Bye,
Roland

-- 
  __ .  . __
 (o. / /.o) roland.mainznrubsig.org
  __//__/  MPEG specialist,
C&&JAVA&&Sun&&Unix programmer
  /O /== O  TEL +49 641 7950090
 (;O/ / O;)
_______________________________________________
appliances-discuss mailing list
appliances-discussopensolaris.org
http://mail.opensolaris.org/mailman/listinfo/appl
iances-discuss

Kernel flag to reduce swap usage of reserved (but unallocated)
user name
2007-06-04 17:19:11
> Would it be usefull to create a kernel flag to mark all
process stack
> memory with |MAP_NORESERVE|, e.g. don't allocate swap
memory for stacks
> ? The idea is to reduce the swap requirements for
platforms like settop
> boxes etc. with limited or no swap space by exploiting
the detail that
> almost all applications reserve memory for large stacks
(8MB default)
> but only use a tiny fraction (most shells and shell
applications rarely
> use more than 64k).

When you say, "limited or no swap space," what are
you really talking
about here?  Solaris can use both memory and disk as
reserved memory
swap.  Check out anon_resvmem() in the kernel.  It can take
pages from
either a swap device or memory.

If you have a fininite amount of resource, using
MAP_NORESERVE means
that your failure mode for overuse will be
non-deterministic.  I.e.
you'll get a SIGBUS or SIGSEGV when you try to write to the
stack and a
page cannot be allocated.  I'm in favor of the more
deterministic method
of having the failure occur at allocation time because we're
out of
memory.  (It's easier to debug too)

-j

_______________________________________________
appliances-discuss mailing list
appliances-discussopensolaris.org
http://mail.opensolaris.org/mailman/listinfo/appl
iances-discuss

Kernel flag to reduce swap usage of reserved (but unallocated)
country flaguser name
United States
2007-06-04 17:42:37
johansen-osdevsun.com wrote:
> 
> > Would it be usefull to create a kernel flag to
mark all process stack
> > memory with |MAP_NORESERVE|, e.g. don't allocate
swap memory for stacks
> > ? The idea is to reduce the swap requirements for
platforms like settop
> > boxes etc. with limited or no swap space by
exploiting the detail that
> > almost all applications reserve memory for large
stacks (8MB default)
> > but only use a tiny fraction (most shells and
shell applications rarely
> > use more than 64k).
> 
> When you say, "limited or no swap space,"
what are you really talking
> about here?  Solaris can use both memory and disk as
reserved memory
> swap.  Check out anon_resvmem() in the kernel.  It can
take pages from
> either a swap device or memory.

I was thinking about the traditional Unix view of a
"swap slice", e.g. a
settop-box may only have flash memory and no swap disk/slice
etc.

> If you have a fininite amount of resource, using
MAP_NORESERVE means
> that your failure mode for overuse will be
non-deterministic.  I.e.
> you'll get a SIGBUS or SIGSEGV when you try to write to
the stack and a
> page cannot be allocated. 

... my point is that if the machine only has 128MB and no
swap (disk) is
available (e.g. LiveDVD) then every single wasted memory
page is one too
much. AFAIK (looking at pmap output of a few processes of a
B48/SPARC
machine) many processes never use all the stack they reserve
by default.
Even if one process uses it's stack more extensively the
others won't do
that, leaving lots of memory pages reserved but unused.

Some (very wrong and raw) calculations:
Imagine a system has 50 processes where each process has a
8MB stack:
50*8MB=400MB
If each process only uses 1MB of it's 8MB reserved stack
memory you
"waste" ~~350MB (in theory)

One option would be to reduce the default stack size - but
that causes
trouble for applications which have been optimized to use
more stack
(which adds some nasty error/failure cases anyway) -
therefore it may
(IMO) be better to keep the current stack size at it's
current value and
change the way how the stack memory is reserved (since the
average stack
usage of all processes is _far_ below the maximum stack
size).

> I'm in favor of the more deterministic method
> of having the failure occur at allocation time because
we're out of
> memory.  (It's easier to debug too)

Right... but in the case of a LiveDVD (which is mainly used
for
demostration) it (AFAIK) matters more that it _runs_ on
small machines -
the correctness is AFAIK secondary in this case (e.g. the
"correct"
approach for a permanent installation would be the creation
of a swap
slice and turn the proposed flag "off").

Note: I am not proposing to turn such a flag "on"
by default but I think
it may be valueable for cases like the BeleniX LiveDVD...

----

Bye,
Roland

-- 
  __ .  . __
 (o. / /.o) roland.mainznrubsig.org
  __//__/  MPEG specialist,
C&&JAVA&&Sun&&Unix programmer
  /O /== O  TEL +49 641 7950090
 (;O/ / O;)
_______________________________________________
appliances-discuss mailing list
appliances-discussopensolaris.org
http://mail.opensolaris.org/mailman/listinfo/appl
iances-discuss

Kernel flag to reduce swap usage of reserved (but unallocated)
user name
2007-06-04 19:21:06
> I was thinking about the traditional Unix view of a
"swap slice", e.g. a
> settop-box may only have flash memory and no swap
disk/slice etc.

I think that in this case, as long as you don't configure
the flash disk
to be part of a swap partition, the kernel will reserve
physical memory
as swap instead.

> > If you have a fininite amount of resource, using
MAP_NORESERVE means
> > that your failure mode for overuse will be
non-deterministic.  I.e.
> > you'll get a SIGBUS or SIGSEGV when you try to
write to the stack and a
> > page cannot be allocated. 
> 
> ... my point is that if the machine only has 128MB and
no swap (disk) is
> available (e.g. LiveDVD) then every single wasted
memory page is one too
> much. 

I don't understand this reasoning.  I agree that we don't
want to waste
memory in this situation.  However, in the case of 128MB mem
and no swap
device, a call to anon_resvmem() will simply deduct the
amount of memory
that is available to the system by the size of the
requested
reservation.

This is necessary because we've taken a pagefault in the
stack at this
address and need pages to back the VA we've allocated.  As
far as I
understand it, these pages aren't being wasted because we're
_using_
them as part of the stack.

> AFAIK (looking at pmap output of a few processes of a
B48/SPARC
> machine) many processes never use all the stack they
reserve by default.
> Even if one process uses it's stack more extensively
the others won't do
> that, leaving lots of memory pages reserved but
unused.

I disagree with this statement.  The anon reservation is
made at the
time the stack pages are allocated.  This happens when we
get a
pagefault in the process' stack.  The memory will only be
reserved if it
is going to be used by the process.  This is because the
pagefault
occurs when the unmapped area of the stack is accessed.

> Some (very wrong and raw) calculations:
> Imagine a system has 50 processes where each process
has a 8MB stack:
> 50*8MB=400MB
> If each process only uses 1MB of it's 8MB reserved
stack memory you
> "waste" ~~350MB (in theory)

I'm not sure I understand where you're getting 8MB as a size
for a
stack?  A random sample of processes that I pmap'd had a
16k-24k stack.

> One option would be to reduce the default stack size -
but that causes
> trouble for applications which have been optimized to
use more stack
> (which adds some nasty error/failure cases anyway) -
therefore it may
> (IMO) be better to keep the current stack size at it's
current value and
> change the way how the stack memory is reserved 

The stack size is grown as needed.  The memory isn't
reserved until we
grow the stack.

> (since the average stack usage of all processes is
_far_ below the
> maximum stack size).

Yes, and therefore the amount of anonymous memory reserved
is, on
average, much smaller than the maximum stack size.

> Right... but in the case of a LiveDVD (which is mainly
used for
> demostration) it (AFAIK) matters more that it _runs_ on
small machines -
> the correctness is AFAIK secondary in this case (e.g.
the "correct"
> approach for a permanent installation would be the
creation of a swap
> slice and turn the proposed flag "off").

I think it needs to do both: run and be correct.  Is there
an actual
problem you're seeing with a LiveDVD?

-j
_______________________________________________
appliances-discuss mailing list
appliances-discussopensolaris.org
http://mail.opensolaris.org/mailman/listinfo/appl
iances-discuss

Re: Kernel flag to reduce swap usage of reserved (but unallocated) stack memory
country flaguser name
United States
2007-06-05 03:08:29

>Would it be usefull to create a kernel flag to mark all
process stack
>memory with |MAP_NORESERVE|, e.g. don't allocate swap
memory for stacks
>? The idea is to reduce the swap requirements for
platforms like settop
>boxes etc. with limited or no swap space by exploiting
the detail that
>almost all applications reserve memory for large stacks
(8MB default)
>but only use a tiny fraction (most shells and shell
applications rarely
>use more than 64k).

No, they do not reserve 8MB as by default the swap is
MAP_NORESERVE
(it's somewhat different than that).  All thread stacks are
MAP_NORESERVE
also.

The net gain would be 0.

You can easily check this:

$ swap -s
total: 502992k bytes allocated + 25504k reserved = 528496k
used, 10645456k available
$ sh 
$ swap -s
total: 503112k bytes allocated + 25520k reserved = 528632k
used, 10645312k available


See how only 16K is added to reserved?  The whole system has
25MB 
"reserved" but has 125 processes (which would
yield around 1G of reserved
memory by you reckoning.

Processes do get killed when swap space cannot be gotten
when there's a 
page fault on an unmapped stack page.

Casper

_______________________________________________
appliances-discuss mailing list
appliances-discussopensolaris.org
http://mail.opensolaris.org/mailman/listinfo/appl
iances-discuss

Kernel flag to reduce swap usage of reserved (but unallocated)
country flaguser name
United States
2007-06-05 03:41:30

>I'm not sure I understand where you're getting 8MB as a
size for a
>stack?  A random sample of processes that I pmap'd had a
16k-24k stack.


I think Roland was simply mistaken and the hole thread
therefor pointless
when we consider stacks.

What we could do, for small memory LiveCD environments, is
allowing 
something like "unsafe mode": over commit memory;
do away with "reserved"
memory and just hand it out until the wolrd collapses.

Just type "swap -s" on a LiveCD system and find
out how much additional
headroom that would give you.

Casper

_______________________________________________
appliances-discuss mailing list
appliances-discussopensolaris.org
http://mail.opensolaris.org/mailman/listinfo/appl
iances-discuss

[1-6]

about | contact  Other archives ( Real Estate discussion Medical topics )