List Info

Thread: misc. MMU: NUMA, big pages, idle zero, ring buffers, PAE, ...




misc. MMU: NUMA, big pages, idle zero, ring buffers, PAE, ...
user name
2006-04-28 23:33:01
Greetings all,


Note:  Some of these ramblings are ia32/aa64-focused, but
the principles
are general.

While exploring PAE last November, I wound up browsing
through uvm/pmap
code.  I've had a few additional ideas, and would like some
[more]
feedback.


/* Big Pages */

Begin by allocating memory stride 2M/4M (former iff PAE,
latter iff
!PAE).  Track wasted 4K [sub]pages.  Split big pages into
smaller ones
when needed, but avoid using page tables until then. 
Coalesce smaller
pages into bigger ones when free RAM permits.

Rationale:  Hopefully less MMU management overhead and fewer
TLB misses
while memory is plentiful.  Fall back to standard behavior
when needed.


/* Fractional/Checkpointed Zeroing of Big Pages */

I whipped up a crude program that performed 1000 bzero(3)
iterations on
a 2M chunk.  Each iteration took about 9 ms on a PIII/500
notebook.
Should the idle-zero loop zero a fraction of a big page? 
What about
dedicating a PDE slot (Intel terminology) to the zero code?

Rationale:  Several milliseconds -- although certainly less
than 9 ms
when on faster CPU and with optimized zeroing code -- is an
eternity.


/* Per-CPU Management */

Both of the above, as well as free page lists, should be
per-CPU.  Can a
CPU be forced to work with the memory closest to it? 
(Consider NUMA
performance, such as multiprocessor Opteron systems.)

Rationale:  Reduced inter-CPU contention.  Assuming
processes have
significant CPU affininty, using "nearby" memory
would reduce reduce
both interconnect bandwidth use and memory access time.


/* Ring Buffers */

A native mapping for ring buffers would be nice:

  	u_char *ringbuf = mmapringbuf(..., MAP_RINGBUF, ...) ;

would allocate a memory region from <base> to <base
+ 2 * size>.  i.e.,

  	base
  	base + size

would both be aliased to the same physical pages.  Voila! 
Simple,
linear ringbuf where the MMU handles wraparound at the
region's end.

Rationale:  It's just so much easier this way. 


/* mremap() */

Zero-copy allocation-size changes are convenient.

Rationale:  Obvious.


Eddy
--
Everquick Internet - http://www.everquick.net/
A division of Brotsman & Dreger, Inc. - http://www.brotsman.com/
Bandwidth, consulting, e-commerce, hosting, and network
building
Phone: +1 785 865 5885 Lawrence and [inter]national
Phone: +1 316 794 8922 Wichita
____________________________________________________________
____________
DO NOT send mail to the following addresses:
davidcbrics.com -*- jfconmaapaqintc.net -*- sameverquick.net
Sending mail to spambait addresses is a great way to get
blocked.
Ditto for broken OOO autoresponders and foolish AV software
backscatter.
[1]

about | contact  Other archives ( Real Estate discussion Medical topics )