This is turning into a bad month :-(
I'm running BOINC clients on this box, and the kernel seems
unable to
schedule them properly. I'm subscribed to several projects,
so I should
have one on each CPU all the time, running at nice 19 and
therefore mopping
up all available CPU cycles. That's how it used to run. But
nowadays the
kernel scheduler insists on allocating both of them to the
same CPU, thus
limiting them to 50% load. Occasionally it will start up
correctly, but
only if I've started the BOINC client interactively rather
than from a
startup script, but even if so it still reverts to its bad
behaviour after
a while. I haven't been able so far to spot any particular
influence that
might cause this reversion, and the time before it happens
is apparently
random.
The box is a Supermicro H8DCE with 2 x Opteron 246 CPUs and
2 x 2GB RAM.
This board divides the DIMM slots into two banks of four,
one bank next to
each CPU and associated with it. I've tried various kernels
from 2.6.16-r13
to 2.6.21-r1. I've tried unsetting all the clever-looking
optimisations in
the kernel, I've tried all three scheduling algorithms and
I've tried
resetting the BIOS to "optimised" defaults. I've
even tried a genkernel
kernel with default config, but that version couldn't see
the root
disk /dev/sda for some reason, and of course it wouldn't
boot.
It's also odd that CPU1 runs 5 - 6 C hotter than CPU0,
whether loaded or
not.
Sometimes I suspect a problem with APIC or perhaps the
IOMMU, re which I
have mostly default or conservative settings in the kernel.
Has anyone here
some experience they could offer?
I've also been to the BOINC project sites and changed my
preferences to the
most conservative I can find, but still I can't get proper
allocation of
boinc clients to processors. I've tried the forums and got
some useful
help, but not yet a solution.
This all started some time ago, about the time when I had to
replace the
motherboard, but as I wasn't following it very closely at
the time I
haven't been able to pinpoint the factor that caused the
change in kernel
scheduling behaviour.
--
Rgds
Peter Humphrey
Linux Counter 5290, Aug 93
--
gentoo-amd64 gentoo.org mailing list
|