List Info

Thread: Re: Identifying CPUs in the kernel




Re: Identifying CPUs in the kernel
country flaguser name
United States
2007-06-23 04:52:33
Peter Humphrey <prhgotadsl.co.uk> posted
200706230916.07711.prhgotadsl.co.uk, excerpted below, on  Sat,
23 Jun
2007 09:16:07 +0100:

> Here's what top showed then. Look at the /nice/ values
on lines 3 and 4,
> and compare those with the %CPU and Processor fields of
processes 5279
> and 5280 (sorry about the line wraps). This has me
deeply puzzled:

Fixed the line wraps and removed a bit of extraneous
information. =8^)
 
> top - 09:04:59 up 23 min, 5 users, load average: 3.60,
4.79, 3.91
> Tasks: 124 total, 2 running, 122 sleeping, 0 stopped, 0
zombie

> Cpu0: 0.3%us, 0.3%sy,  0.0%ni, 99.3%id, [zeroes]
> Cpu1: 0.0%us, 0.3%sy, 99.7%ni,  0.0%id, [zeroes]

> PID USER  PR NI S %CPU %MEM  TIME+   P COMMAND
> 5279 prh  34 19 S   50  1.0  6:53.97 1 setiathome-5.12
> 5280 prh  34 19 S   50  1.0  6:54.08 0 setiathome-5.12


> I don't think this is a scheduling problem; it goes
deeper, so that the
> kernel doesn't have a consistent picture of which
processor is which.

Critical question here, is that in SMP Irix or SMP Solaris
mode?  (See
the top manpage if you don't know what I mean.)  Asked
another way, is
that displaying percent of total CPU time (both CPUs) or
percent of
total divided by number of CPUs (so percent of one CPU)?

If it's Irix mode (percent total CPU time), then it's
reporting full
usage of both CPUs, one on each.  The CPU0 line would then
be the one
screwed up, since it's reporting idle when it clearly has to
be in use.

If it's Solaris mode (percent of a single CPU's time, so
total of all
percentages should be 200% if you have two CPUs), then the
CPUs
lines would seem to be correct, both processes would appear
to be
running on CPU1, maxing it out, and the P column of the 5280
line
would have to be screwed up.  (That's assuming you let the
figures
stabilize after the last schedtool call you made.)

In either case, I'm not sure where your bug is, but you are
correct,
the problem appears to be way deeper than scheduling.  I'd
guess it's
ultimately a kernel bug, possibly due to a hardware bug,
possibly not,
but you might wish to file it on top initially, just to see
if they've
seen similar and can tell you what's going on.  Unless you
want to
double-check patching status yourself, you might as well
file the bug
with Gentoo first, in case it's a Gentoo bug.  They'll
probably end
up closing it "upstream", but at least then when
you file it upstream,
you can say you've cleared it with Gentoo first. 

As for top, note that there's a trick you can use with it. 
You'll
likely want to trim the memory columns etc as I did for your
bug
report, but you may not want to mess up your regular config
to do
so.  Not a problem! =8^)  Create a symlink to top called
something
else (say topbug).  Then run it using the symlink, and you
can change
and save your setttings, and it'll save them in a different
rc file
(topbugrc using my example).  That way, you can run it with
the bug
report settings when you want to, without messing up your
regular
config.

Of course, don't forget to mention in your bug report
whether you were
in Solaris or Irix SMP mode, because as I explained, it
/does/ make a
difference.

Let me know how this goes, post the bug number when you file
it or
whatever, as I'd like to follow it too.  You definitely have
a
strange one here, and I'd /love/ to see what the real
experts have
to say about it!  You are absolutely correct, it doesn't
seem to
make any sense at all!

Good luck.  That's one /strange/ problem you have going
there!
No /wonder/ you were expressing frustration earlier!

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." 
Richard Stallman

-- 
gentoo-amd64gentoo.org mailing list


[1]

about | contact  Other archives ( Real Estate discussion Medical topics )