List Info

Thread: Core2 Duo 1.8 NetBSD 4BETA SLOWER than Celeron M 1.3 NetBSD3 - Help!




Core2 Duo 1.8 NetBSD 4BETA SLOWER than Celeron M 1.3 NetBSD3 - Help!
user name
2006-10-18 20:41:58
Help!

This is really beyond me.

I have posted a few times about this "great fast"
new Core2 Duo machine 
I bought a while ago.
After a problem with a defective 512 MB RAM block, which I
swapped for 
two 1 GB block of a less unknown brand, I thought my
problems were 
solved. The machine runs NetBSD 4.0BETA, from dmesg (which I
have posted 
before, so here only some relevant excerpts):
NetBSD 4.0_BETA (GENERIC.MPACPI) #0: Fri Sep 15 03:25:05 UTC
2006
        
buildsb3.netbsd.org:/home/builds/ab/netbsd-4/i386/200609140
000Z-obj/home/builds/ab/netbsd-4/src/sys/arch/i386/compile/G
ENER
IC.MPACPI
total memory = 2039 MB
avail memory = 1994 MB

The machine is equipped with a Samsung 80 GB SATA II disk,
and I added 
an older Maxtor because I need to clean up a lot of old
"garbage".

atapibus0 at atabus0: 2 targets
cd0 at atapibus0 drive 1: <LITE-ON DVD SOHD-16P9S, ,
FS09> cdrom removable
cd0: 32-bit data port
cd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 2
(Ultra/33)
wd0 at atabus0 drive 0: <Maxtor 6Y080L0>
wd0: drive supports 16-sector PIO transfers, LBA addressing
wd0: 78167 MB, 158816 cyl, 16 head, 63 sec, 512 bytes/sect x
160086528 
sectors
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6
(Ultra/133)
wd0(piixide0:0:0): using PIO mode 4, Ultra-DMA mode 5
(Ultra/100) (using 
DMA)
cd0(piixide0:0:1): using PIO mode 4, Ultra-DMA mode 2
(Ultra/33) (using DMA)
wd1 at atabus1 drive 0: <SAMSUNG HD080HJ>
wd1: drive supports 16-sector PIO transfers, LBA48
addressing
wd1: 76319 MB, 155061 cyl, 16 head, 63 sec, 512 bytes/sect x
156301488 
sectors
wd1: 32-bit data port
wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 7
wd1(piixide1:0:0): using PIO mode 4, Ultra-DMA mode 6
(Ultra/133) (using 
DMA)
boot device: wd1
root on wd1a dumps on wd1b
root file system type: ffs

I also have a slightly old, mediocre ThinkPad R50e, with a
1.3 GHz 
Celeron M CPU and 1270 MB RAM. I have upgraded its disk to a
120 GB WD 
Scorpion. It is this machine that has accumulated the
"garbage" (Music 
CD images, ripped music from my CD collection, huge Maildir
mail 
archives with thousands of small mail files, disk dumps,
source code, 
etc etc.) 43722 MB in total.

Thinking that the new machine will be better suited for
sorting all this 
out, and because I want to clean the ThinkPad and reinstall
it 
differently, I have moved all this data to the new machine.
As I intend 
to buy bigger SATA disks when economy allows, and as the two
disks are 
dissimilar, I have not configured RAID-1, so to be safe, I
copied the 
data twice.

Because I really am an insane, paranoid nut, I use this
little script 
(md5dir) to verify data integrity:
#! /bin/sh
cd $1
find . -type f |(IFS="" ; while read f ; do echo
`md5 <"$f"`"##$f" ; done)

It's perhaps not as fast/efficient/smart as mtree, but it
does precisely 
what I need. Generates a simple list that can easily be
manipulated with 
sed, cut, uniq, sort, perl, diff etc. It's also great for
identifying 
duplicate files and so on.

Here is the time from the ThinkPad (which is called
"able". I use the 
old phonetic alphabet to name my machines, but as you will
see later, 
this has turned ironic on me):

able:/home $ time sudo md5dir >/tmp/lhp.md5sums
/home/lhp/bin/md5dir: cannot open
./MUSIK/Modest_Musorgskij/Nina 
Kavtaradze/Piano Music (Disc 2)/2-12 Limoges. Le MarcheÌ (La
Grande 
Nouvelle).m4a: no such file
/home/lhp/bin/md5dir: cannot open
./MUSIK/Modest_Musorgskij/Nina 
Kavtaradze/Piano Music (Disc 2)/2-17 Hopak De Jeunes
Ukrainiens (De 
L'opeÌra) _La Foire De Sorotchintsy_.m4a: no such file
/home/lhp/bin/md5dir: cannot open
./MUSIK/Modest_Musorgskij/Nina 
Kavtaradze/Piano Music (Disc 2)/2-18 SceÌne De Foire
(Fragment De 
L'opeÌra) _La Foire De Sorotchintsy_.m4a: no such file
/home/lhp/bin/md5dir: cannot open
./MAIL_NEWS/News/Archive/Re  Tintins 
"far" HergeÌ i  Horn: no such file
 4264.78s real  1336.45s user  1363.77s system

As you see, it took 71 minutes to completely hash
everything. /bin/sh 
has problems with some filenames, but that's unimportant.

Of course I ran the same on the two copies on the Core2Duo
machine. Now, 
I *have* had problems with /bin/sh giving segmentation
faults now and 
then, even after replacing the RAM, but no more memory
faults. As a 
temporary fix, I did "mv /bin/sh /bin/osh ; ln /bin/ksh
/bin/sh", which 
helped a bit when I built stuff from pkgsrc.

This had the added bonus of not giving errors with 8bit
characters in 
filenames as seen above. When  I ran my script on the copy
on the Maxtor 
disk, it ran OK. I let it run over night, so I don't know
the time it 
took. (I just reran on the Maxtor with /bin/osh, and it
crashed after 20 
minutes. I then timed the Maxtor disk with ksh, and this
time it ran - 
again without any fault:
dog:/disk2/usr/ablehome $ time sudo md5dirKSH lhp 
 >ksh.md5sums              
 3909.83s real  1189.11s user  1362.91s system

It is worth noticing that this was barely faster than
"able". Presumably 
this just implies that I/O is the limiting bottleneck in
this operation.)

But when I tried to do the same on the Samsung SATA disk, I
got  *memory 
fault* errrors after processing  about 250000 of the 1.4
million files. 
Sometimes sooner, sometimes later. I tried to switch to
/bin/osh, and to 
/rescue/sh and /rescue/ksh, but still I would get a memory
fault after 
some (fairly long) time. Also, it would take noticeably
longer. After 
hacking up a way to do a shorter list of files at a time,
and then cat 
together the complete list, I remembered that I had bash
installed in 
/usr/pkg/bin. I have now been running the script for more
than 6 hours - 
but at least bash didn't crash! (It just finished right
now.)

So, to sum things up:

I have a supposedly "wicked fast" machine, which
turns out to live up to 
the name I happened to bestow upon it: dog.
I get occasional segfaults with /bin/sh, whereas /bin/ksh
works slightly 
better, but in some situations, it also segfaults - at least
when doing 
stuff with the SATA disk. Bash seems to work better, but is
slow as hell.

The whole mess seems to be related to the system it's
running: 
i386-MPACPI, 4.0BETA build 200609140000Z, the size of its
memory (?), 
and the type of task I try to perform: a shell script going
through 
1,405,214 files of varying size, doing an MD5 sum on each.
This I 
suppose implies large pipes, lots of memory mapped file I/O,
etc.

However I don't really have the knowledge to even find out
where to 
begin debugging this mess. I can barely come up with a few
questions, 
which I hope some knowledgeable persons may have answers
for:
* Am I correct in assuming that the RAM is not necessarily
to blame 
here, IOW, can memory faults occur due to other reasons than
bad RAM?
* Is there a more suitable system/kernel than
i386-GENERIC.MPACPI I 
could use? Switch to amd64 perhaps? Others have been talking
about XEN 
in connection with Core2Duo machines?
* Why is there such a difference between the SATA disk and
the PATA 
disk? Running the same script on the same data on the PATA
is fine, on 
the other I eventually get memory faults. Consistently.
* Am I doing myself a disservice by running 4.0BETA rather
than 3.x? I 
had hoped I would gain support for the Realtek 8168B
ethernet device on 
the motherboard (ASRock ConRoe 945G-DVI), but I haven't had
any luck 
there either.
* What I am most concerned about is whether there is still a
hardware 
fault, which only shows up under heavy load. But after
having replaced 
the RAM, I feel this is rather unlikely? Am I being too
optimistic?

Any suggestions as to what I should do with this machine
(well, 
obviously excluding suggestions to donate it, trash it etc)
would be 
most welcome! And if I have accidentally stumbled upon some
rare -  
maybe even subtle - bug, that only shows up under special
circumstances 
and loads, I sure would like to help get this fixed. I would
file a PR - 
if I wasn't so unsure as to what to write in it! If someone
could 
suggest some tests to run, I would be delighted to do so!

My plan was for this machine to replace the Pentium II 233
MHz with it's 
whining 40GB drive, which is my current home server. (This
machine was 
set up quickly to stand in for a 350 MHz machine, which
didn't come up 
after a power outage.) The high-pitch howling of
"fox" is getting on my 
nerves, but before I put the "dog" on its watch, I
want to be sure it 
can handle the job!

-Lasse
Core2 Duo 1.8 NetBSD 4BETA SLOWER than Celeron M 1.3 NetBSD3 - Help!
user name
2006-10-18 22:57:44
Lasse Hillerøe Petersen <lhptoft-hp.dk> wrote:
> Any suggestions as to what I should do with this
machine (well, 
> obviously excluding suggestions to donate it, trash it
etc) would be 
> most welcome! And if I have accidentally stumbled upon
some rare -  
> maybe even subtle - bug, that only shows up under
special circumstances 
> and loads, I sure would like to help get this fixed. I
would file a PR - 
> if I wasn't so unsure as to what to write in it! If
someone could 
> suggest some tests to run, I would be delighted to do
so!

 I would start by watching the output of iostat and vmstat
while the Core2
Duo machine is doing the work.

something like:

iostat -x 10

or

iostat wd0 wd1 10

keep in mind that the first line of output in both iostat
and vmstat is a
summary since the performance counters were cleared,
typically the last
reboot. The subsequent lines show real-time averages.

  See the man pages for explanations of the columns
headings.

 If iostat or vmstat reports idle CPU while your job is
running, then
often the bottleneck is I/O.


 -johan
[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )