List Info

Thread: Dual AMD MP unstable under heavy load when smp is active




Dual AMD MP unstable under heavy load when smp is active
country flaguser name
Canada
2008-03-04 20:05:51
Hi guys,

I been having quite some trouble finding a problem whom seem
to be
related with SMP on one of my production server.

The problem is not easily reproducible but the best way I
found was to
fire up "make buildworld" while having some other
things going on
(mysql, apache, bind, jails, etc). When SMP is active, the
compile will
end up with a segfault or, quite rarely, end up with a
crash. I recently
configure the crash device but still was unable to recreate
a full
system crash.

At first, I thought it was related to the memory so I done
some test and
changed most DIMM but ultimately, the problem was sill
there. To pin
point the problem, I first tried to add options to the
GENERIC kernel
witch I found to be stable. That's how I found that it was
related to
SMP. I then tried mixing some other thing like reducing the
driver in
the kernel to the minimum I could for different reason. One
of them is
that the motherboard is a "Tyan thunder K7X"
(http://www.tyan.com/archive/products/html/thunderk7x.ht
ml) and it has
an onbord adaptec SCSI controller which I don't use. Since
the driver
used for this adapter is not MP safe, I tried disabling it
via the BIOS
and/or by disabling the driver in the kernel but it had no
effect. The
actual SCSI adapter in used is the Dell 4/DC (LSILogic
MegaRAID) you can
see in the dmesg.

Now I have no clue on how I could further debug this
problem.

dmesg from generic kernel:

Copyright (c) 1992-2008 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991,
1992, 1993, 1994
        The Regents of the University of California. All
rights reserved.
FreeBSD is a registered trademark of The FreeBSD
Foundation.
FreeBSD 6.3-RELEASE-p1 #0: Wed Feb 27 07:56:51 EST 2008
    rootmegatron.mantor.org:/usr/obj/usr/src/sys/GENERIC
ACPI APIC Table: <PTLTD          APIC  >
Timecounter "i8254" frequency 1193182 Hz quality
0
CPU: AMD Athlon(tm) MP 2200+ (1800.07-MHz 686-class CPU)
  Origin = "AuthenticAMD"  Id = 0x680  Stepping =
0
 
Features=0x383fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,API
C,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE>
  AMD
Features=0xc0480800<SYSCALL,MP,MMX+,3DNow!+,3DNow!>
real memory  = 3220701184 (3071 MB)
avail memory = 3150741504 (3004 MB)
MADT: Forcing active-low polarity and level trigger for SCI
ioapic0 <Version 1.1> irqs 0-23 on motherboard
kbd1 at kbdmux0
ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112,
RF2413, RF5413)
hptrr: HPT RocketRAID controller driver v1.1 (Feb 27 2008
07:56:28)
acpi0: <PTLTD   RSDT> on motherboard
acpi0: Power Button (fixed)
acpi0: Sleep Button (fixed)
Timecounter "ACPI-safe" frequency 3579545 Hz
quality 850
acpi_timer0: <24-bit timer at 3.579545MHz> port
0x8008-0x800b on acpi0
cpu0: <ACPI CPU> on acpi0
acpi_button0: <Power Button> on acpi0
pcib0: <ACPI Host-PCI bridge> port
0xcf8-0xcff,0x8000-0x807f,0x8080-0x80ff iomem
0xd8000-0xdbfff on acpi0
pci0: <ACPI PCI bus> on pcib0
agp0: <AMD 762 host to AGP bridge> port 0x1810-0x1813
mem
0xf8000000-0xfbffffff,0xf6210000-0xf6210fff at device 0.0 on
pci0
pcib1: <ACPI PCI-PCI bridge> at device 1.0 on pci0
pci1: <ACPI PCI bus> on pcib1
isab0: <PCI-ISA bridge> at device 7.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <AMD 768 UDMA100 controller> port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xf000-0xf00f at device
7.1 on pci0
ata0: <ATA channel 0> on atapci0
ata1: <ATA channel 1> on atapci0
pci0: <bridge> at device 7.3 (no driver attached)
amr0: <LSILogic MegaRAID 1.53> mem
0xf6200000-0xf620ffff irq 20 at
device 8.0 on pci0
amr0: delete logical drives supported by controller
amr0: <LSILogic PERC 4/DC> Firmware 350O, BIOS 1.09,
128MB RAM
ahc0: <Adaptec aic7899 Ultra160 SCSI adapter> port
0x1000-0x10ff mem
0xf4000000-0xf4000fff irq 20 at device 10.0 on pci0
ahc0: [GIANT-LOCKED]
aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
ahc1: <Adaptec aic7899 Ultra160 SCSI adapter> port
0x1400-0x14ff mem
0xf4001000-0xf4001fff irq 21 at device 10.1 on pci0
ahc1: [GIANT-LOCKED]
aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs
pcib2: <ACPI PCI-PCI bridge> at device 16.0 on pci0
pci2: <ACPI PCI bus> on pcib2
ohci0: <OHCI (generic) USB controller> mem
0xf4100000-0xf4100fff irq 19
at device 0.0 on pci2
ohci0: [GIANT-LOCKED]
usb0: OHCI version 1.0, legacy support
usb0: SMM does not respond, resetting
usb0: <OHCI (generic) USB controller> on ohci0
usb0: USB revision 1.0
uhub0: AMD OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 4 ports with 4 removable, self powered
pci2: <display, VGA> at device 7.0 (no driver
attached)
xl0: <3Com 3c980C Fast Etherlink XL> port
0x2400-0x247f mem
0xf4102000-0xf410207f irq 18 at device 8.0 on pci2
miibus0: <MII bus> on xl0
ukphy0: <Generic IEEE 802.3u media interface> on
miibus0
ukphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX,
auto
xl0: Ethernet address: 00:e0:81:22:2e:c4
xl1: <3Com 3c980C Fast Etherlink XL> port
0x2480-0x24ff mem
0xf4102400-0xf410247f irq 19 at device 9.0 on pci2
miibus1: <MII bus> on xl1
ukphy1: <Generic IEEE 802.3u media interface> on
miibus1
ukphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX,
auto
xl1: Ethernet address: 00:e0:81:22:2e:c5
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64
irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7
irq 6 drq 2 on acpi0
fdc0: does not respond
device_attach: fdc0 attach returned 6
fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7
irq 6 drq 2 on acpi0
fdc0: does not respond
device_attach: fdc0 attach returned 6
pmtimer0 on isa0
orm0: <ISA Option ROMs> at iomem
0xc0000-0xc7fff,0xc8000-0xc87ff,0xc8800-0xc8fff,0xe0000-0xe3
fff on isa0
ppc0: parallel port not found.
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio0: configured irq 4 not in bitmap of probed irqs 0
sio0: port may not be enabled
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 8250 or not responding
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem
0xa0000-0xbffff on isa0
Timecounter "TSC" frequency 1800073530 Hz quality
800
Timecounters tick every 1.000 msec
hptrr: no controller detected.
Waiting 5 seconds for SCSI devices to settle
ad0: 476940MB <WDC WD5000AAKB-00UKA0 07.01N01> at
ata0-master UDMA100
amr0: delete logical drives supported by controller
amrd0: <LSILogic MegaRAID logical drive> on amr0
amrd0: 139900MB (286515200 sectors) RAID 1 (optimal)
Trying to mount root from ufs:/dev/amrd0s1a

kldstat:

Id Refs Address    Size     Name
 1   10 0xc0400000 7a05b0   kernel
 2    1 0xc0ba1000 5c304    acpi.ko
 3    1 0xc8093000 3000     fdescfs.ko
 4    1 0xc8106000 3000     pflog.ko
 5    1 0xc8109000 2d000    pf.ko
 6    1 0xc817b000 19000    linux.ko

If you have any idea or you need more information to
diagnosis the
problem please let me known.

regards,

---
Danny Fullerton
Mantor Organization
_______________________________________________
freebsd-smpfreebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-smp
To unsubscribe, send any mail to
"freebsd-smp-unsubscribefreebsd.org"

Re: Dual AMD MP unstable under heavy load when smp is active
country flaguser name
United States
2008-03-04 21:00:47
Danny,

I don't know what the bug is, but it does exist.

I have an IBM x3455 with 2 Opteron dual core processors. 
Under heavy loads 
it crashes.  As a step in debugging, I unplugged one of the
processors, and 
the problem went away.  I switched to Centos version 4, and
it operates 
perfectly.

In addition to FreeBSD,  the problem also exists in Fedora
Core.

Of the OSes I tested, only Redhat and Centos worked
correctly on the x3455.

I didn't try Windows, so I can't say whether or not it
operates properly on 
this system.

Unfortunately, that is all I know about the issue.

Paul Missman


----- Original Message ----- 
From: "Danny Fullerton" <northoxmantor.org>
To: <freebsd-smpfreebsd.org>
Sent: Tuesday, March 04, 2008 9:05 PM
Subject: Dual AMD MP unstable under heavy load when smp is
active


> Hi guys,
>
> I been having quite some trouble finding a problem whom
seem to be
> related with SMP on one of my production server.
>
> The problem is not easily reproducible but the best way
I found was to
> fire up "make buildworld" while having some
other things going on
> (mysql, apache, bind, jails, etc). When SMP is active,
the compile will
> end up with a segfault or, quite rarely, end up with a
crash. I recently
> configure the crash device but still was unable to
recreate a full
> system crash.
>
> At first, I thought it was related to the memory so I
done some test and
> changed most DIMM but ultimately, the problem was sill
there. To pin
> point the problem, I first tried to add options to the
GENERIC kernel
> witch I found to be stable. That's how I found that it
was related to
> SMP. I then tried mixing some other thing like reducing
the driver in
> the kernel to the minimum I could for different reason.
One of them is
> that the motherboard is a "Tyan thunder K7X"
> (http://www.tyan.com/archive/products/html/thunderk7x.ht
ml) and it has
> an onbord adaptec SCSI controller which I don't use.
Since the driver
> used for this adapter is not MP safe, I tried disabling
it via the BIOS
> and/or by disabling the driver in the kernel but it had
no effect. The
> actual SCSI adapter in used is the Dell 4/DC (LSILogic
MegaRAID) you can
> see in the dmesg.
>
> Now I have no clue on how I could further debug this
problem.
>
> dmesg from generic kernel:
>
> Copyright (c) 1992-2008 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991,
1992, 1993, 1994
>        The Regents of the University of California. All
rights reserved.
> FreeBSD is a registered trademark of The FreeBSD
Foundation.
> FreeBSD 6.3-RELEASE-p1 #0: Wed Feb 27 07:56:51 EST
2008
>    rootmegatron.mantor.org:/usr/obj/usr/src/sys/GENERIC
> ACPI APIC Table: <PTLTD          APIC  >
> Timecounter "i8254" frequency 1193182 Hz
quality 0
> CPU: AMD Athlon(tm) MP 2200+ (1800.07-MHz 686-class
CPU)
>  Origin = "AuthenticAMD"  Id = 0x680 
Stepping = 0
>
>
Features=0x383fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,API
C,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE>
>  AMD
Features=0xc0480800<SYSCALL,MP,MMX+,3DNow!+,3DNow!>
> real memory  = 3220701184 (3071 MB)
> avail memory = 3150741504 (3004 MB)
> MADT: Forcing active-low polarity and level trigger for
SCI
> ioapic0 <Version 1.1> irqs 0-23 on motherboard
> kbd1 at kbdmux0
> ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111,
RF5112, RF2413, RF5413)
> hptrr: HPT RocketRAID controller driver v1.1 (Feb 27
2008 07:56:28)
> acpi0: <PTLTD   RSDT> on motherboard
> acpi0: Power Button (fixed)
> acpi0: Sleep Button (fixed)
> Timecounter "ACPI-safe" frequency 3579545 Hz
quality 850
> acpi_timer0: <24-bit timer at 3.579545MHz> port
0x8008-0x800b on acpi0
> cpu0: <ACPI CPU> on acpi0
> acpi_button0: <Power Button> on acpi0
> pcib0: <ACPI Host-PCI bridge> port
> 0xcf8-0xcff,0x8000-0x807f,0x8080-0x80ff iomem
0xd8000-0xdbfff on acpi0
> pci0: <ACPI PCI bus> on pcib0
> agp0: <AMD 762 host to AGP bridge> port
0x1810-0x1813 mem
> 0xf8000000-0xfbffffff,0xf6210000-0xf6210fff at device
0.0 on pci0
> pcib1: <ACPI PCI-PCI bridge> at device 1.0 on
pci0
> pci1: <ACPI PCI bus> on pcib1
> isab0: <PCI-ISA bridge> at device 7.0 on pci0
> isa0: <ISA bus> on isab0
> atapci0: <AMD 768 UDMA100 controller> port
> 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xf000-0xf00f at
device 7.1 on pci0
> ata0: <ATA channel 0> on atapci0
> ata1: <ATA channel 1> on atapci0
> pci0: <bridge> at device 7.3 (no driver
attached)
> amr0: <LSILogic MegaRAID 1.53> mem
0xf6200000-0xf620ffff irq 20 at
> device 8.0 on pci0
> amr0: delete logical drives supported by controller
> amr0: <LSILogic PERC 4/DC> Firmware 350O, BIOS
1.09, 128MB RAM
> ahc0: <Adaptec aic7899 Ultra160 SCSI adapter>
port 0x1000-0x10ff mem
> 0xf4000000-0xf4000fff irq 20 at device 10.0 on pci0
> ahc0: [GIANT-LOCKED]
> aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253
SCBs
> ahc1: <Adaptec aic7899 Ultra160 SCSI adapter>
port 0x1400-0x14ff mem
> 0xf4001000-0xf4001fff irq 21 at device 10.1 on pci0
> ahc1: [GIANT-LOCKED]
> aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253
SCBs
> pcib2: <ACPI PCI-PCI bridge> at device 16.0 on
pci0
> pci2: <ACPI PCI bus> on pcib2
> ohci0: <OHCI (generic) USB controller> mem
0xf4100000-0xf4100fff irq 19
> at device 0.0 on pci2
> ohci0: [GIANT-LOCKED]
> usb0: OHCI version 1.0, legacy support
> usb0: SMM does not respond, resetting
> usb0: <OHCI (generic) USB controller> on ohci0
> usb0: USB revision 1.0
> uhub0: AMD OHCI root hub, class 9/0, rev 1.00/1.00,
addr 1
> uhub0: 4 ports with 4 removable, self powered
> pci2: <display, VGA> at device 7.0 (no driver
attached)
> xl0: <3Com 3c980C Fast Etherlink XL> port
0x2400-0x247f mem
> 0xf4102000-0xf410207f irq 18 at device 8.0 on pci2
> miibus0: <MII bus> on xl0
> ukphy0: <Generic IEEE 802.3u media interface> on
miibus0
> ukphy0:  10baseT, 10baseT-FDX, 100baseTX,
100baseTX-FDX, auto
> xl0: Ethernet address: 00:e0:81:22:2e:c4
> xl1: <3Com 3c980C Fast Etherlink XL> port
0x2480-0x24ff mem
> 0xf4102400-0xf410247f irq 19 at device 9.0 on pci2
> miibus1: <MII bus> on xl1
> ukphy1: <Generic IEEE 802.3u media interface> on
miibus1
> ukphy1:  10baseT, 10baseT-FDX, 100baseTX,
100baseTX-FDX, auto
> xl1: Ethernet address: 00:e0:81:22:2e:c5
> atkbdc0: <Keyboard controller (i8042)> port
0x60,0x64 irq 1 on acpi0
> atkbd0: <AT Keyboard> irq 1 on atkbdc0
> kbd0 at atkbd0
> atkbd0: [GIANT-LOCKED]
> fdc0: <floppy drive controller> port
0x3f0-0x3f5,0x3f7 irq 6 drq 2 on 
> acpi0
> fdc0: does not respond
> device_attach: fdc0 attach returned 6
> fdc0: <floppy drive controller> port
0x3f0-0x3f5,0x3f7 irq 6 drq 2 on 
> acpi0
> fdc0: does not respond
> device_attach: fdc0 attach returned 6
> pmtimer0 on isa0
> orm0: <ISA Option ROMs> at iomem
>
0xc0000-0xc7fff,0xc8000-0xc87ff,0xc8800-0xc8fff,0xe0000-0xe3
fff on isa0
> ppc0: parallel port not found.
> sc0: <System console> at flags 0x100 on isa0
> sc0: VGA <16 virtual consoles, flags=0x300>
> sio0: configured irq 4 not in bitmap of probed irqs 0
> sio0: port may not be enabled
> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
> sio0: type 8250 or not responding
> sio1: configured irq 3 not in bitmap of probed irqs 0
> sio1: port may not be enabled
> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem
0xa0000-0xbffff on isa0
> Timecounter "TSC" frequency 1800073530 Hz
quality 800
> Timecounters tick every 1.000 msec
> hptrr: no controller detected.
> Waiting 5 seconds for SCSI devices to settle
> ad0: 476940MB <WDC WD5000AAKB-00UKA0 07.01N01> at
ata0-master UDMA100
> amr0: delete logical drives supported by controller
> amrd0: <LSILogic MegaRAID logical drive> on amr0
> amrd0: 139900MB (286515200 sectors) RAID 1 (optimal)
> Trying to mount root from ufs:/dev/amrd0s1a
>
> kldstat:
>
> Id Refs Address    Size     Name
> 1   10 0xc0400000 7a05b0   kernel
> 2    1 0xc0ba1000 5c304    acpi.ko
> 3    1 0xc8093000 3000     fdescfs.ko
> 4    1 0xc8106000 3000     pflog.ko
> 5    1 0xc8109000 2d000    pf.ko
> 6    1 0xc817b000 19000    linux.ko
>
> If you have any idea or you need more information to
diagnosis the
> problem please let me known.
>
> regards,
>
> ---
> Danny Fullerton
> Mantor Organization
> _______________________________________________
> freebsd-smpfreebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-smp
> To unsubscribe, send any mail to
"freebsd-smp-unsubscribefreebsd.org"
>
>
> -- 
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.5.516 / Virus Database: 269.21.4/1310 -
Release Date: 3/4/2008 
> 8:35 AM
>
> 

_______________________________________________
freebsd-smpfreebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-smp
To unsubscribe, send any mail to
"freebsd-smp-unsubscribefreebsd.org"

Re: Dual AMD MP unstable under heavy load when smp is active
country flaguser name
Canada
2008-03-04 21:32:03
Hello Paul,

I would like to known if done those test with the recent
FreeBSD 7.0? I
seen lots of work in the SMP area of this release and I'm
wondering if I
could have better chance with this version.

thanks,

dmesg with smp on (GENERIC + option smp):

Copyright (c) 1992-2008 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991,
1992, 1993, 1994
        The Regents of the University of California. All
rights reserved.
FreeBSD is a registered trademark of The FreeBSD
Foundation.
FreeBSD 6.3-RELEASE-p1 #0: Wed Feb 27 21:11:40 EST 2008
    rootmegatron.mantor.org:/usr/obj/usr/src/sys/MEGATRONTEST

Timecounter "i8254" frequency 1193182 Hz quality
0
CPU: AMD Athlon(tm) MP 2200+ (1800.07-MHz 686-class CPU)
  Origin = "AuthenticAMD"  Id = 0x680  Stepping =
0
 
Features=0x383fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,API
C,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE>
  AMD
Features=0xc0480800<SYSCALL,MP,MMX+,3DNow!+,3DNow!>
real memory  = 3220701184 (3071 MB)
avail memory = 3146387456 (3000 MB)
ACPI APIC Table: <PTLTD          APIC  >
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): APIC ID:  1
 cpu1 (AP): APIC ID:  0
MADT: Forcing active-low polarity and level trigger for SCI
ioapic0 <Version 1.1> irqs 0-23 on motherboard
kbd1 at kbdmux0
ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112,
RF2413, RF5413)
hptrr: HPT RocketRAID controller driver v1.1 (Feb 27 2008
21:11:16)
acpi0: <PTLTD   RSDT> on motherboard
acpi0: Power Button (fixed)
acpi0: Sleep Button (fixed)
Timecounter "ACPI-safe" frequency 3579545 Hz
quality 850
acpi_timer0: <24-bit timer at 3.579545MHz> port
0x8008-0x800b on acpi0
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
acpi_button0: <Power Button> on acpi0
pcib0: <ACPI Host-PCI bridge> port
0xcf8-0xcff,0x8000-0x807f,0x8080-0x80ff iomem
0xd8000-0xdbfff on acpi0
pci0: <ACPI PCI bus> on pcib0
agp0: <AMD 762 host to AGP bridge> port 0x1810-0x1813
mem
0xf8000000-0xfbffffff,0xf6210000-0xf6210fff at device 0.0 on
pci0
pcib1: <ACPI PCI-PCI bridge> at device 1.0 on pci0
pci1: <ACPI PCI bus> on pcib1
isab0: <PCI-ISA bridge> at device 7.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <AMD 768 UDMA100 controller> port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xf000-0xf00f at device
7.1 on pci0
ata0: <ATA channel 0> on atapci0
ata1: <ATA channel 1> on atapci0
pci0: <bridge> at device 7.3 (no driver attached)
amr0: <LSILogic MegaRAID 1.53> mem
0xf6200000-0xf620ffff irq 20 at
device 8.0 on pci0
amr0: delete logical drives supported by controller
amr0: <LSILogic PERC 4/DC> Firmware 350O, BIOS 1.09,
128MB RAM
ahc0: <Adaptec aic7899 Ultra160 SCSI adapter> port
0x1000-0x10ff mem
0xf4000000-0xf4000fff irq 20 at device 10.0 on pci0
ahc0: [GIANT-LOCKED]
aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
ahc1: <Adaptec aic7899 Ultra160 SCSI adapter> port
0x1400-0x14ff mem
0xf4001000-0xf4001fff irq 21 at device 10.1 on pci0
ahc1: [GIANT-LOCKED]
aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs
pcib2: <ACPI PCI-PCI bridge> at device 16.0 on pci0
pci2: <ACPI PCI bus> on pcib2
ohci0: <OHCI (generic) USB controller> mem
0xf4100000-0xf4100fff irq 19
at device 0.0 on pci2
ohci0: [GIANT-LOCKED]
usb0: OHCI version 1.0, legacy support
usb0: SMM does not respond, resetting
usb0: <OHCI (generic) USB controller> on ohci0
usb0: USB revision 1.0
uhub0: AMD OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 4 ports with 4 removable, self powered
pci2: <display, VGA> at device 7.0 (no driver
attached)
xl0: <3Com 3c980C Fast Etherlink XL> port
0x2400-0x247f mem
0xf4102000-0xf410207f irq 18 at device 8.0 on pci2
miibus0: <MII bus> on xl0
ukphy0: <Generic IEEE 802.3u media interface> on
miibus0
ukphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX,
auto
xl0: Ethernet address: 00:e0:81:22:2e:c4
xl1: <3Com 3c980C Fast Etherlink XL> port
0x2480-0x24ff mem
0xf4102400-0xf410247f irq 19 at device 9.0 on pci2
miibus1: <MII bus> on xl1
ukphy1: <Generic IEEE 802.3u media interface> on
miibus1
ukphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX,
auto
xl1: Ethernet address: 00:e0:81:22:2e:c5
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64
irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7
irq 6 drq 2 on acpi0
fdc0: does not respond
device_attach: fdc0 attach returned 6
fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7
irq 6 drq 2 on acpi0
fdc0: does not respond
device_attach: fdc0 attach returned 6
pmtimer0 on isa0
orm0: <ISA Option ROMs> at iomem
0xc0000-0xc7fff,0xc8000-0xc87ff,0xc8800-0xc8fff,0xe0000-0xe3
fff on isa0
ppc0: parallel port not found.
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio0: configured irq 4 not in bitmap of probed irqs 0
sio0: port may not be enabled
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 8250 or not responding
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem
0xa0000-0xbffff on isa0
Timecounters tick every 1.000 msec
hptrr: no controller detected.
Waiting 5 seconds for SCSI devices to settle
ad0: 476940MB <WDC WD5000AAKB-00UKA0 07.01N01> at
ata0-master UDMA100
amr0: delete logical drives supported by controller
amrd0: <LSILogic MegaRAID logical drive> on amr0
amrd0: 139900MB (286515200 sectors) RAID 1 (optimal)
SMP: AP CPU #1 Launched!
Trying to mount root from ufs:/dev/amrd0s1a

---
Danny Fullerton
Mantor Organization

Paul Missman wrote:
>
> Danny,
>
> I don't know what the bug is, but it does exist.
>
> I have an IBM x3455 with 2 Opteron dual core
processors.  Under heavy
> loads it crashes.  As a step in debugging, I unplugged
one of the
> processors, and the problem went away.  I switched to
Centos version
> 4, and it operates perfectly.
>
> In addition to FreeBSD,  the problem also exists in
Fedora Core.
>
> Of the OSes I tested, only Redhat and Centos worked
correctly on the
> x3455.
>
> I didn't try Windows, so I can't say whether or not it
operates
> properly on this system.
>
> Unfortunately, that is all I know about the issue.
>
> Paul Missman
>
>
> ----- Original Message ----- From: "Danny
Fullerton" <northoxmantor.org>
> To: <freebsd-smpfreebsd.org>
> Sent: Tuesday, March 04, 2008 9:05 PM
> Subject: Dual AMD MP unstable under heavy load when smp
is active
>
>
>> Hi guys,
>>
>> I been having quite some trouble finding a problem
whom seem to be
>> related with SMP on one of my production server.
>>
>> The problem is not easily reproducible but the best
way I found was to
>> fire up "make buildworld" while having
some other things going on
>> (mysql, apache, bind, jails, etc). When SMP is
active, the compile will
>> end up with a segfault or, quite rarely, end up
with a crash. I recently
>> configure the crash device but still was unable to
recreate a full
>> system crash.
>>
>> At first, I thought it was related to the memory so
I done some test and
>> changed most DIMM but ultimately, the problem was
sill there. To pin
>> point the problem, I first tried to add options to
the GENERIC kernel
>> witch I found to be stable. That's how I found that
it was related to
>> SMP. I then tried mixing some other thing like
reducing the driver in
>> the kernel to the minimum I could for different
reason. One of them is
>> that the motherboard is a "Tyan thunder
K7X"
>> (http://www.tyan.com/archive/products/html/thunderk7x.ht
ml) and it has
>> an onbord adaptec SCSI controller which I don't
use. Since the driver
>> used for this adapter is not MP safe, I tried
disabling it via the BIOS
>> and/or by disabling the driver in the kernel but it
had no effect. The
>> actual SCSI adapter in used is the Dell 4/DC
(LSILogic MegaRAID) you can
>> see in the dmesg.
>>
>> Now I have no clue on how I could further debug
this problem.
>>
>> dmesg from generic kernel:
>>
>> Copyright (c) 1992-2008 The FreeBSD Project.
>> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989,
1991, 1992, 1993, 1994
>>        The Regents of the University of California.
All rights reserved.
>> FreeBSD is a registered trademark of The FreeBSD
Foundation.
>> FreeBSD 6.3-RELEASE-p1 #0: Wed Feb 27 07:56:51 EST
2008
>>    rootmegatron.mantor.org:/usr/obj/usr/src/sys/GENERIC
>> ACPI APIC Table: <PTLTD          APIC  >
>> Timecounter "i8254" frequency 1193182 Hz
quality 0
>> CPU: AMD Athlon(tm) MP 2200+ (1800.07-MHz 686-class
CPU)
>>  Origin = "AuthenticAMD"  Id = 0x680 
Stepping = 0
>>
>>
Features=0x383fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,API
C,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE>
>>
>>  AMD
Features=0xc0480800<SYSCALL,MP,MMX+,3DNow!+,3DNow!>
>> real memory  = 3220701184 (3071 MB)
>> avail memory = 3150741504 (3004 MB)
>> MADT: Forcing active-low polarity and level trigger
for SCI
>> ioapic0 <Version 1.1> irqs 0-23 on
motherboard
>> kbd1 at kbdmux0
>> ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111,
RF5112, RF2413,
>> RF5413)
>> hptrr: HPT RocketRAID controller driver v1.1 (Feb
27 2008 07:56:28)
>> acpi0: <PTLTD   RSDT> on motherboard
>> acpi0: Power Button (fixed)
>> acpi0: Sleep Button (fixed)
>> Timecounter "ACPI-safe" frequency 3579545
Hz quality 850
>> acpi_timer0: <24-bit timer at 3.579545MHz>
port 0x8008-0x800b on acpi0
>> cpu0: <ACPI CPU> on acpi0
>> acpi_button0: <Power Button> on acpi0
>> pcib0: <ACPI Host-PCI bridge> port
>> 0xcf8-0xcff,0x8000-0x807f,0x8080-0x80ff iomem
0xd8000-0xdbfff on acpi0
>> pci0: <ACPI PCI bus> on pcib0
>> agp0: <AMD 762 host to AGP bridge> port
0x1810-0x1813 mem
>> 0xf8000000-0xfbffffff,0xf6210000-0xf6210fff at
device 0.0 on pci0
>> pcib1: <ACPI PCI-PCI bridge> at device 1.0 on
pci0
>> pci1: <ACPI PCI bus> on pcib1
>> isab0: <PCI-ISA bridge> at device 7.0 on
pci0
>> isa0: <ISA bus> on isab0
>> atapci0: <AMD 768 UDMA100 controller> port
>> 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xf000-0xf00f
at device 7.1 on pci0
>> ata0: <ATA channel 0> on atapci0
>> ata1: <ATA channel 1> on atapci0
>> pci0: <bridge> at device 7.3 (no driver
attached)
>> amr0: <LSILogic MegaRAID 1.53> mem
0xf6200000-0xf620ffff irq 20 at
>> device 8.0 on pci0
>> amr0: delete logical drives supported by
controller
>> amr0: <LSILogic PERC 4/DC> Firmware 350O,
BIOS 1.09, 128MB RAM
>> ahc0: <Adaptec aic7899 Ultra160 SCSI adapter>
port 0x1000-0x10ff mem
>> 0xf4000000-0xf4000fff irq 20 at device 10.0 on
pci0
>> ahc0: [GIANT-LOCKED]
>> aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253
SCBs
>> ahc1: <Adaptec aic7899 Ultra160 SCSI adapter>
port 0x1400-0x14ff mem
>> 0xf4001000-0xf4001fff irq 21 at device 10.1 on
pci0
>> ahc1: [GIANT-LOCKED]
>> aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253
SCBs
>> pcib2: <ACPI PCI-PCI bridge> at device 16.0
on pci0
>> pci2: <ACPI PCI bus> on pcib2
>> ohci0: <OHCI (generic) USB controller> mem
0xf4100000-0xf4100fff irq 19
>> at device 0.0 on pci2
>> ohci0: [GIANT-LOCKED]
>> usb0: OHCI version 1.0, legacy support
>> usb0: SMM does not respond, resetting
>> usb0: <OHCI (generic) USB controller> on
ohci0
>> usb0: USB revision 1.0
>> uhub0: AMD OHCI root hub, class 9/0, rev 1.00/1.00,
addr 1
>> uhub0: 4 ports with 4 removable, self powered
>> pci2: <display, VGA> at device 7.0 (no driver
attached)
>> xl0: <3Com 3c980C Fast Etherlink XL> port
0x2400-0x247f mem
>> 0xf4102000-0xf410207f irq 18 at device 8.0 on pci2
>> miibus0: <MII bus> on xl0
>> ukphy0: <Generic IEEE 802.3u media interface>
on miibus0
>> ukphy0:  10baseT, 10baseT-FDX, 100baseTX,
100baseTX-FDX, auto
>> xl0: Ethernet address: 00:e0:81:22:2e:c4
>> xl1: <3Com 3c980C Fast Etherlink XL> port
0x2480-0x24ff mem
>> 0xf4102400-0xf410247f irq 19 at device 9.0 on pci2
>> miibus1: <MII bus> on xl1
>> ukphy1: <Generic IEEE 802.3u media interface>
on miibus1
>> ukphy1:  10baseT, 10baseT-FDX, 100baseTX,
100baseTX-FDX, auto
>> xl1: Ethernet address: 00:e0:81:22:2e:c5
>> atkbdc0: <Keyboard controller (i8042)> port
0x60,0x64 irq 1 on acpi0
>> atkbd0: <AT Keyboard> irq 1 on atkbdc0
>> kbd0 at atkbd0
>> atkbd0: [GIANT-LOCKED]
>> fdc0: <floppy drive controller> port
0x3f0-0x3f5,0x3f7 irq 6 drq 2 on
>> acpi0
>> fdc0: does not respond
>> device_attach: fdc0 attach returned 6
>> fdc0: <floppy drive controller> port
0x3f0-0x3f5,0x3f7 irq 6 drq 2 on
>> acpi0
>> fdc0: does not respond
>> device_attach: fdc0 attach returned 6
>> pmtimer0 on isa0
>> orm0: <ISA Option ROMs> at iomem
>>
0xc0000-0xc7fff,0xc8000-0xc87ff,0xc8800-0xc8fff,0xe0000-0xe3
fff on isa0
>> ppc0: parallel port not found.
>> sc0: <System console> at flags 0x100 on isa0
>> sc0: VGA <16 virtual consoles, flags=0x300>
>> sio0: configured irq 4 not in bitmap of probed irqs
0
>> sio0: port may not be enabled
>> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
>> sio0: type 8250 or not responding
>> sio1: configured irq 3 not in bitmap of probed irqs
0
>> sio1: port may not be enabled
>> vga0: <Generic ISA VGA> at port 0x3c0-0x3df
iomem 0xa0000-0xbffff on
>> isa0
>> Timecounter "TSC" frequency 1800073530 Hz
quality 800
>> Timecounters tick every 1.000 msec
>> hptrr: no controller detected.
>> Waiting 5 seconds for SCSI devices to settle
>> ad0: 476940MB <WDC WD5000AAKB-00UKA0
07.01N01> at ata0-master UDMA100
>> amr0: delete logical drives supported by
controller
>> amrd0: <LSILogic MegaRAID logical drive> on
amr0
>> amrd0: 139900MB (286515200 sectors) RAID 1
(optimal)
>> Trying to mount root from ufs:/dev/amrd0s1a
>>
>> kldstat:
>>
>> Id Refs Address    Size     Name
>> 1   10 0xc0400000 7a05b0   kernel
>> 2    1 0xc0ba1000 5c304    acpi.ko
>> 3    1 0xc8093000 3000     fdescfs.ko
>> 4    1 0xc8106000 3000     pflog.ko
>> 5    1 0xc8109000 2d000    pf.ko
>> 6    1 0xc817b000 19000    linux.ko
>>
>> If you have any idea or you need more information
to diagnosis the
>> problem please let me known.
>>
>> regards,
>>
>> ---
>> Danny Fullerton
>> Mantor Organization
>> _______________________________________________
>> freebsd-smpfreebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-smp
>> To unsubscribe, send any mail to
"freebsd-smp-unsubscribefreebsd.org"
>>
>>
>> -- 
>> No virus found in this incoming message.
>> Checked by AVG Free Edition.
>> Version: 7.5.516 / Virus Database: 269.21.4/1310 -
Release Date:
>> 3/4/2008 8:35 AM
>>
>>
>
> _______________________________________________
> freebsd-smpfreebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-smp
> To unsubscribe, send any mail to
"freebsd-smp-unsubscribefreebsd.org"

_______________________________________________
freebsd-smpfreebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-smp
To unsubscribe, send any mail to
"freebsd-smp-unsubscribefreebsd.org"

Re: Dual AMD MP unstable under heavy load when smp is active
country flaguser name
United States
2008-03-05 05:58:47
Danny,

>From what I can reconstruct, it seems I was using the
64-bit version of 
FreeBSD 6.4.

Looks like the last responder to the list says that version
7 is free of 
this problem.

Best of luck,

Paul


----- Original Message ----- 
From: "Danny Fullerton" <northoxmantor.org>
To: <freebsd-smpfreebsd.org>
Sent: Tuesday, March 04, 2008 10:32 PM
Subject: Re: Dual AMD MP unstable under heavy load when smp
is active


> Hello Paul,
>
> I would like to known if done those test with the
recent FreeBSD 7.0? I
> seen lots of work in the SMP area of this release and
I'm wondering if I
> could have better chance with this version.
>
> thanks,
>
>> 

_______________________________________________
freebsd-smpfreebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-smp
To unsubscribe, send any mail to
"freebsd-smp-unsubscribefreebsd.org"

[1-4]

about | contact  Other archives ( Real Estate discussion Medical topics )