List Info

Thread: Re: Hard disk woes




Re: Hard disk woes
user name
2005-09-05 12:19:12
On Mon, Sep 05, 2005 at 03:16:13PM +0000, Michael Abbott
wrote:
> I'm having some very odd behaviour from one of my hard
disks and I wonder
> what anybody makes of it.
> 
> In brief, the hard disk in questions works just fine
much of the time, but
> when high volume data transfers are requested I get the
following in
> /var/log/messages:
> 
> Sep  3 15:21:02 saturn /kernel: ad6: READ command
timeout tag=0 serv=0 - 
> resetting
> Sep  3 15:21:02 saturn /kernel: ata3: resetting devices
.. done
> Sep  3 15:21:12 saturn /kernel: ad6: READ command
timeout tag=0 serv=0 - 
> resetting
> Sep  3 15:21:12 saturn /kernel: ata3: resetting devices
.. done
> Sep  3 15:21:23 saturn /kernel: ad6: READ command
timeout tag=0 serv=0 - 
> resetting
> Sep  3 15:21:23 saturn /kernel: ata3: resetting devices
.. done
> Sep  3 15:21:33 saturn /kernel: ad6: READ command
timeout tag=0 serv=0 - 
> resetting
> Sep  3 15:21:33 saturn /kernel: ad6: trying fallback to
PIO mode
> Sep  3 15:21:33 saturn /kernel: ata3: resetting devices
.. done
> Sep  3 15:21:43 saturn /kernel: ad6: READ command
timeout tag=0 serv=0 - 
> resetting
> Sep  3 15:21:43 saturn /kernel: ata3: resetting devices
.. ata3-slave: ATA 
> identify retries exceeded
> Sep  3 15:21:43 saturn /kernel: done
> 
> After this point the hard disk in question is frozen
until I reboot, and
> any process that tries to touch it is similarly frozen
(doesn't even
> respond to kill -9).  `shutdown -r` is enough to
restore operation, and
> the rest of the system seemed happy enough.
> 
> Another interesting effect.  I placed a replacement
hard disk on the same
> ATA bus (as a slave, device ad7) and tried copying
files from ad6 to ad7.
> This time when ad6 froze and the kerned decided to give
up on ata3 (and so
> decided to disable ad7 at the same time, naturally
enough) the entire
> system froze!  No response from the console, stone cold
dead, hard reset
> needed.
> 
> 
> So some questions seem to me to arise from this.
> 
> 1.  Why does FreeBSD handle this so ungracefully?  If
restarting is
> sufficient to bring ata3 back then can't the ata driver
do a proper
> restart?
> 
> 2.  Goodness me, FreeBSD froze!  I know it's a hardware
failure, but
> still: it's on a auxillary ATA controller with no
system files attached.
> Is this problem of general interest?  It's certainly a
massive hint to me
> not to consider (parallel) ATA for RAID!
> 
> 3.  Any thoughts on what is wrong with the hard disk in
question?  I've
> changed ATA controllers, so it seems to be the disk,
not the controller.
> The behaviour is very odd.  If I copy files off one at
a time, eg using:
>  	find . -type f -exec cp {} "$TARGET/"{} ;
-exec echo -n '.' ;
> the disk seems to hang in there, but if I just do
>  	cp -R . "$TARGET"
> then it freezes!  (This statement may not have been
thoroughly tested:
> having to restart each time gets old quite quickly.)
> 
> 
> Ok, now for the boring bits.
> 
> $ uname -a
> FreeBSD saturn.araneidae.co.uk 4.11-RELEASE-p11 FreeBSD
4.11-RELEASE-p11 
> #6: Sat Aug 27 16:33:58 GMT 2005     
> rootsaturn.araneidae.co.uk:/usr/obj/usr/src/sys/GENERIC 
i386
> $ dmesg | grep ata
> atapci0: <HighPoint HPT370 ATA100 controller>
port 
>
0xa000-0xa0ff,0x9c00-0x9c03,0x9800-0x9807,0x9400-0x9403,0x90
00-0x9007 irq 
> 12 at device 11.0 on pci0
> ata2: at 0x9000 on atapci0
> ata3: at 0x9800 on atapci0
> atapci1: <VIA 8233 ATA133 controller> port
0xa800-0xa80f at device 17.1 on 
> pci0
> ata0: at 0x1f0 irq 14 on atapci1
> ata1: at 0x170 irq 15 on atapci1
> atapci2: <HighPoint HPT372 ATA133 controller>
port 
>
0xc400-0xc4ff,0xc000-0xc003,0xbc00-0xbc07,0xb800-0xb803,0xb4
00-0xb407 irq 
> 10 at device 19.0 on pci0
> ata4: at 0xb400 on atapci2
> ata5: at 0xbc00 on atapci2
> ad0: 39083MB <Maxtor 4D040H2> [79408/16/63] at
ata0-master UDMA100
> ad1: 190782MB <SAMSUNG SP2014N> [387621/16/63] at
ata0-slave UDMA133
> ad4: 76319MB <ST380021A> [155061/16/63] at
ata2-master UDMA100
> ad6: 76319MB <ST380021A> [155061/16/63] at
ata3-master UDMA100
> acd0: DVD-ROM <CREATIVEDVD-ROM DVD2240E 12/24/97>
at ata1-master PIO4
> $ sudo atacontrol cap ata3 0
> ATA channel 3, Master, device ad6:
> 
> ATA/ATAPI revision    5
> device model          ST380021A
> serial number         3HV0MYL9
> firmware revision     3.10
> cylinders             16383
> heads                 16
> sectors/track         63
> lba supported         156301488 sectors
> lba48 not supported dma supported
> overlap not supported
> 
> Feature                      Support  Enable    Value  
Vendor
> write cache                    yes      yes
> read ahead                     yes      yes
> dma queued                     no       no      0/00
> SMART                          yes      no
> microcode download             yes      yes
> security                       yes      no
> power management               yes      yes
> advanced power management      no       no     
65278/FEFE
> automatic acoustic management  yes      yes     128/80 
128/80
> $
> 
> That's everything I can think of.
> 

Just a general comment:

I had a very similar problem a while back. After replacing
the drive in
question, then replacing the motherboard, I discovered it
was a power
issue. The power supply was freaking out at medium to high
loads, which
was causing the device to continually reset.

Jason
_______________________________________________
freebsd-questionsfreebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-que
stions
To unsubscribe, send any mail to
"freebsd-questions-unsubscribefreebsd.org"
Re: Hard disk woes
user name
2005-09-05 13:56:19
On Mon, 5 Sep 2005, Jason Morgan wrote:
> On Mon, Sep 05, 2005 at 03:16:13PM +0000, Michael
Abbott wrote:
>> I'm having some very odd behaviour from one of my
hard disks and I wonder
>> what anybody makes of it.
>>
>> In brief, the hard disk in questions works just
fine much of the time, but
>> when high volume data transfers are requested I get
the following in
>> /var/log/messages:
>>
>> Sep  3 15:21:02 saturn /kernel: ad6: READ command
timeout tag=0 serv=0 -
>> resetting

> I had a very similar problem a while back. After
replacing the drive in 
> question, then replacing the motherboard, I discovered
it was a power 
> issue. The power supply was freaking out at medium to
high loads, which 
> was causing the device to continually reset.

Well, I hope that's not it.  I'm encouraged to think not:
 	- the problem seems to be tied to one particular hard disk
and I
 	  presently run with four hard disks
 	- the system has operated trouble free for three years
 	- my memory is that it was a good quality power supply.
I don't really see how I'd diagnose a power supply problem,
but as I say, 
the hard disk in question is the only part with problems.
_______________________________________________
freebsd-questionsfreebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-que
stions
To unsubscribe, send any mail to
"freebsd-questions-unsubscribefreebsd.org"
[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )