|
List Info
Thread: Replacing disks under 3ware-9550SX safely
|
|
| Replacing disks under 3ware-9550SX
safely |

|
2008-05-01 13:35:19 |
|
Hello, I9;m working on a server with a 3w-9550SX controller, with 3x500G disks in a raid-5 and 1x500G hot spare. One night, a disk fails, and the server crashes! Working on the server, I see that many filesystems were destroyed beyond repair!! This was too bad to hear. Some LVM volumes were repaired, others were restored from backup. The bad disk was removed. I learnt that 3ware controllers aren't really high quality, and they probably corrupt the FSs.
Since all disks are same age, I thought I'd buy new disks to replace the old ones. I bought 4x500G barracuda-ES drives, which should be high quality. Here lies my problem. I need to replace the 3 running disks, with 3 new disks, and add an extra one as hot spare. I am scared to do that, because the standard way is to "fail" a disk, and rebuild on a new one, then repeat for the other 2 disks till all 3 are replaced. Now this puts me in a vulnerable situation, if I "fail" a disk, and while rebuilding another disk naturally fails, all data is gone! Is there any other "wise" way to do what I want safely ? I contacted 3w support, and they just insist I should fail/rebuild, but since I don't have much faith in their controllers or the old disks ... any smarter way to do this ?
Regards
|
| Re: Replacing disks under 3ware-9550SX
safely |
  Taiwan |
2008-05-01 16:37:51 |
Ahmed Kamal wrote:
> Hello,
> I'm working on a server with a 3w-9550SX controller,
with 3x500G disks
> in a raid-5 and 1x500G hot spare. One night, a disk
fails, and the
> server crashes! Working on the server, I see that many
filesystems were
> destroyed beyond repair!! This was too bad to hear.
Some LVM volumes
> were repaired, others were restored from backup. The
bad disk was
> removed. I learnt that 3ware controllers aren't really
high quality, and
> they probably corrupt the FSs.
>
> Since all disks are same age, I thought I'd buy new
disks to replace the
> old ones. I bought 4x500G barracuda-ES drives, which
should be high
> quality. Here lies my problem. I need to replace the 3
running disks,
> with 3 new disks, and add an extra one as hot spare. I
am scared to do
> that, because the standard way is to "fail" a
disk, and rebuild on a new
> one, then repeat for the other 2 disks till all 3 are
replaced. Now this
> puts me in a vulnerable situation, if I
"fail" a disk, and while
> rebuilding another disk naturally fails, all data is
gone! Is there any
> other "wise" way to do what I want safely ? I
contacted 3w support, and
> they just insist I should fail/rebuild, but since I
don't have much
> faith in their controllers or the old disks ... any
smarter way to do this ?
Yes....
Fully backup your file systems and then do a fail/rebuild.
--
"Once more the drama begins."
-- The Emperor Paul Muad'dib on his ascension to the Lion
Throne
_______________________________________________
rhelv5-list mailing list
rhelv5-list redhat.com
h
ttps://www.redhat.com/mailman/listinfo/rhelv5-list
|
|
| RE: Replacing disks under 3ware-9550SX
safely |

|
2008-05-01 18:39:13 |
|
| Certainly that is where the phrase 'backup' comes from. If you want
integrity you can count on, stop transactions to the RAID subsystem, make a
tape/dvd/etc, then fail/replace your drives, if it works you smile. If ever it
stops working, replace the rest of the drives and recover from
tape/dvd/etc.
There
is no way to guarantee that the RAID will save your tushy.
Just
last week I spent a couple hours building a new system with simple RAID 1
on the boot & root. I actually followed a manual step by step to
make sure I did it right (not always my style=). Two ugly experiences were that
when I 'tested' by unplugging a disk & boot, the system would not boot
'normally' - it wanted be to go hand type some disk node that supposedly I
should know about that it wanted I guess. OK, so I plug in the drive and all is
normal. I download and install mindi/mondo and make a bootable CD since
Microlite BackupEDGE does not handle RedHat software RAID, and simply reboot the
system with the CD in place. The CD timed out and launched from the console
message and destroyed my system with the same message as when I removed one
disk.
Given
this, I am abandoning the beauty of RAID on my root/boot disk. I will recover
from DVD if (read when) it fails.
Moral
of the story is: DON'T EVER COUNT ON RAID to save your
hide.
Bill
Watson
Hello, I'm working on a server with
a 3w-9550SX controller, with 3x500G disks in a raid-5 and 1x500G hot spare.
One night, a disk fails, and the server crashes! Working on the server, I see
that many filesystems were destroyed beyond repair!! This was too bad to hear.
Some LVM volumes were repaired, others were restored from backup. The bad disk
was removed. I learnt that 3ware controllers aren't really high quality, and
they probably corrupt the FSs.
Since all disks are same age, I thought
I'd buy new disks to replace the old ones. I bought 4x500G barracuda-ES
drives, which should be high quality. Here lies my problem. I need to replace
the 3 running disks, with 3 new disks, and add an extra one as hot spare. I am
scared to do that, because the standard way is to "fail" a disk, and rebuild
on a new one, then repeat for the other 2 disks till all 3 are replaced. Now
this puts me in a vulnerable situation, if I "fail" a disk, and while
rebuilding another disk naturally fails, all data is gone! Is there any other
"wise" way to do what I want safely ? I contacted 3w support, and they just
insist I should fail/rebuild, but since I don't have much faith in their
controllers or the old disks ... any smarter way to do this
?
Regards
|
| Re: Replacing disks under 3ware-9550SX
safely |
  United Kingdom |
2008-05-03 06:22:48 |
On Thu, May 01, 2008 at 09:35:19PM +0300, Ahmed Kamal
wrote:
> Hello,
> I'm working on a server with a 3w-9550SX controller,
with 3x500G disks in a
> raid-5 and 1x500G hot spare. One night, a disk fails,
and the server
> crashes! Working on the server, I see that many
filesystems were destroyed
> beyond repair!! This was too bad to hear. Some LVM
volumes were repaired,
> others were restored from backup. The bad disk was
removed. I learnt that
> 3ware controllers aren't really high quality, and they
probably corrupt the
> FSs.
>
> Since all disks are same age, I thought I'd buy new
disks to replace the old
> ones. I bought 4x500G barracuda-ES drives, which should
be high quality.
> Here lies my problem. I need to replace the 3 running
disks, with 3 new
> disks, and add an extra one as hot spare. I am scared
to do that, because
> the standard way is to "fail" a disk, and
rebuild on a new one, then repeat
> for the other 2 disks till all 3 are replaced. Now this
puts me in a
> vulnerable situation, if I "fail" a disk, and
while rebuilding another disk
> naturally fails, all data is gone! Is there any other
"wise" way to do what
> I want safely ? I contacted 3w support, and they just
insist I should
> fail/rebuild, but since I don't have much faith in
their controllers or the
> old disks ... any smarter way to do this ?
I have some 3ware controllers as well and while I can't say
that they
are the best (performance is horrible in many cases) I never
lost any
data unless I had two dead disks in RAID5.
The most common reason for a rebuild to fail is if any of
your remaining
disks in the raid have a fault (bad blocks). The best way to
deal with
this is to have the 3ware card to run a verify task every
few days to
deal with problems like this. Also have smartd running to
monitor the disks
so you get a warning.
So before your rebuilds, *backup* your data if you can. Run
a verify
task so the controller/disks have a chance to correct any
exsiting
errors, check with smartctl your disks and start with the
one with the
most bad blocks (if any).
Sadly most SATA disks have an unrecoverable read error rate
of 1/10^14
or so which means that statistically you'll get one error
every ~13TB
that you read. So during every rebuild you'll have a 1/13
chance to
loose a block whatever you do :(
Cheers,
Kostas
_______________________________________________
rhelv5-list mailing list
rhelv5-list redhat.com
h
ttps://www.redhat.com/mailman/listinfo/rhelv5-list
|
|
| Re: Replacing disks under 3ware-9550SX
safely |

|
2008-05-03 06:50:20 |
|
Thanks guys, I did backup, and rebuild on new disks. The 3rd/Final disk is now rebuilding, so I guess I should be happy. About the one error every 13TB, I guess we'll have to wait for btrfs 
more seriously now, I should have the raid-5 array of new disks as my primary storage. And the older 3 disks, just lying there. I would like to use the old ones as backup for the new one. My plan is to assemble to old disks into a new raid-5 unit, and then daily LVM-snapshot the primary unit, then 'dump' file systems onto the backup raid unit. Any better suggestions ?
Actually I don't know if I should be making the backup unit raid-5, knowing that raid-5 is slow for writing!
Kostas, about your performance problems, make sure you have the latest firmware, AFAIK it mentions performance improvements for ext3 specifically. I was too scared to re-flash my card though ;)
Regards
On Sat, May 3, 2008 at 2:22 PM, Kostas Georgiou < k.georgiou imperial.ac.uk">k.georgiou imperial.ac.uk> wrote:
On Thu, May 01, 2008 at 09:35:19PM +0300, Ahmed Kamal wrote:
> Hello,
> I'm working on a server with a 3w-9550SX controller, with 3x500G disks in a
> raid-5 and 1x500G hot spare. One night, a disk fails, and the server
> crashes! Working on the server, I see that many filesystems were destroyed
> beyond repair!! This was too bad to hear. Some LVM volumes were repaired,
> others were restored from backup. The bad disk was removed. I learnt that
> 3ware controllers aren't really high quality, and they probably corrupt the
> FSs.
>
> Since all disks are same age, I thought I'd buy new disks to replace the old
> ones. I bought 4x500G barracuda-ES drives, which should be high quality.
> Here lies my problem. I need to replace the 3 running disks, with 3 new
> disks, and add an extra one as hot spare. I am scared to do that, because
> the standard way is to "fail" a disk, and rebuild on a new one, then repeat
> for the other 2 disks till all 3 are replaced. Now this puts me in a
> vulnerable situation, if I "fail" a disk, and while rebuilding another disk
> naturally fails, all data is gone! Is there any other "wise" way to do what
> I want safely ? I contacted 3w support, and they just insist I should
> fail/rebuild, but since I don't have much faith in their controllers or the
> old disks ... any smarter way to do this ?
I have some 3ware controllers as well and while I can't say that they
are the best (performance is horrible in many cases) I never lost any
data unless I had two dead disks in RAID5.
The most common reason for a rebuild to fail is if any of your remaining
disks in the raid have a fault (bad blocks). The best way to deal with
this is to have the 3ware card to run a verify task every few days to
deal with problems like this. Also have smartd running to monitor the disks
so you get a warning.
So before your rebuilds, *backup* your data if you can. Run a verify
task so the controller/disks have a chance to correct any exsiting
errors, check with smartctl your disks and start with the one with the
most bad blocks (if any).
Sadly most SATA disks have an unrecoverable read error rate of 1/10^14
or so which means that statistically you'll get one error every ~13TB
that you read. So during every rebuild you'll have a 1/13 chance to
loose a block whatever you do :(
Cheers,
Kostas
|
| Re: Replacing disks under 3ware-9550SX
safely |

|
2008-05-03 15:44:42 |
|
I have some 3ware controllers as well and while I can't say that they
are the best (performance is horrible in many cases) I never lost any
data unless I had two dead disks in RAID5.
Are you really sure about this? Most benchmarks I saw indicated otherwise.
|
| Re: Replacing disks under 3ware-9550SX
safely |

|
2008-05-04 10:30:05 |
> > I have some 3ware controllers as well and while I
can't say that they
> > are the best (performance is horrible in many
cases) I never lost any
> > data unless I had two dead disks in RAID5.
>
> Are you really sure about this? Most benchmarks I saw
indicated otherwise.
Seconded. I've always been impressed with 3ware performance,
but in
most case I've only had SCSI RAID to compare which is apples
and oranges.
It will be very interesting now that most of the RAID
controllers support both
SAS and SATA, hopefully we will get some good
price/performance
comparisons.
_______________________________________________
rhelv5-list mailing list
rhelv5-list redhat.com
h
ttps://www.redhat.com/mailman/listinfo/rhelv5-list
|
|
[1-7]
|
|