List Info

Thread: Re: µªÎ`: HBase PerformanceEvaluation failing




Re: µªÎ`: HBase PerformanceEvaluation failing
country flaguser name
United States
2007-11-24 18:21:47
I think that stack was suggesting an HDFS fsck, not a disk
level fsck.

Try [hadoop fsck /]




On 11/24/07 4:09 PM, "Kareem Dana"
<kareem.danagmail.com> wrote:

> I do not have root access on the xen cluster I'm using.
I will ask the
> admin to make sure the disk is working properly.
Regarding the
> mismatch versions though, are you suggesting that
different region
> servers might be running different versions of
hbase/hadoop? They are
> all running the same code from the same shared storage.
There isn't
> even another version of hadoop anywhere for the other
nodes to run. I
> think I'll try dropping my cluster down to 2 nodes and
working back
> up... maybe I can pin point a specific problem node.
Thanks for taking
> a look at my logs.
> 
> On Nov 24, 2007 5:49 PM, stack <stackduboce.net> wrote:
>> I took a quick look Kareem.   As with the last
time, hbase keeps having
>> trouble w/ the hdfs.  Things start out fine around
16:00 then go bad
>> because can't write reliably to the hdfs -- a
variety of reasons.  You
>> then seem to restart the cluster around 17:37 or so
and things seem to
>> go along fine for a while until 19:05 when again,
all regionservers
>> report trouble writing the hdfs.  Have you run an
fsck?


Re: µªÎ`: HBase PerformanceEvaluation failing
user name
2007-11-24 20:12:52
I ran hadoop fsck and sure enough the DFS was corrupted. It
seems that
the PerformanceEvaluation test is corrupting it. Before I
run the
test, I ran fsck and the DFS was reported as HEALTHY. Once
the PE
fails, the DFS is reported as corrupt. I tried to simplify
my setup
and run the PE again. My new config is as follows:

hadoop07 - DFS Master, Mapred master, hbase master
hadoop09-10 - 2 hbase region servers
hadoop11-12 - 2 datanodes, task trackers

mapred.map.tasks = 2
mapred.reduce.tasks = 1
dfs.replication = 1

I ran the distributed PE in that configuration and it still
failed
with similar errors. The output of the hadoop fsck for this
run was:

..........
/tmp/hadoop-kcd/hbase/hregion_.META.,,1/info/mapfiles/643488
1831082231493/data:
MISSING 1 blocks of total size 0 B.
......................................
/tmp/hadoop-kcd/hbase/hregion_TestTable,11566878,12270926815
44002579/info/mapfiles/5263238643231358600/data:
MISSING 1 blocks of total size 0 B.
....
/tmp/hadoop-kcd/hbase/hregion_TestTable,12612310,16520624110
16999689/info/mapfiles/2024298319068625138/data:
MISSING 1 blocks of total size 0 B.
....
/tmp/hadoop-kcd/hbase/hregion_TestTable,12612310,16520624110
16999689/info/mapfiles/5071453667327337040/data:
MISSING 1 blocks of total size 0 B.
.........
/tmp/hadoop-kcd/hbase/hregion_TestTable,13932,47381927475213
22482/info/mapfiles/4400784113695734765/data:
MISSING 1 blocks of total size 0 B.
...................................
............................................................
............
/tmp/hadoop-kcd/hbase/log_172.16.6.56_-1823376333333123807_6
0020/hlog.dat.027:
MISSING 1 blocks of total size 0 B.
.Status: CORRUPT
 Total size:    1890454330 B
 Total blocks:  180 (avg. block size 10502524 B)
 Total dirs:    190
 Total files:   173
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Target replication factor:     1
 Real replication factor:       1.0


The filesystem under path '/' is CORRUPT


On Nov 24, 2007 6:21 PM, Ted Dunning <tdunningveoh.com> wrote:
>
> I think that stack was suggesting an HDFS fsck, not a
disk level fsck.
>
> Try [hadoop fsck /]
>
>
>
>
>
> On 11/24/07 4:09 PM, "Kareem Dana"
<kareem.danagmail.com> wrote:
>
> > I do not have root access on the xen cluster I'm
using. I will ask the
> > admin to make sure the disk is working properly.
Regarding the
> > mismatch versions though, are you suggesting that
different region
> > servers might be running different versions of
hbase/hadoop? They are
> > all running the same code from the same shared
storage. There isn't
> > even another version of hadoop anywhere for the
other nodes to run. I
> > think I'll try dropping my cluster down to 2 nodes
and working back
> > up... maybe I can pin point a specific problem
node. Thanks for taking
> > a look at my logs.
> >
> > On Nov 24, 2007 5:49 PM, stack <stackduboce.net> wrote:
> >> I took a quick look Kareem.   As with the last
time, hbase keeps having
> >> trouble w/ the hdfs.  Things start out fine
around 16:00 then go bad
> >> because can't write reliably to the hdfs -- a
variety of reasons.  You
> >> then seem to restart the cluster around 17:37
or so and things seem to
> >> go along fine for a while until 19:05 when
again, all regionservers
> >> report trouble writing the hdfs.  Have you run
an fsck?
>
>

[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )