The reason I ask is because I know in S3 and in P2P storage
systems that I
have been involved in we had a replica synchronization
algorithm that would
run once every night and it relied on techniques like Merkle
tree
comparisons. Anyway understanding that would be beneficial.
I don't mind
reading through the sources but would appreciate if pointed
to the correct
package.
Thanks
A
On 7/17/07, Phantom <ghostwhoowalks gmail.com> wrote:
>
> I am sure re-replication is not done on every heartbeat
miss since that
> would be very expensive and inefficient. At the same
time you cannot really
> tell if a node is partitioned away, crashed or just
slow. Is it threshold
> based i.e I missed N heartbeats so re-replicate ? Which
package in the
> source code could I look at to glean this information
?
>
> Thanks
> A
>
> On 7/17/07, Phantom <ghostwhoowalks gmail.com> wrote:
> >
> > That's awesome.
> >
> > Thanks
> > A
> >
> > On 7/17/07, Doug Cutting < cutting apache.org> wrote:
> > >
> > > Phantom wrote:
> > > > Here is the scenario I was concerned
about. Consider three nodes in
> > > the
> > > > system A, B and C which are placed say
in different racks. Let us
> > > say that
> > > > the disk on A fries up today. Now the
blocks that were stored on A
> > > are not
> > > > going to re-replicated (this is my
understanding but I could be
> > > wrong in
> > > > this assumption) to some other node or
to the new disk with which
> > > you would
> > > > bring back A.
> > >
> > > That's incorrect. When a datanode fails to
send a heartbeat to the
> > > namenode in a timely manner then its data is
assumed missing and is
> > > re-replicated. And when block corruption is
detected, corrupt
> > > replicas
> > > are removed and non-corrupt replicas are
re-replicated to maintain the
> > >
> > > desired level of replication.
> > >
> > > Doug
> > >
> >
> >
>
|