List Info

Thread: RE: HBase num_versions




RE: HBase num_versions
country flaguser name
United States
2007-11-07 13:45:39
Currently in the shell, num_versions = 0 is equivalent to
'all versions'.

I don't think that needs to be changed, unless someone can
imagine a clause on a query that wouldn't require a version
of a row to operate correctly.

Thanks,
Stu


-----Original Message-----
From: Jim Kellerman <jimpowerset.com>
Sent: Wednesday, November 7, 2007 1:03pm
To: hadoop-userlucene.apache.org <hadoop-userlucene.apache.org>
Subject: RE: HBase num_versions

num_versions=all ?

---
Jim Kellerman, Senior Engineer; Powerset
jimpowerset.com


> -----Original Message-----
> From: Michael Stack [mailto:stackduboce.net]
> Sent: Wednesday, November 07, 2007 9:59 AM
> To: hadoop-userlucene.apache.org
> Cc: stuhoodwebmail.us
> Subject: Re: HBase num_versions
>
> In the absence of a num_versions qualifier, shell makes
> presumption that you want ALL versions.  Changing the
default
> to be 1 would mean that we would have to add some other
means
> of specifying all versions ("num_versions=-1"
or some such
> oddity).  What ye think?
> St.Ack
>
>
> Jim Kellerman wrote:
> > Yes, for num_versions > 1, HBase has to dig
through the
> memcache, and multiple HStore files until it has found
the
> requested number of versions or runs out of places to
look.
> This is especially apparent if there is only 1 version.
It
> has to do a lot of work for nothing.
> >
> > Please enter a Jira for the HBase shell to default
the
> number of versions to 1.
> >
> > ---
> > Jim Kellerman, Senior Engineer; Powerset jimpowerset.com
> >
> >
> >
> >> -----Original Message-----
> >> From: Stu Hood [mailto:stuhoodwebmail.us]
> >> Sent: Tuesday, November 06, 2007 11:23 PM
> >> To: hadoop-userlucene.apache.org
> >> Subject: HBase num_versions
> >>
> >> Hey guys,
> >>
> >> Just noticed some surprising behavior for
select statements
> >> in HBase 0.15: a select command without a
num_versions = 1
> >> clause takes 2 orders of magnitude longer to
run than a
> bare select.
> >>
> >> Is this inconsistent implementation, or is it
taking extra
> >> time to scan for additional versions? If this
isn't a bug,
> >> then perhaps the default for num_versions
should be 1 to keep
> >> things snappy by default.
> >>
> >>
============================================================
> >>
> >> Hbase> describe test;
> >>
+-----------------------------------------------------------
--
> >> ----------------+
> >> | Column Family Descriptor
> >>                 |
> >>
+-----------------------------------------------------------
--
> >> ----------------+
> >> | name: hex, max versions: 3, compression:
NONE, in memory:
> >> false, max length:|
> >> |  2147483647, bloom filter: none
> >>                 |
> >>
+-----------------------------------------------------------
--
> >> ----------------+
> >> 1 columnfamily(s) in set (0.310 sec)
> >> Hbase> select hex: from test where row =
'3980000'
> num_versions = 1;
> >> 3cbae0
> >> 1 row(s) in set (0.016 sec)
> >> Hbase> select hex: from test where row =
'3980000';
> >> 3cbae0
> >> 1 row(s) in set (1.882 sec)
> >>
> >>
============================================================
> >>
> >>
> >> Thanks,
> >>
> >>
> >> Stu Hood
> >> Webmail.us
> >> "You manage your business. We'll manage
your email."(r)
> >>
> >>
> >>
>
>



RE: HBase num_versions
country flaguser name
Korea, Republic of
2007-11-07 17:27:29
Yeah, Thanks stu hood.
Time-dimension is one of the most important things.Also,
It's a matter of considerable complexity.
 
And last, Hbase's Time-dimension is different from date
field in traditional-db (or hbase).
I will clarify it. 
 
Now "select statement" isn't perfect yet. 
(A little imitation is a dangerous thing.)
 
Thanks,
Edward.
------------------------------
B. Regards,
Edward yoon  NHN, corp. 
Home : http://www.udanax.org> Date: Wed, 7 Nov 2007 14:45:39 -0500> Subject: RE:
HBase num_versions> From: stuhoodwebmail.us> To:
hadoop-userlucene.apache.org> > Currently in the shell,
num_versions = 0 is equivalent to 'all versions'.> > I
don't think that needs to be changed, unless someone can
imagine a clause on a query that wouldn't require a version
of a row to operate correctly.> > Thanks,> Stu>
> > -----Original Message-----> From: Jim Kellerman
<jimpowerset.com>> Sent: Wednesday, November 7,
2007 1:03pm> To: hadoop-userlucene.apache.org
<hadoop-userlucene.apache.org>> Subject: RE:
HBase num_versions> > num_versions=all ?> >
---> Jim Kellerman, Senior Engineer; Powerset> jimpowerset.com> > > > -----Original
Message-----> > From: Michael Stack [mailto:stackduboce.net]> > Sent: Wednesday, November 07,
2007 9:59 AM> > To: hadoop-userlucene.apache.org> >
Cc: stuhoodwebmail.us> > Subject: Re: HBase
num_versions> >> > In the absence of a
num_versions qualifier, shell makes> > presumption
that you want ALL versions. Changing the default> > to
be 1 would mean that we would have to add some other
means> > of specifying all versions
("num_versions=-1" or some such> > oddity).
What ye think?> > St.Ack> >> >> >
Jim Kellerman wrote:> > > Yes, for num_versions
> 1, HBase has to dig through the> > memcache, and
multiple HStore files until it has found the> >
requested number of versions or runs out of places to
look.> > This is especially apparent if there is only
1 version. It> > has to do a lot of work for
nothing.> > >> > > Please enter a Jira for
the HBase shell to default the> > number of versions
to 1.> > >> > > ---> > > Jim
Kellerman, Senior Engineer; Powerset jimpowerset.com> > >> > >> >
>> > >> -----Original Message-----> >
>> From: Stu Hood [mailto:stuhoodwebmail.us]> > >> Sent: Tuesday, November
06, 2007 11:23 PM> > >> To: hadoop-userlucene.apache.org> > >> Subject: HBase
num_versions> > >>> > >> Hey
guys,> > >>> > >> Just noticed some
surprising behavior for select statements> > >>
in HBase 0.15: a select command without a num_versions =
1> > >> clause takes 2 orders of magnitude
longer to run than a> > bare select.> >
>>> > >> Is this inconsistent
implementation, or is it taking extra> > >> time
to scan for additional versions? If this isn't a bug,>
> >> then perhaps the default for num_versions
should be 1 to keep> > >> things snappy by
default.> > >>> > >>
============================================================
> > >>> > >> Hbase> describe
test;> > >>
+-----------------------------------------------------------
--> > >> ----------------+> > >> |
Column Family Descriptor> > >> |> >
>>
+-----------------------------------------------------------
--> > >> ----------------+> > >> |
name: hex, max versions: 3, compression: NONE, in
memory:> > >> false, max length:|> >
>> | 2147483647, bloom filter: none> > >>
|> > >>
+-----------------------------------------------------------
--> > >> ----------------+> > >> 1
columnfamily(s) in set (0.310 sec)> > >>
Hbase> select hex: from test where row = '3980000'>
> num_versions = 1;> > >> 3cbae0> >
>> 1 row(s) in set (0.016 sec)> > >>
Hbase> select hex: from test where row = '3980000';>
> >> 3cbae0> > >> 1 row(s) in set
(1.882 sec)> > >>> > >>
============================================================
> > >>> > >>> > >>
Thanks,> > >>> > >>> >
>> Stu Hood> > >> Webmail.us> >
>> "You manage your business. We'll manage your
email."(r)> > >>> > >>> >
>>> >> >> > 
____________________________________________________________
_____
Help yourself to FREE treats served up daily at the
Messenger Café. Stop by today.
http://www.cafemessenger.com/
info/info_sweetstuff2.html?ocid=TXT_TAGLM_OctWLtagline
[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )