google.public.web-apis
http://groups.google.com/group/google.public.web-apis
?hl=en
google.public.web-apis googlegroups.com
Today's topics:
* Inconsistent Results with SOAP API - 4 messages, 2
authors
http://groups.g
oogle.com/group/google.public.web-apis/browse_thread/thread/
9a7a7260d32e51e4?hl=en
* Google SOAP Search API - 1 messages, 1 author
http://groups.g
oogle.com/group/google.public.web-apis/browse_thread/thread/
8c9a0937a47bf6c4?hl=en
* google soap search key ? - 1 messages, 1 author
http://groups.g
oogle.com/group/google.public.web-apis/browse_thread/thread/
a58578c00863face?hl=en
============================================================
==================
TOPIC: Inconsistent Results with SOAP API
http://groups.g
oogle.com/group/google.public.web-apis/browse_thread/thread/
9a7a7260d32e51e4?hl=en
============================================================
==================
== 1 of 4 ==
Date: Tues, Feb 20 2007 7:31 am
From: "Manfred"
On Feb 20, 1:40 am, "dkerins" <bitbox... gmail.com> wrote:
> I am doing some research work for a thesis. I have a
site which
> contains a few documents with fairly unique strings.
When I do a
> regular google search with the following term...
> site:pce.bitbox.ca A0ACA011 B76687DE
>
> I get the expected results. (i.e. There are 4 pages
with these two
> terms not including the index file.
>
> When I do the same query through the SOAP API my
results are not
> correct.
> i.e. I get two of the 4 files and the index file. I
get this
> incorrect result using a perl script and with the java
that came with
> the API too.
>
> Should I not expect consistency?
You should expect small variations in both, the web
interface _and_ the api.
For example, when I use your query in the web
site:pce.bitbox.ca A0ACA011 B76687DE
I get (apart from the index) 3:
pce.bitbox.ca/a-and-b-and-d.html
pce.bitbox.ca/a-and-b.html
pce.bitbox.ca/a-and-b-and-e.html
It depends on which Google datacenters are involved for the
response.
Hope this helps,
Manfred
== 2 of 4 ==
Date: Tues, Feb 20 2007 8:56 am
From: "dkerins"
On Feb 20, 7:31 am, "Manfred"
<manfred.staudin... gmail.com> wrote:
> On Feb 20, 1:40 am, "dkerins"
<bitbox... gmail.com> wrote:> I am doing some
research work for a thesis. I have a site which
> > contains a few documents with fairly unique
strings. When I do a
> > regular google search with the following term...
> > site:pce.bitbox.ca A0ACA011 B76687DE
>
> > I get the expected results. (i.e. There are 4
pages with these two
> > terms not including the index file.
>
> > When I do the same query through the SOAP API my
results are not
> > correct.
> > i.e. I get two of the 4 files and the index file.
I get this
> > incorrect result using a perl script and with the
java that came with
> > the API too.
>
> > Should I not expect consistency?
>
> You should expect small variations in both, the web
> interface _and_ the api.
> For example, when I use your query in the web
> site:pce.bitbox.ca A0ACA011 B76687DE
> I get (apart from the index) 3:
> pce.bitbox.ca/a-and-b-and-d.html
> pce.bitbox.ca/a-and-b.html
> pce.bitbox.ca/a-and-b-and-e.html
>
> It depends on which Google datacenters are involved for
the response.
>
> Hope this helps,
> Manfred
Thanks Manfred.
Is it reasonable to expect that over time the various data
centers
would synchronize and that their differing results would
converge into
one, dare I say accurate, world view?
Thanks David
== 3 of 4 ==
Date: Tues, Feb 20 2007 11:16 am
From: "Manfred"
On Feb 20, 5:56 pm, "dkerins" <bitbox... gmail.com> wrote:
> On Feb 20, 7:31 am, "Manfred"
<manfred.staudin... gmail.com> wrote:
>
>
>
> > On Feb 20, 1:40 am, "dkerins"
<bitbox... gmail.com> wrote:> I am doing some
research work for a thesis. I have a site which
> > > contains a few documents with fairly unique
strings. When I do a
> > > regular google search with the following
term...
> > > site:pce.bitbox.ca A0ACA011 B76687DE
>
> > > I get the expected results. (i.e. There are
4 pages with these two
> > > terms not including the index file.
>
> > > When I do the same query through the SOAP API
my results are not
> > > correct.
> > > i.e. I get two of the 4 files and the index
file. I get this
> > > incorrect result using a perl script and
with the java that came with
> > > the API too.
>
> > > Should I not expect consistency?
>
> > You should expect small variations in both, the
web
> > interface _and_ the api.
> > For example, when I use your query in the web
> > site:pce.bitbox.ca A0ACA011 B76687DE
> > I get (apart from the index) 3:
> > pce.bitbox.ca/a-and-b-and-d.html
> > pce.bitbox.ca/a-and-b.html
> > pce.bitbox.ca/a-and-b-and-e.html
>
> > It depends on which Google datacenters are
involved for the response.
>
> > Hope this helps,
> > Manfred
>
> Thanks Manfred.
>
> Is it reasonable to expect that over time the various
data centers
> would synchronize and that their differing results
would converge into
> one, dare I say accurate, world view?
>
> Thanks David
Hi David,
With the query "site:pce.bitbox.ca" I get 13
pages, from which 8 pages
are already in the supplemental index. This means that
convergence is
not likely to occur.
I would also question the assumptions (or model) about the
search
engine which build the basis for your experiments.
Hope this helps,
Manfred
== 4 of 4 ==
Date: Tues, Feb 20 2007 7:38 pm
From: "dkerins"
On Feb 20, 11:16 am, "Manfred"
<manfred.staudin... gmail.com> wrote:
> On Feb 20, 5:56 pm, "dkerins"
<bitbox... gmail.com> wrote:
>
>
>
> > On Feb 20, 7:31 am, "Manfred"
<manfred.staudin... gmail.com> wrote:
>
> > > On Feb 20, 1:40 am, "dkerins"
<bitbox... gmail.com> wrote:> I am doing some
research work for a thesis. I have a site which
> > > > contains a few documents with fairly
unique strings. When I do a
> > > > regular google search with the following
term...
> > > > site:pce.bitbox.ca A0ACA011 B76687DE
>
> > > > I get the expected results. (i.e. There
are 4 pages with these two
> > > > terms not including the index file.
>
> > > > When I do the same query through the
SOAP API my results are not
> > > > correct.
> > > > i.e. I get two of the 4 files and the
index file. I get this
> > > > incorrect result using a perl script
and with the java that came with
> > > > the API too.
>
> > > > Should I not expect consistency?
>
> > > You should expect small variations in both,
the web
> > > interface _and_ the api.
> > > For example, when I use your query in the
web
> > > site:pce.bitbox.ca A0ACA011 B76687DE
> > > I get (apart from the index) 3:
> > > pce.bitbox.ca/a-and-b-and-d.html
> > > pce.bitbox.ca/a-and-b.html
> > > pce.bitbox.ca/a-and-b-and-e.html
>
> > > It depends on which Google datacenters are
involved for the response.
>
> > > Hope this helps,
> > > Manfred
>
> > Thanks Manfred.
>
> > Is it reasonable to expect that over time the
various data centers
> > would synchronize and that their differing results
would converge into
> > one, dare I say accurate, world view?
>
> > Thanks David
>
> Hi David,
>
> With the query "site:pce.bitbox.ca" I get 13
pages, from which 8 pages
> are already in the supplemental index. This means that
convergence is
> not likely to occur.
> I would also question the assumptions (or model) about
the search
> engine which build the basis for your experiments.
>
> Hope this helps,
> Manfred
Thanks for the insight. What is the supplemental index?
Why wouldn't
the different data centers eventually have same info about a
site that
is fairly static? Is there somewhere I could get more
information
about this so that my expectations are in line with what
google would
provide.
On the pce.bitbox.ca site there are actually 15 pages
including the
index file.
The work I am doing focuses on creating the notion of user
preference
with search terms.
This allows the user to require some terms and prefer (to
varying
degrees) but not require other terms. My software does the
preprocessing and sends an series of queries to simulate
this
specified preference.
So a preference query would look like this...
site:pce.bitbox.ca A0ACA011 U1[B76687DE] U2[C9B136AF]
means A0ACA011 is required and the other two are preferred
but not
required (U1 is a more preferred term than U2).
My software would parse this and send a set of queries and
then gather
and present the results to the user.
The google queries would be...
site:pce.bitbox.ca A0ACA011 B76687DE C9B136AF
site:pce.bitbox.ca A0ACA011 B76687DE -C9B136AF
site:pce.bitbox.ca A0ACA011 -B76687DE C9B136AF
site:pce.bitbox.ca A0ACA011 -B76687DE -C9B136AF
The more preferred/successful query is the first one, the
last is the
least preferred/successful result.
As you can see the user can kill 4 birds with one stone, so
to speak.
The pce.bitbox.ca site is used simply to prove that my
software works
correctly, so having google return a consistent set of
results would
be helpful. (Not that I am expecting you to "fix
it" for me . I am
simply presenting this information to you so you might
understand what
I am up to and why I have the expectations that I do.
Thanks so much for your input.
David.
============================================================
==================
TOPIC: Google SOAP Search API
http://groups.g
oogle.com/group/google.public.web-apis/browse_thread/thread/
8c9a0937a47bf6c4?hl=en
============================================================
==================
== 1 of 1 ==
Date: Tues, Feb 20 2007 4:17 pm
From: kaossesi gmail.com
http://www.umutdolu.com
============================================================
==================
TOPIC: google soap search key ?
http://groups.g
oogle.com/group/google.public.web-apis/browse_thread/thread/
a58578c00863face?hl=en
============================================================
==================
== 1 of 1 ==
Date: Tues, Feb 20 2007 4:20 pm
From: kaossesi gmail.com
hi
where google soap search key
thanks
http://www.umutdolu.com
============================================================
==================
You received this message because you are subscribed to the
Google Groups "google.public.web-apis"
group.
To post to this group, visit http://groups.google.com/group/google.public.web-apis
?hl=en
To unsubscribe from this group, send email to
google.public.web-apis-unsubscribe googlegroups.com
To change the way you get mail from this group, visit:
http://groups.google.com/group/google.publi
c.web-apis/subscribe?hl=en
To report abuse, send email explaining the problem to
abuse googlegroups.com
============================================================
==================
Google Groups: http://groups.google.c
om?hl=en
|