List Info

Thread: nutch fetch status codes




nutch fetch status codes
user name
2007-09-18 09:30:28
hi,

Can someone explain on the various status codes and their
meaning?
fetched, unfetched  - pretty obvious

db_gone - ?
db_redir_perm - ?
db_redir_temp - ?

Eyal Edri
Re: nutch fetch status codes
country flaguser name
Poland
2007-09-18 10:57:02
eyal edri wrote:
> hi,
> 
> Can someone explain on the various status codes and
their meaning?
> fetched, unfetched  - pretty obvious
> 
> db_gone - ?

We tried several times to retrieve this page (3 times by
default), and 
it was either forbidden by robots.txt, or we got HTTP 404.

> db_redir_perm - ?

This url is redirected to a different url using HTTP 301
(Permanently 
Moved). The HTTP spec says that in this case the original
url should not 
be used anymore.

> db_redir_temp - ?

This url is redirected to a different url using HTTP 302
(Temporarily 
Moved).


-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||/|  Information Retrieval, Semantic Web
___|||__||  |  ||  |  Embedded Unix, System Integration
http://www.sigram.com 
Contact: info at sigram dot com


Re: nutch fetch status codes
country flaguser name
United States
2007-09-18 14:36:24
Hello-

    I should point out that these are HTTP codes, not nutch
specific stuff, 
so if you want more information you might get more thorough
results 
referencing that.

                        see you
                            -J


----- Original Message ----- 
From: "Andrzej Bialecki" <abgetopt.org>
To: <nutch-userlucene.apache.org>
Sent: Tuesday, September 18, 2007 8:57 AM
Subject: Re: nutch fetch status codes


> eyal edri wrote:
>> hi,
>>
>> Can someone explain on the various status codes and
their meaning?
>> fetched, unfetched  - pretty obvious
>>
>> db_gone - ?
>
> We tried several times to retrieve this page (3 times
by default), and it 
> was either forbidden by robots.txt, or we got HTTP
404.
>
>> db_redir_perm - ?
>
> This url is redirected to a different url using HTTP
301 (Permanently 
> Moved). The HTTP spec says that in this case the
original url should not 
> be used anymore.
>
>> db_redir_temp - ?
>
> This url is redirected to a different url using HTTP
302 (Temporarily 
> Moved).
>
>
> -- 
> Best regards,
> Andrzej Bialecki     <><
>  ___. ___ ___ ___ _ _  
__________________________________
> [__ || __|__/|__||/|  Information Retrieval, Semantic
Web
> ___|||__||  |  ||  |  Embedded Unix, System
Integration
> http://www.sigram.com 
Contact: info at sigram dot com
> 


[1-3]

about | contact  Other archives ( Real Estate discussion Medical topics )