Sorry, I was a bit too fast there, the answer applies to the
RegexURLFilter not the RegexUrlNormalizer. I don't think
there is a
similar facility for the RegexUrlNormalizer, but let me know
if you
find it
Rgrds, Thomas
On 5/22/06, TDLN <diamond108 gmail.com> wrote:
> Hi Stefan
>
> try running bin/nutch
org.apache.nutch.net.URLFilterChecker
>
> Rgrds, Thomas
>
> On 5/22/06, Stefan Neufeind <apache.org stefan-neufeind.de> wrote:
> > Hi,
> >
> > is there a way to debug rules for
RegexUrlNormalizer, e.g. test the
> > substitution from commandline?
> >
> >
> > bin/nutch
org.apache.nutch.net.RegexUrlNormalizer
> >
> > does print out the rules it uses. But afaik there
is no such thing
> > possible as
> >
> > echo "http://www.example.com&qu
ot; | bin/nutch
> > org.apache.nutch.net.RegexUrlNormalizer
> >
> > is there? So how do you debug rules when writing
new ones and testing
> > them against a set of URLs that should match /
should not match?
> >
> >
> >
> > Regards,
> > Stefan
> >
>
|