[
HTTPS://ISSUES.APACHE.ORG/JIRA/BROWSE/NUTCH-505?PAGE=COM.ATL
ASSIAN.JIRA.PLUGIN.SYSTEM.ISSUETABPANELS:COMMENT-TABPANEL#AC
TION_12512139 ]
ANDRZEJ BIALECKI COMMENTED ON NUTCH-505:
-----------------------------------------
PLEASE TEST JAVA 1.5 AND JAVA 1.6 - IIRC THERE ARE SOME
DIFFERENCES IN PERFORMANCE OF JAVA.UTIL.REGEX BETWEEN THESE
TWO VERSIONS.
> OUTLINK URLS SHOULD BE VALIDATED
> --------------------------------
>
> KEY: NUTCH-505
> URL:
HTTPS://ISSUES.APACHE.ORG/JIRA/BROWSE/NUTCH-505
> PROJECT: NUTCH
> ISSUE TYPE: IMPROVEMENT
> REPORTER: DO?ACAN GüNEY
> ASSIGNEE: DO?ACAN GüNEY
> PRIORITY: MINOR
> FIX FOR: 1.0.0
>
> ATTACHMENTS: FILTERED.TXT, NUTCH-505-V2.PATCH,
NUTCH-505-V3.PATCH, NUTCH-505.PATCH, NUTCH-505.PATCH,
NUTCH-505_DRAFT.PATCH, NUTCH-505_DRAFT_V2.PATCH
>
>
> SEE DISCUSSION HERE:
>
HTTP://WWW.NABBLE.COM/FETCHING-HTTP%3A--WWW.VARIETY.COM-%3C-
DIV%3E%3C-A%3E-TF3961692.HTML
> PARSE PLUGINS MAY EXTRACT GARBAGE URLS FROM PAGES. WE
NEED A URL VALIDATION SYSTEM THAT TESTS THESE URLS AND
FILTERS OUT GARBAGE.
--
THIS MESSAGE IS AUTOMATICALLY GENERATED BY JIRA.
-
YOU CAN REPLY TO THIS EMAIL TO ADD A COMMENT TO THE ISSUE
ONLINE.
|