>please use an attachment, not in the mail body, mailers
breaks
>body content.
<...>
>provide test example as attachmnent too, I will plug
them
>in test/HTML
The attached tar.gz includes the contextual patch of
HTMLparser.c of
libxml2-2.6.24 (now with htmlParseLookupSequence) and the
test HTML file
"chunk-boundary-cdata.html". The test HTML file
triggers the error in
libxml2 because it has the closing
"</script>" tag exactly on the 4096
boundary. To reproduce the test, the number of chars in the
test HTML
file and the number of bytes read by testHTML must not be
changed(!).
The character alignment needs to match exactly to trigger
the error.
Before the patch, libxml2-2.6.24 will fail the following
test with the
simple test HTML file:
./testHTML --push --sax --debug chunk-boundary-cdata.html
SAX.setDocumentLocator()
SAX.startDocument()
SAX.startElement(html)
SAX.startElement(body)
SAX.characters(.............................., 1000)
SAX.characters(...........................
.., 1000)
SAX.characters(.............................., 1000)
SAX.characters(...........................
.., 1000)
SAX.characters(.............................., 74)
SAX.startElement(script)
SAX.error: Invalid char in CDATA 0x0
SAX.cdata(</, 2)
SAX.error: htmlParseEndTag: '</' not found
SAX.cdata(cript>
<a href="test", 26)
SAX.error: Unexpected end tag : a
SAX.cdata(
, 1)
SAX.endElement(script)
SAX.endElement(body)
SAX.ignorableWhitespace(
, 1)
SAX.endElement(html)
SAX.ignorableWhitespace(
, 1)
SAX.endDocument()
After the patch, the result is correct.
Cyrill
_______________________________________________
xml mailing list, project page http://xmlsoft.org/
xml gnome.org
http://mai
l.gnome.org/mailman/listinfo/xml
|