Daniel Veillard <veillard redhat.com> writes:
>> shows this for every document I get back that
parses:
>>
>> <?xml version="1.0"
encoding="UTF-8" standalone="yes"?>
>> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0
Transitional//EN" "h
ttp://www.w3.org/TR/REC-html40/loose.dtd">
>>
>> Here's the relevant bit of the loader again:
>>
>> # The parserContext and resulting document
>> parserContext =
libxml2.parserCtxt(_obj=pctx)
>
> what is pctx ??? i find suspicious the fact you could
provide a C parser
> context here.
This is inside a document loader implementation. The parser
context is
passed in, here's the function again:
def loader(url, pctx, ctx, type):
doc = None
context_object = None
if type:
context_object = libxslt.stylesheet(_obj=ctx)
else:
context_object = libxslt.transformCtxt(_obj=ctx)
# The parserContext and resulting document
parserContext = libxml2.parserCtxt(_obj=pctx)
doc = None
if url == "/one":
doc =
parserContext.htmlCtxtReadFile("file2.html",
"UTF8", 1)
else:
doc =
parserContext.ctxtReadDoc("""<document>
<h1>this is xml</h1>
</document>""", url,
"UTF8", 0)
return doc
And here's the set:
try:
libxslt.setLoaderFunc(loader)
except Exception, e:
# Whoops! serious error
Note the pctx in the loader arg list.
>
>> doc = None
>> if url == "/one":
>> doc =
parserContext.htmlCtxtReadFile("file2.html",
"UTF8", 1)
>> else:
>> doc =
parserContext.ctxtReadDoc("""<document>
>
> just use htmlReadFile and forget about trying to
address directly the
> parser context. With python overhead you won't gain
anything to create
> a separately accessible object. The less you touch
things though Python
> the better it will be, really. That said HTML parsing
works for me when
> using htmlReadFile.
>From a loader?
I thought it was not possible. I'll try it and see!
--
Nic Ferrier
http://www.tapsellfer
rier.co.uk for all your tapsell ferrier needs
_______________________________________________
xml mailing list, project page http://xmlsoft.org/
xml gnome.org
http://mai
l.gnome.org/mailman/listinfo/xml
|