On Wed, Jan 03, 2007 at 11:53:55AM +0200, Jean Jordaan
wrote:
> Hi there
>
> I'd like to find the encoding of an XML document, as
detected by
> libxml2, using the Python bindings. From lxml, I can
get it like this:
>
> >>> et
> <etree._ElementTree object at 0xb7cc992c>
> >>> et.docinfo.encoding
> 'windows-1252'
>
> According to the lxml API docs, lxml gets this
information from libxml2 (see
> http://cod
espeak.net/lxml/api.html#parsers )
>
> How do I get at it without depending on lxml? The only
way I've been
> able to find is using debugDumpDocumentHead, which just
prints to
> stdout.
>
> >>> dh = xml.debugDumpDocumentHead(xml)
> DOCUMENT
> version=1.0
> encoding=windows-1252
> standalone=true
Hum, it's a string attached to the xmlDoc, it's available
directly in C
but there is no specific API to extract it. As a result the
autogenerated
bindings don't seems to have a way to extract the
information. Could you
add a bugzilla asking for that functionality, the simplest
is probably
to provide a custom accessor function, specifically at the
python binding
level.
Daniel
--
Red Hat Virtualization group http://redhat.com/v
irtualization/
Daniel Veillard | virtualization library http://libvirt.org/
veillard redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ |
Rpmfind RPM search engine http://rpmfind.net/
_______________________________________________
xml mailing list, project page http://xmlsoft.org/
xml gnome.org
http://mai
l.gnome.org/mailman/listinfo/xml
|