List Info

Thread: Re: strange transformCtxt free-ing problem




Re: strange transformCtxt free-ing problem
user name
2007-01-15 19:13:59
While waiting for a proper test file, I did some more
checking, this time
with Valgrind (unfortunately, Valgrind and Python tend to
work in a very
adversarial manner, so this isn't very easy).  I think I now
understand
where the original trouble comes from.

The pythonDocLoaderFuncWrapper function in python/libxslt.c
creates a
parser context, pctxt, using xmlNewParserCtxt().  Apparently
the reason
for doing this is so that later, after calling the user's
loader, some
additional error checking and cleanup can be done.  pctxt is
then
converted into a Python object [pctxtobj =
libxml_xmlParserCtxtPtrWrap(pctxt)] which is passed to the
user's loader
function (the routine 'loader' in your test program).

Within the loader function which you posted, you create a
new object:
  parserContext = libxml2.parserCtxt(_obj=pctx)
where 'pctx' is pctxtobj.  However, parserContext is only a
local python
object, so at the end of the loader function Python very
kindly calls
upon it's Garbage Collector to dispose of it.  That action
causes
xmlFreeParserCtxt to be called for the underlying parser
context pointer,
which in this instance is the (original C) variable pctxt. 
That, in
turn, causes nothing but trouble for the remainder of the
code within
pythonDocLoaderFuncWrapper.

That's about as far as I can go, since it now would appear
to be a
problem with the basic design of the loader code.  Please
let me know if
I have made some error in my analysis described above, or if
I can
further assist in any way.

Bill

William M. Brack wrote:
> Could you provide the file ("file2.html")
that you are using for this
> test which fails?  If I use a file like
libxml2/test/HTML/doc2.htm:
>
> billbbopt ~/gnomesvn/work $ ln -s HTML/doc2.htm
file2.html
> billbbopt ~/gnomesvn/work $ python bug.py
> ./file2.html:10: HTML parser error : Misplaced DOCTYPE
declaration
> <!-- END Naviscope Javascript --><!DOCTYPE
HTML PUBLIC "-//W3C//DTD HTML
> 4.0 Tra
>                                  ^
> <?xml version="1.0"?>
> <html>
>   <head/>
>   <body>
>     <div>
> <!-- saved from url=(0016)http://intranet/ -->
> <!-- BEGIN Naviscope Javascript -->
> <!-- END Naviscope Javascript -->
> <!-- saved from url=(0027)http://www.agents-tech.co
m/ -->
>     </div>
>     <div>this is xml</div>
>   </body>
> </html>
>
> which seems to indicate that at least something is
working .
>
> (note that I'm using the latest SVN for both libxslt
and libxml2)
>
>
> Bill
>
> Nic James Ferrier wrote:
>> Daniel Veillard <veillardredhat.com> writes:
>>
>>> Nic said:
>>>>  *** glibc detected *** double free or
corruption (!prev): 0x081b6300
>>>> ***
>>>>  Aborted
>>>>
>>>   But did you update libxslt too and make
install for it too ? Please
>>> do
>>> he fixed the problems in libxslt not in
libxml2,
>>
>> Ah!
>>
>> Yes. It stopped segfaulting. I can't get it to
parse the HTML... but
>> it has stopped segfaulting.
>>
>>   doc.dump(sys.stdout)
>>
>> shows this for every document I get back that
parses:
>>
>> <?xml version="1.0"
encoding="UTF-8" standalone="yes"?>
>> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0
Transitional//EN"
>> "h
ttp://www.w3.org/TR/REC-html40/loose.dtd">
>>
>> Here's the relevant bit of the loader again:
>>
>>   def loader(url, pctx, ctx, type):
>>       doc = None
>>       context_object = None
>>       if type:
>>           context_object =
libxslt.stylesheet(_obj=ctx)
>>       else:
>>           context_object =
libxslt.transformCtxt(_obj=ctx)
>>       # The parserContext and resulting document
>>       parserContext =
libxml2.parserCtxt(_obj=pctx)
>>       doc = None
>>       if url == "/one":
>>           doc =
parserContext.htmlCtxtReadFile("file2.html",
"UTF8", 1)
>>       else:
>>           doc =
parserContext.ctxtReadDoc("""<document>

>>   <h1>this is xml</h1>
>>   </document>""", url,
"UTF8", 0)
>>       return doc
>>
>>
>> so when I ask for "/one" from my
stylesheet I get back (practically)
>> nothing.
>>
>> --
>> Nic Ferrier
>> http://www.tapsellfer
rier.co.uk   for all your tapsell ferrier needs
>> _______________________________________________
>> xml mailing list, project page  http://xmlsoft.org/
>> xmlgnome.org
>> http://mai
l.gnome.org/mailman/listinfo/xml
>>
>
>
> _______________________________________________
> xml mailing list, project page  http://xmlsoft.org/
> xmlgnome.org
> http://mai
l.gnome.org/mailman/listinfo/xml
>


_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xmlgnome.org
http://mai
l.gnome.org/mailman/listinfo/xml

[1]

about | contact  Other archives ( Real Estate discussion Medical topics )