List Info

Thread: Re: xmlReadFile with filename instead of URL?




Re: xmlReadFile with filename instead of URL?
country flaguser name
Australia
2007-03-27 20:16:30
Hi Daniel,

>  Works just fine here, if the "foo%2Fbar.xml"
file is present. If it is
> absent, then the unescaping is tested. I don't
understand why that doesn't
> work for you.

The behaviour only seems to trigger when you configure
--without-zlib. I 
don't know why yet, but there are zlib specific #ifdefs in
the loading 
and URL mangling code, so there could be something funny
going on that 
isn't triggered when zlib is disabled.

Michael

-- 
Print XML with Prince!
http://www.princexml.com

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xmlgnome.org
http://mai
l.gnome.org/mailman/listinfo/xml

Re: xmlReadFile with filename instead of URL?
country flaguser name
Australia
2007-03-29 18:43:43
> The behaviour only seems to trigger when you configure
--without-zlib. I 
> don't know why yet, but there are zlib specific #ifdefs
in the loading 
> and URL mangling code, so there could be something
funny going on that 
> isn't triggered when zlib is disabled.

Okay, I found the bug, it's very simple.

Here is the comment for xmlFileOpen:

  * Wrapper around xmlFileOpen_real that try it with an
unescaped
  * version of filename, if this fails fallback to filename

However, the code does *not* do this:

     unescaped = xmlURIUnescapeString(filename, 0, NULL);
     if (unescaped != NULL) {
         retval = xmlFileOpen_real(unescaped);
         xmlFree(unescaped);
     } else {
         retval = xmlFileOpen_real(filename);
     }
     return retval;

The code is unescaping the filename first and trying to load
that. If it 
fails, then it fails. Shouldn't it try and load the filename
as-is 
first, and if *that* fails try unescaping it? Or better yet,
not try 
unescaping it all, I mean since when did filenames use %
escapes anyway?

So I suggest this patch to xmlFileOpen in xmlIO.c:

     retval = xmlFileOpen_real(filename);
     if (retval == NULL) {
         unescaped = xmlURIUnescapeString(filename, 0,
NULL);
         if (unescaped != NULL) {
             retval = xmlFileOpen_real(unescaped);
             xmlFree(unescaped);
         }
     }
     return retval;

With this code the file "hello%2Fworld.xml" will
be loaded first, and 
only if it is not found will "hello/world.xml" be
loaded. But yeah, I 
would rather delete that entire if test, as it seems to me
that any URL 
unescaping should be handled a lot earlier before
xmlFileOpen sees it.

Michael

-- 
Print XML with Prince!
http://www.princexml.com

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xmlgnome.org
http://mai
l.gnome.org/mailman/listinfo/xml

[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )