List Info

Thread: Patch for XML declaration detection (standalone=-2)




Patch for XML declaration detection (standalone=-2)
user name
2006-12-04 00:11:39
Hi,

Here is a patch that fixes a minor bug in my earlier patch
that enabled 
detection of whether an XML declaration was specified. It
also adds some 
documentation explaining the different values of
"standalone":

  standalone=1	---> standalone="yes"
  standalone=0	---> standalone="no"
  standalone=-1	---> no XML declaration
  standalone=-2	---> XML declaration, but no standalone
attribute

The use case for all this is that I wish to use the XML
reader, read up 
to the start tag of the root element, and then check the
standalone 
value in order to see if an XML declaration was seen. This
is in order 
to do content sniffing between XML and HTML.

The bug was that standalone was defaulting to -1 instead of
-2 in 
xmlParseSDDecl, even though if we get to xmlParseSDDecl it
means that an 
XML declaration has been seen. It was only triggering in
situations like 
this:

<?xml version="1.0"  ?>

note the space after the version attribute, which was
causing 
xmlParseSDDecl to be called, then straight away returning
-1, as no 
standalone attribute was found. Now it returns -2,
indicating that no 
standalone attribute was found, but the XML declaration is
present.

Michael


Index: parser.c
============================================================
=======
RCS file: /cvs/gnome/libxml2/parser.c,v
retrieving revision 1.462
diff -u -r1.462 parser.c
--- parser.c	15 Oct 2006 20:32:48 -0000	1.462
+++ parser.c	4 Dec 2006 00:05:50 -0000
 -8925,12
+8925,17 
   *  - element types with element content, if white space
occurs directly
   *    within any instance of those types.
   *
- * Returns 1 if standalone, 0 otherwise
+ * Returns:
+ *   1 if standalone="yes"
+ *   0 if standalone="no"
+ *  -2 if standalone attribute is missing or invalid
+ *	  (A standalone value of -2 means that the XML
declaration was found,
+ *	   but no value was specified for the standalone
attribute).
   */

  int
  xmlParseSDDecl(xmlParserCtxtPtr ctxt) {
-    int standalone = -1;
+    int standalone = -2;

      SKIP_BLANKS;
      if (CMP10(CUR_PTR, 's', 't', 'a', 'n', 'd', 'a', 'l',
'o', 'n', 
'e')) {
Index: include/libxml/tree.h
============================================================
=======
RCS file: /cvs/gnome/libxml2/include/libxml/tree.h,v
retrieving revision 1.154
diff -u -r1.154 tree.h
--- include/libxml/tree.h	25 Oct 2006 16:06:29 -0000	1.154
+++ include/libxml/tree.h	4 Dec 2006 00:05:50 -0000
 -503,7
+503,12 

      /* End of common part */
      int             compression;/* level of zlib
compression */
-    int             standalone; /* standalone document (no
external 
refs) */
+    int             standalone; /* standalone document (no
external refs)
+				     1 if standalone="yes"
+				     0 if standalone="no"
+				    -1 if there is no XML declaration
+				    -2 if there is an XML declaration, but no
+					standalone attribute was specified */
      struct _xmlDtd  *intSubset;	/* the document internal
subset */
      struct _xmlDtd  *extSubset;	/* the document external
subset */
      struct _xmlNs   *oldNs;	/* Global namespace, the old
way */
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xmlgnome.org
http://mai
l.gnome.org/mailman/listinfo/xml
Patch for XML declaration detection (standalone=-2)
user name
2006-12-04 09:27:36
On Mon, Dec 04, 2006 at 11:11:39AM +1100, Michael Day wrote:
> Hi,
> 
> Here is a patch that fixes a minor bug in my earlier
patch that enabled 
> detection of whether an XML declaration was specified.
It also adds some 
> documentation explaining the different values of
"standalone":
> 
>  standalone=1	---> standalone="yes"
>  standalone=0	---> standalone="no"
>  standalone=-1	---> no XML declaration
>  standalone=-2	---> XML declaration, but no
standalone attribute
> 
> The use case for all this is that I wish to use the XML
reader, read up 
> to the start tag of the root element, and then check
the standalone 
> value in order to see if an XML declaration was seen.
This is in order 
> to do content sniffing between XML and HTML.
> 
> The bug was that standalone was defaulting to -1
instead of -2 in 
> xmlParseSDDecl, even though if we get to xmlParseSDDecl
it means that an 
> XML declaration has been seen. It was only triggering
in situations like 
> this:
> 
> <?xml version="1.0"  ?>
> 
> note the space after the version attribute, which was
causing 
> xmlParseSDDecl to be called, then straight away
returning -1, as no 
> standalone attribute was found. Now it returns -2,
indicating that no 
> standalone attribute was found, but the XML declaration
is present.

  Okay, this looks clean, I hope this new extra value won't
be a problem 
in existing code, I doubt it will, so I applied the patch
and commited it,

  thanks a lot !

Daniel

-- 
Red Hat Virtualization group http://redhat.com/v
irtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
veillardredhat.com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ |
Rpmfind RPM search engine  http://rpmfind.net/
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xmlgnome.org
http://mai
l.gnome.org/mailman/listinfo/xml
[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )