|
List Info
Thread: xmlwriter, attributes and encoding
|
|
| xmlwriter, attributes and encoding |

|
2006-08-15 23:06:48 |
When using xmlwriter attributes , document encoding that has
been set is
not passed to xmlAttrSerializeTxtContent, so character
references are
written when needed.
The xmlDocPtr passed to xmlAttrSerializeTxtContent is only
used to check
encoding, so I was wondering if the code in the attached
patch is fine
to fix this (it fakes a document so any changes within
xmlAttrSerializeTxtContent to manipulate the passed document
could
possibly blow up - though i dont see any need/reason to ever
do this).
Rob
Index: xmlwriter.c
============================================================
=======
RCS file: /cvs/gnome/gnome-xml/xmlwriter.c,v
retrieving revision 1.36
diff -c -r1.36 xmlwriter.c
*** xmlwriter.c 27 Apr 2006 08:10:25 -0000 1.36
--- xmlwriter.c 11 Aug 2006 20:02:29 -0000
***************
*** 1425,1432 ****
break;
case XML_TEXTWRITER_ATTRIBUTE:
buf = NULL;
!
xmlAttrSerializeTxtContent(writer->out->buffer, NULL,
! NULL,
content);
break;
default:
break;
--- 1425,1441 ----
break;
case XML_TEXTWRITER_ATTRIBUTE:
buf = NULL;
!
! if (writer->out->encoder) {
! xmlDoc doc;
! memset(&doc, 0,
sizeof(xmlDoc));
! doc.encoding = (xmlChar
*)writer->out->encoder->name;
!
xmlAttrSerializeTxtContent(writer->out->buffer,
&doc,
! NULL,
content);
! } else {
!
xmlAttrSerializeTxtContent(writer->out->buffer, NULL,
! NULL,
content);
! }
break;
default:
break;
_______________________________________________
xml mailing list, project page http://xmlsoft.org/
xml gnome.org
http://mai
l.gnome.org/mailman/listinfo/xml
|
|
| xmlwriter, attributes and encoding |

|
2006-08-16 08:49:40 |
On Tue, Aug 15, 2006 at 07:06:48PM -0400, Rob Richards
wrote:
> When using xmlwriter attributes , document encoding
that has been set is
> not passed to xmlAttrSerializeTxtContent, so character
references are
> written when needed.
> The xmlDocPtr passed to xmlAttrSerializeTxtContent is
only used to check
> encoding, so I was wondering if the code in the
attached patch is fine
> to fix this (it fakes a document so any changes within
> xmlAttrSerializeTxtContent to manipulate the passed
document could
> possibly blow up - though i dont see any need/reason to
ever do this).
Hum, that's a bit nasty, because none of the fields in
the document
structure are initialized (except encoding), so it
internally use an
API based not on the API description but on its
implementation. If at
some point someone add something like entity lookup to
xmlAttrSerializeTxtContent or just check the document type,
suddently
the writer would crash in an unexpected way.
So fields should at least be zero initialized, which is
probably
a bit costly if done for every single attribute emitted. So
while passing
a document might be a good idea, it's better to create and
initialize that
document when getting the xmlTextWriterPtr. Maybe adding a
doc in
struct _xmlTextWriter and making sure it's at least
minimally initialized
and properly freed with the xmlTextWriter is a good idea.
That sounds
far more clean to me, but a bit more work indeed
Daniel
--
Red Hat Virtualization group http://redhat.com/v
irtualization/
Daniel Veillard | virtualization library http://libvirt.org/
veillard redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ |
Rpmfind RPM search engine http://rpmfind.net/
_______________________________________________
xml mailing list, project page http://xmlsoft.org/
xml gnome.org
http://mai
l.gnome.org/mailman/listinfo/xml
|
|
| xmlwriter, attributes and encoding |

|
2006-08-16 11:54:49 |
Daniel Veillard wrote:
> On Tue, Aug 15, 2006 at 07:06:48PM -0400, Rob Richards
wrote:
>
>> When using xmlwriter attributes , document encoding
that has been set is
>> not passed to xmlAttrSerializeTxtContent, so
character references are
>> written when needed.
>> The xmlDocPtr passed to xmlAttrSerializeTxtContent
is only used to check
>> encoding, so I was wondering if the code in the
attached patch is fine
>> to fix this (it fakes a document so any changes
within
>> xmlAttrSerializeTxtContent to manipulate the passed
document could
>> possibly blow up - though i dont see any
need/reason to ever do this).
>>
>
> Hum, that's a bit nasty, because none of the fields
in the document
> structure are initialized (except encoding), so it
internally use an
> API based not on the API description but on its
implementation. If at
> some point someone add something like entity lookup to
> xmlAttrSerializeTxtContent or just check the document
type, suddently
> the writer would crash in an unexpected way.
> So fields should at least be zero initialized, which
is probably
> a bit costly if done for every single attribute
emitted. So while passing
> a document might be a good idea, it's better to create
and initialize that
> document when getting the xmlTextWriterPtr. Maybe
adding a doc in
> struct _xmlTextWriter and making sure it's at least
minimally initialized
> and properly freed with the xmlTextWriter is a good
idea. That sounds
> far more clean to me, but a bit more work indeed
>
> Daniel
>
Was tying to keep the change as minimal as possible, so how
about this one.
A doc has been added to the struct and is always created
within
xmlNewTextWriter(). This would at least provide a doc that
could be used
for this type of thing, rather than checking if one exists
and having to
create one if needed at different points within the
xmlwriter API. This
doc is also completely independent of any doc that may exist
within a
context.
Rob
Index: xmlwriter.c
============================================================
=======
RCS file: /cvs/gnome/gnome-xml/xmlwriter.c,v
retrieving revision 1.36
diff -c -r1.36 xmlwriter.c
*** xmlwriter.c 27 Apr 2006 08:10:25 -0000 1.36
--- xmlwriter.c 16 Aug 2006 10:14:40 -0000
***************
*** 91,96 ****
--- 91,97 ----
char qchar; /* character used for
quoting attribute values */
xmlParserCtxtPtr ctxt;
int no_doc_free;
+ xmlDocPtr doc;
};
static void xmlFreeTextWriterStackEntry(xmlLinkPtr lk);
***************
*** 215,220 ****
--- 216,224 ----
"xmlNewTextWriter : out of
memory!\n");
return NULL;
}
+
+ ret->doc = xmlNewDoc(NULL);
+
ret->no_doc_free = 0;
return ret;
***************
*** 487,492 ****
--- 491,499 ----
xmlFreeParserCtxt(writer->ctxt);
}
+ if (writer->doc != NULL)
+ xmlFreeDoc(writer->doc);
+
if (writer->ichar != NULL)
xmlFree(writer->ichar);
xmlFree(writer);
***************
*** 539,544 ****
--- 546,553 ----
if (encoder != NULL) {
writer->out->conv =
xmlBufferCreateSize(4000);
xmlCharEncOutFunc(encoder,
writer->out->conv, NULL);
+ if (writer->doc)
+ writer->doc->encoding = xmlStrdup((xmlChar
*)writer->out->encoder->name);
} else
writer->out->conv = NULL;
***************
*** 1425,1432 ****
break;
case XML_TEXTWRITER_ATTRIBUTE:
buf = NULL;
!
xmlAttrSerializeTxtContent(writer->out->buffer, NULL,
! NULL,
content);
break;
default:
break;
--- 1434,1441 ----
break;
case XML_TEXTWRITER_ATTRIBUTE:
buf = NULL;
!
xmlAttrSerializeTxtContent(writer->out->buffer,
writer->doc,
! NULL,
content);
break;
default:
break;
_______________________________________________
xml mailing list, project page http://xmlsoft.org/
xml gnome.org
http://mai
l.gnome.org/mailman/listinfo/xml
|
|
| xmlwriter, attributes and encoding |

|
2006-08-16 12:14:37 |
On Wed, Aug 16, 2006 at 07:54:49AM -0400, Rob Richards
wrote:
> Daniel Veillard wrote:
> >On Tue, Aug 15, 2006 at 07:06:48PM -0400, Rob
Richards wrote:
> >
> >>When using xmlwriter attributes , document
encoding that has been set is
> >>not passed to xmlAttrSerializeTxtContent, so
character references are
> >>written when needed.
> >>The xmlDocPtr passed to
xmlAttrSerializeTxtContent is only used to check
> >>encoding, so I was wondering if the code in the
attached patch is fine
> >>to fix this (it fakes a document so any changes
within
> >>xmlAttrSerializeTxtContent to manipulate the
passed document could
> >>possibly blow up - though i dont see any
need/reason to ever do this).
> >>
> >
> > Hum, that's a bit nasty, because none of the
fields in the document
> >structure are initialized (except encoding), so it
internally use an
> >API based not on the API description but on its
implementation. If at
> >some point someone add something like entity lookup
to
> >xmlAttrSerializeTxtContent or just check the
document type, suddently
> >the writer would crash in an unexpected way.
> > So fields should at least be zero initialized,
which is probably
> >a bit costly if done for every single attribute
emitted. So while passing
> >a document might be a good idea, it's better to
create and initialize that
> >document when getting the xmlTextWriterPtr. Maybe
adding a doc in
> >struct _xmlTextWriter and making sure it's at
least minimally initialized
> >and properly freed with the xmlTextWriter is a good
idea. That sounds
> >far more clean to me, but a bit more work indeed
> >
> >Daniel
> >
>
> Was tying to keep the change as minimal as possible, so
how about this one.
> A doc has been added to the struct and is always
created within
> xmlNewTextWriter(). This would at least provide a doc
that could be used
> for this type of thing, rather than checking if one
exists and having to
> create one if needed at different points within the
xmlwriter API. This
> doc is also completely independent of any doc that may
exist within a
> context.
> ***************
> *** 539,544 ****
> --- 546,553 ----
> if (encoder != NULL) {
> writer->out->conv =
xmlBufferCreateSize(4000);
> xmlCharEncOutFunc(encoder,
writer->out->conv, NULL);
> + if (writer->doc)
> + writer->doc->encoding = xmlStrdup((xmlChar
*)writer->out->encoder->name);
> } else
> writer->out->conv = NULL;
I would suggest to guard that a bit better as
if ((writer->doc != NULL) &&
(writer->doc->encoding == NULL))
just in case writer->doc gets inherited in some way in
the future or if
a default encoding shows up.
Otherwise it looks fine to me, just run the regression tests
for good measure
before commiting, one never knows
Daniel
--
Red Hat Virtualization group http://redhat.com/v
irtualization/
Daniel Veillard | virtualization library http://libvirt.org/
veillard redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ |
Rpmfind RPM search engine http://rpmfind.net/
_______________________________________________
xml mailing list, project page http://xmlsoft.org/
xml gnome.org
http://mai
l.gnome.org/mailman/listinfo/xml
|
|
[1-4]
|
|