List Info

Thread: using xmlTextReader efficiently




using xmlTextReader efficiently
user name
2006-08-24 03:52:09
I'd like to use xmlTextReader to parse and validate source documents against DTDs and Relax NG. I'm somewhat familiar with this already, as I've used libxml2 before.

It just dawned on me, however, that it would be a good idea to ask the list to confirm some assumptions I've made about using xmlTextReader. I've looked at the source, but I'm not the most experienced C programmer, so I wanted to double check with the experts.

Basically, I want to verify that you can use xmlTextReader like an ideal SAX parser that doesn't build an entire tree structure of source documents in memory.

I assume that this example:


does not build an in-memory tree of the entire source doc being validated, but rather only holds small portions of the document at a time. Is that correct?

Additionally, when taking similar action with a RELAX NG schema, does the same hold true for the source document (obviously, the schema doc has to be parsed into a tree in memory)?

Thanks for the advice!



Todd Ditchendorf

Scandalous Software - Cocoa Developer Tools



using xmlTextReader efficiently
user name
2006-08-24 10:46:27
On Wed, Aug 23, 2006 at 08:52:09PM -0700, Todd Ditchendorf
wrote:
> I'd like to use xmlTextReader to parse and validate
source documents  
> against DTDs and Relax NG. I'm somewhat familiar with
this already,  
> as I've used libxml2 before.
> 
> It just dawned on me, however, that it would be a good
idea to ask  
> the list to confirm some assumptions I've made about
using  
> xmlTextReader. I've looked at the source, but I'm not
the most  
> experienced C programmer, so I wanted to double check
with the experts.
> 
> Basically, I want to verify that you can use
xmlTextReader like an  
> ideal SAX parser that doesn't build an entire tree
structure of  
> source documents in memory.

  Right, basically it's a bit like a tree parser bug with
just a 
sliding window of the document being constructed at a given
time,
at the minimal the current node and its ancestors.

> I assume that this example:
> 
> http://xmlsoft.
org/examples/reader2.c
> 
> does not build an in-memory tree of the entire source
doc being  
> validated, but rather only holds small portions of the
document at a  
> time. Is that correct?

  yes

> Additionally, when taking similar action with a RELAX
NG schema, does  
> the same hold true for the source document (obviously,
the schema doc  
> has to be parsed into a tree in memory)?

  yes, except that with *some* piece of RNG schemas larger
parts of the
tree need to be available, in RNG you may need to accumulate
data, either
document data (libxml2 way) or regexp data (derivation
method).

Daniel

-- 
Red Hat Virtualization group http://redhat.com/v
irtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
veillardredhat.com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ |
Rpmfind RPM search engine  http://rpmfind.net/
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xmlgnome.org
http://mai
l.gnome.org/mailman/listinfo/xml
[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )