List Info

Thread: Getting element content non-recursively




Getting element content non-recursively
user name
2006-09-18 15:45:25

Hi all,

 

Is there a way in which I can get the content of an element that has children, without getting the content of those children also? For example if I have something like:

 

<node1&gt;

 &nbsp;  <node2&gt;some text</node2>

<node1&gt;

 

And I call xmlNodeGetContent on node1, I get “some text̶1; and some blank characters around it. For this particular example I would want an empty string as node1 doesn̵7;t have any text content itself.

 

This is actually part of a larger problem that I am trying to solve, so there may be a better way. Basically, for my application, I have little fragments of XML that I need to add to various elements of other XML documents. I need to parse these fragments to ensure they are valid so I was looking at the functions that add children to nodes but I could not find anything that adds (and parses) a complete fragment of xml. Therefore, what I do instead is build a new document from my fragment (which catches any parse errors), then I iterate around all of the attributes and nodes in a recursive manner adding them to my target. For each node, I make a new element with the name and content of the original and here is where I get my problem – the xmlNodeGetContent function is giving me the concatenated content of all children but as I am handling recursion myself, I need only the content of the actual node I am referring to.

 

 

Any help much appreciated!

 

 

Thanks,

Caroline M.

Getting element content non-recursively
user name
2006-09-18 16:05:25
On Mon, Sep 18, 2006 at 04:45:25PM +0100, Caroline
Middlebrook wrote:
> Hi all,
> 
> 
> 
> Is there a way in which I can get the content of an
element that has
> children, without getting the content of those children
also? For example if
> I have something like:
> 
> 
> 
> <node1>
> 
>     <node2>some text</node2>
> 
> <node1>
> 
> 
> 
> And I call xmlNodeGetContent on node1, I get
"some text" and some blank
> characters around it. For this particular example I
would want an empty
> string as node1 doesn't have any text content itself.

  node->children should be a text node with

        "

      "

   in it, the node->children->next should be the
element node node2
and node->children->next->next should contain

                                "

 "

   as per your indentation. All spaces are significant in
XML.

> This is actually part of a larger problem that I am
trying to solve, so
> there may be a better way. Basically, for my
application, I have little
> fragments of XML that I need to add to various elements
of other XML
> documents. I need to parse these fragments to ensure
they are valid so I was
> looking at the functions that add children to nodes but
I could not find
> anything that adds (and parses) a complete fragment of
xml. Therefore, what

  There is no notion in XML of parsing "fragment of a
document", it's defined
only in terms of parsing a full document.

> I do instead is build a new document from my fragment
(which catches any
> parse errors),

  Parse errors in XML are *fatal*, proces MUST stop there if
you find one.

> then I iterate around all of the attributes and nodes
in a
> recursive manner adding them to my target. For each
node, I make a new
> element with the name and content of the original and
here is where I get my
> problem - the xmlNodeGetContent function is giving me
the concatenated
> content of all children but as I am handling recursion
myself, I need only
> the content of the actual node I am referring to.

  I'm apparently too jet-lag to fully understand what you
are doing here,
sorry !

Daniel

-- 
Red Hat Virtualization group http://redhat.com/v
irtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
veillardredhat.com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ |
Rpmfind RPM search engine  http://rpmfind.net/
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xmlgnome.org
http://mai
l.gnome.org/mailman/listinfo/xml
Getting element content non-recursively
user name
2006-09-18 19:13:11
Caroline Middlebrook wrote:
> Hi all,
> 
> 
> 
> Is there a way in which I can get the content of an
element that has
> children, without getting the content of those children
also? For example if
> I have something like:
> 
> 
> 
> <node1>
> 
>     <node2>some text</node2>
> 
> <node1>
> 
> 
> 
> And I call xmlNodeGetContent on node1, I get
"some text" and some blank
> characters around it. For this particular example I
would want an empty
> string as node1 doesn't have any text content itself.
> 
> 
> 
> This is actually part of a larger problem that I am
trying to solve, so
> there may be a better way. Basically, for my
application, I have little
> fragments of XML that I need to add to various elements
of other XML
> documents. I need to parse these fragments to ensure
they are valid so I was
> looking at the functions that add children to nodes but
I could not find
> anything that adds (and parses) a complete fragment of
xml. Therefore, what
> I do instead is build a new document from my fragment
(which catches any
> parse errors), then I iterate around all of the
attributes and nodes in a
> recursive manner adding them to my target. For each
node, I make a new
> element with the name and content of the original and
here is where I get my
> problem - the xmlNodeGetContent function is giving me
the concatenated
> content of all children but as I am handling recursion
myself, I need only
> the content of the actual node I am referring to.
> 
Hi Caroline,

This actually sounds like a job for XSLT. Have you not taken
that 
approach for a particular reason?

- Rush
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xmlgnome.org
http://mai
l.gnome.org/mailman/listinfo/xml
Getting element content non-recursively
user name
2006-09-19 07:36:32
Hi Daniel,

>There is no notion in XML of parsing "fragment of
a document", it's defined
>only in terms of parsing a full document.

Yes, so what I am trying to do instead, is take one XML
document that has
been parsed, and add its contents to a node of an existing
XML document.

>I'm apparently too jet-lag to fully understand what you
are doing here,
>sorry !

Well, my overall intention is to take a fragment of XML, and
insert that
fragment including all children to an existing node of
another document. As
that is not an operation that is inherently supported with
XML, I am trying
to achieve the same effect by iterating over the nodes of
the fragment in a
recursive fashion and adding each one to my target document.
Thus at each
node I am doing something like: (pseudo code)

    Node* node =
targetNode->make_new_element(sourceNode->get_name());
    node->set_content(sourceNode->get_content());
    for (iterator i = sourceNode ->attributes_begin();
        i != sourceNode ->attributes_end(); ++i)
    {
        node->add_attribute();
    }
    // now call this function recursively with any children
of sourceNode

The call to get_content() above, is where the call is being
made to
xmlNodeGetContent and what I want here is the text of just
the current node
only, and not its children.

Thanks,
Caroline M.



_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xmlgnome.org
http://mai
l.gnome.org/mailman/listinfo/xml
Getting element content non-recursively
user name
2006-09-19 07:38:29
Hi Rush,

>This actually sounds like a job for XSLT. Have you not
taken that
>approach for a particular reason?
>- Rush

Yes, I am using a C++ library that has been written by
somebody else in my
company. I can make small additions to it but nothing major.
The library is
just a C++ wrapper around libxml, and doesn't support XSLT
or any other XML
technologies.

Thanks.
Caroline M





_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xmlgnome.org
http://mai
l.gnome.org/mailman/listinfo/xml
Getting element content non-recursively
user name
2006-09-19 17:20:16
Caroline Middlebrook wrote:
> Hi Rush,
> 
> 
>>This actually sounds like a job for XSLT. Have you
not taken that
>>approach for a particular reason?
>>- Rush
> 
> 
> Yes, I am using a C++ library that has been written by
somebody else in my
> company. I can make small additions to it but nothing
major. The library is
> just a C++ wrapper around libxml, and doesn't support
XSLT or any other XML
> technologies.
> 

Well, that's too bad. I have to admit that I don't quite
see what you're 
doing, but I will offer one idea to think about. Would it
help you to 
first iterate over all the elements in document order and
save pointers 
to them in a vector, then go through the vector in reverse
order to do 
your replacements? That would mean that you replace contents
of the most 
highly nested elements first. I have an application that
needs to 
replace attributes on certain elements, and that is the way
I do it. 
When I get the contents of a node, I get all the nested
content too, 
including whatever I may have changed already.

- Rush
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xmlgnome.org
http://mai
l.gnome.org/mailman/listinfo/xml
Getting element content non-recursively
user name
2006-09-20 07:41:58
>Well, that's too bad. I have to admit that I don't
quite see what you're
>doing, but I will offer one idea to think about.

Basically, I have some XML, which I want to insert into
another piece of
XML. That's it! I just can't seem to find a way to do that
correctly!

I am implementing an XML-based messaging protocol. All of
the messages have
a strict format defined by a DTD and that is fine. However,
a handful of the
messages have an element that is just a placeholder for any
extra data
needed - that element may hold any amount of valid XML.

I need to pull those messages apart, and I need to put them
back together
again. Pulling them apart is easy but when I put them back
together I
struggle with this additional element because I can't seem
to find a way to
copy this fragment of XML into the message I am currently
building.

>Would it help you to first iterate over all the elements
in document order
>and save pointers to them in a vector, then go through
the vector in
>reverse order to do your replacements? That would mean
that you replace
>contents of the most highly nested elements first. I
have an application
>that needs to replace attributes on certain elements,
and that is the way I
>do it.

I'm not sure. I don't think so because that recursion
problem will still hit
me as I get to the top-most elements. All I need to do is
get the text
context of a single element, without getting the content of
all its children
too.

Caroline




_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xmlgnome.org
http://mai
l.gnome.org/mailman/listinfo/xml
Getting element content non-recursively
user name
2006-09-20 07:41:58
>Well, that's too bad. I have to admit that I don't
quite see what you're
>doing, but I will offer one idea to think about.

Basically, I have some XML, which I want to insert into
another piece of
XML. That's it! I just can't seem to find a way to do that
correctly!

I am implementing an XML-based messaging protocol. All of
the messages have
a strict format defined by a DTD and that is fine. However,
a handful of the
messages have an element that is just a placeholder for any
extra data
needed - that element may hold any amount of valid XML.

I need to pull those messages apart, and I need to put them
back together
again. Pulling them apart is easy but when I put them back
together I
struggle with this additional element because I can't seem
to find a way to
copy this fragment of XML into the message I am currently
building.

>Would it help you to first iterate over all the elements
in document order
>and save pointers to them in a vector, then go through
the vector in
>reverse order to do your replacements? That would mean
that you replace
>contents of the most highly nested elements first. I
have an application
>that needs to replace attributes on certain elements,
and that is the way I
>do it.

I'm not sure. I don't think so because that recursion
problem will still hit
me as I get to the top-most elements. All I need to do is
get the text
context of a single element, without getting the content of
all its children
too.

Caroline




_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xmlgnome.org
http://mai
l.gnome.org/mailman/listinfo/xml
Getting element content non-recursively
user name
2006-09-20 08:04:19
On Wed, Sep 20, 2006 at 08:41:58AM +0100, Caroline
Middlebrook wrote:
> >Well, that's too bad. I have to admit that I
don't quite see what you're
> >doing, but I will offer one idea to think about.
> 
> Basically, I have some XML, which I want to insert into
another piece of
> XML. That's it! I just can't seem to find a way to do
that correctly!

  copy and paste ! Yes you need 2 operations to be sure if
you don't understand
libxml2 internals:
  http://xmlsoft.org/html/libxml-tree.html#xmlDocCopyNode
  http://xmlsoft.org/html/libxml-tree.html#xmlAddNextSib
ling

Daniel

-- 
Red Hat Virtualization group http://redhat.com/v
irtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
veillardredhat.com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ |
Rpmfind RPM search engine  http://rpmfind.net/
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xmlgnome.org
http://mai
l.gnome.org/mailman/listinfo/xml
Getting element content non-recursively
user name
2006-09-20 08:04:19
On Wed, Sep 20, 2006 at 08:41:58AM +0100, Caroline
Middlebrook wrote:
> >Well, that's too bad. I have to admit that I
don't quite see what you're
> >doing, but I will offer one idea to think about.
> 
> Basically, I have some XML, which I want to insert into
another piece of
> XML. That's it! I just can't seem to find a way to do
that correctly!

  copy and paste ! Yes you need 2 operations to be sure if
you don't understand
libxml2 internals:
  http://xmlsoft.org/html/libxml-tree.html#xmlDocCopyNode
  http://xmlsoft.org/html/libxml-tree.html#xmlAddNextSib
ling

Daniel

-- 
Red Hat Virtualization group http://redhat.com/v
irtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
veillardredhat.com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ |
Rpmfind RPM search engine  http://rpmfind.net/
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xmlgnome.org
http://mai
l.gnome.org/mailman/listinfo/xml
[1-10] [11-18]

about | contact  Other archives ( Real Estate discussion Medical topics )