List Info

Thread: Re: A whitespace issue in XML::LibXML




Re: A whitespace issue in XML::LibXML
country flaguser name
United States
2007-07-20 18:04:36
Birgit Kellner wrote:
> Vaclav Barta schrieb:
>   
>> On Friday 20 July 2007 20:27, Birgit Kellner
wrote:
>>   
>>     
>>> Petr Pajas schrieb:
>>>     
>>>       
>>>> First of all, why do you do that "by
hand"? To get all text nodes
>>>> from a subtree nicely concatenated, you can
use e.g.
>>>>
>>>> $text = $node->findvalue('string(.)')
>>>>       
>>>>         
>>> I should have been more specific on that. I'm
not interested in all text
>>> nodes, but in text node children of the element
<seg>, and in text node
>>>     
>>>       
>> seg/text()
>>
>>   
>>     
>>> children of the element <span> that can
be contained within <seg>. <seg>
>>>     
>>>       
>> seg/span/text()
>>
>> XPath may not be very perlish, but it's quite
useful...
>>
>> 	Bye
>> 		Vasek
>>
>>   
>>     
> Yes, but that wouldn't get me the proper sequence, no?
>
> Consider:
>
> <seg>This is <span>a cold
breeze</span><note>Oh, and here's some text 
> which deals with a completely different subject-matter,
say, bunny 
> rabbits on a balcony.</note> on an unbearably hot
summer evening.</seg>
>   
Assuming that $seg contains a reference to the
"seg" node in the example 
above:

my $seg_text = '';
for my $node ($seg->findnodes('text() | span'))
{
    $seg_text .= $node->findvalue('string(.)');
}

will produce "This is a cold breeze on an unbearably
hot summer 
evening."   Is this what you are trying to achieve (or
at least steer 
you in the right direction)?

FYI, the 'text() | span' XPath expression above selects just
the raw 
text node and "span" element children of the
"seg" element, skipping the 
"note" element children (or any other child nodes,
for that matter).  By 
combining the two separate relative XPaths
"text()" and "span" with an 
or operator, findnodes() will return all the nodes in the
proper 
sequence.  Then, as Petr suggested, findvalue('string(.)')
will return 
the concatenated text of the current node and any of its
descendants.

HTH,
Richard
_______________________________________________
Perl-XML mailing list
Perl-XMLlistserv.ActiveState.com
To unsubscribe: http:/
/listserv.ActiveState.com/mailman/mysubs

[1]

about | contact  Other archives ( Real Estate discussion Medical topics )