List Info

Thread: New to XML::SAX; problem with character data




New to XML::SAX; problem with character data
country flaguser name
Netherlands
2007-09-21 01:38:57

Hi,

I'm new to the XML::SAX parser (also new to XML parsing, my previous XML parsing method was only using regex..).
First I couldn't understand how to get the attributes out of the hashes. I really missed some example were they showed how to do that. Cause there is a XML::INTRO on cpan, but no real example in it.

Ok but now the problem I can't figure out how to get all my data from an element. It's a like a comment tag that I need to parse, but it's long with all kinds of characters in it.
What I have:
sub characters {
   ; my ($self, $characters) = _;
 ;   my $text = $characters->{Data};
   ; $text =~ s/^s*//;
    $text =~ s/s*$//;
    return '' unless $text;
    if ($current_element eq 'Comment'){
     ;   $hash{'Comment'}=$text;
   ; }
}
But the problem is that $text doesn't have the complete comment entry. I know it stops here: "(3&apos"  (where "(3" last thing stored).

Sorry if this is already explained earlier to someone, but I couldn't find it on the internet and not on this mailinglist.

Can someone show me?

Btw. Is the sub Characters the one to take to get the data from between the tags?

Greetings,

Jelle

Re: New to XML::SAX; problem with character data
country flaguser name
Czech Republic
2007-09-21 02:22:40
snaphitplanet.nl wrote:
> Hi,
> 
> I'm new to the XML::SAX parser (also new to XML
parsing, my previous XML 
> parsing method was only using regex..).
> First I couldn't understand how to get the attributes
out of the hashes. 
> I really missed some example were they showed how to do
that. Cause 
> there is a XML::INTRO on cpan, but no real example in
it.

See these pages for more info including examples:
http://perl-xml.sourceforge.net/perl-sax/sax-2.1-ref.ht
ml
ht
tp://cpan.uwinnipeg.ca/htdocs/XML-SAX/Intro.html
http://perl-xml.
sourceforge.net/faq/

> 
> Ok but now the problem I can't figure out how to get
all my data from an 
> element. It's a like a comment tag that I need to
parse, but it's long 
> with all kinds of characters in it.
> What I have:
> sub characters {
>     my ($self, $characters) = _;
>     my $text = $characters->;
>     $text =~ s/^s*//;
>     $text =~ s/s*$//;
>     return '' unless $text;
>     if ($current_element eq 'Comment'){
>         $hash{'Comment'}=$text;
>     }
> }
> But the problem is that $text doesn't have the complete
comment entry. I 
> know it stops here: "(3&apos"  (where
"(3" last thing stored).
> 
> Sorry if this is already explained earlier to someone,
but I couldn't 
> find it on the internet and not on this mailinglist.
> 
> Can someone show me?
> 
> Btw. Is the sub Characters the one to take to get the
data from between 
> the tags?


Yes, but SAX does not guarantee that all text data are sent
within one event. 
You can have multiple consecutive events which you have to
join yourself. This 
depends on a specific parser you are using. For example,
with XML::SAX::ExpatXS 
consequent character data are joined and provided in one
event by default, 
XML::SAX::Expat calls one event for each line, plus
additional one for each line 
end, etc.

Petr


-- 
Petr Cimprich
Ginger Alliance
www.gingerall.com
_______________________________________________
Perl-XML mailing list
Perl-XMLlistserv.ActiveState.com
To unsubscribe: http:/
/listserv.ActiveState.com/mailman/mysubs

[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )