Fredrik Lundh wrote:
> Chris Withers wrote:
>
>>> That's how escaping works, be it in XML,
encodings, compression, whatever.
>> Well yes and no. I'd expect escaping to work such
that whatever we're
>> dealing with can be round tripped, ie: parsed,
serialiazed, parsed
>> again, etc.
>
> that's exactly how it works in ET, of course.
I didn't say it didn't
> cdata is character data; see
>
> http://
www.w3.org/TR/html401/types.html#h-6.2
>
> that's not the same thing as a "CDATA
section" (which is just one of
> several ways to store character data in an XML file).
Ug. How confusing :-(
> how things are
> stored doesn't matter; that's just a serialization
detail:
>
> http://www.
w3.org/TR/xml-infoset/#omitted
>
> What is not in the Information Set
>
> 6. Whether characters are represented by character
references.
> 19. The boundaries of CDATA marked sections.
> ...
I'm not sure I follow what you're trying to say...
>> I and many others do not When
writing content into an html template,
>> that content often comes from other sources that
spit out lumps of html.
>> Being able to insert them without escaping is a
common use case.
>
> HTML might be similar to XML, but an XML parser cannot
parse HTML, so
> you cannot insert HTML fragments into an XML document
without either
> escaping it, or pre-processing it to make sure it's
well-formed.
What about xhtml?
> if you want to embed HTML fragments in an ET tree, use
ElementTidy or
> ElementSoup (or equivalent) to turn the fragment into
properly nested
> and properly namespaced XHTML.
Fair enough...
> if you want to do unstructured string handling, use a
template library
I'm using/building a templating library, it just happens
that ET is an
implementation detail of that template library
>> That's true, sometimes. That inserted lump may have
come from a process
>> which can only spit out perfect html fragments, in
which case you're
>> fine, or it may come from user input, in which case
you're doomed but
>> will likely have happy customers
>
> the hackers will be happy, at least:
>
> htt
p://en.wikipedia.org/wiki/Cross_site_scripting
user -> content author in this case.
Since they usually own and run the system to which they're
adding
content, a much more effective attack would just be to turn
the box off :-P
cheers,
Chris
--
Simplistix - Content Management, Zope & Python
Consulting
- http://www.simplistix.co.
uk
_______________________________________________
XML-SIG maillist - XML-SIG python.org
http:
//mail.python.org/mailman/listinfo/xml-sig
|