List Info

Thread: @role in SVG




SVG in text/html (was: @role in SVG)
country flaguser name
United States
2007-10-12 11:09:16
Hi, Henri-

Henri Sivonen wrote (on 10/12/2007 7:23 AM):
> 
> We don't do inline SVG in text/html yet. Personally, I
hope we'll get 
> there. However, if we do, the main SVG complications
will be the xlink 
> mapping, the /> syntax and SVG-native camelCaps. I
don't think it is a 
> good idea to introduce more complications if we are
already entertaining 
> inline SVG in text/html as a possibility.

Thanks for outlining the challenges to integrating SVG into
text/html, 
from an HTML5 standpoint.  That's very helpful.

I also want that to happen, and would like to facilitate
that when the 
time comes.  Also like you, I do have certain concerns about
how it's 
done.  I'll give you my viewpoint (which is not necessarily
shared by 
the rest of the SVG or CDF WGs).

 From a technical and market viewpoint (an odd pairing,
perhaps), I feel 
very strongly that SVG-in-HTML should maintain identical
markup syntax 
with standalone SVG (or SVG-in-XHTML, and probably
X/HTML-in-SVG); any 
differences between the two syntaces would be actively
harmful to SVG. 
For example, someone who copy-pasted an SVG fragment from
HTML and tried 
to use it as a standalone file, or imported it into an SVG
file (perhaps 
in an automated mashup) would get unexpected and probably
disastrous 
results.  Those inconsistencies would leave casual authors
with a bad 
impression of SVG, and force advanced authors to make
elaborate 
workarounds.  The goal, from the perspective of the SVG WG,
would be to 
make it easier --not harder-- for authors, and to increase
the use of 
SVG (and specifically not to drive authors into the hands of
vendors of 
proprietary formats).  I'm not saying that the SVG WG is not
willing to 
consider reasonable compromises, just that the end result of
should be a 
uniform syntax for SVG.

 From a logistics standpoint, this work should be done in
coordination 
between the HTML, SVG, and CDF Working Groups.  All have a
vested 
interest in it, and each has a unique set of perspectives,
needs, and 
knowledge.  Perhaps we can begin talking about it at the
upcoming Tech 
Plenary.  We are all busy with other things right now, but
opening the 
dialog will prepare us for what we'll need to consider going
forward.

Regards-
-Doug Schepers
W3C Staff Contact, SVG, CDF, and WebAPI


Re: SVG in text/html (was: @role in SVG)
country flaguser name
Finland
2007-10-13 09:43:59
On Oct 12, 2007, at 19:09, Doug Schepers wrote:
> Henri Sivonen wrote (on 10/12/2007 7:23 AM):
>> We don't do inline SVG in text/html yet.
Personally, I hope we'll  
>> get there. However, if we do, the main SVG
complications will be  
>> the xlink mapping, the /> syntax and SVG-native
camelCaps. I don't  
>> think it is a good idea to introduce more
complications if we are  
>> already entertaining inline SVG in text/html as a
possibility.
>
> Thanks for outlining the challenges to integrating SVG
into text/ 
> html, from an HTML5 standpoint.  That's very helpful.
>
> I also want that to happen, and would like to
facilitate that when  
> the time comes.  Also like you, I do have certain
concerns about  
> how it's done.  I'll give you my viewpoint (which is
not  
> necessarily shared by the rest of the SVG or CDF WGs).
>
> From a technical and market viewpoint (an odd pairing,
perhaps), I  
> feel very strongly that SVG-in-HTML should maintain
identical  
> markup syntax with standalone SVG (or SVG-in-XHTML, and
probably X/ 
> HTML-in-SVG); any differences between the two syntaces
would be  
> actively harmful to SVG.

Do you mean you'd like to bring in the complication of
arbitrary  
namespace prefixes? I'd like make the following deviations
from SVG- 
as-XML syntax:
  1) I'd like to minimize the need of tokenizer
parametrization to  
toggling case folding behavior and, if we must, CDATA
sections.  
Specifically, I think attribute tokenization should run the
same code  
as attribute tokenization for the HTML parts of text/html.
  2) I'd like to avoid supporting arbitrary namespace
prefixes both  
in order to sidestep issues in shipped IE versions and in
order to  
relieve authors of namespace syntax. (xlink: should probably
be  
considered non-arbitrary and hard-wired.)

More concretely, I've been thinking something like this
might work:
  * Case folding in the tokenizer should be made conditional
so that  
potentially camelCap names in <svg> subtrees would not
be case-folded.
    - Issue: Should case folding be toggled on and off (in
which case  
tokenizing "<svg " would happen in the
case-folding state allowing  
"<SvG ") or should names be collected unfolded
and then whole names  
conditionally case-folded (in which case we could require
"<svg " to  
be in lower case)?
    - Issue 2: If the latter, to avoid expensively
case-folding whole  
start tag tokens *including* attributes later on, the
tokenizer  
should probably have to know about tag names that turn on
the case- 
preserving mode before looking for attributes but the tree
builder  
should be the part of the parser telling the tokenizer to
switch back  
to the case folding mode. This would be ugly but probably
necessary.
  * Start tag tokens should have a flag about the />
presence. The  
tree builder would ignore this for HTML elements but would
pop  
immediately for SVG elements.
  * The <svg> element would establish "an SVG
scope" in the tree  
builder. The <svg> start tag token would itself be
handled in the  
HTML state of the tree builder so that the <svg>
element would be  
subject to foster parenting.
  * When in an SVG scope, the tree builder would ignore the
HTML tree  
building rules. This means that stray tags looking like HTML
tags  
could not cause the tree builder to pop out of the SVG
scope. While  
in the SVG scope, the tree builder would assign the SVG
namespace URI  
to the element nodes it creates.
    - Issue: What to do if there is a prefixed element?
  * When in the SVG scope, a start tag token would
unconditionally  
result in the corresponding element node to be appended to
the  
current node. (And if the /> flag is set on the token,
the node would  
be popped immediately.)
  * When in the SVG scope, an end tag token would cause a  
corresponding element to be searched starting with the
current node  
towards the start of the SVG scope (and no further). If an
element  
were found in scope, the stack would be popped until that
element got  
popped. If there were no such element in scope, the end tag
would be  
ignored. Any outcome but a single pop would be a parse
error.
  * When the current node is a foreignObject element in an
SVG scope,  
the start tag token <html> would establish a
"nested HTML scope". </ 
html>, <body> and </body> would act like
"normal" tokens in a nested  
HTML scope. Specifically, any token other than </html>
encountered in  
a nested HTML scope would be unable to break out of the
nested HTML  
scope.
  * Attributes with the name "xlink:href" on the
tokenization level  
would be reported by the tokenizer as local name
"href" in the XLink  
namespace.
  * xmlns or xmlns attributes
would have no meaning and would be  
non-conforming except xmlns="http://www.w3.org/20
00/svg" and  
xmlnslink=
"http://www.w3.org/
1999/xlink" would be allowed as  
"talismans" on the <svg> start tag.

The above trial balloon proposal is designed to optimize SVG
 
integration in text/html in *future* browsers in a way that
would  
create a namespace-aware DOM that current DOM-based SVG  
implementations would grok immediately but would at the same
time  
remove namespace declaration syntax from the sight of
authors. The  
proposal specifically isn't designed to clone the
colon-based  
namespaces-in-text/html mechanism of IE. OTOH, it shouldn't
interfere  
with it, either, except perhaps for xlink:href, which could
be worked  
around by introducing href.

The approach outlined above could be used for MathML as
well.  
However, in that case, the tokenizer should probably me
modified to  
switch to MathML entity tables when the tree builder is in a
MathML  
scope.

> From a logistics standpoint, this work should be done
in  
> coordination between the HTML, SVG, and CDF Working
Groups.  All  
> have a vested interest in it, and each has a unique set
of  
> perspectives, needs, and knowledge.  Perhaps we can
begin talking  
> about it at the upcoming Tech Plenary.  We are all busy
with other  
> things right now, but opening the dialog will prepare
us for what  
> we'll need to consider going forward.

I agree it would make sense to talk about it at the Tech
Plenary.

-- 
Henri Sivonen
hsivoneniki.fi
http://hsivonen.iki.fi/




Re: SVG in text/html
country flaguser name
Czech Republic
2007-10-13 10:48:13
Henri Sivonen wrote:

> The above trial balloon proposal is designed to
optimize SVG integration
> in text/html in *future* browsers in a way that would
create a
> namespace-aware DOM 

Hmm, and why to support SVG (or any other embedded
vocabulary) in HTML
serialization at all. XML serialization can be used without
any problems
for such purposes. I think that cleaner approach is to
switch from HTML
to XML syntax if you want to use XML features. Trying to
emulate XML
features in HTML syntax is way to hell.

-- 
------------------------------------------------------------
------
  Jirka Kosek      e-mail: jirkakosek.cz      http://xmlguru.cz
------------------------------------------------------------
------
       Professional XML consulting and training services
  DocBook customization, custom XSLT/XSL-FO document
processing
------------------------------------------------------------
------
 OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34
member
------------------------------------------------------------
------

Re: SVG in text/html (was: @role in SVG)
country flaguser name
Norway
2007-10-13 11:22:55
ON SAT, 13 OCT 2007 16:43:59 +0200, HENRI SIVONEN
<HSIVONENIKI.FI> WROTE:

> DO YOU MEAN YOU'D LIKE TO BRING IN THE COMPLICATION OF
ARBITRARY  
> NAMESPACE PREFIXES? I'D LIKE MAKE THE FOLLOWING
DEVIATIONS FROM SVG- 
> AS-XML SYNTAX:

I THINK IT SHOULD BE POSSIBLE TO HAVE THE SAME SVG MARKUP TO
WORK BOTH  
WHEN PARSED AS XML AND WHEN PARSED WITH THE HTML PARSER (TO
THE SAME  
EXTENT AS YOU CAN HAVE HTML MARKUP THAT WORKS BOTH WHEN
PARSED AS XML AND  
WHEN PARSED WITH THE HTML PARSER), AND MOREOVER, IT SHOULD
BE POSSIBLE TO  
WRITE SCRIPTS THAT FIX UP THE DOM AFTERWARDS FOR LEGACY
UAS.


>   1) I'D LIKE TO MINIMIZE THE NEED OF TOKENIZER
PARAMETRIZATION TO  
> TOGGLING CASE FOLDING BEHAVIOR AND, IF WE MUST, CDATA
SECTIONS.

CDATA SECTIONS AND THE CONTENT MODEL FLAG ARE INTERESTING.

IN LEGACY UAS, <SCRIPT> AND <STYLE> WILL BE
PARSED AS CDATA ELEMENTS, AND  
<TITLE> AND <TEXTAREA> AS RCDATA. DOING THE SAME
IN NEW UAS IS NICE  
BECAUSE THAT MAKES SURE THAT CONTENT WILL DEGRADE REASONABLY
IN LEGACY  
UAS, AND MAKES IT EASIER TO WRITE SCRIPTS THAT FIXES THE DOM
FOR LEGACY  
UAS. FOR <SCRIPT>, <STYLE> AND <TITLE>
THIS WOULD NOT BE A PROBLEM, SINCE  
THEY NORMALLY ONLY CONTAIN TEXT, BUT <TEXTAREA> IS
MORE PROBLEMATIC SINCE  
IT CAN CONTAIN ELEMENTS. (<TEXTAREA> IS NEW IN SVG 1.2
AND APPARENTLY  
THERE ISN'T MUCH CONTENT USING IT YET. RENAMING THAT ELEMENT
WOULD MAKE  
THIS ISSUE GO AWAY.)

ALSO, AUTHORS ARE ALREADY USED TO DOING
"<SCRIPT>//<![CDATA[" WHEN WORKING  
WITH MARKUP THAT NEEDS TO WORK AS BOTH HTML AND XHTML, SO
HAVING THE SAME  
RULES FOR SVG IN HTML IS LIKELY WHAT AUTHORS WOULD EXPECT.

HAVING ALL SVG ELEMENTS BE PCDATA (AS IN XML) WOULD PROBABLY
MEAN THAT WE  
ALSO HAVE TO INTRODUCE CDATA SECTIONS (SINCE AUTHORS DON'T
WANT TO WRITE  
"&AMP;&AMP;" IN THEIR SCRIPTS, AND IT
WOULD BE HARDER TO MAKE THINGS WORK  
IN LEGACY UAS).


> SPECIFICALLY, I THINK ATTRIBUTE TOKENIZATION SHOULD RUN
THE SAME CODE AS  
> ATTRIBUTE TOKENIZATION FOR THE HTML PARTS OF
TEXT/HTML.
>   2) I'D LIKE TO AVOID SUPPORTING ARBITRARY NAMESPACE
PREFIXES BOTH IN  
> ORDER TO SIDESTEP ISSUES IN SHIPPED IE VERSIONS AND IN
ORDER TO RELIEVE  
> AUTHORS OF NAMESPACE SYNTAX. (XLINK: SHOULD PROBABLY BE
CONSIDERED  
> NON-ARBITRARY AND HARD-WIRED.)
>
> MORE CONCRETELY, I'VE BEEN THINKING SOMETHING LIKE THIS
MIGHT WORK:
>   * CASE FOLDING IN THE TOKENIZER SHOULD BE MADE
CONDITIONAL SO THAT  
> POTENTIALLY CAMELCAP NAMES IN <SVG> SUBTREES
WOULD NOT BE CASE-FOLDED.
>     - ISSUE: SHOULD CASE FOLDING BE TOGGLED ON AND OFF
(IN WHICH CASE  
> TOKENIZING "<SVG " WOULD HAPPEN IN THE
CASE-FOLDING STATE ALLOWING "<SVG  
> ") OR SHOULD NAMES BE COLLECTED UNFOLDED AND THEN
WHOLE NAMES  
> CONDITIONALLY CASE-FOLDED (IN WHICH CASE WE COULD
REQUIRE "<SVG " TO BE  
> IN LOWER CASE)?
>     - ISSUE 2: IF THE LATTER, TO AVOID EXPENSIVELY
CASE-FOLDING WHOLE  
> START TAG TOKENS *INCLUDING* ATTRIBUTES LATER ON, THE
TOKENIZER SHOULD  
> PROBABLY HAVE TO KNOW ABOUT TAG NAMES THAT TURN ON THE
CASE-PRESERVING  
> MODE BEFORE LOOKING FOR ATTRIBUTES BUT THE TREE BUILDER
SHOULD BE THE  
> PART OF THE PARSER TELLING THE TOKENIZER TO SWITCH BACK
TO THE CASE  
> FOLDING MODE. THIS WOULD BE UGLY BUT PROBABLY
NECESSARY.

I DON'T THINK IT'S NECESSARY TO REQUIRE THE SVG START TAG TO
BE LOWERCASE  
IF DOING SO WOULD BE A PERFORMANCE PROBLEM, BUT I DON'T FEEL
STRONGLY  
ABOUT IT. IT IS HOWEVER NECESSARY TO GET THE CASE OF THE
ATTRIBUTES OF THE  
SVG START TAG RIGHT BECAUSE OF (AT LEAST) THE
VIEWBOX="" ATTRIBUTE.


>   * START TAG TOKENS SHOULD HAVE A FLAG ABOUT THE />
PRESENCE. THE TREE  
> BUILDER WOULD IGNORE THIS FOR HTML ELEMENTS BUT WOULD
POP IMMEDIATELY  
> FOR SVG ELEMENTS.

DOING SO FOR "SCRIPT", "STYLE",
"TITLE" AND "TEXTAREA" WOULD MESS UP  
LEGACY UAS BADLY.


>   * THE <SVG> ELEMENT WOULD ESTABLISH "AN
SVG SCOPE" IN THE TREE  
> BUILDER. THE <SVG> START TAG TOKEN WOULD ITSELF
BE HANDLED IN THE HTML  
> STATE OF THE TREE BUILDER SO THAT THE <SVG>
ELEMENT WOULD BE SUBJECT TO  
> FOSTER PARENTING.
>   * WHEN IN AN SVG SCOPE, THE TREE BUILDER WOULD IGNORE
THE HTML TREE  
> BUILDING RULES. THIS MEANS THAT STRAY TAGS LOOKING LIKE
HTML TAGS COULD  
> NOT CAUSE THE TREE BUILDER TO POP OUT OF THE SVG SCOPE.
WHILE IN THE SVG  
> SCOPE, THE TREE BUILDER WOULD ASSIGN THE SVG NAMESPACE
URI TO THE  
> ELEMENT NODES IT CREATES.
>     - ISSUE: WHAT TO DO IF THERE IS A PREFIXED
ELEMENT?

DO THE SAME AS WHAT YOU DO WITH A PREFIXED ELEMENT OUTSIDE
SVG SCOPE  
(I.E., INCLUDE THE PREFIX AND THE COLON IN THE LOCAL NAME).


>   * WHEN IN THE SVG SCOPE, A START TAG TOKEN WOULD
UNCONDITIONALLY  
> RESULT IN THE CORRESPONDING ELEMENT NODE TO BE APPENDED
TO THE CURRENT  
> NODE. (AND IF THE /> FLAG IS SET ON THE TOKEN, THE
NODE WOULD BE POPPED  
> IMMEDIATELY.)
>   * WHEN IN THE SVG SCOPE, AN END TAG TOKEN WOULD CAUSE
A CORRESPONDING  
> ELEMENT TO BE SEARCHED STARTING WITH THE CURRENT NODE
TOWARDS THE START  
> OF THE SVG SCOPE (AND NO FURTHER). IF AN ELEMENT WERE
FOUND IN SCOPE,  
> THE STACK WOULD BE POPPED UNTIL THAT ELEMENT GOT
POPPED. IF THERE WERE  
> NO SUCH ELEMENT IN SCOPE, THE END TAG WOULD BE IGNORED.
ANY OUTCOME BUT  
> A SINGLE POP WOULD BE A PARSE ERROR.
>   * WHEN THE CURRENT NODE IS A FOREIGNOBJECT ELEMENT IN
AN SVG SCOPE,  
> THE START TAG TOKEN <HTML> WOULD ESTABLISH A
"NESTED HTML SCOPE". </ 
> HTML>, <BODY> AND </BODY> WOULD ACT LIKE
"NORMAL" TOKENS IN A NESTED  
> HTML SCOPE. SPECIFICALLY, ANY TOKEN OTHER THAN
</HTML> ENCOUNTERED IN A  
> NESTED HTML SCOPE WOULD BE UNABLE TO BREAK OUT OF THE
NESTED HTML SCOPE.

I THINK IT MAKES MORE SENSE TO MAKE <FOREIGNOBJECT>
ITSELF SWITCH BACK TO  
NORMAL "IN BODY". THE COMMON CASE SEEMS TO BE TO
JUST HAVE A <DIV> AS  
CHILD WHEN YOU USE XHTML IN A <FOREIGNOBJECT>.


>   * ATTRIBUTES WITH THE NAME "XLINK:HREF" ON
THE TOKENIZATION LEVEL  
> WOULD BE REPORTED BY THE TOKENIZER AS LOCAL NAME
"HREF" IN THE XLINK  
> NAMESPACE.
>   * XMLNS OR XMLNS ATTRIBUTES
WOULD HAVE NO MEANING AND WOULD BE  
> NON-CONFORMING EXCEPT
XMLNS="HTTP://WWW.W3.ORG/2000/SVG" AND  
> XMLNS:XLINK="HTTP://WWW.W3.ORG/1999/XLINK"
WOULD BE ALLOWED AS  
> "TALISMANS" ON THE <SVG> START TAG.

ALLOWING THE XMLNS="HTTP://WWW.W3.ORG/1999/XHTML"
TALISMAN ON THE CHILD OF  
FOREIGNOBJECT, TOO (PERHAPS ONLY FOR <DIV>?).


> [...]

-- 
SIMON PIETERS
OPERA SOFTWARE


Re: SVG in text/html
country flaguser name
United States
2007-10-13 15:20:41
Hi, Henri-

There's a small chance that I'm not as conversant in the
details of the 
HTML5 parser as you, so I don't know the constraints that it
has when 
dealing with SVG (or any XML, for that matter).  However, I
was able to 
follow the steps you described, even if I'm not fully aware
of the 
rationale behind all of them.  Thanks for the detailed
analysis... it 
was both edifying and encouraging.


Henri Sivonen wrote (on 10/13/2007 10:43 AM):
> 
> Do you mean you'd like to bring in the complication of
arbitrary 
> namespace prefixes? 

Not necessarily.  I'm fine with imposing certain limitations
on SVG 
content, assuming that it's a set of limitations that can be
easily 
obeyed by authoring tools (and which, preferably, existing
authoring 
tools abide by anyway).  The most important thing for me is
that SVG 
fragments from an HTML+SVG (SVG-in-HTML) compound document
could be 
extracted as standalone SVG documents; the second most
important thing 
is that the most likely content from standalone SVG
documents should 
work as an SVG fragment in HTML (this is second because I
think it is 
likely that this will be the case, given existing SVG
content-creation 
tools).


> I'd like make the following deviations from
> SVG-as-XML syntax:
>  1) I'd like to minimize the need of tokenizer
parametrization to 
> toggling case folding behavior and, if we must, CDATA
sections. 

Strictly speaking, CDATA sections are not required in SVG,
but as you 
know, script will break in an XML parser it if doesn't
escape its "<" 
and "&" characters.  The majority of SVG
authoring tools, I suspect, are 
not script-aware: they are just drawing apps that export to
SVG; people 
savvy enough to be scripting can be expected to take
precautions and 
read FAQs to resolve their problems there.

Even drawing tools, though, are likely to use CSS, and may
automatically 
enclose it in a CDATA section "just to be safe". 
It would be worthwhile 
to look at the survey of tools and see if they do this, and
if so, if 
they can be encouraged to change this practice.

I would prefer that CDATA be allowed, but it's not a
deal-breaker.  I 
confess I don't know why it's a problem in the HTML parser,
though, if 
you care to explain.

Most tools do include XML prologs and DOCTYPES in their SVG
output... 
what affect will this have on a whole-file copy-paste into
HTML, in 
terms of parsing?


> Specifically, I think attribute tokenization should run
the same code as 
> attribute tokenization for the HTML parts of
text/html.

Could you elaborate on that?  What are the implications?


>  2) I'd like to avoid supporting arbitrary namespace
prefixes both in 
> order to sidestep issues in shipped IE versions and in
order to relieve 
> authors of namespace syntax. (xlink: should probably be
considered 
> non-arbitrary and hard-wired.)

I think it's reasonable both to limit arbitrary namespace
prefixes in 
HTML+SVG, and to hard-wire the XLink namespace.  That
SVG-fragment 
content will still work as expected in a standalone SVG UA,
and most 
people trying to do clever things in namespaces will
probably be using 
XHTML+SVG anyway.


> The above trial balloon proposal is designed to
optimize SVG integration 
> in text/html in *future* browsers in a way that would
create a 
> namespace-aware DOM that current DOM-based SVG
implementations would 
> grok immediately but would at the same time remove
namespace declaration 
> syntax from the sight of authors. The proposal
specifically isn't 
> designed to clone the colon-based
namespaces-in-text/html mechanism of 
> IE. OTOH, it shouldn't interfere with it, either,
except perhaps for 
> xlink:href, which could be worked around by introducing
href.

I'm still on the fence about 'null:href'.  Can you explain
in detail why 
this is so problematic in HTML5 (especially given that SVG
isn't 
natively supported in IE anyway)?


> The approach outlined above could be used for MathML as
well. However, 
> in that case, the tokenizer should probably me modified
to switch to 
> MathML entity tables when the tree builder is in a
MathML scope.

I agree it's a good idea to look at the most common XML
presentation 
formats and generalize the solution.


> I agree it would make sense to talk about it at the
Tech Plenary.

I'll coordinate with the respective chairs and try to lock
down a time 
and day.  We can communicate offlist about who would be good
to have 
around and what their attendance schedules are.

Regards-
-Doug Schepers
W3C Staff Contact, SVG, CDF, and WebAPI


Re: SVG in text/html
country flaguser name
Czech Republic
2007-10-14 16:31:09
Doug Schepers wrote:

> Most tools do include XML prologs and DOCTYPES in their
SVG output...
> what affect will this have on a whole-file copy-paste
into HTML, in
> terms of parsing?

Many SVG tools use internal entities in produced content.
This is
showstopper not only for embeding of SVG into HTML, but also
for
embedding into XML.

As SVG has to be normalized before inserting into HTML -- at
least
!DOCTYPE has to be removed and entities expanded, most
likely also CDATA
section removed I don't see any compelling reason why not to
"normalize"
 HTML into XHTML before insertion of SVG fragment also. That
way there
will be no need to mangle HTML syntax to accept SVG
fragments.

If you really think that HTML should be able to accept SVG,
MathML, ...
fragments then HTML syntax should be defined in a more
generic manner as
an alternative seralization of XML Infoset.

-- 
------------------------------------------------------------
------
  Jirka Kosek      e-mail: jirkakosek.cz      http://xmlguru.cz
------------------------------------------------------------
------
       Professional XML consulting and training services
  DocBook customization, custom XSLT/XSL-FO document
processing
------------------------------------------------------------
------
 OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34
member
------------------------------------------------------------
------

Re: SVG in text/html
country flaguser name
Czech Republic
2007-10-14 16:45:17
Doug Schepers wrote:

> Most tools do include XML prologs and DOCTYPES in their
SVG output...
> what affect will this have on a whole-file copy-paste
into HTML, in
> terms of parsing?

Many SVG tools use internal entities in produced content.
This is
showstopper not only for embeding of SVG into HTML, but also
for
embedding into XML.

As SVG has to be normalized before inserting into HTML -- at
least
!DOCTYPE has to be removed and entities expanded, most
likely also CDATA
section removed I don't see any compelling reason why not to
"normalize"
 HTML into XHTML before insertion of SVG fragment also. That
way there
will be no need to mangle HTML syntax to accept SVG
fragments.

If you really think that HTML should be able to accept SVG,
MathML, ...
fragments then HTML syntax should be defined in a more
generic manner as
an alternative seralization of XML Infoset.

-- 
------------------------------------------------------------
------
  Jirka Kosek      e-mail: jirkakosek.cz      http://xmlguru.cz
------------------------------------------------------------
------
       Professional XML consulting and training services
  DocBook customization, custom XSLT/XSL-FO document
processing
------------------------------------------------------------
------
 OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34
member
------------------------------------------------------------
------

Re: SVG in text/html
country flaguser name
United States
2007-10-15 18:31:29
Hi, Jirka-

Jirka Kosek wrote (on 10/14/2007 5:31 PM):
> Doug Schepers wrote:
> 
>> Most tools do include XML prologs and DOCTYPES in
their SVG output...
>> what affect will this have on a whole-file
copy-paste into HTML, in
>> terms of parsing?
> 
> Many SVG tools use internal entities in produced
content.

I'm not sure that's correct.  Certainly, Illustrator does,
and there's a 
lot of Illustrator content... but off the top of my head, I
can't think 
of any others that are pernicious about that.  Maybe
CorelDraw?  I don't 
think Inkscape does.  Can you identify other common tools
that do?

My point is that a relatively small number of authoring
tools would need 
to be changed to affect the majority of output... they could
simply 
offer a "Save as inline SVG" option, even.


> As SVG has to be normalized before inserting into HTML
-- at least
> !DOCTYPE has to be removed and entities expanded, most
likely also CDATA
> section removed I don't see any compelling reason why
not to "normalize"
>  HTML into XHTML before insertion of SVG fragment also.


I don't agree with your reasoning.  Because the author might
have to 
normalize one document type, they should have to normalize
the other as 
well?  That's at least twice as much work, and probably
quite a lot 
more: the number of changes that would need to be done to a
conforming 
SVG fragment in order to align it to what's needed for HTML5
would be 
far less work than converting a conforming HTML5 text/html
file to 
XHTML, and it's very likely that the HTML file would have
more content 
to be converted.  A simple script could strip out the
offending SVG 
detritus, and prepare the file for insertion into HTML very
easily.

We've already cited the fact that there are complications to
changing 
your host-language infrastructure... forcing authors to
convert all 
their pages to XHTML would result in almost nobody bothering
to use any 
inline SVG at all.


> That way there will be no need to mangle HTML syntax to
accept
> SVG fragments.

I think "mangle" is hyperbole.


> If you really think that HTML should be able to accept
SVG, MathML, ...
> fragments then HTML syntax should be defined in a more
generic manner as
> an alternative seralization of XML Infoset.

The solution Henri (perhaps with input from others, like
Sam, Simon, and 
Anne?) put forth seems like it may be headed in that general
direction, 
but let's not jump to that conclusion.  If it can be
special-cased for 
SVG, I think that would be a win even if it didn't extend to
any 
arbitrary XML.


Regards-
-Doug Schepers
W3C Staff Contact, SVG, CDF, and WebAPI


Re: SVG in text/html
country flaguser name
Finland
2007-10-16 05:37:15
On Oct 13, 2007, at 23:20, Doug Schepers wrote:

> Henri Sivonen wrote (on 10/13/2007 10:43 AM):
>> Do you mean you'd like to bring in the complication
of arbitrary  
>> namespace prefixes?
>
> Not necessarily.  I'm fine with imposing certain
limitations on SVG  
> content, assuming that it's a set of limitations that
can be easily  
> obeyed by authoring tools (and which, preferably,
existing  
> authoring tools abide by anyway).

It seems to me that using colonless element names is an easy
 
limitation for authoring tools to follow.

> The most important thing for me is that SVG fragments
from an HTML 
> +SVG (SVG-in-HTML) compound document could be extracted
as  
> standalone SVG documents; the second most important
thing is that  
> the most likely content from standalone SVG documents
should work  
> as an SVG fragment in HTML (this is second because I
think it is  
> likely that this will be the case, given existing SVG
content- 
> creation tools).

Do you mean the extraction from HTML should work on the
source copy- 
paste level as opposed to using a tool that incorporates an
HTML  
parser and an XML serializer? Even if the conforming case
were  
carefully specced to allow such copy-paste, content out
there would  
inevitably start to contain constructs that wouldn't be safe
for  
pasting into XML (like content that tries to be XHTML
1.0-as-text/ 
html is now unsafe for pasting into XML on the source
level), so  
doing the extraction using a parser followed by a serializer
would be  
the safe way to go.

>> I'd like make the following deviations from
>> SVG-as-XML syntax:
>>  1) I'd like to minimize the need of tokenizer
parametrization to  
>> toggling case folding behavior and, if we must,
CDATA sections.
>
> Strictly speaking, CDATA sections are not required in
SVG, but as  
> you know, script will break in an XML parser it if
doesn't escape  
> its "<" and "&" characters. 
The majority of SVG authoring tools, I  
> suspect, are not script-aware: they are just drawing
apps that  
> export to SVG; people savvy enough to be scripting can
be expected  
> to take precautions and read FAQs to resolve their
problems there.
>
> Even drawing tools, though, are likely to use CSS, and
may  
> automatically enclose it in a CDATA section "just
to be safe".  It  
> would be worthwhile to look at the survey of tools and
see if they  
> do this, and if so, if they can be encouraged to change
this practice.
>
> I would prefer that CDATA be allowed, but it's not a
deal-breaker.   
> I confess I don't know why it's a problem in the HTML
parser,  
> though, if you care to explain.

Introducing CDATA sections wholesale into text/html (also
into the  
HTML parts of the document) would be a problem because new
CDATA- 
aware parsers and old CDATA-unaware parsers would give
incompatible  
parse trees and the incompatibility wouldn't even add any  
expressiveness to the language.

As for introducing CDATA sections but only for <svg>
subtrees only,  
there's the issue of whether to be consistent with the
surrounding  
HTML syntax or with XML syntax. Copy-pasteability suggests
supporting  
XMLisms like CDATA sections and /> is <svg>
subtrees. Consistency  
with the surrounding HTML would suggest not supporting CDATA
sections.

The general problem with SVG <title>, <script>,
<style> and  
<textArea> is ensuring that they don't produce
ungraceful results  
when an SVG-in-text/html document is loaded in a legacy
text/html  
browser. It seems to me that authors who want to avoid
<textArea>  
rendering as HTML <textarea> in legacy browsers just
have to avoid  
<textArea> in SVG-in-text/html. <title> seems
harmless enough when  
the surrounding HTML already has a <title> of its
own.

In the case of <style> and <script>, legacy
browsers would try to  
treat them as HTML <style> and <script>. Parsing
them the same way as  
HTML <style> and <script> in the case of
SVG-in-text/html would at  
least ensure that both old and new parsers agree on when the
elements  
end even when the script/style content touches edge cases.
On the  
other hand, having CDATA sections and not having
element-specific  
tokenization content models would be good for copying and
pasting  
from XML files.

I can't say off-hand which approach is the best.

> Most tools do include XML prologs and DOCTYPES in their
SVG  
> output... what affect will this have on a whole-file
copy-paste  
> into HTML, in terms of parsing?

You can't paste an XML declaration or a DOCTYPE in the
middle of an  
XHTML+SVG document, so from the conformance point of view I
don't  
think it is necessary to allow them to be pasted in the
middle of  
text/html. As for what should happen if you paste them in  
nonetheless, I think the current behavior of the HTML5
parsing  
algorithm is reasonable: the XML declaration turns into a
comment  
node and the doctype gets dropped.

>> Specifically, I think attribute tokenization should
run the same  
>> code as attribute tokenization for the HTML parts
of text/html.
>
> Could you elaborate on that?  What are the
implications?

Unquoted attributes would be treated as in text/html in
general. XML  
attribute value normalization wouldn't be performed. (That
is,  
authors should rely on the parser discarding white space
around the  
value. Authors simply shouldn't put extra spaces in there.
This is  
already good advice with XML when the author doesn't know
the  
configuration of the receiving XML parser.) White space
between the  
close quote of a previous attribute and the name of the next
 
attribute wouldn't be required.

>>  2) I'd like to avoid supporting arbitrary
namespace prefixes both  
>> in order to sidestep issues in shipped IE versions
and in order to  
>> relieve authors of namespace syntax. (xlink: should
probably be  
>> considered non-arbitrary and hard-wired.)
>
> I think it's reasonable both to limit arbitrary
namespace prefixes  
> in HTML+SVG, and to hard-wire the XLink namespace. 
That SVG- 
> fragment content will still work as expected in a
standalone SVG  
> UA, and most people trying to do clever things in
namespaces will  
> probably be using XHTML+SVG anyway.

OK.

>> The above trial balloon proposal is designed to
optimize SVG  
>> integration in text/html in *future* browsers in a
way that would  
>> create a namespace-aware DOM that current DOM-based
SVG  
>> implementations would grok immediately but would at
the same time  
>> remove namespace declaration syntax from the sight
of authors. The  
>> proposal specifically isn't designed to clone the
colon-based  
>> namespaces-in-text/html mechanism of IE. OTOH, it
shouldn't  
>> interfere with it, either, except perhaps for
xlink:href, which  
>> could be worked around by introducing href.
>
> I'm still on the fence about 'null:href'.  Can you
explain in  
> detail why this is so problematic in HTML5 (especially
given that  
> SVG isn't natively supported in IE anyway)?

Perhaps special-casing xlink:href *only* isn't that bad, but
 
specifying new processing for names with colons *in general*
carries  
the risk of specifying something that's incompatible with
what  
happens when the syntax is fed to current IE.

I've got an impression that Microsoft doesn't want to change
what  
they do with names that contain colons, but I guess it is
best if  
they comment on that. (I don't currently have access to IE,
so I  
can't test what exactly happens with xlink:href.)

-- 
Henri Sivonen
hsivoneniki.fi
http://hsivonen.iki.fi/




Re: SVG in text/html
country flaguser name
Norway
2007-10-16 06:10:42
On Tue, 16 Oct 2007 12:37:15 +0200, Henri Sivonen
<hsivoneniki.fi> wrote:

>> Most tools do include XML prologs and DOCTYPES in
their SVG output...  
>> what affect will this have on a whole-file
copy-paste into HTML, in  
>> terms of parsing?
>
> You can't paste an XML declaration or a DOCTYPE in the
middle of an  
> XHTML+SVG document, so from the conformance point of
view I don't think  
> it is necessary to allow them to be pasted in the
middle of text/html.  
> As for what should happen if you paste them in
nonetheless, I think the  
> current behavior of the HTML5 parsing algorithm is
reasonable: the XML  
> declaration turns into a comment node and the doctype
gets dropped.

To elaborate further: if the doctype had an internal subset,
then the  
doctype would end at the first >, effectively resulting
in the characters  
"]>" being shown on the page.

Entities would not be expanded. However, that's not a
problem so long as  
entities are only used for namespace declarations, since
xmlns and xmlns  
attributes are meaningless under Henri's proposal.

-- 
Simon Pieters
Opera Software


[1-10] [11-20]

about | contact  Other archives ( Real Estate discussion Medical topics )