List Info

Thread: "Must Ignore vs. Microformats"




"Must Ignore vs. Microformats"
user name
2006-07-19 17:34:58
Hello,

On 7/19/06, Tantek Çelik <tantekcs.stanford.edu> wrote:
> On 7/19/06 8:37 AM, "Frances Berriman"
<fberrimangmail.com> wrote:
>
> > http://cafe.elharo.com/xml/must-ignore-vs-microformats
> >
> > A friend of mine showed me this today. 
Macroformats, over Microformats.
>
> The article is terrible and about 90% incorrect. 
Unfortunately this is
> perhaps in due in some part to the IBM article which
though decent overall,
> has some errors itself, and takes a walk through
transcoding to XML and back
> which is interesting but perhaps unnecessary.
>
> The author of the "macroformats" article
misses all the reasons that XML has
> failed on the Web, and all the specific design
principles that have gone
> into microformats that were developed by learning from
XML's failure.  In
> fact, he continues to push several of these reasons as
actual *plusses* for
> XML (namespaces, invalidity, etc.)
>
> There will continue to be plenty of folks banging there
head against the
> wall and trying to push "plain old xml"
(POX) on the Web, and they will
> likely continue to see the same amount of success as
they have to date.
>
> What we can do to be helpful:
>
>
> 1. Dissect articles like this into a series of
assertions/questions and put
> them on the wiki, e.g.:
>
> * "why would anyone write markup like this? It
brings exactly nothing to the
> table."

(Sorry to bring up a point for XML, but.... I know others
will
probably bring this up outside of here... so I might as well
do it
here....)

One "good" thing about XML, IMO, is that for
certain simple markups
based on XML, it's easier for a beginner-level or
intermediate-level
developer to write a parser for it (as compared to writing a
parser
for Micrformats... since HTML is more difficult to parse).

(For example, writing a parser in C, C++, PHP, Java, C# or
whatever.)

One example of such a simple format based on XML is RSS.

I'd say it is pretty easy for someone to write a parser for
it since
RSS is such a simple markup.  (Although, technically, their
parser
will probably be wrong and might choke and die if some fancy
things
are done with the XML... like using namespaces, adding
DTD's, etc.)

OPML is probably another example too of a simple XML markup.

And yes, I know both formats have ALOT of problems.  But
their
simplicity (in that respect) helps bring on developer
adoption.  (Or
at least, helps bring on adoption by a certain kind of
developer.)

Sure the parsers they write might be technically wrong...
but these
developers can "see" something going pretty
quickly.  (Which might
encourage them to further develop their systems.  And maybe
even
eventually support all the "advanced" stuff to
make their parsers
technically correct.)

Now, having said that, in other realms, Microformats are
much much
easier to parse.  (Like for in-browser technologies.  Like
CSS
styling, JavaScript manipulation, and user scripts.... like
greasemonkey.)

(I even have a PHP parser written that makes parsing
Microformats and
other kinds of semantic HTML dead easy... coming to you via
LGPL
eventually... once I improve the HTML-repairing part of it. 
Gotta
compile tidy and see if that can improve the
HTML-repairing.)

So, maybe we should address that point to.  Maybe something
like...

Q: But writing parsers for Microformats is hard in language
X...
A: You don't need to write a parser in language X, here's
a list of
some parsers....


See ya

-- 
    Charles Iliya Krempeaux, B.Sc.

    charles  reptile.ca
    supercanadian  gmail.com

    developer weblog: http://ChangeLog.ca/
____________________________________________________________
_______________
 Make Television                                http://maketelevision.com/

_______________________________________________
microformats-discuss mailing list
microformats-discussmicroformats.org
http://microformats.org/mailman/listinfo/microforma
ts-discuss
"Must Ignore vs. Microformats"
user name
2006-07-19 17:55:17
On 7/19/06 10:34 AM, "Charles Iliya Krempeaux"
<supercanadiangmail.com>
wrote:

> Hello,
> 
> On 7/19/06, Tantek Çelik <tantekcs.stanford.edu> wrote:
>> On 7/19/06 8:37 AM, "Frances Berriman"
<fberrimangmail.com> wrote:
>> 
>>> http://cafe.elharo.com/xml/must-ignore-vs-microformats
>>> 
>>> A friend of mine showed me this today. 
Macroformats, over Microformats.
>> 
>> The article is terrible and about 90% incorrect. 
Unfortunately this is
>> perhaps in due in some part to the IBM article
which though decent overall,
>> has some errors itself, and takes a walk through
transcoding to XML and back
>> which is interesting but perhaps unnecessary.
>> 
>> The author of the "macroformats"
article misses all the reasons that XML has
>> failed on the Web, and all the specific design
principles that have gone
>> into microformats that were developed by learning
from XML's failure.  In
>> fact, he continues to push several of these reasons
as actual *plusses* for
>> XML (namespaces, invalidity, etc.)
>> 
>> There will continue to be plenty of folks banging
there head against the
>> wall and trying to push "plain old xml"
(POX) on the Web, and they will
>> likely continue to see the same amount of success
as they have to date.
>> 
>> What we can do to be helpful:
>> 
>> 
>> 1. Dissect articles like this into a series of
assertions/questions and put
>> them on the wiki, e.g.:
>> 
>> * "why would anyone write markup like this?
It brings exactly nothing to the
>> table."
> 
> (Sorry to bring up a point for XML, but.... I know
others will
> probably bring this up outside of here... so I might as
well do it
> here....)
> 
> One "good" thing about XML, IMO, is that
for certain simple markups
> based on XML, it's easier for a beginner-level or
intermediate-level
> developer to write a parser for it (as compared to
writing a parser
> for Micrformats... since HTML is more difficult to
parse).
> 
> (For example, writing a parser in C, C++, PHP, Java, C#
or whatever.)

I'll be perfectly frank.

This assumes that making it easy to write a parser is
important.

That assumption is wrong.

Or, to put it more clearly, making such an assumption in a
vacuum (which
many XML folks do) is wrong.

It is more important to make it easier to publish than it is
to make it
easier to parse.

This is why the supposed "easier to parse"
aspect of XML is incredibly
misleading.  It ignores both the need to be easier to
publish, and the fact
that XML, in fact, is *harder* to publish.


> One example of such a simple format based on XML is
RSS.
> 
> I'd say it is pretty easy for someone to write a
parser for it since
> RSS is such a simple markup.  (Although, technically,
their parser
> will probably be wrong and might choke and die if some
fancy things
> are done with the XML... like using namespaces, adding
DTD's, etc.)

You're kidding right?

RSS is the canonical example of an XML format gone wrong
from the "purist"
standpoint, although it is the 2nd most popular XML format
on the Web (after
XHTML).

Go look at any production RSS parser and understand its
complexity.

It is certainly *not* pretty easy for someone to write a
parser for RSS that
actually works with real RSS on the Web.


> OPML is probably another example too of a simple XML
markup.

Not really.  OPML postpones the real parsing to what do the
attributes mean.


> And yes, I know both formats have ALOT of problems. 
But their
> simplicity (in that respect) helps bring on developer
adoption.  (Or
> at least, helps bring on adoption by a certain kind of
developer.)

With all due respect (and as a developer myself), the
developers don't
matter as much as the publishers.


> Now, having said that, in other realms, Microformats
are much much
> easier to parse.  (Like for in-browser technologies. 
Like CSS
> styling, JavaScript manipulation, and user scripts....
like
> greasemonkey.)

Yes, the "microformats are hard to parse"
misconception has been debunked
quite a bit by the creation of simple open source parsers
for which this
community is to be commended.  You know who you are out
there.

 http://m
icroformats.org/wiki/implementations


> (I even have a PHP parser written that makes parsing
Microformats and
> other kinds of semantic HTML dead easy... coming to you
via LGPL
> eventually... once I improve the HTML-repairing part of
it.  Gotta
> compile tidy and see if that can improve the
HTML-repairing.)

Release early release often.  Even if it is not
"done", I encourage you to
release it because you might get help from folks to finish
it.


> So, maybe we should address that point to.  Maybe
something like...
> 
> Q: But writing parsers for Microformats is hard in
language X...
> A: You don't need to write a parser in language X,
here's a list of
> some parsers....

Well said.

Thanks,

Tantek

_______________________________________________
microformats-discuss mailing list
microformats-discussmicroformats.org
http://microformats.org/mailman/listinfo/microforma
ts-discuss
[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )