List Info

Thread: Microformats survey (for GRDDL)




Microformats survey (for GRDDL)
user name
2007-03-20 06:55:03
I've completed a survey of the current state of microformats
with
regard to the GRDDL mechanism for data extraction. In
principle GRDDL
is usable on all documents which use microformats.

Results tabulated at:

http://esw.w3.org/topic/CustomRdfDialects/Grddable
Microformats

(It's on the ESW Wiki - please correct any errors/omissions
directly)

Short version: following a strict interpretation of the
relevant
specs, no official microformats are currently usable with
GRDDL.
Taking a loose view, around a third are right now.

Summary:
As anticipated, the weakest link is the non-existence of
profile URIs.
Of the 18 microformats listed, only 3 have profile URIs
directly
usable by GRDDL-aware agents (hCard, hCalendar &
hReview), and none of
these URIs are endorsed by microformats.org.

Only 1 of the 18 has an endorsed profile URI (XFN), and that
isn't
GRDDL-enabled. It was suggested on microformats-discuss that
relevant
Wiki pages for the microformats could be used as interim
profile URIs,
but again these aren't GRDDL-enabled. Most of the
microformats do have
an XMDP expression of their profile, yet with a couple of
exceptions
these are listed as source markup, i.e. not really human or
machine-readable. It isn't obvious what the intended purpose
of this
might be.

XSLT to RDF/XML is available in various stages of completion
for 6 of
the 18 microformats listed (including the 3 with unofficial
profile
URIs).

In other words, only 4 of these 18 formats exploit the HTML
specification fully for disambiguation. Because the profile
URIs
corresponding to 3 of these have appeared outside the
microformats.org
process, only one format+profileURI combination may properly
be called
a microformat (rather than 'semantic HTML'), and that one
isn't
GRDDLable.

While this limits the publisher's options when it comes to
publishing
data in HTML, consumers may still use heuristics based on
GRDDL or
similar mechanisms to extract data from microformat-enhanced
documents
(i.e. screenscraping), with the obvious impact on
reliability &
authority of the data, questions of provenance etc.

Cheers,
Danny.

-- 

http://dannyayers.com
_______________________________________________
microformats-discuss mailing list
microformats-discussmicroformats.org
http://microformats.org/mailman/listinfo/microforma
ts-discuss

[1]

about | contact  Other archives ( Real Estate discussion Medical topics )