List Info

Thread: GRDDL with HTML 4.01




GRDDL with HTML 4.01
country flaguser name
United Kingdom
2007-09-27 15:14:50
[I assume that GRDDL issues are on-topic here; if not please
suggest a
better forum Thank you.]

I have been playing with using GRDDL, for example:

        <http:
//www.westmidlandbirdclub.com/belvide/>

but parsers barf on that page, presumably because it's HTML
4.01 not
XHTML.

Surely, as its valid HTML, parsers should be able to convert
it, on the
fly, to XHTML before extracting the RDF? Or have I
misunderstood
something?

Are there parsers for GRDDL in HTML, which I've overlooked?

-- 
Andy Mabbett
_______________________________________________
microformats-discuss mailing list
microformats-discussmicroformats.org
http://microformats.org/mailman/listinfo/microforma
ts-discuss

Re: GRDDL with HTML 4.01
user name
2007-09-27 17:08:10
On 9/27/07, Andy Mabbett <andypigsonthewing.org.uk>
wrote:
>
> [I assume that GRDDL issues are on-topic here; if not
please suggest a
> better forum Thank you.]
>
> I have been playing with using GRDDL, for example:
>
>         <http:
//www.westmidlandbirdclub.com/belvide/>
>
> but parsers barf on that page, presumably because it's
HTML 4.01 not
> XHTML.
>
> Surely, as its valid HTML, parsers should be able to
convert it, on the
> fly, to XHTML before extracting the RDF? Or have I
misunderstood
> something?
>
> Are there parsers for GRDDL in HTML, which I've
overlooked?
>

It is up to GRDDL implementors to add HTML parsing as they
see fit. I
have an internal GRDDL parser which I use which pipes
everything
through Tidy.

GRDDL implementors can also choose default profiles. For
instance,
Triplr automatically looks for some common microformats
(hCard and
hCalendar, IIRC).

Triplr parses the above-linked page fine:
http://triplr.org/rdf/www.westmidlandbirdclub.com/belv
ide/

It doesn't provide any information beyond the hCard. The Geo
doesn't
parse either.

Triplr can't parse any of the actual GRDDL data. You ought
to use a
profile page - perhaps a specific profile for your whole
site with
links to different transformations. I've put together some 
profiles:
http://tommor
ris.org/profiles/tommorris
http://tommorris.o
rg/profiles/nsfw
http://tommor
ris.org/profiles/votelinks

Using data-view on the source document is not good practice.
There's
no reason you can't but Triplr doesn't seem to be reading
it. Better
just to make an XHTML page and shove the relevant links to
the XSLT on
there.

The W3C hosts an official reference implementation GRDDL
service:
http://www.w3.org/20
07/08/grddl/
This should not read your page as it's designed to work
closely to
spec - i.e. XHTML and XML, not HTML 4.

You can also get GRDDL support from irc.freenode.net #swig

Yours,

-- 
Tom Morris
http://tommorris.org/
_______________________________________________
microformats-discuss mailing list
microformats-discussmicroformats.org
http://microformats.org/mailman/listinfo/microforma
ts-discuss

Re: GRDDL with HTML 4.01
country flaguser name
United Kingdom
2007-09-27 17:50:21
In message
<d375f00f0709271508v5c2101aay3abe9fe8a829da7amail.gmail.com>, Tom
Morris <bbtommorrisgmail.com> writes

[My HTNML4,01 page with GRDDL mark-up:

        <http:
//www.westmidlandbirdclub.com/belvide/> ]

>> Surely

>> parsers should be able to convert it, on the
>> fly, to XHTML before extracting the RDF? Or have I
misunderstood
>> something?
>>
>> Are there parsers for GRDDL in HTML, which I've
overlooked?

>It is up to GRDDL implementors to add HTML parsing as
they see fit.

Understood.

>I have an internal GRDDL parser which I use which pipes
everything
>through Tidy.
>
>GRDDL implementors can also choose default profiles. For
instance,
>Triplr automatically looks for some common microformats
(hCard and
>hCalendar, IIRC).

Am I correct in thinking you'd parse any page with hCard
that way, with
or without it having GRDDL mark-up?

>Triplr can't parse any of the actual GRDDL data.

Not sure what you mean, here.

> You ought to use a
>profile page - perhaps a specific profile for your whole
site with
>links to different transformations.

What would be the advantage of that, for me or the site's
users?

>Using data-view on the source document is not good
practice. There's
>no reason you can't but Triplr doesn't seem to be
reading it.

Again, not sure what you mean, here.

>The W3C hosts an official reference implementation GRDDL
service:
>http://www.w3.org/20
07/08/grddl/
>This should not read your page as it's designed to work
closely to
>spec - i.e. XHTML and XML, not HTML 4.

That seems a rather short-sighted view, if the intention is
to allow
publishers to join the semantic web with the least effort.

Thanks for your help.

-- 
Andy Mabbett
_______________________________________________
microformats-discuss mailing list
microformats-discussmicroformats.org
http://microformats.org/mailman/listinfo/microforma
ts-discuss

Re: GRDDL with HTML 4.01
user name
2007-10-02 07:57:11
Sorry, I've been procrastinating and avoiding my e-mail.

On 9/27/07, Andy Mabbett <andypigsonthewing.org.uk>
wrote:
> In message
> <d375f00f0709271508v5c2101aay3abe9fe8a829da7amail.gmail.com>, Tom
> Morris <bbtommorrisgmail.com> writes
> Am I correct in thinking you'd parse any page with
hCard that way, with
> or without it having GRDDL mark-up?
>

Yes, Triplr should parse most pages with hCard (and whatever
the other
common microformats it has built in) without any profile
URIs. Next
time I see dajobe on IRC, I'll ask him which microformats
it
auto-detects.

> >Triplr can't parse any of the actual GRDDL data.
>
> Not sure what you mean, here.
>

Triplr is not reading the data-view URI from your page (it's
not
expecting it to be there) and thus not making the correct
inferences.

> > You ought to use a
> >profile page - perhaps a specific profile for your
whole site with
> >links to different transformations.
>
> What would be the advantage of that, for me or the
site's users?
>

Well, making one of your pages in to a profile for all the
others
won't really help you, nor the sites users. But until we get
off our
collective arses and put up profiles for the other
microformats and
non-official-microformats (GeoURL in this case).

> >Using data-view on the source document is not good
practice. There's
> >no reason you can't but Triplr doesn't seem to be
reading it.
>
> Again, not sure what you mean, here.
>

Okay, your page has got the <http://www.w3.
org/2003/g/data-view>
profile URI. Technically, a GRDDL processor ought to pick
that up and
use it to interpret the contents of your page in accordance
with what
is in the link[rel='transformation'] elements. But - in
practice -
that's not how it's implemented. You should really have a
separate
profile URI. Most GRDDL processors aren't written to
recognise the
link[rel='transformation'] elements on the 'source' page,
but on the
'profile' page.

If you switched your page over so that it used links to
profile URIs,
rather than using the page itself as a profile for how to
intepret the
page, there are other advantages - GRDDL processors can
often be
optimised by locally storing the relevant transformations.
In my
internal processor, for instance, I have a local copy of the
hCard
transformation - it's invoked both by the string
"vcard" being in the
document and the hCard profile URI: <http://www.w3.or
g/2006/03/hcard>.

Unfortunately, you are in the position where the profile
URIs don't
exist at the moment. That's why I'm suggesting you simply
have a
profile page on your site - a bit like mine:
<http://to
mmorris.org/profiles/tommorris>
It would simply contain a list of all the XSL files you
currently have
in head/link.

You could edit <http://w
ww.westmidlandbirdclub.com/site/> this page to
turn it in to a GRDDL profile for the site, then just point
to it from
the profile URIs. "About this Site" and
"Colophon" style pages seem
like an ideal candidate for being GRDDL profiles.

> >The W3C hosts an official reference implementation
GRDDL service:
> >http://www.w3.org/20
07/08/grddl/
> >This should not read your page as it's designed to
work closely to
> >spec - i.e. XHTML and XML, not HTML 4.
>
> That seems a rather short-sighted view, if the
intention is to allow
> publishers to join the semantic web with the least
effort.
>

It's not really short sighted. The W3C implementation is a
reference
implementation that's built as closely as possible to the
spec. Most
GRDDL implementations will not follow the spec, and will
extend it to
do things like (a) automatically detecting microformats and
(b)
supporting HTML 4. It's not really intended for public use,
but just
to see what the page should look like to a 'base' GRDDL
implementation.

If you have specific implementation things you wish to
discuss, feel
free to e-mail or IM me off-list.

-- 
Tom Morris
http://tommorris.org/
_______________________________________________
microformats-discuss mailing list
microformats-discussmicroformats.org
http://microformats.org/mailman/listinfo/microforma
ts-discuss

[1-4]

about | contact  Other archives ( Real Estate discussion Medical topics )