List Info

Thread: Re: Parsing XFN in PHP




Re: Parsing XFN in PHP
country flaguser name
Ireland
2008-04-10 14:20:26
Julian Bond wrote:
> Ryan Parman <ryan.lists.warpsharegmail.com> Thu, 10 Apr 2008 09:05:47
>> As someone with a background in parsing RSS/Atom, I
can say from 
>> years of experience that RSS is only occasionally
XML and that you 
>> typically find far more HTML in a feed than XML.
And parsing HTML can 
>> be a bitch.
>
> Big snip.
>
> Woah! That's enough to put one off even starting on
parsing and 
> reading uF. Which makes uF all a bit pointless. Oh
dear. :(
>
> I suspect though that this Gordian knot can be cut. It
seems quite 
> likely that any page marked up with uF is good enough
that HTML-Tidy 
> won't remove too many uF marked up elements. If that's
the case, then 
> Fetch html -> HTML-Tidy -> XML parsing is going
to get 99% of the job 
> done and successfully extract the uF marked data.
Aside re 'nofollow':

If you're scrubbing HTMLish character streams with arbitrary
other code 
to make XHTML, do take care that you're not accidentally
scrubbing 
rel='nofollow' from comment areas while leaving in
potentially 
mischievous "rel='me'" claims. I don't know the
default behaviour of 
HTML Tidy or similar tools, but this risk is worth bearing
in mind.

Per http://microformats.org/wiki/xfn-clarific
ations#me_nofollow_interaction
    "If a link has the rel value "nofollow",
then a "me" rel value DOES 
NOT indicate an identity relationship. That is, only rel
attributes with 
the value "me", and WITHOUT the value
"nofollow" indicate an identity 
relationship assertion. "

While it might seem odd for a 'nofollow' to be stripped
while leaving a 
'me' in there, I've seen enough hostility to the 'nofollow'
idea 
floating around, that it is certainly possible some HTML
cleanup tools 
will drop that markup. For example, 
http://meiert.com/en/blog/20070106/nofollow
-still-considered-harmful/  
http://fo
olswisdom.com/do-follow-wordpress/ 
http://www.itst.org/n
onofollow/  http://www.nonofollow.net/
 
http://www.unintentionallyblank.c
o.uk/2007/02/20/on-the-redundancy-of-nofollow/ 
etc...

cheers,

Dan

--
http://danbri.org/
_______________________________________________
microformats-discuss mailing list
microformats-discussmicroformats.org
http://microformats.org/mailman/listinfo/microforma
ts-discuss

[1]

about | contact  Other archives ( Real Estate discussion Medical topics )