List Info

Thread: Modify data from rss feed before inserting?




Modify data from rss feed before inserting?
country flaguser name
United States
2007-06-25 20:09:29
Hi all,

I've discovered that using news.google.com and an advance
search I can get 
extremely good results for the type of news I'm trying to
put on my upstart 
Pligg based site via the Pligg Magpie based RSS importer
module. The problem 
I am having is that the google rss feed has some extra
information that I'd 
like to get rid of. Take a look at an example:

--snip--
<item>

<title>Indianapolis Colts lead ESPY nominations -
Times Picayune</title>

<link>http
://news.google.com/news/url?sa=T&ct=us/2-0&fd=R&
url=http://www.nola.com/newsflash/sports/index.ssf%3F/base/s
ports-13/1182776353191090.xml%26storylist%3D&cid=1117574
525&ei=CRyARvatAYiy0AH_vtE0</link>

<guid
isPermaLink="false">tag:news.google.com,2005:cl
uster=429cd57d</guid>

<pubDate>Mon, 25 Jun 2007 13:14:06
GMT</pubDate>

<description><br><table border=0 width=
valign=top cellpadding=2 
cellspacing=7><tr><td valign=top
class=j><a 
href="http://news.google.com/news/url?sa=T&ct=us/2-0&
fd=R&url=http://www.nola.com/newsflash/sports/index.ssf%
3F/base/sports-13/1182776353191090.xml%26storylist%3D&ci
d=1117574525&ei=CRyARvatAYiy0AH_vtE0">Indianapol
is 
Colts lead ESPY nominations</a><br><font
size=-1><font color=#6f6f6f>Times 
Picayune,&nbsp;LA&nbsp;-</font> <nobr>6
hours ago</nobr></font><br><font 
size=-1>The Arthur Ashe Courage Award is presented to
individuals whose 
contributions transcend <b>sports</b>. The new
Jimmy V Award for 
Perseverance will be presented to
<b>...</b></font><br></table>
</description>
</item>

--snip--

Unfortunately, each one of these is somewhat problematic.

Fortitle, I would like to remove the '- Times Picayune'
portion. Is there a 
way to get rid of this using php before it is submitted? The
'-' is always 
there followed by the source.

For the link I'd really like to use only the portion between
'url=' and 
'&cid'

Lastly, for description I would like to keep only what's
between '<font 
size=-1>' and '</font>'

Is there any hope in hacking this together?  I know a fair
amount of php but 
I haven't dealt with trying to keep only a portion of a
string based on 
given criteria.  Any help is much appreciated.

____________________________________________________________
_____
Make every IM count. Download Messenger and join the i’m
Initiative now. 
It’s free. http://im.live.com/messenger/im/home/?source=TAGHM_June
07



------------------------------------------------------------
-------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and
take
control of your XML. No limits. Just data. Click to get it
now.
http://sourcefor
ge.net/powerbar/db2/
_______________________________________________
Magpierss-general mailing list
Magpierss-generallists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/magpi
erss-general

[1]

about | contact  Other archives ( Real Estate discussion Medical topics )