RSS and XPath

April 08, 2004
coldfusion

I came across a handy reference article on xml.com today that gives XPath queries for RSS and Atom feeds. Just last week I was attempting to parse a RSS 1.0 feed in CFMX using the XMLSearch function. I'm running into problems however due to the name spaces in RSS 1.0, here's the code I'm using:

<cfhttp url="http://www.fullasagoog.com/xml/ColdFusionMX.xml" method="get" />
<cfset rss = XMLParse(cfhttp.filecontent)>

<!--- get an array of items --->
<cfset items = XMLSearch(rss, "/rdf:RDF/item")>
<cfdump var="#items#">

The result is that the items array is empty. I think this is a namespace issue, but I'm not really sure. Is this a bug? Anyone have an idea?

BTW if your looking to parse RSS 0.9x or 2.0 with XPath check out this older blog post.


Like this? Follow me ↯


You might also like:

6 people found this page useful, what do you think?

Comments

Why not just use: XmlSearch(rss, "//item") That way you don't have to worry about what version of RSS you're parsing... you'll always get an array of items.
Hi Roger, I tried that as well previously, it also returned an empty array. Were you able to get that to work? -pete
This works: XMLSearch(rss,"/rdf:RDF/:item") as does this: XMLSearch(rss,"//:item") Because of the namespaces, you have to explicitly specify that 'item' has an empty namespace prefix.
Ah! Good to know, thanks Sean!
Pete: I took a sec to dig around in JournURL's aggregator code, and found what I'm actually using: XmlSearch(myRSS, "//*[name()='item']")
This seems to be the one that works best for most feeds we try to parse: XmlSearch(myRSS, "//*[name()='item']") I have a question about looping through the resulting array. What should the xpath be when looking for the description, title and item of the children? Thanks...


Foundeo Inc.