Regex is awfully nasty for parsing html, you'd be better off with the HTML object.
YQL (Yahoo Query Language) allows you to query websites with a SQL like syntax, for example:
The url it generates: http://query.yahooapis.com/v1/public...uct'%5D%22
So you can see what you'd load with Excel.
You use xpath to return the parts you're interested in and loop through those, in the above case all the ul elements with a class of product.
It starts getting really nifty though for multiple websites with the same layout - like arcadia, since it allows you to use an in clause:
http://developer.yahoo.com/yql/conso...oduct%27%5D%22
So you can aggregate site data using common clauses, this outputs:
http://query.yahooapis.com/v1/public...agnostics=true
Bookmarks