Need a Specialized Spider/Scraper
Posted by: marko
Date: March 06, 2007 10:04PM

Hey folks,

I'd like to ask about what kind of bill I'd be looking ad for a specialized web scraper. I'm looking for something that will
1) take a keyword list
2) for each keyword, pull the urls of the top 10 (or more, the program should leave this up to my discretion) ranking pages.
3) look at the top 100, 1000 etc (# at my discretion) yahoo indexed links these sites have and find all those that are in common
4) output an excel spreadsheet (or allow me to export the data to such a spreadsheet) where the common backlinks are all listed. they should be ranked by most common source. so if one site links to 5 of the top 10, and another only links to two, the one linking to 5 would be listed first. Note: the spreadsheet should also list which of the top 10-20 are being linked to from teh site.
5) In addition, the scraper would get 1000 characters of text above and 1000 characters of text below the link in question. Alternately, if it could take a screenshot and put that into the spreadsheet, that would be fine too. The idea is to get the context of the link.

The data in the sheet would look like this:

keyword a
site - links to all 10 results
site - links to results a,b,c, and d (named by url, not rank, since rank continuously changes)
site - links to...

keyword b
site 4 - links to...

Any ideas on price? I've found spider code skeletons on the web that I figure can be adapted. The common backlinks part would work like ' tool. But not using it, cuz then I'd use their resources excessively and I don't think that would be fair. Anyone here able to do that?

Re: Need a Specialized Spider/Scraper
Posted by: cyst
Date: April 20, 2007 01:55PM

Do you prefer this to be a web app? This doesn't sound difficult. PM me.

