Alexa Web Search Platform
Amazon have announced the (public beta) launch of Alexa Web Search Platform – a service which lets developers pretty much create their own search engines using Alexa’s computing and storage resources.
The quick tour gives more information.
It works by you defining the pages you want to access from Alexa’s archive (“100 Terabytes of Web content spanning 4 billion pages and 8 million sites”), developing an application to run queries on that data and then downloading or publishing the results via a web service.
The pricing structure is quite interesting and probably quite difficult to predict for any given application:
– $1 per cpu hour
– $1 per GB/year of user storage (up to 13 TB…)
– $1 per 50 GB processed
– $1 per GB uploaded/downloaded
– $1 for every 4,000 user-published web service requests
It looks like the start point is always Alexa’s web archive – you can’t influence what they’re actually crawling/archiving or how often. If they allowed you to direct their crawler as well, then it could well take on the Google Search Appliance solution – which personally I think is prohibitively expensive.