Nice high-level overview of data mining and where the industry is headed. And the comments were just as illuminating as the article itself. From the information given here it would seem it’s worth seriously exploring MapReduce hosted in Amazon’s cloud.
We are in the midst of a data mining renaissance.
Traditionally, data warehousing implementations were large, complex and expensive, meaning only the top-ranking companies could afford them. Teradata pioneered the initial market for corporate data warehousing solutions and still maintains a segment lead, something HP’s CEO Mark Hurd knows all too well. More recent entrants into the data warehousing and intelligence market, like Netezza, have emerged with cost-effective, appliance-based approaches. Others in this arena include Greenplum, recent Microsoft acquisition DATAllegro and, of course IBM, Oracle and SAP.
But the web changed the way we radiate and consume information and in doing so, created a new opportunity to measure and monetize it. Faced with more user data, logging information, and web content than anyone thought one system could handle, the major web companies developed highly scaled data warehousing solutions themselves. Armed with these tools, they improved customer resonance by building better recommendation engines, more targeted advertising networks and more intricate campaigns.