Whilst nice, how is this going to handle the changing nature of the web? It's nice that it detects "lists" and such, but a few changes to CSS is going to trash that automation right?
I'm also fairly sure you'll break (either directly, or on a user's behalf) a few EULA's that really specifically ban scraping.
replies(2):