AnyBook4Less.com | Order from a Major Online Bookstore |
![]() |
Home |  Store List |  FAQ |  Contact Us |   | ||
Ultimate Book Price Comparison Engine Save Your Time And Money |
![]() |
Title: Spidering Hacks by Kevin Hemenway, Tara Calishain ISBN: 0-596-00577-6 Publisher: O'Reilly & Associates Pub. Date: 01 November, 2003 Format: Paperback Volumes: 1 List Price(USD): $24.95 |
Average Customer Rating: 5 (3 reviews)
Rating: 5
Summary: Great Book
Comment: Are you ready to be the next Google? It is widely known that Google pulled out in front of (and largely obsoleted) major search engine players like Altavista and Yahoo largely because of Google's highly accurate search results -- you find what you search for. They are so confident in their search engine spiders they even have a "I'm feeling lucky" button to transport you to the first search result found -- it's arrogance, but well deserved arrogance. In a sentence, Google works.
Enter Kevin Hemenway and Tara Calishain's latest O'Reilly book: Spidering Hacks. Continuing in the Oreilly "Hacks" tradition, this comprehensive guidebook provides a hundred clear, useful tools for designing and implementing the next generation -- or maybe just your own customized -- spider (or bot, if you prefer.)
So why build your own spider? Well, if you have a large website, your spider could check link integrity, HTML standards and check meta-tags. If you are researching a topic and Google is not returning what you want, creating your own spider might be just what you need. This handy book (with examples in Perl) will show you how to:
* Create a site-friendly bot that wont get you banned by webmasters (Hack #16 --Respecting your Scrapee's Bandwidth, and Hack # 17 -- Respecting robots.txt)
* Interested in graphics, audio and video? Hacks #33 through #42 step you through collecting media files. Specific examples including scraping films from www.ifilm.com (Hack #24), gathering movies from the Library of Congress (Hack #35) and archiving images from Webshots. You'll have your own personalized library in no time.
* Weblog-Free Google Results -- Weblogs (aka Blogs) are amazingly popular these days. With Google's pagerank algorithm, that means they get heavy emphasis in your search results. Hack #50 skims down the search results by eliminating those annoying Blogs.
In addition, you'll find multiple hacks covering Amazon.com and RSS Feeds. The book includes much information regarding spider automation (e.g. Cron jobbing your spiders.) You'll find content filtering and and even a hack using PHP code(Hack #84.)
This book is extraordinarily helpful and is a great resource for any PERL hacker. I highly recommend it to any computer hobbyist interesting in data mining and spidering and scraping. Well done, O'Reilly!
Rating: 5
Summary: A fresh idea
Comment: Spidering hacks like other oreilly "hacks" books live up to the tradition. This book shows some of the internet guru tips and tricks. Although, overall the book is pretty straightfarword.. plz note that:
1. Even though the book has an introductary chapter(s) on perl, this is not really a perl newbie book. Make sure you alteast have the knowledge equivalent to "Learning Perl" before you touch this book.
2. For advance programmers, this book may not live up to the expection. Most of the book is about extracting information from websites (which one can easily do using a webbrowser BTW). However, the ideas and techniques are new for novices to intermediate.
3. The third problem is this book is that its more useful for North American audience than overseas. (However you can modify techniques.)
4. If you are not so much into programming, you might not like this book. Alternatively, you can just download the examples and run it yourself. Also, if you are not a power user OR dont have the time/skill/interest to spider the web you dont need this book. IT DOES NOT REVEAL SECRETS TO REVOLUTIONALIZE YOUR WEB EXPERIENCE OR LET YOU ACCESS ANY WEBSITE ILLEGALLY.
Overall, ITS A good BOOK.
Rating: 5
Summary: fun to read
Comment: Like other Oreilly hacking books, this one is easy to read and follow. Inside this book, you can find lots of aways automating perl scripts to things for you...The other related book is Perl&LWP...
![]() |
Title: Google Hacks: 100 Industrial-Strength Tips & Tools by Tara Calishain, Rael Dornfest ISBN: 0596004478 Publisher: O'Reilly & Associates Pub. Date: 01 February, 2003 List Price(USD): $24.95 |
![]() |
Title: Amazon Hacks: 100 Industrial-Strength Tips and Tools by Paul Bausch ISBN: 0596005423 Publisher: O'Reilly & Associates Pub. Date: 20 August, 2003 List Price(USD): $24.95 |
![]() |
Title: eBay Hacks: 100 Industrial-Strength Tips and Tools by David A. Karp ISBN: 0596005644 Publisher: O'Reilly & Associates Pub. Date: 25 August, 2003 List Price(USD): $24.95 |
![]() |
Title: Windows XP Hacks by Preston Gralla ISBN: 0596005113 Publisher: O'Reilly & Associates Pub. Date: 22 August, 2003 List Price(USD): $24.95 |
![]() |
Title: Linux Server Hacks by Rob Flickenger ISBN: 0596004613 Publisher: O'Reilly & Associates Pub. Date: January, 2003 List Price(USD): $24.95 |
Thank you for visiting www.AnyBook4Less.com and enjoy your savings!
Copyright� 2001-2021 Send your comments