Main > Programming > Libraries >

WWW::Spyder 0.19

WWW::Spyder 0.19

Sponsored Links

WWW::Spyder 0.19 Ranking & Summary

RankingClick at the star to rank
Ranking Level
User Review: 1 (1 times)
File size: 0.017 MB
Platform: Any Platform
License: Perl Artistic License
Price:
Downloads: 815
Date added: 2007-08-02
Publisher: Ashley Pond V.

WWW::Spyder 0.19 description

WWW::Spyder is a Perl module that acts like a web spider.

A web spider that returns plain text, HTML, and other information per page crawled and can determine what pages to get and parse based on supplied terms compared to the text in links as well as page content.

METHODS

$spyder->new()

Construct a new spyder object. Without at least the seed() set, or go_to_seed() turned on, the spyder isnt ready to crawl.

$spyder = WWW::Spyder->new(shift||die"Gimme a URL!n");
# ...or...
$spyder = WWW::Spyder->new( %options );

Options include: sleep_base (in seconds), exit_on (hash of methods and settings). Examples below.

$spyder->seed($url)

Adds a URL (or URLs) to the top of the queues for crawling. If the spyder is constructed with a single scalar argument, that is considered the seed_url.

$spyder->bell([bool])

This will print a bell ("a") to STDERR on every successfully crawled page. It might seem annoying but it is an excellent way to know your spyder is behaving and working. True value turns it on. Right now it cant be turned off.

$spyder->spyder_time([bool])

Returns raw seconds since Spyder was created if given a boolean value, otherwise returns "D day(s) HH::MM:SS."

$spyder->terms([list of terms to match])

The more terms, the more the spyder is going to grasp at. If you give a straight list of strings, they will be turned into very open regexes. E.g.: "king" would match "sulking" and "kinglet" but not "King." It is case sensitive right now. If you want more specific matching or different behavior, pass your own regexes instead of strings.

$spyder->terms( qr/bkings?b/i, qr/bqueens?b/i );

terms() is only settable once right now, then its a done deal.

$spyder->spyder_data()

A comma formatted number of kilobytes retrieved so far. Dont give it an argument. Its a set/get routine.

$spyder->slept()

Returns the total number of seconds the spyder has slept while running. Useful for getting accurate page/time counts (spyder performance) discounting the added courtesy naps.

$spyder->UA->...

The LWP::UserAgent. You can reset them, I do believe, by calling methods on the UA. Here are the initialized values you might want to tweak (see LWP::UserAgent for more information):

$spyder->UA->timeout(30);
$spyder->UA->max_size(250_000);
$spyder->UA->agent(Mozilla/5.0);

Changing the agent name can hurt your spyder b/c some servers wont return content unless its requested by a "browser" they recognize.

You should probably add your email with from() as well.

$spyder->UA->from(bluefintuna@fish.net);
$spyder->cookie_file([local_file])

They live in $ENV{HOME}/spyderCookie by default but you can set your own file if you prefer or want to save different cookie files for different spyders.

WWW::Spyder 0.19 Screenshot

Advertisements

WWW::Spyder 0.19 Keywords

Bookmark WWW::Spyder 0.19

Hyperlink code:
Link for forum:

WWW::Spyder 0.19 Copyright

WareSeeker periodically updates pricing and software information of WWW::Spyder 0.19 full version from the publisher, so some information may be slightly out-of-date. You should confirm all information before relying on it. Software piracy is theft, Using crack, password, serial numbers, registration codes, key generators is illegal and prevent future development of WWW::Spyder 0.19 Edition. Download links are directly from our publisher sites, torrent files or links from rapidshare.com, yousendit.com or megaupload.com are not allowed

Allok Video Splitter 2.2.0 Review:

Name (Required)
Email(Required)
Captcha
Featured Software

Want to place your software product here?
Please contact us for consideration.

Contact WareSeeker.com
Related Software
WWW::BF2Player is a Perl module that can fetch information about game servers from BF2Player.com Free Download
WWW::Orkut::Spider is a Perl extension for spidering the orkut community. Free Download
WWW::Scraper::Dice Perl module contains Scrapes Dice : (skills,locations) => (title, location ,residue). Free Download
WWW::Myspace is a Perl module to access MySpace.com profile information from Perl. Free Download
WWW::OpenSVN is an automated interface for the OpenSVN online Subversion repositories service. Free Download
WWW::Poll is a Perl extension to build web polls. Free Download
WWW::Scraper::Monster is a Perl module that scrapes Monster.com. Free Download
WWW::Scraper::FlipDog it Scrapes www.FlipDog.com. Free Download