Main > Programming > Libraries >

Text::Scraper 0.02

Text::Scraper 0.02

Sponsored Links

Text::Scraper 0.02 Ranking & Summary

RankingClick at the star to rank
Ranking Level
User Review: 0 (0 times)
File size: 0.045 MB
Platform: Any Platform
License: Perl Artistic License
Price:
Downloads: 798
Date added: 2007-08-22
Publisher: Chris McEwan

Text::Scraper 0.02 description

Text::Scraper contains structured data from (un)structured text.

SYNOPSIS

use Text::Scraper;

use LWP::Simple;
use Data::Dumper;

#
# 1. Get our template and source text
#
my $tmpl = Text::Scraper->slurp(*DATA);
my $src = get(http://search.cpan.org/recent) || die $!;

#
# 2. Extract data from source
#
my $obj = Text::Scraper->new(tmpl => $tmpl);
my $data = $obj->scrape($src);

#
# 3. Do something really neat...(left as excercise)
#
print "Newest Submission: ", $data->[0]{submissions}[0]{name}, "nn";
print "Scraper model:n", Dumper($obj), "nn";
print "Parsed model:n", Dumper($data) , "nn";

__DATA__

< div class=path>< center>< table>< tr>
< ?tmpl stuff pre_nav ?>
< td class=datecell>< span>< big>< b> < ?tmpl var date_string ?> < /b>< /big>< /span>< /td>
< ?tmpl stuff post_nav ?>
< /tr>< /table>< /center>< /div>

< ul>
< ?tmpl loop submissions ?>
< li>< a href="< ?tmpl var link ?>">< ?tmpl var name ?>< /a>
< ?tmpl if has_description ?>
< small> -- < ?tmpl var description ?>< /small>
< ?tmpl end has_description ?>
< /li>
< ?tmpl end submissions ?>
< /ul>

ABSTRACT

Text::Scraper provides a fully functional base-class to quickly develop Screen-Scrapers and other text extraction tools. Programmatically generated text such as dynamic webpages are trivially reversed engineered.

Using templates, the programmer is freed from staring at fragile, heavily escaped regular expressions, mapping capture groups to named variables or wrestling with the DOM and badly formed HTML. In addition, extracted data can be hierarchical, which is beyond the capabilities of vanilla regular expressions.

Text::Scrapers functionality overlaps some existing CPAN modules - Template::Extract and WWW::Scraper.
Text::Scraper is much more lightweight than either and has a more general application domain than the latter. It has no dependencies on other frameworks, modules or design-decisions. On average, Text::Scraper benchmarks around 250% faster than Template::Extract - and uses significantly less memory.

Unlike both existing modules, Text::Scraper generalizes its functionality to allow the programmer to refine template capture groups beyond (.*?), fully redefine the template syntax and introduce new template constructs bound to custom classes.

Text::Scraper 0.02 Screenshot

Advertisements

Text::Scraper 0.02 Keywords

Bookmark Text::Scraper 0.02

Hyperlink code:
Link for forum:

Text::Scraper 0.02 Copyright

WareSeeker periodically updates pricing and software information of Text::Scraper 0.02 full version from the publisher, so some information may be slightly out-of-date. You should confirm all information before relying on it. Software piracy is theft, Using crack, password, serial numbers, registration codes, key generators is illegal and prevent future development of Text::Scraper 0.02 Edition. Download links are directly from our publisher sites, torrent files or links from rapidshare.com, yousendit.com or megaupload.com are not allowed

Allok Video Splitter 2.2.0 Review:

Name (Required)
Email(Required)
Captcha
Featured Software

Want to place your software product here?
Please contact us for consideration.

Contact WareSeeker.com
Related Software
screen-scraper is a tool for extracting data from Web sites. It works much like a database that allows you to mine the data of the world wide web. It provides a graphical interface allowing you to designate URLs, data elements to be extracted, and scripting logic to traverse pages and work with mined data. Once these items have been created, screen-scraper can be invoked from external languages such as .NET, Java, PHP, and Active Server Pages. Free Download
Text::ScriptTemplate is a standalone ASP/JSP/PHP-style template processor. Free Download
PerlIO is a Perl module created to load on demand PerlIO layers and root of PerlIO::* name space. Free Download
XcplayC is a text-GUI for XMMS based on xcplay. Free Download
StealIt is a service menu to take ownership on selected file/directory. Free Download
WWW::Scraper::Dice Perl module contains Scrapes Dice : (skills,locations) => (title, location ,residue). Free Download
Convert::Transcribe is a Perl extension for transcribing natural languages. Free Download
WWW::Scraper::BAJobs it Scrapes BAJobs.com. Free Download