Main > Programming > Libraries >

Text::Ngrams 1.9

Text::Ngrams 1.9

Sponsored Links

Text::Ngrams 1.9 Ranking & Summary

RankingClick at the star to rank
Ranking Level
User Review: 0 (0 times)
File size: 0.036 MB
Platform: Any Platform
License: Perl Artistic License
Price:
Downloads: 827
Date added: 2007-08-22
Publisher: Simon Cozens

Text::Ngrams 1.9 description

Text::Ngrams is a flexible Ngram analysis (for characters, words, and more).
SYNOPSIS
For default character n-gram analysis of string:
use Text::Ngrams;
my $ng3 = Text::Ngrams->new;
$ng3->process_text(abcdefg1235678hijklmnop);
print $ng3->to_string;
my @ngramsarray = $ng3->get_ngrams;
One can also feed tokens manually:
use Text::Ngrams;
my $ng3 = Text::Ngrams->new;
$ng3->feed_tokens(a);
$ng3->feed_tokens(b);
$ng3->feed_tokens(c);
$ng3->feed_tokens(d);
$ng3->feed_tokens(e);
$ng3->feed_tokens(f);
$ng3->feed_tokens(g);
$ng3->feed_tokens(h);
We can choose n-grams of various sizes, e.g.:
my $ng = Text::Ngrams->new( windowsize => 6 );
or different types of n-grams, e.g.:
my $ng = Text::Ngrams->new( type => byte );
my $ng = Text::Ngrams->new( type => word );
my $ng = Text::Ngrams->new( type => utf8 );
To process a list of files:
$ng->process_files(somefile.txt, otherfile.txt);
This module implement text n-gram analysis, supporting several types of analysis, including character and word n-grams.
The module Text::Ngrams is very flexible. For example, it allows a user to manually feed a sequence of any tokens. It handles several types of tokens (character, word), and also allows a lot of flexibility in automatic recognition and feed of tokens and the way they are combined in an n-gram. It counts all n-gram frequencies up to the maximal specified length. The output format is meant to be pretty much human-readable, while also loadable by the module.
The module can be used from the command line through the script ngrams.pl provided with the package.
Version restrictions:
- If a user customizes a type, it is possible that a resulting n-gram will be ambiguous. In this way, to different n-grams may be counted as one. With predefined types of n-grams, this should not happen. For example, if a user chooses that a token can contain a space, and uses space as an n-gram separator, then a trigram like this "x x x x" is ambiguous.
- Method process_file does not handle multi-line tokens by default. This can be fixed, but it does not seem to be worth the code complication. There are various ways around this if one really needs such tokens: One way is to preprocess them. Another way is to read as much text as necessary at a time then to use process_text, which does handle multi-line tokens.

Text::Ngrams 1.9 Screenshot

Advertisements

Text::Ngrams 1.9 Keywords

Bookmark Text::Ngrams 1.9

Hyperlink code:
Link for forum:

Text::Ngrams 1.9 Copyright

WareSeeker periodically updates pricing and software information of Text::Ngrams 1.9 full version from the publisher, so some information may be slightly out-of-date. You should confirm all information before relying on it. Software piracy is theft, Using crack, password, serial numbers, registration codes, key generators is illegal and prevent future development of Text::Ngrams 1.9 Edition. Download links are directly from our publisher sites, torrent files or links from rapidshare.com, yousendit.com or megaupload.com are not allowed

Allok Video Splitter 2.2.0 Review:

Name (Required)
Email(Required)
Captcha
Featured Software

Want to place your software product here?
Please contact us for consideration.

Contact WareSeeker.com
Related Software
NGramJ is an ngram library for NLP with Java. Free Download
Text::Graph is a Perl extension for generating text-based graphs. Free Download
lhs2tex is a preprocessor to generate LaTeX from literate Haskell sources. Free Download
Text::NSP::Measures is a Perl module for computing association scores of Ngrams. Free Download
Text::Kakasi is a perl frontend to kakasi. Free Download
Tiltilation is a action packed ball rolling fun game. Free Download
auto-nng is a software for analysis and classification of data, using artificial neuronal networks. Free Download
Text::Kakasi::JP is a Japanese Perl extension for Text::Kakasi. Free Download