searches
js-search 1.0
js-search is a javascript indexing and searching. more>>
A client-side library for building a simple inverted index, and searching it.
You can download the source code from SVN with the following command:
svn checkout http://js-search.googlecode.com/svn/trunk/ js-search
Search::Tools 0.01
Search::Tools are tools for building search applications. more>>
SYNOPSIS
use Search::Tools;
my $re = Search::Tools->regexp(query => the quick brown fox);
my $snipper = Search::Tools->snipper(query => $re);
my $hiliter = Search::Tools->hiliter(query => $re);
for my $result (@search_results)
{
print $hiliter->light( $snipper->snip( $result->summary ) );
}
Search::Tools is a set of utilities for building search applications. Rather than adhering to a particular search application, the goal of Search::Tools is to provide general-purpose methods for common search application features. Think of Search::Tools like a toolbox rather than a hammer.
Examples include:
Parsing search queries for the meaningful keywords
Rich regular expressions for locating keywords in the original indexed documents
Contextual snippets showing query keywords
Highlighting of keywords in context
Search::Tools is derived from some of the features in HTML::HiLiter and SWISH::HiLiter, but has been re-written with an eye to accomodating more general purpose features.
Search::Lemur 1.00
Search::Lemur is a Perl class to query a Lemur server, and parse the results. more>>
SYNOPSYS
use Search::Lemur;
my $lem = Search::Lemur->new("http://url/to/lemur.cgi");
# run some queries, and get back an array of results
# a query with a single term:
my @results1 = $lem->query("encryption");
# a query with two terms:
my @results2 = $lem->query("encryption MD5");
# get corpus term frequency of MD5:
my $md5ctf = $results2[1]->ctf();
This module will make it easy to interact with a Lemur Toolkit for Language Modeling and Information Retrieval server for information retreival exercises. For more information on Lemur, see http://www.lemurproject.org/.
This module takes care of all parsing of responses from the server. You can just pass a query as a space-separated list of terms, and the module will give you back an array of result objects.
DGS Search 0.9.6
DGS Search was created to provide an easy to install search utility. more>>
DGS Search is aimed at supporting the small to medium sized web site.
Search::QueryParser 0.91
Search::QueryParser parses a query string into a data structure suitable for external search engines. more>>
SYNOPSIS
my $qp = new Search::QueryParser;
my $s = +mandatoryWord -excludedWord +field:word "exact phrase";
my $query = $qp->parse($s) or die "Error in query : " . $qp->err;
$someIndexer->search($query);
# query with comparison operators and implicit plus (second arg is true)
$query = $qp->parse("txt~^foo.* date>=01.01.2001 date<<less
Search::VectorSpace 0.02
Search::VectorSpace is a very basic vector-space search engine. more>>
SYNOPSIS
use Search::VectorSpace;
my @docs = ...;
my $engine = Search::VectorSpace->new( docs => @docs, threshold => .04);
$engine->build_index();
while ( my $query = ) {
my %results = $engine->search( $query );
print join "n", keys %results;
}
This module takes a list of documents (in English) and builds a simple in-memory search engine using a vector space model. Documents are stored as PDL objects, and after the initial indexing phase, the search should be very fast. This implementation applies a rudimentary stop list to filter out very common words, and uses a cosine measure to calculate document similarity. All documents above a user-configurable similarity threshold are returned.
Search::Glimpse 0.02
Search::Glimpse is a Perl extension to communicate with Glimpse server. more>>
SYNOPSIS
use Search::Glimpse;
my $glimpse = Search::Glimpse->new;
my @results = $glimpse->search("search this string");
ABSTRACT
This module is an extension to use glimpse server from Perl.
Quick hack to connect to glimpse server.
new
Creates a new glimpse object.
search
Search on a glimpse object
hits
Returns the number of hits...
files
Returns the number of files...
List::Search 0.3
List::Search is a Perl module for fast searching of sorted lists. more>>
SYNOPSIS
use List::Search qw( list_search nlist_search custom_list_search );
# Create a list to search
my @list = sort qw( bravo charlie delta );
# Search for a value, returns the index of first match
print list_search( alpha, @list ); # 0
print list_search( charlie, @list ); # 1
print list_search( zebra, @list ); # -1
# Search numerically
my @numbers = sort { $a $b } ( 10, 20, 100, 200, );
print nlist_search( 20, @numbers ); # 2
# Search using some other comparison
my $cmp_code = sub { lc( $_[0] ) cmp lc( $_[1] ) };
my @custom_list = sort { $cmp_code->( $a, $b ) } qw( FOO bar BAZ bundy );
print list_search_generic( $cmp_code, foo, @custom_list );
This module lets you quickly search a sorted list. It will return the index of the first entry that matches, or if there is no exact matches then the first entry that is greater than the search key.
For example in the list my @list = qw( bob dave fred ); searching for dave will return 1 as $list[1] eq dave. Searching for charles will also return 1 as dave is the first entry that is greater than charles.
If there are none of the entries match then -1 is returned. You can either check for this or use it as an index to get the last values in the list. Whichever approach you choose will depend on what you are trying to do.
The actual searching is done using a binary search which is very fast.
WWW::Search 2.488
WWW::Search is a collection of Perl modules which provide an API to WWW search engines. more>>
We include two applications built from this library: AutoSearch (an program to automate tracking of search results over time), and a small demonstration program to drive the library. Back-ends for other search engines and more sophisticated clients are currently under development.
WWW::Search includes AutoSearch, an program to automate web-based searches.
pro-search 0.17.2
pro-search is a crawler for FTP servers, SMB shares, HTTP servers, and DC++ networks. more>>
Search::FreeText 0.05
Search::FreeText is a free text indexing module for medium-to-large text corpuses. more>>
SYNOPSIS
my $test = new Search::FreeText(-db => [DB_File, "stories.db"]);
$text->open_index();
$text->clear_index();
$text->index_document(1, "Hello world");
$text->index_document(2, "World in motion");
$text->index_document(3, "Cruel crazy beautiful world");
$text->index_document(4, "Hey crazy");
$text->close_index();
$text->open_index();
foreach ($text->search("Crazy", 10)) {
print "$_->[0], $_->[1]n";
};
$text->close_index();
This module provides free text searching in a relatively open manner. It allows a persistent inverted file index to be constructed and managed (within limits), and then to be searched fairly efficiently. The module depends on a DBM module of some kind to manage the inverted file (DB_File is usually the best choice, as it is quite fast, quite scaleable, and accepts the long values that are needed for performance.
The free text searching algorithm used is the BM25 weighting scheme described in Robertson, S. E., Walker, S., Beaulieu, M. M., Gatford, M., and Payne, A. (1995). Okapi at TREC-4, in NIST Special Publication 500-236, the Fourth Text Retrieval Conference (TREC-4), pages 73-96.
Much of the module depends on an open lexical analysis system, which is implemented by Search::FreeText::LexicalAnalysis. This is where all the word splitting and stemming is handled (Lingua::Stem is used for the stemming).
Using the module is quite simple: you can open an index and close it, and while it is open you add documents as strings, each with a key of your own choosing. You can search the corpus using a string, and you get back a list of matches, each an array of your own document key and a relevance measure. So, for example, the keys might be database table keys, URLs, file names, anything like that will do. This makes Search::FreeText a very useful package to implement fairly efficient and high quality search systems.
Google Search 0.1
Google Search is a desktop tool with you can search anything you want on the Google engine direct from your desktop. more>>
Installation:
To compile use qmake then make
Example:
bash: qmake mio.pro
bash: make
bash: ./mio
It uses firefox only. I will include konqueror in the next update.
Assign it a global shorcut.
WWW::Search::YahooNews 1.00
WWW::Search::YahooNews is a Perl backend for searching Yahoo News. more>>
SYNOPSIS
use WWW::Search; $query = "Bob Hope"; $search = new WWW::Search(YahooNews); $search->native_query(WWW::Search::escape_query($query)); $search->maximum_to_retrieve(100); while (my $result = $search->next_result()) {
$url = $result->url; $title = $result->title; $desc = $result->description;
$desc
n"; }
This class is a Yahoo specialization of WWW::Search. It handles making and interpreting Yahoo News Searches. Yahoo allows searching a wide variety of news sources like SEC and PRWire to name a few. http://www.search.news.yahoo.com.
HOW DOES IT WORK?
native_setup_search is called (from WWW::Search::setup_search) before we do anything. It initializes our private variables (which all begin with underscore) and sets up a URL to the first results page in {_next_url}.
native_retrieve_some is called (from WWW::Search::retrieve_some) whenever more hits are needed. It calls WWW::Search::http_request to fetch the page specified by {_next_url}. It then parses this page, appending any search hits it finds to {cache}. If it finds a ``next button in the text, it sets {_next_url} to point to the page for the next set of results, otherwise it sets it to undef to indicate were done.
WWW::Search::Nomade 1.3
WWW::Search::Nomade is a Perl class for searching Nomade. more>>
SYNOPSIS
use WWW::Search;
my $oSearch = new WWW::Search(Nomade);
$oSearch->maximum_to_retrieve(100);
#$oSearch ->{_debug}=1;
# Create request
$oSearch->native_query(WWW::Search::escape_query("cgi"));
# or Make an international search (on google db)
$oSearch->native_query(WWW::Search::escape_query("cgi"),
{ opt => 1 });
print "I find ", $oSearch->approximate_result_count(),"n";
while (my $oResult = $oSearch->next_result())
{ print "Url:", $oResult->url,"n","Titre:", $oResult->title,"n"; }
This class is an Nomade specialization of WWW::Search. It handles making and interpreting Nomade searches http://www.Nomade.fr, a french search engine.
This class exports no public interface; all interaction should be done through WWW::Search objects.
Search::Dict 5.8.8
Search::Dict is a Perl module to search for key in dictionary file. more>>
SYNOPSIS
use Search::Dict;
look *FILEHANDLE, $key, $dict, $fold;
use Search::Dict;
look *FILEHANDLE, $params;
Sets file position in FILEHANDLE to be first line greater than or equal (stringwise) to $key. Returns the new file position, or -1 if an error occurs.
The flags specify dictionary order and case folding:
If $dict is true, search by dictionary order (ignore anything but word characters and whitespace). The default is honour all characters.
If $fold is true, ignore case. The default is to honour case.
If there are only three arguments and the third argument is a hash reference, the keys of that hash can have values dict, fold, and comp or xfrm (see below), and their correponding values will be used as the parameters.
If a comparison subroutine (comp) is defined, it must return less than zero, zero, or greater than zero, if the first comparand is less than, equal, or greater than the second comparand.
If a transformation subroutine (xfrm) is defined, its value is used to transform the lines read from the filehandle before their comparison.