lucene
Sponsored Links
Sponsored Links
Secleted [ 0 ] software to compare
Results 1 - 15 of about 33
Lucene 0.13
Lucene is a Perl API to the C++ port of the Lucene search engine. more>>
Lucene is a Perl API to the C++ port of the Lucene search engine.
SYNOPSIS
Initialize/Empty Lucene index
my $analyzer = new Lucene::Analysis::Standard::StandardAnalyzer();
my $store = Lucene::Store::FSDirectory->getDirectory("/home/lucene", 1);
my $tmp_writer = new Lucene::Index::IndexWriter($store, $analyzer, 1);
$tmp_writer->close;
undef $tmp_writer;
Choose your Analyzer (string tokenizer)
# lowercases text and splits it at non-letter characters
my $analyzer = new Lucene::Analysis::SimpleAnalyzer();
# same as before and removes stop words
my $analyzer = new Lucene::Analysis::StopAnalyzer();
# same as before but you provide your own stop words
my $analyzer = new Lucene::Analysis::StopAnalyzer([qw/that this in or and/]);
# splits text at whitespace characters
my $analyzer = new Lucene::Analysis::WhitespaceAnalyzer();
# lowercases text, tokenized it based on a grammer that
# leaves named authorities intact (e-mails, company names,
# web hostnames, IP addresses, etc) and removed stop words
my $analyzer = new Lucene::Analysis::Standard::StandardAnalyzer();
# same as before but you provide your own stop words
my $analyzer = new Lucene::Analysis::Standard::StandardAnalyzer([qw/that this in or and/]);
# takes string as it is (only when using clucene-0.9.17 or above)
my $analyzer = new Lucene::Analysis::KeywordAnalyzer();
Create a custom Analyzer
package MyAnalyzer;
use base Lucene::Analysis::Analyzer;
# You MUST called SUPER::new if you implement new()
sub new {
my $class = shift;
my $self = $class->SUPER::new();
# ...
return $self;
}
sub tokenStream {
my ($self, $field, $reader) = @_;
my $ret = new Lucene::Analysis::StandardTokenizer($reader);
if ($field eq "MyKeywordField") {
return $ret;
}
$ret = new Lucene::Analysis::LowerCaseFilter($ret);
$ret = new Lucene::Analysis::StopFilter($ret, [qw/foo bar bax/]);
return $ret;
}
package main;
my $analyzer = new MyAnalyzer;
Choose your Store (storage engine)
# in-memory storage
my $store = new Lucene::Store::RAMDirectory();
# disk-based storage
my $store = Lucene::Store::FSDirectory->getDirectory("/home/lucene", 0);
Open and configure an IndexWriter
my $writer = new Lucene::Index::IndexWriter($store, $analyzer, 0);
# optional settings for power users
$writer->setMergeFactor(100);
$writer->setUseCompoundFile(0);
$writer->setMaxFieldLength(255);
$writer->setMinMergeDocs(10);
$writer->setMaxMergeDocs(100);
Create Documents and add Fields
my $doc = new Lucene::Document;
# field gets analyzed, indexed and stored
$doc->add(Lucene::Document::Field->Text("content", $content));
# field gets indexed and stored
$doc->add(Lucene::Document::Field->Keyword("isbn", $isbn));
# field gets just stored
$doc->add(Lucene::Document::Field->UnIndexed("sales_rank", $sales_rank));
# field gets analyzed and indexed
$doc->add(Lucene::Document::Field->UnStored("categories", $categories));
<<lessSYNOPSIS
Initialize/Empty Lucene index
my $analyzer = new Lucene::Analysis::Standard::StandardAnalyzer();
my $store = Lucene::Store::FSDirectory->getDirectory("/home/lucene", 1);
my $tmp_writer = new Lucene::Index::IndexWriter($store, $analyzer, 1);
$tmp_writer->close;
undef $tmp_writer;
Choose your Analyzer (string tokenizer)
# lowercases text and splits it at non-letter characters
my $analyzer = new Lucene::Analysis::SimpleAnalyzer();
# same as before and removes stop words
my $analyzer = new Lucene::Analysis::StopAnalyzer();
# same as before but you provide your own stop words
my $analyzer = new Lucene::Analysis::StopAnalyzer([qw/that this in or and/]);
# splits text at whitespace characters
my $analyzer = new Lucene::Analysis::WhitespaceAnalyzer();
# lowercases text, tokenized it based on a grammer that
# leaves named authorities intact (e-mails, company names,
# web hostnames, IP addresses, etc) and removed stop words
my $analyzer = new Lucene::Analysis::Standard::StandardAnalyzer();
# same as before but you provide your own stop words
my $analyzer = new Lucene::Analysis::Standard::StandardAnalyzer([qw/that this in or and/]);
# takes string as it is (only when using clucene-0.9.17 or above)
my $analyzer = new Lucene::Analysis::KeywordAnalyzer();
Create a custom Analyzer
package MyAnalyzer;
use base Lucene::Analysis::Analyzer;
# You MUST called SUPER::new if you implement new()
sub new {
my $class = shift;
my $self = $class->SUPER::new();
# ...
return $self;
}
sub tokenStream {
my ($self, $field, $reader) = @_;
my $ret = new Lucene::Analysis::StandardTokenizer($reader);
if ($field eq "MyKeywordField") {
return $ret;
}
$ret = new Lucene::Analysis::LowerCaseFilter($ret);
$ret = new Lucene::Analysis::StopFilter($ret, [qw/foo bar bax/]);
return $ret;
}
package main;
my $analyzer = new MyAnalyzer;
Choose your Store (storage engine)
# in-memory storage
my $store = new Lucene::Store::RAMDirectory();
# disk-based storage
my $store = Lucene::Store::FSDirectory->getDirectory("/home/lucene", 0);
Open and configure an IndexWriter
my $writer = new Lucene::Index::IndexWriter($store, $analyzer, 0);
# optional settings for power users
$writer->setMergeFactor(100);
$writer->setUseCompoundFile(0);
$writer->setMaxFieldLength(255);
$writer->setMinMergeDocs(10);
$writer->setMaxMergeDocs(100);
Create Documents and add Fields
my $doc = new Lucene::Document;
# field gets analyzed, indexed and stored
$doc->add(Lucene::Document::Field->Text("content", $content));
# field gets indexed and stored
$doc->add(Lucene::Document::Field->Keyword("isbn", $isbn));
# field gets just stored
$doc->add(Lucene::Document::Field->UnIndexed("sales_rank", $sales_rank));
# field gets analyzed and indexed
$doc->add(Lucene::Document::Field->UnStored("categories", $categories));
Download (0.018MB)
Added: 2007-05-10 License: Perl Artistic License Price:
897 downloads
Plucene 1.25
Plucene is a Perl port of the Lucene search engine. more>>
Plucene is a Perl port of the Lucene search engine.
SYNOPSIS
Create Documents by adding Fields:
my $doc = Plucene::Document->new;
$doc->add(Plucene::Document::Field->Text(content => $content));
$doc->add(Plucene::Document::Field->Text(author => "Your Name"));
Choose Your Analyser and add documents to an Index Writer
my $analyzer = Plucene::Analysis::SimpleAnalyzer->new();
my $writer = Plucene::Index::Writer->new("my_index", $analyzer, 1);
$writer->add_document($doc);
undef $writer; # close
Search by building a Query
my $parser = Plucene::QueryParser->new({
analyzer => Plucene::Analysis::SimpleAnalyzer->new(),
default => "text" # Default field for non-specified queries
});
my $query = $parser->parse(author:"Your Name");
Then pass the Query to an IndexSearcher and collect hits
my $searcher = Plucene::Search::IndexSearcher->new("my_index");
my @docs;
my $hc = Plucene::Search::HitCollector->new(collect => sub {
my ($self, $doc, $score) = @_;
push @docs, $searcher->doc($doc);
});
$searcher->search_hc($query => $hc);
<<lessSYNOPSIS
Create Documents by adding Fields:
my $doc = Plucene::Document->new;
$doc->add(Plucene::Document::Field->Text(content => $content));
$doc->add(Plucene::Document::Field->Text(author => "Your Name"));
Choose Your Analyser and add documents to an Index Writer
my $analyzer = Plucene::Analysis::SimpleAnalyzer->new();
my $writer = Plucene::Index::Writer->new("my_index", $analyzer, 1);
$writer->add_document($doc);
undef $writer; # close
Search by building a Query
my $parser = Plucene::QueryParser->new({
analyzer => Plucene::Analysis::SimpleAnalyzer->new(),
default => "text" # Default field for non-specified queries
});
my $query = $parser->parse(author:"Your Name");
Then pass the Query to an IndexSearcher and collect hits
my $searcher = Plucene::Search::IndexSearcher->new("my_index");
my @docs;
my $hc = Plucene::Search::HitCollector->new(collect => sub {
my ($self, $doc, $score) = @_;
push @docs, $searcher->doc($doc);
});
$searcher->search_hc($query => $hc);
Download (0.32MB)
Added: 2007-06-23 License: Perl Artistic License Price:
856 downloads
PyLucene 2.1.0
PyLucene is a GCJ-compiled version of Java Lucene integrated with Python via SWIG. more>>
PyLucene is a GCJ-compiled version of Java Lucene integrated with Python via SWIG.
PyLucene goal is to allow you to use Lucenes text indexing and searching capabilities from Python. It is designed to be API compatible with the latest version of Java Lucene.
PyLucene is supported on Mac OS X, Linux and Windows. Binaries for PyLucene are available below. See the INSTALL file for information about building PyLucene from sources.
Installation:
To build PyLucene from sources, please see the INSTALL file.
To install PyLucene binaries you just downloaded:
- install the files in the python directory into pythons site-packages directory
- if you downloaded binaries with Berkeley DB support, install the files in the db directory into the directory containing your Berkeley DB shared libraries, such as /usr/local/BerkeleyDB.4.3/lib
- if you are installing Unix (Mac OS X or Linux) binaries, install the files in the gcj directory into /usr/local/lib
Enhancements:
- This release wraps the newly-released Java Lucene 2.1.0.
<<lessPyLucene goal is to allow you to use Lucenes text indexing and searching capabilities from Python. It is designed to be API compatible with the latest version of Java Lucene.
PyLucene is supported on Mac OS X, Linux and Windows. Binaries for PyLucene are available below. See the INSTALL file for information about building PyLucene from sources.
Installation:
To build PyLucene from sources, please see the INSTALL file.
To install PyLucene binaries you just downloaded:
- install the files in the python directory into pythons site-packages directory
- if you downloaded binaries with Berkeley DB support, install the files in the db directory into the directory containing your Berkeley DB shared libraries, such as /usr/local/BerkeleyDB.4.3/lib
- if you are installing Unix (Mac OS X or Linux) binaries, install the files in the gcj directory into /usr/local/lib
Enhancements:
- This release wraps the newly-released Java Lucene 2.1.0.
Download (4.1MB)
Added: 2007-02-20 License: MIT/X Consortium License Price:
979 downloads
Apache Lucene 2.2.0
Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. more>>
Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java.
Apache Lucene is a technology suitable for nearly any application that requires full-text search, especially cross-platform.
Enhancements:
- Many new features, optimizations, and bugfixes have been added since 2.1.0, including "point-in-time" searching over NFS, payloads, function queries, and new APIs for pre-analyzed fields.
- This release includes index format changes that are not readable by older versions.
- It can both read and update older Lucene indexes.
- Adding to an index with an older format will cause it to be converted to the newer format.
<<lessApache Lucene is a technology suitable for nearly any application that requires full-text search, especially cross-platform.
Enhancements:
- Many new features, optimizations, and bugfixes have been added since 2.1.0, including "point-in-time" searching over NFS, payloads, function queries, and new APIs for pre-analyzed fields.
- This release includes index format changes that are not readable by older versions.
- It can both read and update older Lucene indexes.
- Adding to an index with an older format will cause it to be converted to the newer format.
Download (2.5MB)
Added: 2007-06-21 License: The Apache License 2.0 Price:
880 downloads
kio-clucene 0.1.0
kio-clucene provides an indexed search engine for KDE. more>>
kio-clucene provides an indexed search engine for KDE.
kio-clucene is a KDE search ioslave that uses clucene, a C++ implementation of lucene, a well-known full-featured text search engine library. kio-clucene gives any KDE application friendly access to searching the content of files inside directories, archives, and many virtual file systems. Optical character recognition (OCR) is supported by using a program like gocr, allowing the text content of images to be indexed.
Main features:
kio-clucene can search for
- full text content
- mimetype
- metainfo
- mod. dates
- path names
- multifield queries are possible (a multifield search-dialog can be used (or not))
kio-clucene can index
- local files
- remote files (ftp,pop3,... (basic support))
- archives (basic local support)
- read-only data stored on cds,dvds.
- text inside images using optical character recognition OCR programs like gocr
kio-clucene stores previews of
- all images
- all videos
kio-clucene can search on
- a single index
- multiple indexes
- Multithreaded search-daemon.
Integration with KDE
- kio-clucene integrates well to the existing kde-framework. You can use kio-clucene from
- konqueror location-bar
- kicker "run command"
- "Alt+F2" "run command"
- any apps, using the openfile dialog
- Example: from kpdf, you could open the file-dialog and search very fast for your pdfs like adobe acrobat reader can do.
The ioslave is build to be user-friendly
- it works a bit like KDE remote-ioslave
- it can call a search-dialog which allow for multifields queries
- it displays recent queries
- recent queries can be saved to desktop-files
- => this almost emulates "virtual folders"
Search-engine
- The search engine is based on clucene, a c++ implementation of apache/lucene. clucene allows for sophisticated queries like:
- proximity-words queries
- but kio-clucene does not implements queries of the form *substring yet.
Display of the results
- So far only standard KDE-listviews are used to display the URLs of the matching documents. No kpart is provided for the display/sorting of the results like beagle can do.
Instantaneous file indexing
- This is not provided by the indexer, but any KDE application can make use of DCOP to request the indexing of any URL. Instantaneous indexing is a kernel/kdecore problem. But Suse 9.3 already ships with beagle installed, and an inotify-enabled kernel. So those features should soon be available with KDE too.
kio-clucene build-system is Scons-enabled
- Kio-clucene make use of the scons build and install system.
<<lesskio-clucene is a KDE search ioslave that uses clucene, a C++ implementation of lucene, a well-known full-featured text search engine library. kio-clucene gives any KDE application friendly access to searching the content of files inside directories, archives, and many virtual file systems. Optical character recognition (OCR) is supported by using a program like gocr, allowing the text content of images to be indexed.
Main features:
kio-clucene can search for
- full text content
- mimetype
- metainfo
- mod. dates
- path names
- multifield queries are possible (a multifield search-dialog can be used (or not))
kio-clucene can index
- local files
- remote files (ftp,pop3,... (basic support))
- archives (basic local support)
- read-only data stored on cds,dvds.
- text inside images using optical character recognition OCR programs like gocr
kio-clucene stores previews of
- all images
- all videos
kio-clucene can search on
- a single index
- multiple indexes
- Multithreaded search-daemon.
Integration with KDE
- kio-clucene integrates well to the existing kde-framework. You can use kio-clucene from
- konqueror location-bar
- kicker "run command"
- "Alt+F2" "run command"
- any apps, using the openfile dialog
- Example: from kpdf, you could open the file-dialog and search very fast for your pdfs like adobe acrobat reader can do.
The ioslave is build to be user-friendly
- it works a bit like KDE remote-ioslave
- it can call a search-dialog which allow for multifields queries
- it displays recent queries
- recent queries can be saved to desktop-files
- => this almost emulates "virtual folders"
Search-engine
- The search engine is based on clucene, a c++ implementation of apache/lucene. clucene allows for sophisticated queries like:
- proximity-words queries
- but kio-clucene does not implements queries of the form *substring yet.
Display of the results
- So far only standard KDE-listviews are used to display the URLs of the matching documents. No kpart is provided for the display/sorting of the results like beagle can do.
Instantaneous file indexing
- This is not provided by the indexer, but any KDE application can make use of DCOP to request the indexing of any URL. Instantaneous indexing is a kernel/kdecore problem. But Suse 9.3 already ships with beagle installed, and an inotify-enabled kernel. So those features should soon be available with KDE too.
kio-clucene build-system is Scons-enabled
- Kio-clucene make use of the scons build and install system.
Download (0.44MB)
Added: 2007-02-01 License: GPL (GNU General Public License) Price:
997 downloads
PyLUcene SHell 0.2.0
PyLUcene SHell (Plush) is an interactive shell to inspect a Lucene store. more>>
PyLUcene SHell (Plush) is an interactive shell to inspect a Lucene store.
Main features:
- View store information.
- View indexes definition.
- Search using the Lucene Query Parser Syntax.
- Sort result list.
- Browse by document number.
- Top term occurences for a field, matching a regex.
- Support PyLucene 1.9.1 and 2.0.0.
- Interactive shell emacs like command history and editing features.
- Command line tool and thus scriptable.
- Easy installation, no java required.
- Can load NXLucene analyzers.
- Plush is free software distributed under the GNU GPL.
- Plush is written in python and can be easily customized.
Installation:
You need python 2.4 (2.3 not tested) with the readline support (--enable-readline)
Plush requires PyLucene which is easy to install using binaries.
Here is an example on how to install the latest PyLucene 2.0.0-3 (Lucene 2.0.0-453447) on ubuntu:
cd /tmp
wget http://downloads.osafoundation.org/PyLucene/linux/ubuntu/PyLucene-2.0.0-3.tar.gz
tar xzvf PyLucene-2.0.0-3.tar.gz
cd PyLucene-2.0.0-3/
sudo cp -r python/* /usr/lib/python2.4/site-packages/
sudo cp -r gcj/* /usr/local/lib
Visit the PyLucene site for other pre-built binaries.
Plush is a pure python package that you can get from the Python Cheese Shop
tar xzvf plush-X.Y.Z.tar.gz
cd plush
Install plush either with:
sudo make install
or using the pythonic way:
python setup.py build
sudo python setup.py install
Thats all no .jar nor java required.
<<lessMain features:
- View store information.
- View indexes definition.
- Search using the Lucene Query Parser Syntax.
- Sort result list.
- Browse by document number.
- Top term occurences for a field, matching a regex.
- Support PyLucene 1.9.1 and 2.0.0.
- Interactive shell emacs like command history and editing features.
- Command line tool and thus scriptable.
- Easy installation, no java required.
- Can load NXLucene analyzers.
- Plush is free software distributed under the GNU GPL.
- Plush is written in python and can be easily customized.
Installation:
You need python 2.4 (2.3 not tested) with the readline support (--enable-readline)
Plush requires PyLucene which is easy to install using binaries.
Here is an example on how to install the latest PyLucene 2.0.0-3 (Lucene 2.0.0-453447) on ubuntu:
cd /tmp
wget http://downloads.osafoundation.org/PyLucene/linux/ubuntu/PyLucene-2.0.0-3.tar.gz
tar xzvf PyLucene-2.0.0-3.tar.gz
cd PyLucene-2.0.0-3/
sudo cp -r python/* /usr/lib/python2.4/site-packages/
sudo cp -r gcj/* /usr/local/lib
Visit the PyLucene site for other pre-built binaries.
Plush is a pure python package that you can get from the Python Cheese Shop
tar xzvf plush-X.Y.Z.tar.gz
cd plush
Install plush either with:
sudo make install
or using the pythonic way:
python setup.py build
sudo python setup.py install
Thats all no .jar nor java required.
Download (0.022MB)
Added: 2007-02-28 License: Perl Artistic License Price:
969 downloads
Lucene::Search::Highlight 0.01
Lucene::Search::Highlight is a Perl module with highlight terms in Lucene search results. more>>
Lucene::Search::Highlight is a Perl module with highlight terms in Lucene search results.
SYNOPSIS
Load highlight classes into namespace
use Lucene::Search::Highlight;
Create Formatter and Query Scorer
my $formatter = new
Lucene::Search::Highlight::SimpleHTMLFormatter("< b >", "< /b >");
my $scorer = new
Lucene::Search::Highlight::QueryScorer($query);
Create Highlighter
my $highlighter = new
Lucene::Search::Highlight::Highlighter($formatter, $scorer);
Get best fragements with highlighted terms
my $fragement = $highlighter->getBestFragment($analyzer, $field, $text);
my $fragements = $highlighter->getBestFragments($analyzer, $field, $text, $num_fragements, $separator);
<<lessSYNOPSIS
Load highlight classes into namespace
use Lucene::Search::Highlight;
Create Formatter and Query Scorer
my $formatter = new
Lucene::Search::Highlight::SimpleHTMLFormatter("< b >", "< /b >");
my $scorer = new
Lucene::Search::Highlight::QueryScorer($query);
Create Highlighter
my $highlighter = new
Lucene::Search::Highlight::Highlighter($formatter, $scorer);
Get best fragements with highlighted terms
my $fragement = $highlighter->getBestFragment($analyzer, $field, $text);
my $fragements = $highlighter->getBestFragments($analyzer, $field, $text, $num_fragements, $separator);
Download (0.006MB)
Added: 2007-04-03 License: Perl Artistic License Price:
936 downloads
Lire - Lucene Image REtrieval 0.5.4
Lire is a simple way to create a Lucene index of image features for content based image retrieval. more>>
Lire, the Lucene Image REtrieval library is a simple way to create a Lucene index of image features for content based image retrieval (CBIR).
The used features are taken from the MPEG-7 Standard: ScalableColor, ColorLayout and EdgeHistogram. Furthermore methods for searching the index are provided.
The LIRE library is part of the Caliph & Emir project and aims to provide the CBIR features of Caliph & Emir to other Java projects in an easy and light weight way.
Creating an Index
Use DocumentBuilderFactory to create a DocumentBuilder, which will create Lucene Documents from images. Add this documents to an index like this:
System.out.println(">> Indexing " + images.size() + " files.");
DocumentBuilder builder = DocumentBuilderFactory.getExtensiveDocumentBuilder();
IndexWriter iw = new IndexWriter(indexPath, new SimpleAnalyzer(), true);
int count = 0;
long time = System.currentTimeMillis();
for (String identifier : images) {
Document doc = builder.createDocument(new FileInputStream(identifier), identifier);
iw.addDocument(doc);
count ++;
if (count % 25 == 0) System.out.println(count + " files indexed.");
}
long timeTaken = (System.currentTimeMillis() - time);
float sec = ((float) timeTaken) / 1000f;
System.out.println(sec + " seconds taken, " + (timeTaken / count) + " ms per image.");
iw.optimize();
iw.close();
Searching in an Index
Use the ImageSearcherFactory for creating an ImageSearcher, which will retrieve the images for you from the index.
IndexReader reader = IndexReader.open(indexPath);
ImageSearcher searcher = ImageSearcherFactory.createDefaultSearcher();
FileInputStream imageStream = new FileInputStream("image.jpg");
BufferedImage bimg = ImageIO.read(imageStream);
// searching for an image:
ImageSearchHits hits = null;
hits = searcher.search(bimg, reader);
for (int i = 0; i < 5; i++) {
System.out.println(hits.score(i) + ": " + hits.doc(i).getField(DocumentBuilder.FIELD_NAME_IDENTIFIER).stringValue());
}
// searching for a document:
Document document = hits.doc(0);
hits = searcher.search(document, reader);
for (int i = 0; i < 5; i++) {
System.out.println(hits.score(i) + ": " + hits.doc(i).getField(DocumentBuilder.FIELD_NAME_IDENTIFIER).stringValue());
}
Enhancements:
- An issue where the scalable color descriptor (color histogram) was not compliant to the MPEG-7 standard was fixed.
- The color only search was changed to use the color layout descriptor and a bug in the edge histogram descriptor was hunted down.
- The LireDemo GUI application has also been updated: A new function for creating image mosaics has been introduced and the indexing of digital photos is now faster than ever as only the EXIF thumbnails - if available - are used instead of the whole image.
<<lessThe used features are taken from the MPEG-7 Standard: ScalableColor, ColorLayout and EdgeHistogram. Furthermore methods for searching the index are provided.
The LIRE library is part of the Caliph & Emir project and aims to provide the CBIR features of Caliph & Emir to other Java projects in an easy and light weight way.
Creating an Index
Use DocumentBuilderFactory to create a DocumentBuilder, which will create Lucene Documents from images. Add this documents to an index like this:
System.out.println(">> Indexing " + images.size() + " files.");
DocumentBuilder builder = DocumentBuilderFactory.getExtensiveDocumentBuilder();
IndexWriter iw = new IndexWriter(indexPath, new SimpleAnalyzer(), true);
int count = 0;
long time = System.currentTimeMillis();
for (String identifier : images) {
Document doc = builder.createDocument(new FileInputStream(identifier), identifier);
iw.addDocument(doc);
count ++;
if (count % 25 == 0) System.out.println(count + " files indexed.");
}
long timeTaken = (System.currentTimeMillis() - time);
float sec = ((float) timeTaken) / 1000f;
System.out.println(sec + " seconds taken, " + (timeTaken / count) + " ms per image.");
iw.optimize();
iw.close();
Searching in an Index
Use the ImageSearcherFactory for creating an ImageSearcher, which will retrieve the images for you from the index.
IndexReader reader = IndexReader.open(indexPath);
ImageSearcher searcher = ImageSearcherFactory.createDefaultSearcher();
FileInputStream imageStream = new FileInputStream("image.jpg");
BufferedImage bimg = ImageIO.read(imageStream);
// searching for an image:
ImageSearchHits hits = null;
hits = searcher.search(bimg, reader);
for (int i = 0; i < 5; i++) {
System.out.println(hits.score(i) + ": " + hits.doc(i).getField(DocumentBuilder.FIELD_NAME_IDENTIFIER).stringValue());
}
// searching for a document:
Document document = hits.doc(0);
hits = searcher.search(document, reader);
for (int i = 0; i < 5; i++) {
System.out.println(hits.score(i) + ": " + hits.doc(i).getField(DocumentBuilder.FIELD_NAME_IDENTIFIER).stringValue());
}
Enhancements:
- An issue where the scalable color descriptor (color histogram) was not compliant to the MPEG-7 standard was fixed.
- The color only search was changed to use the color layout descriptor and a bug in the edge histogram descriptor was hunted down.
- The LireDemo GUI application has also been updated: A new function for creating image mosaics has been introduced and the indexing of digital photos is now faster than ever as only the EXIF thumbnails - if available - are used instead of the whole image.
Download (MB)
Added: 2007-07-10 License: GPL (GNU General Public License) Price:
848 downloads
LibreSource 2.2
LibreSource is a software platform dedicated to software development and management of distributed communities. more>>
LibreSource is a software platform dedicated to software development and management of distributed communities.
LibreSource offers configuration management with the generic synchronization module So6 from INRIA research. It includes numerous tools dealing with project and user management, including bug trackers, forums, wiki pages, mailing lists, etc. An integrated search engine (Lucene) indexes the data hosted on the platform in full text mode.
With its naming tree, LibreSource makes it possible to hierarchize access to documents and resources. It makes it easy to define public and private areas. Its decentralized approach makes it possible to develop a set of robust applications while disregarding the underlying software architecture.
Enhancements:
- Integration of the Jabber instant messaging (Tigase server and Jeti client). LDAP group resource (map groups of users from LDAP groups).
- Dropbox resource (upload files to a confidential area).
- Form resource (create an electronic form online).
- Smart security rights behaviour. RSS feeds on most of the resources, public or private.
- Files included into a download area are easier to sort and to comment.
- Forums show the thread content inside the response page.
- A best children macro has been added.
- Several bugs have been fixed.
- Performance improvements.
- A smaller memory footprint.
- Better packaging.
<<lessLibreSource offers configuration management with the generic synchronization module So6 from INRIA research. It includes numerous tools dealing with project and user management, including bug trackers, forums, wiki pages, mailing lists, etc. An integrated search engine (Lucene) indexes the data hosted on the platform in full text mode.
With its naming tree, LibreSource makes it possible to hierarchize access to documents and resources. It makes it easy to define public and private areas. Its decentralized approach makes it possible to develop a set of robust applications while disregarding the underlying software architecture.
Enhancements:
- Integration of the Jabber instant messaging (Tigase server and Jeti client). LDAP group resource (map groups of users from LDAP groups).
- Dropbox resource (upload files to a confidential area).
- Form resource (create an electronic form online).
- Smart security rights behaviour. RSS feeds on most of the resources, public or private.
- Files included into a download area are easier to sort and to comment.
- Forums show the thread content inside the response page.
- A best children macro has been added.
- Several bugs have been fixed.
- Performance improvements.
- A smaller memory footprint.
- Better packaging.
Download (79.4MB)
Added: 2007-05-16 License: QPL (QT Public License) Price:
927 downloads
Apache Nutch 0.9
Nutch is Web searching software which builds on Lucene Java, adding Web specifics such as a crawler, a link-graph database. more>>
Nutch project is Web searching software which builds on Lucene Java, adding Web specifics such as a crawler, a link-graph database, parsers for HTML and other document formats, etc.
Interesting files include:
docs/api/index.html
Javadocs for the Nutch software.
CHANGES.txt
Log of changes to Nutch.
For the latest information about Nutch, please visit our website at:
http://lucene.apache.org/nutch/
and our wiki, at:
http://wiki.apache.org/nutch/
To get started using Nutch read Tutorial:
http://lucene.apache.org/nutch/tutorial.html
Enhancements:
- This release includes several critical bugfixes, as well as key speedups.
<<lessInteresting files include:
docs/api/index.html
Javadocs for the Nutch software.
CHANGES.txt
Log of changes to Nutch.
For the latest information about Nutch, please visit our website at:
http://lucene.apache.org/nutch/
and our wiki, at:
http://wiki.apache.org/nutch/
To get started using Nutch read Tutorial:
http://lucene.apache.org/nutch/tutorial.html
Enhancements:
- This release includes several critical bugfixes, as well as key speedups.
Download (68MB)
Added: 2007-04-06 License: GPL (GNU General Public License) Price:
938 downloads
Pauker 1.7.5
Pauker is a generic flashcard program written in Java. more>>
Pauker is a generic flashcard program written in Java. Pauker uses a combination of ultra-shortterm, shortterm, and longterm memory.
You can use it to learn all the things you never want to forget, like vocabulary, capitals, important dates, etc.
Pauker uses a combination of ultra-shortterm, shortterm, and longterm memory. You can use it to learn all the things efficiently you never want to forget, like vocabulary, capitals, important dates, etc.
Main features:
- completely free (OpenSource, GPL)
- flash card based,
- learning application,
- written in java
- using the leitner cardfile system,
- and works offline without the need of an internet connection.
Enhancements:
- Pauker works now with accidently unpacked lesson files.
- The French translation was updated.
- Wrong newline characters are shown when an answer was mistyped.
- The chosen repeating method when inserting new cards is now saved between program starts.
- An unobtrusive button for donations was added.
- The internal search engine was updated to Apache Lucene v2.1.
- The internal keyboard focus and default button handling was cleaned up.
<<lessYou can use it to learn all the things you never want to forget, like vocabulary, capitals, important dates, etc.
Pauker uses a combination of ultra-shortterm, shortterm, and longterm memory. You can use it to learn all the things efficiently you never want to forget, like vocabulary, capitals, important dates, etc.
Main features:
- completely free (OpenSource, GPL)
- flash card based,
- learning application,
- written in java
- using the leitner cardfile system,
- and works offline without the need of an internet connection.
Enhancements:
- Pauker works now with accidently unpacked lesson files.
- The French translation was updated.
- Wrong newline characters are shown when an answer was mistyped.
- The chosen repeating method when inserting new cards is now saved between program starts.
- An unobtrusive button for donations was added.
- The internal search engine was updated to Apache Lucene v2.1.
- The internal keyboard focus and default button handling was cleaned up.
Download (2.9MB)
Added: 2007-05-16 License: GPL (GNU General Public License) Price:
898 downloads
Jetspeed 2.0
Jetspeed provides a JSR-168 compliant enterprise portal. more>>
Jetspeed provides a JSR-168 compliant enterprise portal.
etspeed-2 is a full implementation of the Java Portlet API. It is fully compliant with the Portlet Specification 1.0 (JSR-168). It has passed the TCK (Test Compatibility Kit) suite and is fully CERTIFIED to the Java Portlet Standard.
Notable features include security components backed by LDAP and database implementations, and some robust administration interfaces. Custom portals can be built and deployed using the Jetspeed plugin for Maven.
Developers can use the Jetspeed PSML language to assemble portlets, and the Apache Portals Bridges project to bridge portals with existing technologies including Struts, JSF, PHP, and Perl.
For GUI designers, Jetspeed comes with several built-in templates used to decorate portals and portlets.
Main features:
- Fully compliant with Java Portlet API Standard 1.0 (JSR 168)
- Passed JSR-168 TCK Compatibility Test Suite
- J2EE Security based on JAAS Standard, JAAS DB Portal Security Policy
- LDAP Support for User Authentication
- Spring-based Components and Scalable Architecture
- Configurable Pipeline Request Processor
- Auto Deployment of Portlet Applications
- Jetspeed Component Java API
- Jetspeed AJAX XML API
- Declarative Security Constraints and JAAS Database Security Policy
- Runtime Portlet API Standard Role-based Security
- Portal Content Management and Navigations: Pages, Menus, Folders, Links
- Multithreaded Aggregation Engine
- PSML Folder CMS Navigations, Menus, Links
- Jetspeed SSO (Single Sign-on)
- Rules-based Profiler for page and resource location
- Integrates with most popular databases including Derby, MySQL, MS SQL, Postgres, Oracle, DB2, Hypersonic
- Client independent capability engine (html, xhtml, wml,vml)
- Internationalization: Localized Portal Resources in 12 Languages
- Statistics Logging Engine
- Portlet Registry
- Full Text Search of Portlet Resources with Lucene
- User Registration
- Forgotten Password
- Rich Login and Password Configuration Management
<<lessetspeed-2 is a full implementation of the Java Portlet API. It is fully compliant with the Portlet Specification 1.0 (JSR-168). It has passed the TCK (Test Compatibility Kit) suite and is fully CERTIFIED to the Java Portlet Standard.
Notable features include security components backed by LDAP and database implementations, and some robust administration interfaces. Custom portals can be built and deployed using the Jetspeed plugin for Maven.
Developers can use the Jetspeed PSML language to assemble portlets, and the Apache Portals Bridges project to bridge portals with existing technologies including Struts, JSF, PHP, and Perl.
For GUI designers, Jetspeed comes with several built-in templates used to decorate portals and portlets.
Main features:
- Fully compliant with Java Portlet API Standard 1.0 (JSR 168)
- Passed JSR-168 TCK Compatibility Test Suite
- J2EE Security based on JAAS Standard, JAAS DB Portal Security Policy
- LDAP Support for User Authentication
- Spring-based Components and Scalable Architecture
- Configurable Pipeline Request Processor
- Auto Deployment of Portlet Applications
- Jetspeed Component Java API
- Jetspeed AJAX XML API
- Declarative Security Constraints and JAAS Database Security Policy
- Runtime Portlet API Standard Role-based Security
- Portal Content Management and Navigations: Pages, Menus, Folders, Links
- Multithreaded Aggregation Engine
- PSML Folder CMS Navigations, Menus, Links
- Jetspeed SSO (Single Sign-on)
- Rules-based Profiler for page and resource location
- Integrates with most popular databases including Derby, MySQL, MS SQL, Postgres, Oracle, DB2, Hypersonic
- Client independent capability engine (html, xhtml, wml,vml)
- Internationalization: Localized Portal Resources in 12 Languages
- Statistics Logging Engine
- Portlet Registry
- Full Text Search of Portlet Resources with Lucene
- User Registration
- Forgotten Password
- Rich Login and Password Configuration Management
Download (66.5MB)
Added: 2007-02-06 License: The Apache License Price:
991 downloads
Pyndexter 0.2
Pyndexter (pronounced poindexter) is an abstraction layer for full-text indexing engines. more>>
Pyndexter (pronounced poindexter) is an abstraction layer for full-text indexing engines. It presents a uniform query syntax to the user, includes a basic but functional pure-Python indexer, and has adapters for Hype, Hyperestraier, Lucene, Lupy, Pyndex, Swish-e and Xapian.
How do I install it?
Pyndexter should be installable with setuptools:
easy_install pyndexter
Enhancements:
- The API has been revamped considerably and is now much more flexible and extensible.
<<lessHow do I install it?
Pyndexter should be installable with setuptools:
easy_install pyndexter
Enhancements:
- The API has been revamped considerably and is now much more flexible and extensible.
Download (0.052MB)
Added: 2007-02-21 License: BSD License Price:
975 downloads
Castore 1.1.2
Castore takes place in a user-centered design approach to build an open archive platform. more>>
Castore comes from CApitalization & STORagE and takes place in a user-centered design approach to build an open archive platform, planned to create institutional repositories, managed by librarians in their respective institutions.
With this system, the authors are able to store, convert (XML), fully index, manage, perpetuate, valorize, and distribute their digital documents.
It uses an assembly of components (Tomcat, OO, JDO, Saxon, Lucene, etc.) to build middleware applications, relying on XML and any relational database.
The system has been developed with a component architecture (J2EE) to be able to integrate the platform with any intranet environment with the minimum cost of development.
<<lessWith this system, the authors are able to store, convert (XML), fully index, manage, perpetuate, valorize, and distribute their digital documents.
It uses an assembly of components (Tomcat, OO, JDO, Saxon, Lucene, etc.) to build middleware applications, relying on XML and any relational database.
The system has been developed with a component architecture (J2EE) to be able to integrate the platform with any intranet environment with the minimum cost of development.
Download (18.3MB)
Added: 2006-02-02 License: CeCILL (CeCILL Free Software License Agreement) Price:
1359 downloads
eSearch 1.1ea1
eSearch provides a Java-based search engine. more>>
eSearch provides a Java-based search engine.
eSearch is a server side Java-based search engine which supplies basic search capabilities for Web use. Its basic capabilities can be extended to include intelligent agents and other expert-system behaviors.
Main features:
- Free and Open Source
- Written in Java, and therefore platform-independent
- Uses Lucene API for Text and Index operation.
- Supplies basic search capabilities comparable to other existing search engines, for both intranet and internet use.
- Allows the basic capabilities to be extended to include intelligent agents and other expert-system behaviours, with the end result of smarter searching than is possible with existing search engines.
- Scaleable to truly mind-boggling dimensions of collections and users.
- Can dynamically index multiple servers and sources simultaneously
- Have a powerful Developer kit that include eSearch TAG Library, Samples & Documentation.
- Powerful Administration Center that let you control all crawl & search operation in your enterprise.
- Extensible architecture. You can plug new component from third party.
- Can be easily integrated with any other Web Applications: i.e. ePortal, eContent, eForum, eFAQ, eHelpDesk.
<<lesseSearch is a server side Java-based search engine which supplies basic search capabilities for Web use. Its basic capabilities can be extended to include intelligent agents and other expert-system behaviors.
Main features:
- Free and Open Source
- Written in Java, and therefore platform-independent
- Uses Lucene API for Text and Index operation.
- Supplies basic search capabilities comparable to other existing search engines, for both intranet and internet use.
- Allows the basic capabilities to be extended to include intelligent agents and other expert-system behaviours, with the end result of smarter searching than is possible with existing search engines.
- Scaleable to truly mind-boggling dimensions of collections and users.
- Can dynamically index multiple servers and sources simultaneously
- Have a powerful Developer kit that include eSearch TAG Library, Samples & Documentation.
- Powerful Administration Center that let you control all crawl & search operation in your enterprise.
- Extensible architecture. You can plug new component from third party.
- Can be easily integrated with any other Web Applications: i.e. ePortal, eContent, eForum, eFAQ, eHelpDesk.
Download (0.20MB)
Added: 2007-02-26 License: The Apache License Price:
971 downloads
Secleted [ 0 ] software to compare
Copyright Notice:
Software piracy is theft, Using crack, password, serial numbers, registration codes, key generators is illegal and prevent future software development. The above lucene search only lists software in full, demo and trial versions for free download. Download links are directly from our mirror sites or publisher sites, torrent files or links from rapidshare.com, yousendit.com or megaupload.com are not allowed