FileArchiveIndexer 1.29
FileArchiveIndexer 1.29 Ranking & Summary
FileArchiveIndexer 1.29 description
The system allows for a large archive of files that will be regularly changing by the adding, moving, and removing of files. OCR software is used to index content of image files, hard copy paper documents saved as pdf files.
The initial indexing of a large archive can take weeks. This document shows an overview of how it works, and why some decisions were made as to how it works. For technical API documentation, please see FileArchiveIndexer.This application relies heavily on posix. It is meant to run on linux. Portability is not a goal of this package.
In maintaining a database indexing an archive of files, there are two things want to know. The first thing is want to know where the files are, refer to this as the Update Step. The other thing want to do is actually index the files. This is the Indexing Step.
Major Features:
- The system will deal with a large archive of documents residing in a filesystem.
- If a file is renamed, or moved, its contents should not be reindexed. If a file's data has not changed, there must be a way to acknowledge and reflect that.
- Indexing of particular document types can be extremely time intensive, taking maybe minutes for example to read an image with ocr and turn it into text.
- multiple indexing processes should be able to run, maybe even from different computers. They should not bump into each other. basically, there is a Queue system managing indexing.
FileArchiveIndexer 1.29 Screenshot
FileArchiveIndexer 1.29 Keywords
Bookmark FileArchiveIndexer 1.29
FileArchiveIndexer 1.29 Copyright
Want to place your software product here?
Please contact us for consideration.
Contact WareSeeker.com