reliably
PDFlib TET 2.2
PDFlib TET (Text Extraction Toolkit) is software for reliably extracting text information from any PDF file. more>>
In addition to low-level text retrieval TET contains advanced content analysis algorithms for determining word boundaries, removing redundant duplicate text (such as shadows and artificial bold). Using the auxiliary pCOS interface you can retrieve arbitrary objects from the PDF, such as metadata, hypertext, etc.
Fully functional evaluation versions of TET including documentation and samples are available from the TET download page for all supported platforms. Purchasing a license and applying the license key will fully enable the evaluation version for production deployment.
With PDFlib TET you can:
- extract text from PDF, e.g. to store it in a database
- implement a search engine for processing PDF
- convert the text content of PDF pages to XML for processing with other tools
- process PDFs based on their contents
Supported PDF Input
PDFlib TET supports all relevant flavors of PDF input:
- all PDF versions up to PDF 1.7 (Acrobat 8)
- all font and encoding types: base 14 fonts, TrueType, PostScript, OpenType, CID fonts
- encrypted PDF with 40- and 128-bit encryption (appropriate permission settings or password required)
Unicode
Although text in PDF is usually not encoded in Unicode, PDFlib TET will normalize the text from a PDF document to Unicode:
- TET converts all text contents to Unicode. In C the text will be returned in the UTF-8 or UTF-16 formats, and as native Unicode strings in all other language bindings.
- Ligatures and other multi-character glyphs will be decomposed into a sequence of their constituent Unicode characters.
- Vendor-specific Unicode assignments (Private Use Area, PUA) are identified, and mapped to characters in the common Unicode area if possible.
- Glyphs without appropriate Unicode mappings are identified as such, and are mapped to a configurable replacement character.
Full CJK Support
TET includes full support for extracting Chinese, Japanese, and Korean text. All predefined CJK CMaps (encodings) are recognized; horizontal and vertical writing modes are supported.
Content Analysis and Word Identification
TET can be used to retrieve low-level glyph information, but also includes advanced algorithms for content analysis:
- Detect word boundaries to retrieve words instead of characters.
- Recombine the parts of hyphenated words.
- Remove duplicate instances of text, e.g. shadow and artificial bold text.
- Recombine paragraphs into reading order.
- Reorder text which is scattered over the page.
- Reconstruct lines of text.
Geometry
TET provides precise metrics for the text, such as the position on the page, glyph widths, text direction. Specific areas on the page can be excluded or included in the text extraction, e.g. to ignore headers and footers or margins.
Version restrictions:
- Unlicensed versions support all features, but will only process PDF documents with up to 10 pages and 1 MB size. Evaluation versions of TET must not be used for production purposes, but only for evaluating the product. Using TET for production purposes requires a valid TET license.
Enhancements:
- repair mode for damaged PDF repairs damaged documents which were rejected by earlier versions of TET
- support for PDF 1.7, the file format of Acrobat 8
- support for AES-encrypted PDF (appropriate password required)
- TET command-line tool: extract the text based on article threads in the document
- updated pCOS interface (the same pCOS as in PDFlib 7)
- Perl language binding
- many new heuristics and workarounds
- Unicode mappings for more documents
- improvements in the Wordfinder
- various bug fixes
- TET Plugin for Acrobat as a free tool and TET technology demo
MTP ExpeDat 1.7-1
MTP ExpeDat is High-Performance File Server and Clients. more>>
The ExpeDat applications function similarly to FTP servers, clients and browsers, but offer superior throughput and reliability.
Main features:
Performance
- Two to Four times faster WAN speed
- Instant file browsing
- File sizes up to 1 Terabyte
Security
- AES Encryption
- User authentication
- Private, shadow, PAM, and NIS password support
Ease of Use
- Graphical and Command Line interfaces
- Zero-Configuration server deployment
- Support for all major platforms
Configuration
- Bandwidth regulation and management
- Scriptable command line client
- Server-side packaging and compression plug-ins
Version restrictions:
- Free trial expires after 21 days.
Enhancements:
- Rename and Delete files on the server
- System users can view the entire file system
- Transfer multiple sources with one command
- Match remote sources with wildcards (server-side globbing).
Big Faceless Java PDF Viewer 2.11.6
A Swing component that can display PDF documents more>>
Big Faceless Java PDF Viewer is Intended for customers who dont require the full API, the PDF Viewer can be installed as an Applet, application or via Java Web Start, or embedded in a Swing application.
Printing, saving, text search, forms, digital signatures, and annotations are some of the many features available - the viewer can be tailored to include just the features you need, and is a cost-effective solution for those needing the features of Adobe Acrobat on a Java platform.
Essentially a cut-down version of the full PDF Library, the Java PDF Viewer is a more cost-effective solution for those who dont required API access to the main library.
Unlike its big brother, the viewer cannot create new PDFs and any edits must be made through the Swing interface. That interface is supplied with a large number of "features" (the full list is here) which can be enabled to customize the interface; ideal for distributing a limited functionality viewer as part of the
Major Features:
- Swing component for displaying PDF documents
- Customizable feature set includes printing, saving, search/selection, forms, bookmarks, reorderable thumbnails, annotations and more.
- Full support for PDFs up to Acrobat 9
- Viewer can be controlled from JavaScript
- Applet size under 1MB
- Localized in English, French, Spanish, German, Japanese and Chinese
Enhancements:
- All BFO warning (and debug, if enabled) messages are now logged using log4j if available, or java.util.logging.* if not.
- Large synchronization audit fixed some issues when reading from one PDF in multiple threads, particularly when using the viewer.
- Fixed problem that could result in corruption when repeatedly saving a PDF containing compressed XRef tables and multiple revisions.
- Modified AnnotationStamp so Acrobat correctly handles custom stamps, and added some new types of standard stamp.
- Added FDF.getXFDF() method, for exporting XFDF from a PDF, and added support for this to the Viewer. Fixed FDF export, broken some releases ago.
- More PDF/A related fixes, and added a number of new OutputProfile.Features.
- Fixed RichText content in text fields
- Added subsetting for CFF fonts
- Fixed error with some Windows JDKs when rasterizing JPEG images
- Fixed unusual cases of Linear & Radial shading, added preliminary support for Coons & Tensor-Product Patch shading (Type 6 & 7).
- Fixed blending, broken in previous release and resulting in opaque highlight annotations.
- Correctly handle masked images where the mask and image are different sizes, as created by Luratech products.
- Added PageExtractor.Image.getUniqueId(), to identify extracted images.
- Added PageExtractor.Image.getMetaData() and Text.getFontMetaData(), to return any embedded XMP MetaData for those items in the PDF.
- Fixed long-standing bug in CCITT.G3 2D encoding now rarely seen in very old TIFF images.
- Further fixes to handle some types of corrupt PDF
- Viewer: Page Up/Page Down/Home and End keys navigate correctly through the PDF via the Viewport ActionMap. Standard ScrollBar keybindings are overridden, previous KeyListener approach didnt work reliably and was removed.
- Viewer: Fixed missing glyphs in some fonts after a search was run.
- Viewer: PDFTool (the class run by "java -jar bfopdf.jar") has more options.
- Viewer: Stamps no longer slowly grow when clicked on due to rounding error
- Viewer: Calling DocumentPanel.redraw(page) will work as expected.
- Viewer: Dont allow JavaScript to be run on the console if no PDF is loaded
- Viewer: Fixed page jump when zoom level is changed in Column view
- Viewer: Dont fail under 1.6 when saving PDFs to filenames containing an invalid regex backreference.
- Viewer: Improved display under OS X, including addition of a Dock Icon.
- Viewer: Added keyboard shortcuts to "Open Recent" menu.
- Viewer: Thumbnail panel now scrolls to follow the current page, and renders more reliably.
Requirements: Java 2 Standard Edition Runtime Environment
Zoidcom 0.6.7
Zoidcom network library is a high-level, UDP based networking library providing features for automatic replication of gameobject more>>
This is achieved by multiplexing and demultiplexing object information from and into bitstreams, which make it easily possible to avoid sending redundant data. Bools only take one single bit, integers and floats are stripped down to as many bits as needed.
A great deal of the tedious work that appears when attempting to develop an efficient network protocol is handled by Zoidcom, e.g. deciding when to send which data to which client, how to get it efficiently and reliably through the line, what to do when data gets lost and a lot more.
Main features:
Easy And Flexible
- simple and straightforward C++ API
- no additional build steps needed
- custom memory management
Connectivity
- fast, UDP based networking protocol using bitstreams - very little control data overhead
- NAT friendly - the whole system can operate on one single UDP port
- maximize the header/payload ratio by sending fewer big packets instead of lots of small ones
- dynamic bandwidth distribution & limitation
Advanced Object Replication - Make best use of the available bandwidth
- delegate ownership of objects to arbitrary clients (e.g. for AI processing or player control)
- adjustable object priorities
- adjustable object relevance per client (e.g. field of view)
- object subscription groups (e.g. for different, distinct game areas)
Object Events - Easy as communication can become
- direct communication between authority and (groups of) replicated object(s)
- no need to pass around object messages throughout your whole application
- full control - intercept object events for debugging, monitoring and cheatprotection
- object based file transfer - transmit files easily in the background of a running session - the higher the objects priority, the faster the transfer will complete
Automatic Object State Synchronization
- synchronize bools, ints, floats and strings
- interpolate ints and floats
- only send data that really changed
- full control
- minimum and maximum update frequency for each single data item
- intercept, manipulate and prevent data updates for debugging, monitoring and cheatprotection
- efficient
- adjust the amount of relevant bits per replicated item
- default values to avoid transmission of redundant data
- interpolation with adjustable interpolation strength or totally custom interpolation
Useful Utilities
- asynchronous hostname resolution
- accurate ping measurement
- LAN session discovery
- quickrequest feature - talk to a remote host without creating a full connection
- connection grouping
- extensive connection statistics
Enhancements:
- This release adds a method for defining the default relevance of an object to all connections.
- Page: 1 of 1
- 1