Main > Free Download Search >

Free data extractor v3.3 software for linux

data extractor v3.3

Sponsored Links
Sponsored Links
Secleted [ 0 ] software to compare
Results 1 - 15 of about 4892
ccextractor 0.30

ccextractor 0.30


ccextractor is a fast closed captions extractor for MPEG files. more>>
ccextractor project is a fast closed captions extractor for MPEG files.
ccextractor is mostly a mildly optimized C port of McPoodles excellent but painfully slow Perl script SCC_RIP. It lets you rip the raw closed captions (read: subtitles) data from a number of sources, such as DVD or replay TV.
As an added bonus compared to the original SCC_RIP, ccextractor can extract subtitles from the HDTV transport streams that are becoming more common.
At this point ccextractor extracts the line 21 captions (which must legally be present for a number of years until the transition to digital is complete). Note that in most .ts you can find, there will be subtitle data for both analog (EIA-608) decoders and digital (EIA-708). AFAIK there are not
freely available EIA-708 rippers.
Anyway, since line 21 captions will be available for some time, we have time to build a decent 708 ripper.
Basic Usage
For details on CC, please go to McPoodles page:
http://www.geocities.com/mcpoodle43/SCC_TOOLS/DOCS/SCC_TOOLS.HTML
You will need his tools to use ccextrators output.
The basic idea is that you get the raw closed caption dump from ccextractor.
Then you need other tools (which vary depending on what you want to do) to continue processing.
To get a transcript from a .ts file in .srt (I assume this will be the most common use) do this:
ccextractor -12 input_file
-12 means "extract both subtitle tracks" (actually technical names are fields but tracks is easier to understand). 1 is almost always English. 2 is Spanish in HBO (at least in the few samples Ive seen) but could be anything. Just extract both of them and check.
Example: cctractor -12 house315.ts
ccextractor will create two files, called house315_1.bin and _2.
Then use McPoodles RAW2SCC to create a temporary SCC file (means Scenerist, which is originally the native format for some program, its not important here).
raw2scc house315_1.bin
This creates house315_1.scc
From this .scc file, you can get the final .srt by using McPoodles CCASDI:
ccasdi -s house315_1.srt
Which looks like this (just 3 random lines shown).
514
00:24:07,400 --> 00:24:09,300
Theyve got another trial
going on at Duke.
515
00:24:09,367 --> 00:24:12,567
15% extend their lives
beyond five years.
516
00:24:12,634 --> 00:24:13,701
If youre positive
for protein PHF--
Enhancements:
- This release adds support for DVR-MS files.
- It improves the CC decoder.
- There are several bugfixes, a major speed boost (20%-40%), improved timing for non-TS files, improved format autodetection, and other minor improvements.
<<less
Download (0.033MB)
Added: 2007-05-24 License: GPL (GNU General Public License) Price:
893 downloads
XML Extractor 0.3.0

XML Extractor 0.3.0


XML Extractor is a set of tools for transforming XML-like markup into entities or well-formed XML files. more>>
XML Extractor is a set of tools for transforming XML-like markup into entities or well-formed XML files.

The sourcecode XML metadata extraction tools are intended to be used for extracting and transforming XML-like markup embedded in source code comments into syntactically correct external entities or well-formed XML files.

This can be used for JavaDoc-like code annotation, providing structured comments, or even embedding metadata used by the build process or configuration management tools.

INSTALLATION

For info and options about installing this tool, type:
# python setup.py --help

USAGE

To see usage info for this tool, type:
# python xlf_to_wfx_cli.py --help
<<less
Download (0.020MB)
Added: 2006-10-04 License: LGPL (GNU Lesser General Public License) Price:
1116 downloads
Flat File Extractor 0.2.2

Flat File Extractor 0.2.2


Flat File Extractor can be used for reading different flat file structures and printing them in different formats. more>>
Flat File Extractor can be used for reading different flat file structures and printing them in different formats. ffe is a command line tool developed in GNU/Linux environment and it is distributed under GNU General Public License 2 or later.
Main areas of use are:
- Extracting particular fields or records from a flat file
- Converting data from one format to an other, e.g. from CSV to fixed length
- Verifying a flat file structure
- Testing tool for flat file development
- Displaying flat file content in human readable form
Main features:
- Command-line tool
- Reads standard input and writes to standard output as default
- One input file can contain several types of records (lines)
- Fields in a flat file can be fixed length or separated
- Input file structure and output definitions are independent, meaning one output format can be used with several input files
- Input file structure and output format are freely configurable, they are not predefined
- Output can be formatted e.g. as: fixed length, separated, tokenized, XML, SQL,...
- ffe tries to guess the input format, user needs not to give it as a parameter
Enhancements:
- Configuration keyword const has been added
<<less
Download (0.23MB)
Added: 2007-05-30 License: GPL (GNU General Public License) Price:
882 downloads
Unix configuration extractor 4

Unix configuration extractor 4


The Unix configuration extractor is a script more>> The Unix configuration extractor is a script that runs on the server to extract necessary security configurations. This script doesnt make any changes to the server other than creating the dump files<<less
Download (19KB)
Added: 2009-03-31 License: Freeware Price: Free
206 downloads
Data::Generate 0.01

Data::Generate 0.01


Data::Generate allows you to create various types of synthetic data by parsing regex-like data creation rules. more>>
Data::Generate allows you to create various types of synthetic data by parsing "regex-like" data creation rules.

This module generates data by parsing given text statements (data creation rules). These statements are flexible and powerful regex-like way to control the production of synthetic data. Think about a program that instead of selecting data which matches a regex filter expression, produces it. For example, from the rule [a-c], the generator would produce the array a,b,c. The module works as following:

Specify data creation rules.
my $generator= Data::Generate::parse(VC(24) [0-9][2-3]);
At this step first you define one kind of output datatype (for ex. VC(24)= "output is a string with max length 24") and then with the rest of the expression define what it should look like. If parsing is successful a Data Generator object is instantiated.

Get data
my $Data= $generator->get_unique_data(10);
To really get the data, users must call the get_unique_data method by indicating the desired number of output values. The generator returns the values contained in an array reference. Please remark that output format is fixed according to the data type.

<<less
Download (0.025MB)
Added: 2007-03-31 License: Perl Artistic License Price:
937 downloads
Data.FormValidator 0.04

Data.FormValidator 0.04


Data.FormValidators aim is to bring all the benefits of the perl module Data::FormValidator over to javascript. more>>
Data.FormValidators aim is to bring all the benefits of the perl module Data::FormValidator over to javascript, using the same input profiles (they can be dumped into javascript objects using the perl module Data::JavaScript.
Data.FormValidator library lets you define profiles which declare the required and optional fields and any constraints they might have.
The results are provided as an object which makes it easy to handle missing and invalid results, return error messages about which constraints failed, or process the resulting valid data.
IMPORTANT NOTE: JavaScript form validation is NOT a replacement for data validation in your backend scripts. This is the primary reason this module was written... so that it would be easy to share the same validation profile for both the frontend (via Data.FormValidator.js) and backend (via Data::FormValidator.pm).
Enhancements:
- A problem where some functions were not terminated by a semi-colon, so JavaScript compactors would end up creating broken code was fixed.
<<less
Download (0.047MB)
Added: 2006-01-20 License: GPL (GNU General Public License) Price:
1372 downloads
Erwin Data Structures 2.1.58633

Erwin Data Structures 2.1.58633


Erwin Data Structures is a library that is meant to be the ultimate data structure library for mixed usage of C and C++. more>>
Erwin Data Structures is a library that is meant to be the ultimate data structure library for mixed usage of C and C++.

Arbitrary key and value types are implemented by template files that dont use C++ templates, but are instantiated by a Perl script.

This way, mixed usage in C and C++ is possible. However, a C++ interface is generated to support the advantages of the C++ language. No templates, no void*.
Erwin contains a number of tools, too, all of them written in Perl. The following list shows the data structures and tools, together with some typical examples.
<<less
Download (0.67MB)
Added: 2007-02-09 License: Freely Distributable Price:
997 downloads
Data::ENAML 0.03

Data::ENAML 0.03


Data::ENAML is a Perl extension for ENAML data representation. more>>
Data::ENAML is a Perl extension for ENAML data representation.

SYNOPSIS

use Data::ENAML qw (serialize deserialize);

print serialize(login => {nick => Schop,
email => ariel@atheist.org.il,
tagline => If I had no modem I would not lose Regina});

$struct = deserialize(bad-nick: {nick: "c00l dewd" text: "spaces not allowed"});

ENAML stands for ENAML is Not A Markup Language. (And as we all know, Gnu is Not UNIX, Pine Is Not Email, Wine Is Not Emulator, Lame Aint Mp3 Encoder and so on).

ENAML was defined by Robey Pointer for use in Say2, check http://www.lag.net/say2.

<<less
Download (0.004MB)
Added: 2006-11-15 License: Perl Artistic License Price:
1073 downloads
Fast Data Transfer 0.8.0

Fast Data Transfer 0.8.0


Fast Data Transfer is an application for efficient data transfers that is capable of reading and writing at disk speed. more>>
Fast Data Transfer is an application for efficient data transfers that is capable of reading and writing at disk speed over wide area networks (with standard TCP).
It can be used to stream a large set of files across the network, so a large dataset composed of thousands of files can be sent or received at full speed, without the network transfer restarting between files.
The project is written in Java, runs an all major platforms, and is easy to use.
Main features:
- Streams a dataset (list of files) continuously, using a managed pool of buffers through one or more TCP sockets.
- Uses independent threads to read and write on each physical device
- Transfers data in parallel on multiple TCP streams, when necessary
- Uses appropriate-sized buffers for disk I/O and for the network
- Restores the files from buffers asynchronously
- Resumes a file transfer session without loss, when needed
<<less
Download (0.35MB)
Added: 2007-08-21 License: Other/Proprietary License Price:
797 downloads
Obscure-Extractor-GTK 0.2

Obscure-Extractor-GTK 0.2


Obscure-Extractor-GTK can extract data from simple and unusual archives as used by games. more>>
Obscure-Extractor-GTK can extract data from simple and unusual archives as used by games, e.g. Neverwinter Nights, Homeworld 2, BloodRayne.

Mostly a framework where I can easily add new modules when I want to have a look at the inner workings of games, though the Delphi version has some more advanced stuff like support for old InstallShield archives that would need to be ported.

<<less
Download (0.012MB)
Added: 2006-07-24 License: GPL (GNU General Public License) Price:
1202 downloads
Data::Report 0.06

Data::Report 0.06


Data::Report provides a framework for flexible reporting. more>>
Data::Report provides a framework for flexible reporting.

Data::Report is a flexible, plugin-driven reporting framework. It makes it easy to define reports that can be produced in text, HTML and CSV. Textual ornaments like extra empty lines, dashed lines, and cell lines can be added in a way similar to HTML style sheets.

The Data::Report framework consists of three parts:
The plugins

Plugins implement a specific type of report. Standard plugins provided are Data::Report::Plugin::Text for textual reports, Data::Report::Plugin::Html for HTML reports, and Data::Report::Plugin::Csv for CSV (comma-separated) files.
Users can, and are encouraged, to develop their own plugins to handle different styles and types of reports.

The base class
The base class Data::Report::Base implements the functionality common to all reporters, plus a number of utility functions the plugins can use.

The factory
The actual Data::Report module is a factory that creates a reporter for a given report type by selecting the appropriate plugin and returning an instance thereof.

<<less
Download (0.016MB)
Added: 2007-03-31 License: Perl Artistic License Price:
937 downloads
Data::TreeDumper 0.33

Data::TreeDumper 0.33


Data::TreeDumper is an improved replacement for Data::Dumper. more>>
Data::TreeDumper is an improved replacement for Data::Dumper. Powerful filtering capability.

SYNOPSIS

use Data::TreeDumper ;

my $sub = sub {} ;

my $s =
{
A =>
{
a =>
{
}
, bbbbbb => $sub
, c123 => $sub
, d => $sub
}

, C =>
{
b =>
{
a =>
{
a =>
{
}

, b => sub
{
}
, c => 42
}

}
}
, ARRAY => [qw(elment_1 element_2 element_3)]
} ;


#-------------------------------------------------------------------
# package setup data
#-------------------------------------------------------------------

$Data::TreeDumper::Useascii = 0 ;
$Data::TreeDumper::Maxdepth = 2 ;

print DumpTree($s, title) ;
print DumpTree($s, title, MAX_DEPTH => 1) ;
print DumpTrees
(
[$s, "title", MAX_DEPTH => 1]
, [$s2, "other_title", DISPLAY_ADDRESS => 0]
, USE_ASCII => 1
, MAX_DEPTH => 5
) ;

Output:

title:
|- A [H1]
| |- a [H2]
| |- bbbbbb = CODE(0x8139fa0) [C3]
| |- c123 [C4 -> C3]
| `- d [R5]
| `- REF(0x8139fb8) [R5 -> C3]
|- ARRAY [A6]
| |- 0 [S7] = elment_1
| |- 1 [S8] = element_2
| `- 2 [S9] = element_3
`- C [H10]
`- b [H11]
`- a [H12]
|- a [H13]
|- b = CODE(0x81ab130) [C14]
`- c [S15] = 42

<<less
Download (0.026MB)
Added: 2007-07-06 License: Perl Artistic License Price:
840 downloads
Data::Serializer 0.41

Data::Serializer 0.41


Data::Serializer package contains modules that serialize data structures. more>>
Data::Serializer package contains modules that serialize data structures.

SYNOPSIS

use Data::Serializer;

$obj = Data::Serializer->new();

$obj = Data::Serializer->new(
serializer => Storable,
digester => MD5,
cipher => DES,
secret => my secret,
compress => 1,
);

$serialized = $obj->serialize({a => [1,2,3],b => 5});
$deserialized = $obj->deserialize($serialized);
print "$deserialized->{b}n";

Provides a unified interface to the various serializing modules currently available. Adds the functionality of both compression and encryption.

EXAMPLES

Please see Data::Serializer::Cookbook(3)

METHODS

new - constructor
$obj = Data::Serializer->new();


$obj = Data::Serializer->new(
serializer => Data::Dumper,
digester => SHA-256,
cipher => Blowfish,
secret => undef,
portable => 1,
compress => 0,
serializer_token => 1,
options => {},
);

new is the constructor object for Data::Serializer objects.

The default serializer is Data::Dumper
The default digester is SHA-256
The default cipher is Blowfish
The default secret is undef
The default portable is 1
The default encoding is hex
The default compress is 0
The default compressor is Compress::Zlib
The default serializer_token is 1
The default options is {} (pass nothing on to serializer)
serialize - serialize reference

$serialized = $obj->serialize({a => [1,2,3],b => 5});

Serializes the reference specified.
Will compress if compress is a true value.
Will encrypt if secret is defined.
deserialize - deserialize reference

$deserialized = $obj->deserialize($serialized);

Reverses the process of serialization and returns a copy of the original serialized reference.

freeze - synonym for serialize
$serialized = $obj->freeze({a => [1,2,3],b => 5});

thaw - synonym for deserialize
$deserialized = $obj->thaw($serialized);

raw_serialize - serialize reference in raw form
$serialized = $obj->raw_serialize({a => [1,2,3],b => 5});

This is a straight pass through to the underlying serializer, nothing else is done. (no encoding, encryption, compression, etc)

raw_deserialize - deserialize reference in raw form
$deserialized = $obj->raw_deserialize($serialized);

This is a straight pass through to the underlying serializer, nothing else is done. (no encoding, encryption, compression, etc)

secret - specify secret for use with encryption
$obj->secret(mysecret);

Changes setting of secret for the Data::Serializer object. Can also be set in the constructor. If specified than the object will utilize encryption.

portable - encodes/decodes serialized data

Uses encoding method to ascii armor serialized data

Aids in the portability of serialized data.

compress - compression of data

Compresses serialized data. Default is not to use it. Will compress if set to a true value $obj->compress(1);

serializer - change the serializer

Currently have 8 supported serializers: Storable, FreezeThaw, Data::Denter, Config::General, YAML, PHP::Serialization, XML::Dumper, and Data::Dumper.
Default is to use Data::Dumper.

Each serializer has its own caveats about usage especially when dealing with cyclical data structures or CODE references. Please see the appropriate documentation in those modules for further information.

cipher - change the cipher method

Utilizes Crypt::CBC and can support any cipher method that it supports.

digester - change digesting method

Uses Digest so can support any digesting method that it supports. Digesting function is used internally by the encryption routine as part of data verification.

compressor - changes compresing module

This method is included for possible future inclusion of alternate compression method Currently Compress::Zlib is the only supported compressor.

encoding - change encoding method

Encodes data structure in ascii friendly manner. Currently the only valid options are hex, or b64.

The b64 option uses Base64 encoding provided by MIME::Base64, but strips out newlines.

serializer_token - add usage hint to data

Data::Serializer prepends a token that identifies what was used to process its data. This is used internally to allow runtime determination of how to extract Serialized data. Disabling this feature is not recommended.

options - pass options through to underlying serializer

Currently is only supported by Config::General, and XML::Dumper.

my $obj = Data::Serializer->new(serializer => Config::General,
options => {
-LowerCaseNames => 1,
-UseApacheInclude => 1,
-MergeDuplicateBlocks => 1,
-AutoTrue => 1,
-InterPolateVars => 1
},
) or die "$!n";

or

my $obj = Data::Serializer->new(serializer => XML::Dumper,
options => { dtd => 1, }
) or die "$!n";
store - serialize data and write it to a file (or file handle)
$obj->store({a => [1,2,3],b => 5},$file, [$mode, $perm]);

or

$obj->store({a => [1,2,3],b => 5},$fh);

Serializes the reference specified using the serialize method and writes it out to the specified file or filehandle.

If a file path is specified you may specify an optional mode and permission as the next two arguments. See IO::File for examples.

Trips an exception if it is unable to write to the specified file.

retrieve - read data from file (or file handle) and return it after deserialization

my $ref = $obj->retrieve($file);

or

my $ref = $obj->retrieve($fh);

Reads first line of supplied file or filehandle and returns it deserialized.

<<less
Download (0.025MB)
Added: 2007-07-12 License: Perl Artistic License Price:
834 downloads
Data::CGIForm 0.4

Data::CGIForm 0.4


Data::CGIForm is a Perl module with form data interface. more>>
Data::CGIForm is a Perl module with form data interface.

Data::CGIForm is yet another way to parse and handle CGI form data. The main motivation behind this module was a simple specification based validator that could handle multiple values.
You probably dont want to use this module. CGI::Validate is a much more feature complete take on getting this sort of work done. You may then ask why this is on the CPAN, I ask that of myself from time to time....

SYNOPSIS

my %spec = (
username => qr/^([a-z0-9]+)$/,
password => {
regexp => qr/^([a-z0-9+])$/,
filter => [qw(strip_leading_ws, strip_trailing_ws)],
},
email => {
regexp => qr/^([a-z0-9@.]+)$/,
filter => &qualify_domain,
optional => 1,
errors => {
empty => You didnt enter an email address.,
invalid => Bad [% key %]: "[% value %]",
},
extra_test => &check_email_addr,
},
email2 => {
equal_to => email,
errors => {
unequal => Both email addresses must be the same.,
},
},
);

my $r = $ENV{MOD_PERL} ? Apache::Request->instance : CGI->new;

my $form = Data::CGIForm->new(datasource => $r, spec => %spec);


my @params = $form->params;
foreach $param (@params) {
next unless my $error_string = $form->error($param);

print STDERR $error_string;
}

if ($form->error(username)) {
handle_error($form->username, $form->error(username));
}

my $email = $form->param(email);
my $password = $form->password;

<<less
Download (0.012MB)
Added: 2006-10-04 License: Perl Artistic License Price:
1115 downloads
Sowa Data Capacitor 0.0.0_dev0

Sowa Data Capacitor 0.0.0_dev0


Sowa Data Capacitor is a unified Java API for accessing data in several different forms, such as XML, memory, or database. more>>
Sowa Data Capacitor is a unified Java API for accessing data in several different forms, such as XML, database or memory.

Installation:

Sowa Data Capacitor now not need any dependences (except ant), but itll propably have.

To build it you have to have:

* Apache Ant
* JDK of Java 2

To use it you have to have:

* JVM of Java 2

To build it just run build in base catalogue.

Version Convention

First noumber(maior) is api version, except 0 and 1 which is maturity change.
Itll change only if api is completly rewritten.

Second(minor) means small api(especcially adds) changes, which could break(but
not have to) plugins but not applications.

Third(patch) means some improvement, which not change api.

Dev means version for developer(of plugins optionally applications), Alpha
and Beta test release.

Last noumber mark order of releases.
<<less
Download (0.015MB)
Added: 2006-03-24 License: GPL (GNU General Public License) Price:
1309 downloads
Secleted [ 0 ] software to compare
  • Page: 1 of 5
  • 1
  • 2
  • 3
  • 4
  • 5