data ctable
Data::CTable 1.03
Data::CTable is a Perl module that helps you read, write, manipulate tabular data. more>>
SYNOPSIS
## Read some data files in various tabular formats
use Data::CTable;
my $People = Data::CTable->new("people.merge.mac.txt");
my $Stats = Data::CTable->new("stats.tabs.unix.txt");
## Clean stray whitespace in fields
$People->clean_ws();
$Stats ->clean_ws();
## Retrieve columns
my $First = $People->col(FirstName);
my $Last = $People->col(LastName );
## Calculate a new column based on two others
my $Full = [map {"$First->[$_] $Last->[$_]"} @{$People->all()}];
## Add new column to the table
$People->col(FullName => $Full);
## Another way to calculate a new column
$People->col(Key);
$People->calc(sub {no strict vars; $Key = "$Last,$First";});
## "Left join" records matching Stats:PersonID to People:Key
$Stats->join($People, PersonID => Key);
## Find certain records
$Stats->select_all();
$Stats->select(Department => sub {/Sale/i }); ## Sales depts
$Stats->omit (Department => sub {/Resale/i}); ## not Resales
$Stats->select(UsageIndex => sub {$_ > 20.0}); ## high usage
## Sort the found records
$Stats->sortspec(DeptNum , {SortType => Integer});
$Stats->sortspec(UsageIndex, {SortType => Number });
$Stats->sort([qw(DeptNum UsageIndex Last First)]);
## Make copy of table with only found/sorted data, in order
my $Report = $Stats->snapshot();
## Write an output file
$Report->write(_FileName => "Rept.txt", _LineEnding => "mac");
## Print a final progress message.
$Stats->progress("Done!");
## Dozens more methods and parameters available...
OVERVIEW
Data::CTable is a comprehensive utility for reading, writing, manipulating, cleaning and otherwise transforming tabular data. The distribution includes several illustrative subclasses and utility scripts.
A Columnar Table represents a table as a hash of data columns, making it easy to do data cleanup, formatting, searching, calculations, joins, or other complex operations.
The objects hash keys are the field names and the hash values hold the data columns (as array references).
Tables also store a "selection" -- a list of selected / sorted record numbers, and a "field list" -- an ordered list of all or some fields to be operated on. Select() and sort() methods manipulate the selection list. Later, you can optionally rewrite the table in memory or on disk to reflect changes in the selection list or field list.
Data::CTable reads and writes any tabular text file format including Merge, CSV, Tab-delimited, and variants. It transparently detects, reads, and preserves Unix, Mac, and/or DOS line endings and tab or comma field delimiters -- regardless of the runtime platform.
In addition to reading data files, CTable is a good way to gather, store, and operate on tabular data in memory, and to export data to delimited text files to be read by other programs or interactive productivity applications.
To achieve extremely fast data loading, CTable caches data file contents using the Storable module. This can be helpful in CGI environments or when operating on very large data files. CTable can read an entire cached table of about 120 megabytes into memory in about 10 seconds on an average mid-range computer.
For simple data-driven applications needing to store and quickly retrieve simple tabular data sets, CTable provides a credible alternative to DBM files or SQL.
For data hygiene applications, CTable forms the foundation for writing utility scripts or compilers to transfer data from external sources, such as FileMaker, Excel, Access, personal organizers, etc. into compiled or validated formats -- or even as a gateway to loading data into SQL databases or other destinations. You can easily write short, repeatable scripts in Perl to do reporting, error checking, analysis, or validation that would be hard to duplicate in less-flexible application environments.
The data representation is simple and open so you can directly access the data in the object if you feel like it -- or you can use accessors to request "clean" structures containing only the data or copies of it. Or you can build your own columns in memory and then when youre ready, turn them into a table object using the very flexible new() method.
The highly factored interface and implementation allow fine-grained subclassing so you can easily create useful lightweight subclasses. Several subclasses are included with the distribution.
Most defaults and parameters can be customized by subclassing, overridden at the instance level (avoiding the need to subclass too often), and further overridden via optional named-parameter arguments to most major method calls.
Data::ICal 0.11
Data::ICal is a Perl module that generates iCalendar (RFC 2445) calendar files. more>>
SYNOPSIS
use Data::ICal;
my $calendar = Data::ICal->new();
my $vtodo = Data::ICal::Entry::Todo->new();
$vtodo->add_properties(
# ... see Data::ICal::Entry::Todo documentation
);
# ... or
$calendar = Data::ICal->new(filename => foo.ics); # parse existing file
$calendar = Data::ICal->new(data => BEGIN:VCALENDAR...); # parse existing file
$calendar->add_entry($vtodo);
print $calendar->as_string;
# Or, if youre printing to something you want google to read:
print $calendar->as_string(fold => 0);
A Data::ICal object represents a VCALENDAR object as defined in the iCalendar protocol (RFC 2445, MIME type "text/calendar"), as implemented in many popular calendaring programs such as Apples iCal.
Each Data::ICal object is a collection of "entries", which are objects of a subclass of Data::ICal::Entry. The types of entries defined by iCalendar (which refers to them as "components") include events, to-do items, journal entries, free/busy time indicators, and time zone descriptors; in addition, events and to-do items can contain alarm entries. (Currently, Data::ICal only implements to-do items and events.)
Data::ICal is a subclass of Data::ICal::Entry; see its manpage for more methods applicable to Data::ICal.
Data::Walker 1.05
Data::Walker is a tool for navigating through Perl data structures. more>>
SYNOPSIS
Without any explicit objects:
use Data::Walker;
Data::Walker->cli( $data_structure );
Object-style invocation:
use Data::Walker;
my $w = new Data::Walker;
$w->walk( $data_structure );
$w->ls("-al");
$w->pwd;
$w->cli;
Importing methods into the current package:
use Data::Walker qw(:direct);
walk $data_structure;
ls "-al";
pwd;
cli;
This module allows you to "walk" an arbitrary Perl data structure in the same way that you can walk a directory tree from a UNIX command line. It reuses familiar unix commands (such as "ls", "cd", "pwd") and applies these to data structures.
It has a command-line interface which behaves like a UNIX shell. You can also use object-style sytax to invoke the CLI commands from outside the CLI. Data::Walker objects are encapsulated, so that you can hop into and out of a CLI without losing state, and you can have several Data::Walker objects pointing at different structures.
The main functions can also be imported and used directly from within the Perl debuggers CLI.
Data::Stag 0.10
Data::Stag is a Perl module with structured tags datastructures. more>>
SYNOPSIS
# PROCEDURAL USAGE
use Data::Stag qw(:all);
$doc = stag_parse($file);
@persons = stag_find($doc, "person");
foreach $p (@persons) {
printf "%s, %s phone: %sn",
stag_sget($p, "family_name"),
stag_sget($p, "given_name"),
stag_sget($p, "phone_no"),
;
}
# OBJECT-ORIENTED USAGE
use Data::Stag;
$doc = Data::Stag->parse($file);
@persons = $doc->find("person");
foreach $p (@person) {
printf "%s, %s phone:%sn",
$p->sget("family_name"),
$p->sget("given_name"),
$p->sget("phone_no"),
;
}
This module is for manipulating data as hierarchical tag/value pairs (Structured TAGs or Simple Tree AGgreggates).
Data::Page 2.00
Data::Page is a Perl module that helps when paging through sets of results. more>>
SYNOPSIS
use Data::Page;
my $page = Data::Page->new();
$page->total_entries($total_entries);
$page->entries_per_page($entries_per_page);
$page->current_page($current_page);
print " First page: ", $page->first_page, "n";
print " Last page: ", $page->last_page, "n";
print "First entry on page: ", $page->first, "n";
print " Last entry on page: ", $page->last, "n";
When searching through large amounts of data, it is often the case that a result set is returned that is larger than we want to display on one page. This results in wanting to page through various pages of data. The maths behind this is unfortunately fiddly, hence this module.
The main concept is that you pass in the number of total entries, the number of entries per page, and the current page number. You can then call methods to find out how many pages of information there are, and what number the first and last entries on the current page really are.
For example, say we wished to page through the integers from 1 to 100 with 20 entries per page. The first page would consist of 1-20, the second page from 21-40, the third page from 41-60, the fourth page from 61-80 and the fifth page from 81-100. This module would help you work this out.
Fast Data Transfer 0.8.0
Fast Data Transfer is an application for efficient data transfers that is capable of reading and writing at disk speed. more>>
It can be used to stream a large set of files across the network, so a large dataset composed of thousands of files can be sent or received at full speed, without the network transfer restarting between files.
The project is written in Java, runs an all major platforms, and is easy to use.
Main features:
- Streams a dataset (list of files) continuously, using a managed pool of buffers through one or more TCP sockets.
- Uses independent threads to read and write on each physical device
- Transfers data in parallel on multiple TCP streams, when necessary
- Uses appropriate-sized buffers for disk I/O and for the network
- Restores the files from buffers asynchronously
- Resumes a file transfer session without loss, when needed
MIDI Data Miner 3
MIDI Data Miner uses a neural network to learn correlations between notes and control changes in a MIDI file. more>>
After training MDM can augment a live MIDI stream, adding control changes based on notes received.
Briefly: use MDM by connecting a MIDI device, open preferences and set MIDI input and output ports. Open a MIDI file, create a new neural net and train. Now play notes on your MIDI device, MDM will automatically add control changes.
MDM creates a collection of note and controller pairs from the file and uses them to train a net. The notes are used as inputs and the controls as outputs. The window is the number of previous notes considered when determining the output for the current note. Larger windows give less predictable more interesting responses. Enter a number in the window box and press enter to set the window size.
This is an early alpha release, buggy and incomplete. Please send feedback! Note that this alpha version only reads from the first track in a MIDI file, and only uses the first controller. Look at the included MIDI file in the working/midi folder for details.
Data::Startup 0.04
Data::Startup is a Perl module with startup options class, override, config methods. more>>
######
# Subroutine interface
#
use Data::Startup qw(config override);
$options = override(%default_options, @option_list );
$options = override(%default_options, @option_list );
$options = override(%default_options, %option_list );
@options_list = config(%options );
($key, $old_value) = config(%options, $key);
($key, $old_value) = config(%options, $key => $new_value );
($key, $old_value) = config(%options, $key => $new_value );
@old_options_list = config(%options, @option_list);
@old_options_list = config(%options, @option_list);
@old_options_list = config(%options, %option_list);
######
# Object interface
#
use Data::Startup
$startup_options = $class->Data::Startup::new( @option_list );
$startup_options = $class->Data::Startup::new( @option_list );
$startup_options = $class->Data::Startup::new( %option_list );
$options = $startup_options->override( @option_list );
$options = $startup_options->override( @option_list );
$options = $startup_options->override( %option_list );
@options_list = $options->config( );
($key, $old_value) = $options->config($key);
($key, $old_value) = $options->config($key => $new_value );
($key, $old_value) = $options->config($key => $new_value );
@old_options_list = $options->config(@option_list);
@old_options_list = $options->config(@option_list);
@old_options_list = $options->config(%option_list);
# Note: May use [@option_list] instead of @option_list
# and {@option_list} instead of %option_list
Many times there is a group of subroutines that can be tailored by different situations with a few, say global variables. However, global variables pollute namespaces, become mangled when the functions are multi-threaded and probably have many other faults that it is not worth the time discovering.
As well documented in literature, object oriented programming do not have these faults. This program module class of objects provide the objectized options for a group of subroutines or encapsulated options by using the methods directly as in an option object.
The Data::Startup class provides a way to input options in very liberal manner of either
- arrays, reference to an array, or reference to hash to a
- reference to an array or reference to a hash
- reference to a hash
- referene to an array
- many other combos
without having to cut and paste specialize, tailored code into each subroutine/method.
Data::Stag::HashDB 0.10
Data::Stag::HashDB is a perl used for building indexes over Stag files or objects. more>>
SYNOPSIS
# parsing a file into a hash
my $hdb = Data::Stag::HashDB->new;
$hdb->unique_key("ss_details/social_security_no");
$hdb->record_type("person");
my $obj = {};
$hdb->index_hash($obj);
Data::Stag->parse(-file=>$fn, -handler=>$hdb);
my $person = $obj->{999-9999-9999};
print $person->xml;
# indexing an existing stag tree into a hash
my $personset = Data::Stag->parse($fn);
my $hdb = Data::Stag::HashDB->new;
$hdb->unique_key("ss_details/social_security_no");
$hdb->record_type("person");
my $obj = {};
$hdb->index_hash($obj);
$personset->sax($hdb);
my $person = $obj->{999-9999-9999};
print $person->xml;
You need to provide a record_type - this is the type of element that will be indexed
You need to provide a unique_key - this is a single value used to index the record_types
For example, if we have data in the stag structure below, and if ss_no is unique (we assume it is) then we can index all the people in the database using the code above
publicinfo:
persondata:
person:
ss_details:
social_security_no:
name:
address:
There is a subclass of this method callsed Data::Stag::StagDB, which makes the hash persistent
Data::DPath::Builder 0.00_01
Data::DPath::Builder is a SAX handler for building an XPath tree. more>>
SYNOPSIS
use AnySAXParser;
use Data::DPath::Builder;
$builder = Data::DPath::Builder->new();
$parser = AnySAXParser->new( Handler => $builder );
$root_node = $parser->parse( Source => [SOURCE] );
Data::DPath::Builder is a SAX handler for building an Data::DPath tree.
Data::DPath::Builder is used by creating a new instance of Data::DPath::Builder and providing it as the Handler for a SAX parser. Calling `parse() on the SAX parser will return the root node of the tree built from that parse.
Sunrise Data Dictionary 1.00
Sunrise Data Dictionary is a library for hashtable storage of arbitrary data objects. more>>
Sunrise Data Dictionary library can participate in external reference counting systems or use its own built-in reference counting. It comes with a variety of hash functions and allows the use of runtime supplied hash functions via callback mechanism. The source code is well documented.
The Sunrise Data Dictionary was specifically designed for use within the Afelio and Callweaver telephony servers, the implementation focuses on performance and scalability.
Enhancements:
- This is the initial release of the full API (all header files) and a developer snapshot of the implementation.
Data::ICal::Entry::Todo 0.11
Data::ICal::Entry::Todo is a Perl module that represents a to-do entry in an iCalendar file. more>>
SYNOPSIS
my $vtodo = Data::ICal::Entry::Todo->new();
$vtodo->add_properties(
summary => "go to sleep",
status => INCOMPLETE,
# Dat*e*::ICal is not a typo here
dtstart => Date::ICal->new( epoch => time )->ical,
);
$calendar->add_entry($vtodo);
$vtodo->add_entry($alarm);
A Data::ICal::Entry::Todo object represents a single to-do entry in an iCalendar file. (Note that the iCalendar RFC refers to entries as "components".) It is a subclass of Data::ICal::Entry and accepts all of its methods.
Local Data Manager 6.6.5
Local Data Manager is a collection of cooperating programs that select, capture, manage, and distribute arbitrary data products. more>>
The system is designed for event-driven data distribution, and is currently used in the Unidata Internet Data Distribution (IDD) project. The LDM system includes network client and server programs and their shared protocols.
An important characteristic of the LDM is its support for flexible, site-specific configuration.
Enhancements:
- Fixes for timestamp bugs.
DOG Data Organizer 0.4.2
DOG Data Organizer provides a bookmark organizer for various bookmark types. more>>
DOG is a personal knowledge manager based on topic maps. It currently specializes in managing bookmarks.
It imports and exports Netscape, Mozilla, and KDE2 (XBEL) bookmark files, and it imports KDE1 bookmarks and Windows IE Favorites.

Unicode Data Browser 1.5
UnicodeDataBrowser is a very useful browser designed for the UnicodeData.txt file which consists of much useful information but is not easily read by humans. more>>
UnicodeDataBrowser 1.5 is a very useful browser designed for the UnicodeData.txt file which consists of much useful information but is not easily read by humans. The browser creates a scrollable table in which columns represent properties.
The table may be sorted on any column. Abbreviations are expanded and characters cross-referenced in decomposition and casing fields are named. Regular expression search restricted to a selected column is available. The set of characters for which information is displayed may be restricted to those characters matching a regular expression on a specified property.
Each such filtering operation applies to the output of the previous filtering operation unless the table is reset to the original full set of characters, so filtering on multiple properties is possible.
Enhancements: Adds canonical decomposition info for Hangul syllables.
<<less