whitespace
Sponsored Links
Sponsored Links
Secleted [ 0 ] software to compare
Results 1 - 15 of about 105
Whitespace 1.02
Whitespace is a Perl module to cleanup various types of bogus whitespace in source files. more>>
Whitespace is a Perl module to cleanup various types of bogus whitespace in source files.
SYNOPSIS
use Whitespace;
# Instantiate a whitespace object with
# both input and output files specified
$ws = new Whitespace($infile, $outfile);
# Instantiate a whitespace object with
# only the input files specified (in-place cleanup)
$ws2 = new Whitespace($infile);
# Detect the whitespaces
$ret = $ws->detect();
detect returns undef if it is unable to operate on the given file.
The error that caused the undef can be retrieved using error
print $ws->error() . "n" unless defined $ret;
detect returns the types of whitespaces detected as a hash which can be retrieved using the method status. The populated hash might look like this, if the file only had leading, trailing and end-of-line spaces (say on 3 lines).
%stat = %{$env->status()};
print map "$_ => $stat{$_}n", sort keys %stat;
eol => 3
indent => 0
leading => 1
spacetab => 0
trailing => 1
Cleanup can be achieved for all the whitespaces or for just a given type of whitespace, using the following methods.
If a outfile is given, the cleaned contents are written to this file. If not, the contents are replaced in-place. undef is returned if there was an error writing the file.
# To cleanup the all the whitespaces
$ret = $env->cleanup();
# To cleanup leading whitespaces only
$leadstat = $env->leadclean();
# To cleanup trailing whitespaces only
$trailstat = $env->trailclean();
# To cleanup indentation whitespaces only
$indentstat = $env->indentclean();
# To cleanup space-followed-by-tabs only
$sftstat = $env->spacetabclean();
# To cleanup end-of-line whitespaces only
$eolstat = $env->eolclean();
<<lessSYNOPSIS
use Whitespace;
# Instantiate a whitespace object with
# both input and output files specified
$ws = new Whitespace($infile, $outfile);
# Instantiate a whitespace object with
# only the input files specified (in-place cleanup)
$ws2 = new Whitespace($infile);
# Detect the whitespaces
$ret = $ws->detect();
detect returns undef if it is unable to operate on the given file.
The error that caused the undef can be retrieved using error
print $ws->error() . "n" unless defined $ret;
detect returns the types of whitespaces detected as a hash which can be retrieved using the method status. The populated hash might look like this, if the file only had leading, trailing and end-of-line spaces (say on 3 lines).
%stat = %{$env->status()};
print map "$_ => $stat{$_}n", sort keys %stat;
eol => 3
indent => 0
leading => 1
spacetab => 0
trailing => 1
Cleanup can be achieved for all the whitespaces or for just a given type of whitespace, using the following methods.
If a outfile is given, the cleaned contents are written to this file. If not, the contents are replaced in-place. undef is returned if there was an error writing the file.
# To cleanup the all the whitespaces
$ret = $env->cleanup();
# To cleanup leading whitespaces only
$leadstat = $env->leadclean();
# To cleanup trailing whitespaces only
$trailstat = $env->trailclean();
# To cleanup indentation whitespaces only
$indentstat = $env->indentclean();
# To cleanup space-followed-by-tabs only
$sftstat = $env->spacetabclean();
# To cleanup end-of-line whitespaces only
$eolstat = $env->eolclean();
Download (0.004MB)
Added: 2007-05-10 License: Perl Artistic License Price:
897 downloads
HTML-Strip-Whitespace 0.1.5
HTML-Strip-Whitespace is a Perl module to strip whitespace out of HTML pages. more>>
HTML-Strip-Whitespace is a Perl module to strip whitespace out of HTML pages. As opposed to other solutions (like HTML-Clean), it does not touch whitespace inside < pre > tags (and in the future possibly other whitespace-aware markup.)
Using the HTML-Tidy (or otherwise the HTML Tidy library) for that may be preferable, because they are probably faster, but this module still exists and will be maintained.
<<lessUsing the HTML-Tidy (or otherwise the HTML Tidy library) for that may be preferable, because they are probably faster, but this module still exists and will be maintained.
Download (0.005MB)
Added: 2007-03-20 License: Perl Artistic License Price:
949 downloads
dtRdr::doc::Book::whitespace 0.11.2
dtRdr::doc::Book::whitespace Perl module contains issues with whitespace. more>>
dtRdr::doc::Book::whitespace Perl module contains issues with whitespace.
Synopsis
Weird things happen when whitespace doesnt count, but sort of counts.
The annotations rely on a reliable character position, which can be very different from byte offset due to character encoding and whitespace collapses. Thus, we have to establish conventions for whitespace which can be consistently applied in all of these situations.
All your spaces are belong to one position.
The general rule is that any amount of whitespace, whether spanning a tag or not, is treated a single space character.
Nesty Books
This becomes a little difficult with book formats that contain (rendered) nested content nodes. Because of these types of books, a position needs to be able to map from global to local so that the position in a parent can be calculated given the position in a child. See dtRdr::doc::Book::annotree for "math is fun."
As for whitespace, we have to adopt a convention that a space at the end or beginning of a node needs to belong somewhere. In these examples, Ill use square brackets to represent the opening and closing of node xml tags.
[a[b][c[d]]]
[a [b][c[d]]]
[ a [b][c[d]]]
[ a[ b][c[d] ] ]
The above are not intended to be necessarily equivalent. Just representative situations.
Because lots of linebreaks and/or indentation from manual editing and/or conversion tools is so common, the situation almost always looks like this in reality.
[ a [ b ] [ c [ d ] ] ]
This should basically reduce into the following:
[a [b ][c [d ]]]
Note that:
no node starts with a space
there are no consecutive spaces, regardless of tag boundaries
This convention is important because it needs to be shared between the book base class (which does the annotation-insertion xml munging) and the individual book plugins (which build the annotation offset table to allow for position math.)
I still need to prove it, but I believe that even this should be equivalent to the canonical example above.
[ a[ b][ c[ d] ]]
And, to be pragmatic, this is not really worth chasing, since nested content nodes which are accessible both individually and from within the parent is an impossible-to-resolve-into-a-pagewise-reader concept.
<<lessSynopsis
Weird things happen when whitespace doesnt count, but sort of counts.
The annotations rely on a reliable character position, which can be very different from byte offset due to character encoding and whitespace collapses. Thus, we have to establish conventions for whitespace which can be consistently applied in all of these situations.
All your spaces are belong to one position.
The general rule is that any amount of whitespace, whether spanning a tag or not, is treated a single space character.
Nesty Books
This becomes a little difficult with book formats that contain (rendered) nested content nodes. Because of these types of books, a position needs to be able to map from global to local so that the position in a parent can be calculated given the position in a child. See dtRdr::doc::Book::annotree for "math is fun."
As for whitespace, we have to adopt a convention that a space at the end or beginning of a node needs to belong somewhere. In these examples, Ill use square brackets to represent the opening and closing of node xml tags.
[a[b][c[d]]]
[a [b][c[d]]]
[ a [b][c[d]]]
[ a[ b][c[d] ] ]
The above are not intended to be necessarily equivalent. Just representative situations.
Because lots of linebreaks and/or indentation from manual editing and/or conversion tools is so common, the situation almost always looks like this in reality.
[ a [ b ] [ c [ d ] ] ]
This should basically reduce into the following:
[a [b ][c [d ]]]
Note that:
no node starts with a space
there are no consecutive spaces, regardless of tag boundaries
This convention is important because it needs to be shared between the book base class (which does the annotation-insertion xml munging) and the individual book plugins (which build the annotation offset table to allow for position math.)
I still need to prove it, but I believe that even this should be equivalent to the canonical example above.
[ a[ b][ c[ d] ]]
And, to be pragmatic, this is not really worth chasing, since nested content nodes which are accessible both individually and from within the parent is an impossible-to-resolve-into-a-pagewise-reader concept.
Download (0.77MB)
Added: 2007-07-12 License: Perl Artistic License Price:
835 downloads
jTokeniser 2.0
jTokeniser project is a Java library for tokenising strings into a list of tokens. more>>
jTokeniser project is a Java library for tokenising strings into a list of tokens.
Main features:
- WhiteSpaceTokeniser - this splits a string on all occurances of whitespace, which include spaces, newlines, tabs and linefeeds.
- StringTokeniser - this is basically the same as java.util.StringTokenizer with some extra methods (and extends from Tokeniser). Its default behaviour is to act as a WhiteSpaceTokeniser, however, you can specify a set of characters that are to be used to indicate word delimiters.
- RegexTokeniser - this tokeniser is much more flexible as you can use regular expressions to define a what a token is. So, "w+" means whenever it matches one or more letters, it will consider that a word. By default, it uses a regular expression equivalent to a whitespace tokeniser.
- RegexSeparatorTokeniser - this can be thought of as an advanced StringTokeniser. Whereas StringTokeniser is limited to defining delimiters as a set of individual characters, RegexSeparatorTokeniser can utilise regular expressions for a richer and more flexible approach.
- BreakIteratorTokeniser - one of the most sophisticated tokenisers in the library, although should only be used on natural language strings to isolate words. It also comes with built-in rules about how to find words, knowing how to disregard punctuation, etc.
- SentenceTokeniser - this also uses a BreakIterater like the above, but tuned towards finding sentence boundaries. The "tokens" in this tokeniser are in fact individual sentences.
Enhancements:
- This release includes an easy to use GUI front-end to use the tokenisers interactively, out-of-the-box.
- This is especially useful for experimenting with tokenisers, perhaps within a teaching environment.
- It is also handy for those without the Java experience to utilise the library API directly.
<<lessMain features:
- WhiteSpaceTokeniser - this splits a string on all occurances of whitespace, which include spaces, newlines, tabs and linefeeds.
- StringTokeniser - this is basically the same as java.util.StringTokenizer with some extra methods (and extends from Tokeniser). Its default behaviour is to act as a WhiteSpaceTokeniser, however, you can specify a set of characters that are to be used to indicate word delimiters.
- RegexTokeniser - this tokeniser is much more flexible as you can use regular expressions to define a what a token is. So, "w+" means whenever it matches one or more letters, it will consider that a word. By default, it uses a regular expression equivalent to a whitespace tokeniser.
- RegexSeparatorTokeniser - this can be thought of as an advanced StringTokeniser. Whereas StringTokeniser is limited to defining delimiters as a set of individual characters, RegexSeparatorTokeniser can utilise regular expressions for a richer and more flexible approach.
- BreakIteratorTokeniser - one of the most sophisticated tokenisers in the library, although should only be used on natural language strings to isolate words. It also comes with built-in rules about how to find words, knowing how to disregard punctuation, etc.
- SentenceTokeniser - this also uses a BreakIterater like the above, but tuned towards finding sentence boundaries. The "tokens" in this tokeniser are in fact individual sentences.
Enhancements:
- This release includes an easy to use GUI front-end to use the tokenisers interactively, out-of-the-box.
- This is especially useful for experimenting with tokenisers, perhaps within a teaching environment.
- It is also handy for those without the Java experience to utilise the library API directly.
Download (0.083MB)
Added: 2006-07-20 License: GPL (GNU General Public License) Price:
1192 downloads
Blatte::Parser 0.9.4
Blatte::Parser is a Perl module that contains a parser for Blatte syntax. more>>
Blatte::Parser is a Perl module that contains a parser for Blatte syntax.
SYNOPSIS
use Blatte::Parser;
$parser = new Blatte::Parser();
$perl_expr = $parser->parse(INPUT);
or
$parsed_expr = $parser->expr(INPUT);
if (defined($parsed_expr)) {
$perl_expr = $parsed_expr->transform();
}
METHODS
$parser->parse(INPUT)
Parses the first Blatte expression in INPUT and returns the corresponding Perl string, or undef if an error occurred.
INPUT may be a string or a reference to a string. If its the latter, then after a successful parse, the parsed expression will be removed from the beginning of the string.
$parser->expr(INPUT)
Like parse(), except the result is not converted to Perl; its left in Blattes internal parse-tree format, which uses the Blatte::Syntax family of objects.
$parser->eof(INPUT)
Tests INPUT for end-of-file. Leading whitespace is removed from INPUT with consume_whitespace and, if nothing remains, true is returned, else undef.
<<lessSYNOPSIS
use Blatte::Parser;
$parser = new Blatte::Parser();
$perl_expr = $parser->parse(INPUT);
or
$parsed_expr = $parser->expr(INPUT);
if (defined($parsed_expr)) {
$perl_expr = $parsed_expr->transform();
}
METHODS
$parser->parse(INPUT)
Parses the first Blatte expression in INPUT and returns the corresponding Perl string, or undef if an error occurred.
INPUT may be a string or a reference to a string. If its the latter, then after a successful parse, the parsed expression will be removed from the beginning of the string.
$parser->expr(INPUT)
Like parse(), except the result is not converted to Perl; its left in Blattes internal parse-tree format, which uses the Blatte::Syntax family of objects.
$parser->eof(INPUT)
Tests INPUT for end-of-file. Leading whitespace is removed from INPUT with consume_whitespace and, if nothing remains, true is returned, else undef.
Download (0.031MB)
Added: 2007-04-20 License: Perl Artistic License Price:
917 downloads
ShiftJIS::Regexp 1.00
ShiftJIS::Regexp contains regular expressions in Shift-JIS. more>>
ShiftJIS::Regexp contains regular expressions in Shift-JIS.
SYNOPSIS
use ShiftJIS::Regexp qw(:all);
match($string, p{Hiragana}{2}p{Digit}{2});
match($string, pH{2}pD{2});
# these two are equivalent:
This module provides some functions to use regular expressions in Shift-JIS on the byte-oriented perl.
The legal Shift-JIS character in this module must match the following regular expression:
[x00-x7FxA1-xDF]|[x81-x9FxE0-xFC][x40-x7Ex80-xFC]
To avoid false matching in multibyte encoding, this module uses anchoring technique to ensure each matching position places at the character boundaries.
cf. perlfaq6, "How can I match strings with multibyte characters?"
Functions
re(PATTERN)
re(PATTERN, MODIFIER)
Returns a regular expression parsable by the byte-oriented perl.
PATTERN is specified as a string. MODIFIER is specified as a string. Modifiers in the following list are allowed.
i case-insensitive pattern (only for ascii alphabets)
I case-insensitive pattern (greek, cyrillic, fullwidth latin)
j hiragana-katakana-insensitive pattern (but halfwidth katakana
are not considered.)
s treat string as single line
m treat string as multiple lines
x ignore whitespace (i.e. [x20nrtf]) unless backslashed
or inside a character class; but comments are not recognized!
o once parsed (not compiled!) and the result is cached internally.
o modifier
while ( ) {
print replace($_, (perl), $1, igo);
}
is more efficient than
while ( ) {
print replace($_, (perl), $1, ig);
}
because in the latter case the pattern is parsed every time
whenever the function is called.
match(STRING, PATTERN)
match(STRING, PATTERN, MODIFIER)
An emulation of m// operator aware of Shift-JIS. But, to emulate @list = $string =~ m/PATTERN/g, the pattern should be parenthesized (capturing parentheses are not added automatically).
@list = match($string, pH, g); # wrong; returns garbage!
@list = match($string,(pH),g); # good
PATTERN is specified as a string. MODIFIER is specified as a string.
i,I,j,s,m,x,o please see re().
g match globally
z tell the function the pattern matches an empty string
(sorry, due to the poor auto-detection)
replace(STRING or SCALAR REF, PATTERN, REPLACEMENT)
replace(STRING or SCALAR REF, PATTERN, REPLACEMENT, MODIFIER)
An emulation of s/// operator but aware of Shift-JIS.
If a reference to a scalar is specified as the first argument, substitutes the referent scalar and returns the number of substitutions made. If a string (not a reference) is specified as the first argument, returns the substituted string and the specified string is unaffected.
MODIFIER is specified as a string.
i,I,j,s,m,x,o please see re().
g,z please see match().
jsplit(PATTERN or ARRAY REF of [PATTERN, MODIFIER], STRING)
jsplit(PATTERN or ARRAY REF of [PATTERN, MODIFIER], STRING, LIMIT)
An emulation of CORE::split but aware of Shift-JIS.
In scalar/void context, it does not split into the @_ array; in scalar context, only returns the number of fields found.
PATTERN is specified as a string. But as PATTERN has no special meaning; it splits the string on a single space similarly to CORE::split / /.
When you want to split the string on whitespace, pass an undefined value as PATTERN or use the splitspace() function.
jsplit(undef, " x81x40 This is x81x40 perl.");
splitspace(" x81x40 This is x81x40 perl.");
# (This, is, perl.)
If you want to pass pattern with modifiers, specify an arrayref of [PATTERN, MODIFIER] as the first argument. You can also use "Embedded Modifiers").
MODIFIER is specified as a string.
i,I,j,s,m,x,o please see re().
splitspace(STRING)
splitspace(STRING, LIMIT)
This function emulates CORE::split( , STRING, LIMIT). It returns a list given by split STRING on whitespace including "x81x40" (IDEOGRAPHIC SPACE). Leading whitespace characters do not produce any field.
Note: splitspace(STRING, LIMIT) is equivalent to jsplit(undef, STRING, LIMIT).
splitchar(STRING)
splitchar(STRING, LIMIT)
This function emulates CORE::split(//, STRING, LIMIT). It returns a list given by split of STRING into characters.
Note: splitchar(STRING, LIMIT) is equivalent to jsplit(, STRING, LIMIT).
<<lessSYNOPSIS
use ShiftJIS::Regexp qw(:all);
match($string, p{Hiragana}{2}p{Digit}{2});
match($string, pH{2}pD{2});
# these two are equivalent:
This module provides some functions to use regular expressions in Shift-JIS on the byte-oriented perl.
The legal Shift-JIS character in this module must match the following regular expression:
[x00-x7FxA1-xDF]|[x81-x9FxE0-xFC][x40-x7Ex80-xFC]
To avoid false matching in multibyte encoding, this module uses anchoring technique to ensure each matching position places at the character boundaries.
cf. perlfaq6, "How can I match strings with multibyte characters?"
Functions
re(PATTERN)
re(PATTERN, MODIFIER)
Returns a regular expression parsable by the byte-oriented perl.
PATTERN is specified as a string. MODIFIER is specified as a string. Modifiers in the following list are allowed.
i case-insensitive pattern (only for ascii alphabets)
I case-insensitive pattern (greek, cyrillic, fullwidth latin)
j hiragana-katakana-insensitive pattern (but halfwidth katakana
are not considered.)
s treat string as single line
m treat string as multiple lines
x ignore whitespace (i.e. [x20nrtf]) unless backslashed
or inside a character class; but comments are not recognized!
o once parsed (not compiled!) and the result is cached internally.
o modifier
while ( ) {
print replace($_, (perl), $1, igo);
}
is more efficient than
while ( ) {
print replace($_, (perl), $1, ig);
}
because in the latter case the pattern is parsed every time
whenever the function is called.
match(STRING, PATTERN)
match(STRING, PATTERN, MODIFIER)
An emulation of m// operator aware of Shift-JIS. But, to emulate @list = $string =~ m/PATTERN/g, the pattern should be parenthesized (capturing parentheses are not added automatically).
@list = match($string, pH, g); # wrong; returns garbage!
@list = match($string,(pH),g); # good
PATTERN is specified as a string. MODIFIER is specified as a string.
i,I,j,s,m,x,o please see re().
g match globally
z tell the function the pattern matches an empty string
(sorry, due to the poor auto-detection)
replace(STRING or SCALAR REF, PATTERN, REPLACEMENT)
replace(STRING or SCALAR REF, PATTERN, REPLACEMENT, MODIFIER)
An emulation of s/// operator but aware of Shift-JIS.
If a reference to a scalar is specified as the first argument, substitutes the referent scalar and returns the number of substitutions made. If a string (not a reference) is specified as the first argument, returns the substituted string and the specified string is unaffected.
MODIFIER is specified as a string.
i,I,j,s,m,x,o please see re().
g,z please see match().
jsplit(PATTERN or ARRAY REF of [PATTERN, MODIFIER], STRING)
jsplit(PATTERN or ARRAY REF of [PATTERN, MODIFIER], STRING, LIMIT)
An emulation of CORE::split but aware of Shift-JIS.
In scalar/void context, it does not split into the @_ array; in scalar context, only returns the number of fields found.
PATTERN is specified as a string. But as PATTERN has no special meaning; it splits the string on a single space similarly to CORE::split / /.
When you want to split the string on whitespace, pass an undefined value as PATTERN or use the splitspace() function.
jsplit(undef, " x81x40 This is x81x40 perl.");
splitspace(" x81x40 This is x81x40 perl.");
# (This, is, perl.)
If you want to pass pattern with modifiers, specify an arrayref of [PATTERN, MODIFIER] as the first argument. You can also use "Embedded Modifiers").
MODIFIER is specified as a string.
i,I,j,s,m,x,o please see re().
splitspace(STRING)
splitspace(STRING, LIMIT)
This function emulates CORE::split( , STRING, LIMIT). It returns a list given by split STRING on whitespace including "x81x40" (IDEOGRAPHIC SPACE). Leading whitespace characters do not produce any field.
Note: splitspace(STRING, LIMIT) is equivalent to jsplit(undef, STRING, LIMIT).
splitchar(STRING)
splitchar(STRING, LIMIT)
This function emulates CORE::split(//, STRING, LIMIT). It returns a list given by split of STRING into characters.
Note: splitchar(STRING, LIMIT) is equivalent to jsplit(, STRING, LIMIT).
Download (0.035MB)
Added: 2007-08-08 License: Perl Artistic License Price:
811 downloads
HTML::Clean 0.8
HTML::Clean module cleans up HTML code for web browsers, not humans. more>>
HTML::Clean module cleans up HTML code for web browsers, not humans.
SYNOPSIS
use HTML::Clean;
$h = new HTML::Clean($filename); # or..
$h = new HTML::Clean($htmlcode);
$h->compat();
$h->strip();
$data = $h->data();
print $$data;
The HTML::Clean module encapsulates a number of common techniques for minimizing the size of HTML files. You can typically save between 10% and 50% of the size of a HTML file using these methods. It provides the following features:
Remove unneeded whitespace (begining of line, etc)
Remove unneeded META elements.
Remove HTML comments (except for styles, javascript and SSI)
Replace tags with equivilant shorter tags (< strong > --> < b >)
etc.
The entire proces is configurable, so you can pick and choose what you want to clean.
<<lessSYNOPSIS
use HTML::Clean;
$h = new HTML::Clean($filename); # or..
$h = new HTML::Clean($htmlcode);
$h->compat();
$h->strip();
$data = $h->data();
print $$data;
The HTML::Clean module encapsulates a number of common techniques for minimizing the size of HTML files. You can typically save between 10% and 50% of the size of a HTML file using these methods. It provides the following features:
Remove unneeded whitespace (begining of line, etc)
Remove unneeded META elements.
Remove HTML comments (except for styles, javascript and SSI)
Replace tags with equivilant shorter tags (< strong > --> < b >)
etc.
The entire proces is configurable, so you can pick and choose what you want to clean.
Download (0.047MB)
Added: 2007-08-07 License: Perl Artistic License Price:
808 downloads
wsdebug 0.1
wsdebug is a debugger for the Whitespace programming language. more>>
wsdebug is a debugger for the more or less famous whitespace programming language, coming along with a rather fast interpreter (wsi).
Most programming languages like C or Perl do not care for white space characters (like tab, space or newline/linefeed). The whitespace programming language works just the other way round, dont care for any character but those white space ones.
On the whole its just another geeky language like Brainfuck and others, however more adicting.
If youve written a whole lot of instructions youll probably reach the point, where you get lost. Then just put your script into wsdebug and step through your bunch of whitespace instructions and watch how each command manipulates the stack (or heap).
<<lessMost programming languages like C or Perl do not care for white space characters (like tab, space or newline/linefeed). The whitespace programming language works just the other way round, dont care for any character but those white space ones.
On the whole its just another geeky language like Brainfuck and others, however more adicting.
If youve written a whole lot of instructions youll probably reach the point, where you get lost. Then just put your script into wsdebug and step through your bunch of whitespace instructions and watch how each command manipulates the stack (or heap).
Download (0.27MB)
Added: 2005-04-14 License: GPL (GNU General Public License) Price:
1654 downloads
XML::Filter::Dispatcher::AsStructHandler 0.52
XML::Filter::Dispatcher::AsStructHandler Perl module can help you convert SAX stream in to simple, data-oriented structure. more>>
XML::Filter::Dispatcher::AsStructHandler Perl module can help you convert SAX stream in to simple, data-oriented structure.
SYNOPSIS
## Ordinarily used via the XML::Filter::Dispatchers as_data_struct()
## built-in extension function for XPath
This SAX2 handler builds a simple hash from XML. Text from each element and attribute is stored in the hash with a key of a relative path from the root down to the current element.
The goal is to produce a usable structure as simply and quickly as possible; use XML::Simple for more sophisticated applications.
The resulting data structure has one hash per element, one scalar per attribute, and one scalar per text string in each leaf element.
Warnings are emitted if any content other than whitespace is discarded.
The root element name is discarded.
If you are using namespaces, you must pass in the Namespaces option, otherwise not. Using namespaces without a Namespaces option or vice versa will not work.
Only start_document(), start_element(), characters(), end_element(), and end_document() are provided; so all comments, processing instructions etc., are discarded.
<<lessSYNOPSIS
## Ordinarily used via the XML::Filter::Dispatchers as_data_struct()
## built-in extension function for XPath
This SAX2 handler builds a simple hash from XML. Text from each element and attribute is stored in the hash with a key of a relative path from the root down to the current element.
The goal is to produce a usable structure as simply and quickly as possible; use XML::Simple for more sophisticated applications.
The resulting data structure has one hash per element, one scalar per attribute, and one scalar per text string in each leaf element.
Warnings are emitted if any content other than whitespace is discarded.
The root element name is discarded.
If you are using namespaces, you must pass in the Namespaces option, otherwise not. Using namespaces without a Namespaces option or vice versa will not work.
Only start_document(), start_element(), characters(), end_element(), and end_document() are provided; so all comments, processing instructions etc., are discarded.
Download (0.086MB)
Added: 2007-07-13 License: Perl Artistic License Price:
833 downloads
DParser 1.15
DParser is an simple but powerful tool for parsing. more>>
DParser project is an simple but powerful tool for parsing. You can specify the form of the text to be parsed using a combination of regular expressions and grammar productions.
Because of the parsing technique (technically a scannerless GLR parser based on the Tomita algorithm) there are no restrictions.
The grammar can be ambiguous, right or left recursive, have any number of null productions, and because there is no seperate tokenizer, can include whitespace in terminals and have terminals which are prefixes of other terminals.
DParser handles not just well formed computer languages and data files, but just about any wacky situation that occurs in the real world.
Main features:
- Powerful GLR parsing
- Simple EBNF-style grammars and regular expression terminals
- Priorities and associativities for token and rules
- Built-in error recovery
- Speculative actions (for semantic disambiguation)
- Auto-building of parse tree (optionally)
- Final actions as you go, or on the complete parse tree
- Tree walkers and default actions (multi-pass compilation support)
- Symbol table built for ambiguous parsing
- Partial parses, recursive parsing, parsing starting with any non-terminal
- Whitespace can be specified as a subgrammar
- External (C call interface) tokenizers and external terminal scanners
- Good asymptotically efficiency
- Comes with ANSI-C, Python and Verilog grammars
- Comes with full source
- Portable C for easy compilation and linking
- BSD licence, so you can included it in your application without worrying about licensing
Enhancements:
- Removed call to exec in python interface (Brian Sabbey)
- Fix binary_op_left in python interface (Brian Sabbey)
<<lessBecause of the parsing technique (technically a scannerless GLR parser based on the Tomita algorithm) there are no restrictions.
The grammar can be ambiguous, right or left recursive, have any number of null productions, and because there is no seperate tokenizer, can include whitespace in terminals and have terminals which are prefixes of other terminals.
DParser handles not just well formed computer languages and data files, but just about any wacky situation that occurs in the real world.
Main features:
- Powerful GLR parsing
- Simple EBNF-style grammars and regular expression terminals
- Priorities and associativities for token and rules
- Built-in error recovery
- Speculative actions (for semantic disambiguation)
- Auto-building of parse tree (optionally)
- Final actions as you go, or on the complete parse tree
- Tree walkers and default actions (multi-pass compilation support)
- Symbol table built for ambiguous parsing
- Partial parses, recursive parsing, parsing starting with any non-terminal
- Whitespace can be specified as a subgrammar
- External (C call interface) tokenizers and external terminal scanners
- Good asymptotically efficiency
- Comes with ANSI-C, Python and Verilog grammars
- Comes with full source
- Portable C for easy compilation and linking
- BSD licence, so you can included it in your application without worrying about licensing
Enhancements:
- Removed call to exec in python interface (Brian Sabbey)
- Fix binary_op_left in python interface (Brian Sabbey)
Download (0.26MB)
Added: 2006-10-18 License: BSD License Price:
1103 downloads
Apache::WebSNMP 0.11
Apache::WebSNMP is a Perl module that allows for SNMP calls to be embedded in HTML. more>>
Apache::WebSNMP is a Perl module that allows for SNMP calls to be embedded in HTML.
SYNOPSIS
< html >
< body >
< snmp >
host=zoom.google.org
community=public
connect
interface=ifDescr.2
mac=ifPhysAddress.2
query
< /snmp >
The interface < b >descriptor< /b > for the ethernet card is < snmp > print(interface) < /snmp >
and its mac address is < snmp > print(mac) < /snmp >
< /body >
< /html >
The WebSNMP module allows one to embed SNMP commands directly into HTML code.
REQUIRES
This module requires the perl SNMP module, available at the CPAN site.
USAGE
The module allows for three different kinds of statements, surrounded by < snmp > and < /snmp > html tags. The three types of statements consist of configurations, variable assignments, and commands. A brief description of each type of statement follows:
Configuration:
The configuration statements allow the user the set which host to poll for SNMP information, as well as the SNMP community that the get statements will draw from. This essentially takes the form of assigning values to the reserved variables host and community. All variables are assigned with the following syntax: varible_name=value
Note: there must not be any intervening whitespace between the = and the name and value. Thus to set the SNMP host to machine.domain.net, we would issue the configuration statement:
< snmp >host=machine.domain.net< /snmp >
If not specified, the default host is localhost, and the default community is public.
<<lessSYNOPSIS
< html >
< body >
< snmp >
host=zoom.google.org
community=public
connect
interface=ifDescr.2
mac=ifPhysAddress.2
query
< /snmp >
The interface < b >descriptor< /b > for the ethernet card is < snmp > print(interface) < /snmp >
and its mac address is < snmp > print(mac) < /snmp >
< /body >
< /html >
The WebSNMP module allows one to embed SNMP commands directly into HTML code.
REQUIRES
This module requires the perl SNMP module, available at the CPAN site.
USAGE
The module allows for three different kinds of statements, surrounded by < snmp > and < /snmp > html tags. The three types of statements consist of configurations, variable assignments, and commands. A brief description of each type of statement follows:
Configuration:
The configuration statements allow the user the set which host to poll for SNMP information, as well as the SNMP community that the get statements will draw from. This essentially takes the form of assigning values to the reserved variables host and community. All variables are assigned with the following syntax: varible_name=value
Note: there must not be any intervening whitespace between the = and the name and value. Thus to set the SNMP host to machine.domain.net, we would issue the configuration statement:
< snmp >host=machine.domain.net< /snmp >
If not specified, the default host is localhost, and the default community is public.
Download (0.006MB)
Added: 2007-08-01 License: Perl Artistic License Price:
814 downloads
XML::Filter::DataIndenter 0.1
XML::Filter::DataIndenter is a SAX2 Indenter for data oriented XML. more>>
XML::Filter::DataIndenter is a SAX2 Indenter for data oriented XML.
SYNOPSIS
use XML::Filter::DataIndenter;
use XML::SAX::Machines qw( Pipeline );
Pipeline( XML::Filter::DataIndenter => *STDOUT );
ALPHA CODE ALERT: This is the first release. Feedback and patches welcome.
In data oriented XML, leaf elements (those which contain no elements) contain only character content, all other elements contain only child elements and ignorable whitespace. This filter consumes all whitespace not in leaf nodes and replaces it with whitespace that indents all elements. Character data in leaf elements is left unmolested.
This filter assumes youre emitting data oriented XML. It will die if it sees non-whitespace character data outside of a leaf element. It also dies if it sees start-tag / end-tag mismatch, just as a service to the programmer.
Processing instructions and comments are indented as though they were leaf elements except when they occur in leaf elements.
Example:
This document:
< a>< ?A?>
< !--A-->< b>< ?B?>< !--B-->B< /b>
< !--A-->
< /a>
gets reindented as:
< a>
< ?A?>
< !--A-->
< b>< ?B?>< !--B-->B< /b>
< !--A-->
< /a>
(plus or minus a space in each PI, depending on your XML writer).
<<lessSYNOPSIS
use XML::Filter::DataIndenter;
use XML::SAX::Machines qw( Pipeline );
Pipeline( XML::Filter::DataIndenter => *STDOUT );
ALPHA CODE ALERT: This is the first release. Feedback and patches welcome.
In data oriented XML, leaf elements (those which contain no elements) contain only character content, all other elements contain only child elements and ignorable whitespace. This filter consumes all whitespace not in leaf nodes and replaces it with whitespace that indents all elements. Character data in leaf elements is left unmolested.
This filter assumes youre emitting data oriented XML. It will die if it sees non-whitespace character data outside of a leaf element. It also dies if it sees start-tag / end-tag mismatch, just as a service to the programmer.
Processing instructions and comments are indented as though they were leaf elements except when they occur in leaf elements.
Example:
This document:
< a>< ?A?>
< !--A-->< b>< ?B?>< !--B-->B< /b>
< !--A-->
< /a>
gets reindented as:
< a>
< ?A?>
< !--A-->
< b>< ?B?>< !--B-->B< /b>
< !--A-->
< /a>
(plus or minus a space in each PI, depending on your XML writer).
Download (0.003MB)
Added: 2007-07-11 License: Perl Artistic License Price:
835 downloads
smtpauth 0.94
smtpauth is a authenticating proxy for servers without SMTP AUTH. more>>
smtpauth is a authenticating proxy for servers without SMTP AUTH.
Use smtpauth and stunnel programs to add SMTP AUTH (PLAIN, LOGIN) support to any SMTP server. Clients can authenticate over SSL port 465 or cleartext port 587, and authentication is fully logged via syslog.
Works with JBMail, Pegasus Mail, Mozilla Thunderbird, MS Outlook...
This software is really an interim solution until our favourite MTA(s) support SSL/TLS and SMTP AUTH directly. For now I prefer using external programs to provide this functionality rather than patching MTA source. I designed this software to work with my Postfix server, but smtpauth also works with sendmail and just about any other SMTP server.
Installation:
1. Compile and install binary.
make
Copy smtpauth to /usr/sbin, owned by root, mode 755
2. Create special user smtpauth with its own group, no login allowed.
Note that smtpauth will immediately exit with an error if invoked as root.
It must be run from a low privilege account, for security.
3. [For SSL, port 465] Configure stunnel.conf. Change domain for your site.
setuid = smtpauth
setgid = smtpauth
debug = auth.notice
client = no
[smtps]
accept = 465
exec = /usr/sbin/smtpauth
execargs = smtpauth domain 127.0.0.1
4. Configure /etc/smtpauth.conf
This file should only be readable by the smtpauth user, since it stores plain
passwords. It consists of single lines containing usernames and passwords with
whitespace separating. Blank lines and comment lines starting # are ignored.
user1 pass1
user2 pass2
5. [For SSL, port 465] Start up stunnel
This will create a server running as smtpauth on port smtps/465. When SMTP clients
connect (SSL/TLS) the smtpauth program is launched and provides authentication
service through to 127.0.0.1:25, as a proxy. Your actual SMTP server will accept
mail because that connection is local. The mail headers will include X-SMTP-AUTH
indicating the username. Success and failures will be logged via syslog.
6. [For cleartext, port 587] Configure cleartext submission service in inetd
Since inetd (when started with -W) also supports wrapping, the smtpauth proxy
can be run straight out of here too. Note that this is somewhat risky, because
there will be no SSL/TLS encryption on the submission port (587).
Again, change domain for your site (e.g. mail.yoursite.tld)
submission stream tcp nowait smtpauth /usr/sbin/smtpauth smtpauth domain 127.0.0.1
<<lessUse smtpauth and stunnel programs to add SMTP AUTH (PLAIN, LOGIN) support to any SMTP server. Clients can authenticate over SSL port 465 or cleartext port 587, and authentication is fully logged via syslog.
Works with JBMail, Pegasus Mail, Mozilla Thunderbird, MS Outlook...
This software is really an interim solution until our favourite MTA(s) support SSL/TLS and SMTP AUTH directly. For now I prefer using external programs to provide this functionality rather than patching MTA source. I designed this software to work with my Postfix server, but smtpauth also works with sendmail and just about any other SMTP server.
Installation:
1. Compile and install binary.
make
Copy smtpauth to /usr/sbin, owned by root, mode 755
2. Create special user smtpauth with its own group, no login allowed.
Note that smtpauth will immediately exit with an error if invoked as root.
It must be run from a low privilege account, for security.
3. [For SSL, port 465] Configure stunnel.conf. Change domain for your site.
setuid = smtpauth
setgid = smtpauth
debug = auth.notice
client = no
[smtps]
accept = 465
exec = /usr/sbin/smtpauth
execargs = smtpauth domain 127.0.0.1
4. Configure /etc/smtpauth.conf
This file should only be readable by the smtpauth user, since it stores plain
passwords. It consists of single lines containing usernames and passwords with
whitespace separating. Blank lines and comment lines starting # are ignored.
user1 pass1
user2 pass2
5. [For SSL, port 465] Start up stunnel
This will create a server running as smtpauth on port smtps/465. When SMTP clients
connect (SSL/TLS) the smtpauth program is launched and provides authentication
service through to 127.0.0.1:25, as a proxy. Your actual SMTP server will accept
mail because that connection is local. The mail headers will include X-SMTP-AUTH
indicating the username. Success and failures will be logged via syslog.
6. [For cleartext, port 587] Configure cleartext submission service in inetd
Since inetd (when started with -W) also supports wrapping, the smtpauth proxy
can be run straight out of here too. Note that this is somewhat risky, because
there will be no SSL/TLS encryption on the submission port (587).
Again, change domain for your site (e.g. mail.yoursite.tld)
submission stream tcp nowait smtpauth /usr/sbin/smtpauth smtpauth domain 127.0.0.1
Download (0.011MB)
Added: 2006-03-21 License: GPL (GNU General Public License) Price:
1313 downloads
Geography::States 2.1
Geography::States is a Perl module with map states and provinces to their codes, and vice versa. more>>
Geography::States is a Perl module with map states and provinces to their codes, and vice versa.
SYNOPSIS
use Geography::States;
my $obj = Geography::States -> new (COUNTRY [, STRICT]);
EXAMPLES
my $canada = Geography::States -> new (Canada);
my $name = $canada -> state (NF); # Newfoundland.
my $code = $canada -> state (Ontario); # ON.
my ($code, $name) = $canada -> state (BC); # BC, British Columbia.
my @all_states = $canada -> state; # List code/name pairs.
This module lets you map states and provinces to their codes, and codes to names of provinces and states.
The Geography::States - new ()> call takes 1 or 2 arguments. The first, required, argument is the country we are interested in. Current supported countries are USA, Brazil, Canada, The Netherlands, and Australia. If a second non-false argument is given, we use strict mode. In non-strict mode, we will map territories and alternative codes as well, while we do not do that in strict mode. For example, if the country is USA, in non-strict mode, we will map GU to Guam, while in strict mode, neither GU and Guam will be found.
The state() method
All queries are done by calling the state method in the object. This method takes an optional argument. If an argument is given, then in scalar context, it will return the name of the state if a code of a state is given, and the code of a state, if the argument of the method is a name of a state. In list context, both the code and the state will be returned.
If no argument is given, then the state method in list context will return a list of all code/name pairs for that country. In scalar context, it will return the number of code/name pairs. Each code/name pair is a 2 element anonymous array.
Arguments can be given in a case insensitive way; if a name consists of multiple parts, the number of spaces does not matter, as long as there is some whitespace. (That is "NewYork" is wrong, but "new YORK" is fine.)
<<lessSYNOPSIS
use Geography::States;
my $obj = Geography::States -> new (COUNTRY [, STRICT]);
EXAMPLES
my $canada = Geography::States -> new (Canada);
my $name = $canada -> state (NF); # Newfoundland.
my $code = $canada -> state (Ontario); # ON.
my ($code, $name) = $canada -> state (BC); # BC, British Columbia.
my @all_states = $canada -> state; # List code/name pairs.
This module lets you map states and provinces to their codes, and codes to names of provinces and states.
The Geography::States - new ()> call takes 1 or 2 arguments. The first, required, argument is the country we are interested in. Current supported countries are USA, Brazil, Canada, The Netherlands, and Australia. If a second non-false argument is given, we use strict mode. In non-strict mode, we will map territories and alternative codes as well, while we do not do that in strict mode. For example, if the country is USA, in non-strict mode, we will map GU to Guam, while in strict mode, neither GU and Guam will be found.
The state() method
All queries are done by calling the state method in the object. This method takes an optional argument. If an argument is given, then in scalar context, it will return the name of the state if a code of a state is given, and the code of a state, if the argument of the method is a name of a state. In list context, both the code and the state will be returned.
If no argument is given, then the state method in list context will return a list of all code/name pairs for that country. In scalar context, it will return the number of code/name pairs. Each code/name pair is a 2 element anonymous array.
Arguments can be given in a case insensitive way; if a name consists of multiple parts, the number of spaces does not matter, as long as there is some whitespace. (That is "NewYork" is wrong, but "new YORK" is fine.)
Download (0.006MB)
Added: 2007-02-14 License: Perl Artistic License Price:
982 downloads
SaltShaker 1.4
SaltShaker is a Python script for shaking things in the open source Blender 3d system. more>>
SaltShaker is a Python script for shaking things in the open source Blender 3d system. A lot of information/comments are included for budding Blender Python script writers.
In fact if you have programmed in a few languages before, the key thing to remember with Python is that instead of using curly brackets { } to encapsulate statement blocks it uses the whitespace indenting the code.
Confusing at first (and when you mix tabs with spaces) this soon becomes second nature ie.
def randomiseit(perc):
pr = (Blender.Noise.random()*perc)
# 50% of the time make it negative
if (Blender.Noise.random()<<less
In fact if you have programmed in a few languages before, the key thing to remember with Python is that instead of using curly brackets { } to encapsulate statement blocks it uses the whitespace indenting the code.
Confusing at first (and when you mix tabs with spaces) this soon becomes second nature ie.
def randomiseit(perc):
pr = (Blender.Noise.random()*perc)
# 50% of the time make it negative
if (Blender.Noise.random()<<less
Download (MB)
Added: 2007-01-04 License: GPL (GNU General Public License) Price:
1026 downloads
Secleted [ 0 ] software to compare
Copyright Notice:
Software piracy is theft, Using crack, password, serial numbers, registration codes, key generators is illegal and prevent future software development. The above whitespace search only lists software in full, demo and trial versions for free download. Download links are directly from our mirror sites or publisher sites, torrent files or links from rapidshare.com, yousendit.com or megaupload.com are not allowed