regular

RegexSR for liunx 1.0.0
A tool to create and test complex regular expressions. more>> RegexSR is a very easy-to-use and powerful tool (written in Java) to create and test complex regular expressions.
The plugin system provides extra functionalities, such as transforming expressions into Java code, and allows the user to create his own extensions.
Features include testing regular expressions, handling text through regular expressions or plugins, renaming files, developing plugins, and managing expressions in the repository.<<less
Pak 1.1
Pak transfers multiple, possibly very big, regular files between possibly different hosts you have shell access to. more>>
Pak 1.1 is a great application for Linux users. This powerful utility can transfer multiple, possibly very big, regular files between possibly different hosts you have shell access to.
It transmits segment IDs instead of file names and uses on-the-fly Blowfish-CBC encryption while being absolutely restartable with practically no loss of data already transmitted.
Encrypted pak streams can be stored in intermediary regular files on untrusted hosts. Several stored pak streams, even truncated ones, can be merged for re-piping without decryption. Integrity is never checked. File offsets of any magnitude are supported via recompilation (the default width is 64 bits). Either UNIX 95 or UNIX 98 conformance is required and sufficient.
The individual binaries are:
- pak - create a stream from regular files, based on a portion ("segment") list
- upak - completely write out regular file segments from a stream
- pakmerge - merge multiple streams in regular files into a single stream
- paksplit - split a stream into regular stream files of fixed size
- mklist - create a segment list
- The name "stream" is somewhat unfortunately chosen. It doesn't have to do anything with STREAMS or with stdio streams. It denotes a very simple file format.
Major Features:
- Transfer multiple, possibly very big, regular files between possibly different hosts you have shell access to without transmitting any file names.
- Supports on-the-fly Blowfish-CBC encryption
- Easy and portable to handle, transfer and format/scan file offsets of any magnitude. This is achieved with a handful of very naive and simple "bignum" functions.
- Portability and fun (as opposed to efficiency) were the primary aspects of development.
Enhancements:
- The programs "pak" and "pakmerge" were modified to flush stdout before printing "segment done" messages to stderr ("Paksplit" and "upak" already did this for output files.) This way, whenever the user sees a "segment done" message, he/she can be sure that even the last chunk of the segment was passed to the kernel via write().
Tenshi 0.8
Tenshi is a log monitoring program. more>>
Queues can be set to send a notification as soon as there is a log line assigned to it, or to send periodic reports.
Additionally, uninteresting fields in the log lines (such as PID numbers) can be masked with the standard regular expression grouping operators ( ). This allows cleaner and more readable reports. All reports are separated by hostname and all messages are condensed when possible.
The program reads a configuration file and then forks a deamon for monitoring the specified log files.
Please read the example tenshi.conf and tenshi.8 man page for usage instructions.
Tenshi was formerly known as wasabi. The name was changed to tenshi after we were informed that wasabi is a registered trademark relating to another piece of software.
Chestnut FTP Search 0.4
Chestnut FTP Search is a web application to search for files on FTP servers. more>>
The program is written in Python using web.py framework. To store file indexes PostgreSQL or MySQL is use.
Main features:
- Four search modes: partial match, exact match, regular expression, shell pattern. Every mode can by case sensitive or case insensitive
- Character set can by specified for particular server to allow non ASCII file names.
- Multi-threaded indexer
- i18n support
ItSucks 0.2.0 Beta 3
ItSucks software is a java web spider (web crawler) with the ability to download (and resume) files. more>>
The application also provides a swing GUI and a console interface. All backend functionalities are also available in a separate library, they can be easily used for other projects.
GPRename 2.4
GPRename is a GUI batch file renamer based on Gtk-Perl. more>>
GPRename can rename files numerically, insert/delete characters at/between specified position(s), replace strings (either using regular express or not), and change case.
Ragel State Machine Compiler 5.23
Ragel State Machine Compiler compiles state machines from regular languages. more>>
Ragel can also be thought of as a finite state transducer compiler where output symbols represent blocks of code that get executed instead of written to the output stream.
When you wish to write down a regular language you start with some simple regular language and build a bigger one using the regular language operators union, concatenation, kleene star, intersection and subtraction.
This is precisely the way you describe to Ragel how to compile your finite state machines. Ragel also understands operators that embed actions into machines and operators that control any non-determinism in machines.
Ragel FSMs are closed under all of Ragels regular language, action specification and priority assignment operators. This property allows arbitrary regular languages to be described. Complexity is limited only by available processing resources.
For example, you can make one machine that picks out specially formatted comments in C code, another machine that builds a list all function declarations and a third that identifies string constants then "or" them all together to make a single machine that performs all of these tasks concurrently and independently on one pass of the input.
Main features:
- Describe arbitrary state machines using regular language operators and/or state tables.
- NFA to DFA conversion.
- Hopcrofts state minimization.
- Embed any number of actions into machines at arbitrary places.
- Control non-determinism using priorities on transitions.
- Visualize output with Graphviz.
- Use byte, double byte or word sized alphabets.
- Generate C/C++/Objective-C code with no dependencies.
- Choose from table or control flow driven output.
Enhancements:
- The documentation and the Ruby code generator were improved.
perlrecharclass 5.9.5
perlrecharclass package contains Perl regular expression character classes. more>>
The top level documentation about Perl regular expressions is found in perlre.
This manual page discusses the syntax and use of character classes in Perl Regular Expressions.
A character class is a way of denoting a set of characters, in such a way that one character of the set is matched. Its important to remember that matching a character class consumes exactly one character in the source string. (The source string is the string the regular expression is matched against.)
There are three types of character classes in Perl regular expressions: the dot, backslashed sequences, and the bracketed form.
The dot
The dot (or period), . is probably the most used, and certainly the most well-known character class. By default, a dot matches any character, except for the newline. The default can be changed to add matching the newline with the single line modifier: either for the entire regular expression using the /s modifier, or locally using (?s).
Here are some examples:
"a" =~ /./ # Match
"." =~ /./ # Match
"" =~ /./ # No match (dot has to match a character)
"n" =~ /./ # No match (dot does not match a newline)
"n" =~ /./s # Match (global single line modifier)
"n" =~ /(?s:.)/ # Match (local single line modifier)
"ab" =~ /^.$/ # No match (dot matches one character)
Backslashed sequences
Perl regular expressions contain many backslashed sequences that constitute a character class. That is, they will match a single character, if that character belongs to a specific set of characters (defined by the sequence). A backslashed sequence is a sequence of characters starting with a backslash. Not all backslashed sequences are character class; for a full list, see perlrebackslash.
Heres a list of the backslashed sequences, which are discussed in more detail below.
d Match a digit character.
D Match a non-digit character.
w Match a "word" character.
W Match a non-"word" character.
s Match a white space character.
S Match a non-white space character.
h Match a horizontal white space character.
H Match a character that isnt horizontal white space.
v Match a vertical white space character.
V Match a character that isnt vertical white space.
pP, p{Prop} Match a character matching a Unicode property.
PP, P{Prop} Match a character that doesnt match a Unicode property.
perlrebackslash 0.02
perlrebackslash is a module with Perl Regular Expression Backslash Sequences and Escapes. more>>
The top level documentation about Perl regular expressions is found in perlre.
This document describes all backslash and escape sequences. After explaining the role of the backslash, it lists all the sequences that have a special meaning in Perl regular expressions (in alphabetical order), then describes each of them.
Most sequences are described in detail in different documents; the primary purpose of this document is to have a quick reference guide describing all backslash and escape sequences.
The backslash
In a regular expression, the backslash can perform one of two tasks: it either takes away the special meaning of the character following it (for instance, | matches a vertical bar, its not an alternation), or it is the start of a backslash or escape sequence.
The rules determining what it is are quite simple: if the character following the backslash is a punctuation (non-word) character (that is, anything that is not a letter, digit or underscore), then the backslash just takes away the special meaning (if any) of the character following it.
If the character following the backslash is a letter or a digit, then the sequence may be special; if so, its listed below. A few letters have not been used yet, and escaping them with a backslash is safe for now, but a future version of Perl may assign a special meaning to it. However, if you have warnings turned on, Perl will issue a warning if you use such a sequence.
It is however garanteed that backslash or escape sequences never have a punctuation character following the backslash, not now, and not in a future version of Perl 5. So it is safe to put a backslash in front of a non-word character.
Note that the backslash itself is special; if you want to match a backslash, you have to escape the backslash with a backslash: // matches a single backslash.
htmlobserver 0.8a
htmlobserver project is a program that downloads a Web page at regular intervals, and searches it for regular expressions. more>>
All HTML tags are removed, and the remaining text is searched for regular expressions, which can be defined in a list.
Matching rows are displayed in a panel, and different alarms may be triggered. The alarms are repeated when the result set changes.
DiskSearch 1.2.0
DiskSearch is a tool for searching for files on removable media disks (e.g. for songs on your MP3-CDs). more>>
For instance you can search for songs on your MP3-CDs or for a document on your backup DVDs. For advanced queries there is a regular expression search mode.
The search is based on a simple database file which needs to be filled once by adding all your disks to it.
monq.jfa 1.1.1
monq.jfa is a class library for fast and flexible text filtering with regular expressions. more>>
In contrast to java.util.regex, monq.jfa allows to bind a regular expression to an action that is automatically called whenever a match is spotted in an input stream.
In addition it is possible to combine several tenthousand regex/action pairs into one machinery (called DFA).
The DFA filters input to output by looking for matches of all regular expressions in parallel, calling their actions to reformat the text or to incrementally built up a data structure.
The filtering speed is 1.5MB/s on P4 2.6GHz and is mostly unrelated to the number of regex/action pairs.
Enhancements:
- This is a bugfix release to correct improper unsynchronized reuse of an object within monq.jfa.actions.Printf.
- The user visible symptom would be garbled output.
PCRE 7.2
PCRE is a library that implements Perl 5-style regular expressions. more>>
The current implementation corresponds to Perl 5.005. PCRE is used by many programs, including Exim, Postfix, and PHP.
Enhancements:
- Some more features from Perl 5.10 have been added.
- A few bugs were fixed.
- A couple of performance enhancing refactorings were done.
LinkChecker 4.7
LinkChecker it can check HTML documents for broken links. more>>
Output can be colored or normal text, HTML, SQL, CSV, or a sitemap graph in DOT, GML, or XML format. Supported link types are HTTP/1.1 and 1.0, HTTPS, FTP, mailto:, news:, nntp:, Gopher, Telnet, and local files.
Main features:
- recursive checking
- multithreading
- output in colored or normal text, HTML, SQL, CSV or a sitemap graph in GML or XML.
- HTTP/1.1, HTTPS, FTP, mailto:, news:, nntp:, Gopher, Telnet and local file links support
- restriction of link checking with regular expression filters for URLs
- proxy support
- username/password authorization for HTTP and FTP
- robots.txt exclusion protocol support
- i18n support
- a command line interface
- a (Fast)CGI web interface (requires HTTP server)
Enhancements:
- Log output with certain Unicode characters is fixed now.
- XML output has been improved.
- Gopher URLs are now deprecated.
dk.brics.automaton 1.10-1
dk.brics.automaton project is a Java package that contains a DFA/NFA. more>>
In contrast to many other automaton/regexp packages, this package is fast, compact, and implements real, unrestricted regular operations. It uses a symbolic representation based on intervals of Unicode characters.