Unicode::Map8 0.12
Sponsored Links
Unicode::Map8 0.12 Ranking & Summary
File size:
0.10 MB
Platform:
Any Platform
License:
Perl Artistic License
Price:
Downloads:
812
Date added:
2007-08-20
Publisher:
Gisle Aas
Unicode::Map8 0.12 description
Unicode::Map8 is a mapping table between 8-bit chars and Unicode.
SYNOPSIS
require Unicode::Map8;
my $no_map = Unicode::Map8->new("ISO646-NO") || die;
my $l1_map = Unicode::Map8->new("latin1") || die;
my $ustr = $no_map->to16("V}re norske tegn b|r {resn");
my $lstr = $l1_map->to8($ustr);
print $lstr;
print $no_map->tou("V}re norske tegn b|r {resn")->utf8
The Unicode::Map8 class implement efficient mapping tables between 8-bit character sets and 16 bit character sets like Unicode. The tables are efficient both in terms of space allocated and translation speed. The 16-bit strings is assumed to use network byte order.
The following methods are available:
$m = Unicode::Map8->new( [$charset] )
The object constructor creates new instances of the Unicode::Map8 class. I takes an optional argument that specify then name of a 8-bit character set to initialize mappings from. The argument can also be a the name of a mapping file. If the charset/file can not be located, then the constructor returns undef.
If you omit the argument, then an empty mapping table is constructed. You must then add mapping pairs to it using the addpair() method described below.
$m->addpair( $u8, $u16 );
Adds a new mapping pair to the mapping object. It takes two arguments. The first is the code value in the 8-bit character set and the second is the corresponding code value in the 16-bit character set. The same codes can be used multiple times (but using the same pair has no effect). The first definition for a code is the one that is used.
Consider the following example:
$m->addpair(0x20, 0x0020);
$m->addpair(0x20, 0x00A0);
$m->addpair(0xA0, 0x00A0);
It means that the character 0x20 and 0xA0 in the 8-bit charset maps to themselves in the 16-bit set, but in the 16-bit character set 0x0A0 maps to 0x20.
$m->default_to8( $u8 )
Set the code of the default character to use when mapping from 16-bit to 8-bit strings. If there is no mapping pair defined for a character then this default is substituted by to8() and recode8().
$m->default_to16( $u16 )
Set the code of the default character to use when mapping from 8-bit to 16-bit strings. If there is no mapping pair defined for a character then this default is used by to16(), tou() and recode8().
$m->nostrict;
All undefined mappings are replaced with the identity mapping. Undefined character are normally just removed (or replaced with the default if defined) when converting between character sets.
$m->to8( $ustr );
Converts a 16-bit character string to the corresponding string in the 8-bit character set.
$m->to16( $str );
Converts a 8-bit character string to the corresponding string in the 16-bit character set.
$m->tou( $str );
Same an to16() but return a Unicode::String object instead of a plain UCS2 string.
$m->recode8($m2, $str);
Map the string $str from one 8-bit character set ($m) to another one ($m2). Since we assume we know the mappings towards the common 16-bit encoding we can use this to convert between any of the 8-bit character sets.
$m->to_char16( $u8 )
Maps a single 8-bit character code to an 16-bit code. If the 8-bit character is unmapped then the constant NOCHAR is returned. The default is not used and the callback method is not invoked.
$m->to_char8( $u16 )
Maps a single 16-bit character code to an 8-bit code. If the 16-bit character is unmapped then the constant NOCHAR is returned. The default is not used and the callback method is not invoked.
The following callback methods are available. You can override these methods by creating a subclass of Unicode::Map8.
$m->unmapped_to8
When mapping to 8-bit character string and there is no mapping defined (and no default either), then this method is called as the last resort. It is called with a single integer argument which is the code of the unmapped 16-bit character. It is expected to return a string that will be incorporated in the 8-bit string. The default version of this method always returns an empty string.
Example:
package MyMapper;
@ISA=qw(Unicode::Map8);
sub unmapped_to8
{
my($self, $code) = @_;
require Unicode::CharName;
"<" . Unicode::CharName::uname($code) . ">";
}
$m->unmapped_to16
Likewise when mapping to 16-bit character string and no mapping is defined then this method is called. It should return a 16-bit string with the bytes in network byte order. The default version of this method always returns an empty string.
SYNOPSIS
require Unicode::Map8;
my $no_map = Unicode::Map8->new("ISO646-NO") || die;
my $l1_map = Unicode::Map8->new("latin1") || die;
my $ustr = $no_map->to16("V}re norske tegn b|r {resn");
my $lstr = $l1_map->to8($ustr);
print $lstr;
print $no_map->tou("V}re norske tegn b|r {resn")->utf8
The Unicode::Map8 class implement efficient mapping tables between 8-bit character sets and 16 bit character sets like Unicode. The tables are efficient both in terms of space allocated and translation speed. The 16-bit strings is assumed to use network byte order.
The following methods are available:
$m = Unicode::Map8->new( [$charset] )
The object constructor creates new instances of the Unicode::Map8 class. I takes an optional argument that specify then name of a 8-bit character set to initialize mappings from. The argument can also be a the name of a mapping file. If the charset/file can not be located, then the constructor returns undef.
If you omit the argument, then an empty mapping table is constructed. You must then add mapping pairs to it using the addpair() method described below.
$m->addpair( $u8, $u16 );
Adds a new mapping pair to the mapping object. It takes two arguments. The first is the code value in the 8-bit character set and the second is the corresponding code value in the 16-bit character set. The same codes can be used multiple times (but using the same pair has no effect). The first definition for a code is the one that is used.
Consider the following example:
$m->addpair(0x20, 0x0020);
$m->addpair(0x20, 0x00A0);
$m->addpair(0xA0, 0x00A0);
It means that the character 0x20 and 0xA0 in the 8-bit charset maps to themselves in the 16-bit set, but in the 16-bit character set 0x0A0 maps to 0x20.
$m->default_to8( $u8 )
Set the code of the default character to use when mapping from 16-bit to 8-bit strings. If there is no mapping pair defined for a character then this default is substituted by to8() and recode8().
$m->default_to16( $u16 )
Set the code of the default character to use when mapping from 8-bit to 16-bit strings. If there is no mapping pair defined for a character then this default is used by to16(), tou() and recode8().
$m->nostrict;
All undefined mappings are replaced with the identity mapping. Undefined character are normally just removed (or replaced with the default if defined) when converting between character sets.
$m->to8( $ustr );
Converts a 16-bit character string to the corresponding string in the 8-bit character set.
$m->to16( $str );
Converts a 8-bit character string to the corresponding string in the 16-bit character set.
$m->tou( $str );
Same an to16() but return a Unicode::String object instead of a plain UCS2 string.
$m->recode8($m2, $str);
Map the string $str from one 8-bit character set ($m) to another one ($m2). Since we assume we know the mappings towards the common 16-bit encoding we can use this to convert between any of the 8-bit character sets.
$m->to_char16( $u8 )
Maps a single 8-bit character code to an 16-bit code. If the 8-bit character is unmapped then the constant NOCHAR is returned. The default is not used and the callback method is not invoked.
$m->to_char8( $u16 )
Maps a single 16-bit character code to an 8-bit code. If the 16-bit character is unmapped then the constant NOCHAR is returned. The default is not used and the callback method is not invoked.
The following callback methods are available. You can override these methods by creating a subclass of Unicode::Map8.
$m->unmapped_to8
When mapping to 8-bit character string and there is no mapping defined (and no default either), then this method is called as the last resort. It is called with a single integer argument which is the code of the unmapped 16-bit character. It is expected to return a string that will be incorporated in the 8-bit string. The default version of this method always returns an empty string.
Example:
package MyMapper;
@ISA=qw(Unicode::Map8);
sub unmapped_to8
{
my($self, $code) = @_;
require Unicode::CharName;
"<" . Unicode::CharName::uname($code) . ">";
}
$m->unmapped_to16
Likewise when mapping to 16-bit character string and no mapping is defined then this method is called. It should return a 16-bit string with the bytes in network byte order. The default version of this method always returns an empty string.
Unicode::Map8 0.12 Screenshot
Unicode::Map8 0.12 Keywords
Map8 0.12
Unicode
8-bit character set
there is no
8-bit character
character set
mapping table
No Mapping
Character sets
Character string
character
8-bit
mapping
16-Bit
M-
string
Bookmark Unicode::Map8 0.12
Unicode::Map8 0.12 Copyright
WareSeeker periodically updates pricing and software information of Unicode::Map8 0.12 full version from the publisher, so some information may be slightly out-of-date. You should confirm all information before relying on it. Software piracy is theft, Using crack, password, serial numbers, registration codes, key generators is illegal and prevent future development of Unicode::Map8 0.12 Edition. Download links are directly from our publisher sites, torrent files or links from rapidshare.com, yousendit.com or megaupload.com are not allowed
Featured Software
Want to place your software product here?
Please contact us for consideration.
Contact WareSeeker.com
Related Information
rpg maker xp character sets
rpg maker character sets
html character sets
unicode character set
ascii character set
bit specifications
ansi character set
wingding character set
character setup tutorial
character set oracle
universal character set
character setting plot
16 bit programs
character counts
mapping a drive
8 bit comics
string bikini
character setup
Related Software
Unicode::MapUTF8 is a Perl module with conversions to and from arbitrary character sets and UTF8. Free Download
Unicode::Escape is a Perl module with escape and unescape Unicode characters other than ASCII. Free Download
libunicode is a library of unicode string functions and charset converters. Free Download
Unicode::Collate is a Unicode Collation Algorithm. Free Download
Unicode::Overload is a Perl source filter to implement Unicode operations. Free Download
rxvt-unicode is an rxvt clone supporting mixed fonts, Xft fonts, and Unicode. Free Download
MP3Unicode is a command line utility to convert ID3 tags in mp3 files between different encodings. Free Download
Unicode::Normalize Perl module contains Unicode Normalization Forms. Free Download
Latest Software
Popular Software
Favourite Software