Tamil Converters
- Description
- Details
- Downloads
- Change Log
- Bugs
The packages for Version 2.7 omitted several scripts and had a Makefile that
contained several errors. Version 2.7.1. corrects these errors. There are no changes
in the programs themselves.
This package contains programs for converting among several encodings and
transliterations of Tamil. Except where another encoding is specified,
all programs use UTF-8 Unicode.
- tscii2uni
- Converts the TSCII encoding to Unicode
- i2u8Tamil
- Converts the ISCII encoding to Unicode
- u82iTamil
- Converts Unicode to the ISCII encoding
- Tamil2IPA
- Converts Tamil script to the International Phonetic Alphabet
- IPA2Tamil
- Converts International Phonetic Alphabet or Colloquial Tamil romanization to Tamil
script
- iscii2itransTamil
- Converts the ISCII encoding to the ITRANS pure ASCII romanization
- itrans2isciiTamil
- Converts the ITRANS pure ASCII romanization to the ISCII encoding
- UnicodeName2Tamil
- Converts Unicode character names enclosed in angle brackets (as used in POSIX
locale source files) to actual Tamil characters
- Tamil2UnicodeName
- Converts Tamil script to Unicode character names enclosed in angle
brackets (as used in POSIX locale source files).
- Koln2Tamil
- Converts the Köln romanization used by the
Institute of Indology and Tamil Studies at the University of Cologne,
to Tamil script in UTF-8 Unicode. The program is structured to be run either
independently or as an xlit plugin.
- Tamil2Koln
- Converts Tamil script in UTF-8 Unicode to
the Köln romanization used by the
Institute of Indology and Tamil Studies at the University of Cologne,
The program is structured to be run either
independently or as an xlit plugin.
- Tamil2Indicist
- Converts Tamil script to indicist (ISO 15919) transliteration.
The script may be run either stand-alone or as a plugin to
xlit
- CT2IPA
- Converts Colloquial Tamil romanization to the International Phonetic Alphabet
- Penn2IPA
- Converts the pure ASCII romanization used by the University of Pennsylvania South Asia Language
Resource Center and various other organizations to the International Phonetic Alphabet
The ITRANS accepted as input and generated by these programs is, by default,
extended to include codes for the Tamil digits. The ITRANS accepted as input
and generated by default contains HZ escapes (as defined in
RFC 1843) that delimit the Tamil portion.
This allows processing of mixed Tamil and ASCII text.
The extensions and use of HZ escapes can be disabled by command-line switches.
Back to Top
Language | C, Tcl, Python |
Dependencies | None |
Environment | POSIX |
Current version | 2.7.1 |
Last modified | 2007-05-27 |
License | GNU General Public License |
This software was written for the Linguistic Data Consortium
with funding provided by the US Department of Defense.
Back to Top
The C programs should compile and run in any POSIX-compliant environment.
The Tcl programs require only the basic Tcl installation. They should run
anywhere that Tcl is available.
The Python program should run on any machine that supports Python, meaning
pretty much anywhere.
If you do not already have Tcl on your system, the easiest
way to obtain it is probably to install the
ActiveTcl
distribution from ActiveState.
Don't be concerned by the fact that ActiveState is a commercial outfit.
The Tcl/Tk distribution that they provide is free as in both beer and speech.
They make their money selling services and programming tools. The ActiveTcl
distribution is currently available for: GNU/Linux, HP-UX, AIX, Solaris, Mac OS X,
and MS Windows.
For FreeBSD, Tcl is available at:
http://www.freshports.org/lang/tcl84/
Back to Top
tamilconverters-2.7.1.tgz
tamilconverters-2.7.1.tar.bz2
tamilconverters-2.7.1.zip
If you would like to be notified of new releases, subscribe to tamilconverters at Freshmeat.
Back to Top
2.7.1
- Errors in the Makefile have been corrected so that automated installation should now work properly.
- Several scripts inadvertently omitted from the package are now included.
2.7
- Adds Koln2Tamil and Tamil2Koln, which convert between the romanization used by
the Institute of Indology and
Tamil Studies at the University of Cologne and Tamil script in Unicode. Both may be
run either stand-alone or as a plugins to
xlit.
- Adds Tamil2Indicist, which converts Tamil script to Indicist (ISO 15919) transliteration.
The script may be run either stand-alone or as a plugin to
xlit.
2.6
- Adds Penn2IPA, which converts from the pure ASCII romanization used by the University
of Pennsylvania South Asian Language Resource Center and various other organizations to IPA
(from which it can in turn be converted to Tamil script if so desired by using IPA2Tamil).
2.5
- Adds several transliteration programs to the package:
- Tamil2IPA - Tamil script to International Phonetic Alphabet
- IPA2Tamil - International Phonetic Alphabet or Colloquial Tamil romanization to Tamil script
- CT2IPA - Colloquial Tamil romanization to International Phonetic Alphabet
- UnicodeName2Tamil - Unicode character names enclosed in angle brackets
(as used in POSIX locale source files) to actual Tamil characters
- Tamil2UnicodeName - Tamil script to Unicode character names enclosed in angle brackets
2.4
- Two bugs in tscii2uni were fixed.
- An option has been added to tscii2uni that enables use of the forthcoming Unicode codepoint for Tamil Digit Zero.
- GNU autoconfiguration has been set up.
- All programs now have the usual -h and -v command-line options.
- All programs now report counts of characters converted etc.
Back to Top
No bugs are known.
Back to Top
Back to Bill Poser's software page.