![]() |
![]() ![]() |
![]() |
![]() |
![]() |
![]() |
| Home --> Software Design Issues --> Transliteration principles |
Transliteration Principles Transliteration refers to the process by which one reads and pronounces the words and sentences of one language using the letters and special symbols of another language. Thus transliteration is meant to preserve the sounds of the syllables in words. Transliteration is helpful in situations where one does not know the script of a language but knows to speak and understand the language nevertheless.
Sometimes phonetics symbols are used in place of the normal Roman letters. Phonetic symbols are basically the letters of the Roman alphabet with special marks known as diacritic marks. Here are some examples of transliteration using symbols from the phonetic alphabet. In the second set of aksharas shown below, one sees the use of special symbols from the ascii character set in place of diacritics. The primary difficulty in data entry of the phonetic symbols is that there is no provision to input the symbols directly using the standard ASCII keyboard. Desktop publishing and word processing programs provide means by which the glyph code of the symbol is input using the numeric keypad. While this is acceptable, it does not provide a natural approach. Transliteration methods which use only the displayable ASCII symbols do not run into this problem since the ASCII letters can be typed in directly. A special computer program would however be required to interpret the input string to produce the Indian languages display or printout. This is precisely what the currently popular transliteration schemes attempt. Schemes such as ITRANS, RIT, ADHAWIN etc., use only the standard displayable ASCII letters and symbols to transliterate the text. These schemes allow multiple representations for certain syllables and long vowels but the processing program handles this well.
While transliteration based data input is very useful, one must remember that the schemes themselves vary, even for a given language. The consequence of this is that the data entry procedures will change depending on the scheme and worse still, a given transliterated string will produce different outputs for different languages/scripts. Take for instance the word 'yoga' . The transliterated data input for this string using the "ITRANS" scheme is "yogA". However, when you use this string to get an output in Tamil, using other schemes, you will get String Processing using transliterated text. One useful feature of transliterated representation of Indian Language strings is that conventional string processing programs may be used to process the text. However, applications such as sorting will produce erroneous results as the sorting order of the Aksharas and Roman letters are quite different. Many string processing applications such as processing a sentence may however work properly, so long as the input strings do not contain special characters which are needed for transliteration but can cause confusion if they happen to be delimiters fixed for parsing routines. Transliteration features in the IITMadras Software. The Multilingual software from IIT Madras has incorporated features to help deal with transliterated text. The multilingual editor has a data entry method that directly allows transliterated text to be typed in and the text viewed in local scripts. A .llf file is also automatically created by the editor. Those familiar with ITRANS based input will find this feature helpful. We also have some utilities for viewing and converting transliterated text. Additional information about the utilities is available. |
The primary aim of transliteration is to provide an alternate means of reading text using a different script. Transliteration is meant to preserve the sounds. In practice, the same word may be written differently in different scripts due to the local conventions employed for pronouncing the aksharas. In the northern scripts, the absence of the halanth is seen quite frequently but a person will understand that what is shown is really a generic consonant. The Southern scripts are explicit in the use of the halanth. In India, English is also transliterated into the different Indian scripts with amusing results! Finding exactly matching aksharas is difficult for some of the vowels and a few consonants. A word in English, transliterated into say Hindi or Bengali, would have changed considerbly when transliterated again into another script (typically Southern scripts). Some of the problems encountered in practice are covered in a separate page titled "Vagaries of Transliteration". |
Acharya Logo |
Local Time: 16 47 43 Kali Year 5112 Month: Makaram , Day:26 Star: Magha |