Multilingual Systems

Single truetype font for all the Indian scripts

  The discussions relating to Unicode for Indian languages have given us some insights into the problems of rendering Unicode text strings. As already observed, current thinking (as of this writing) is to use Opentype fonts to render the text. Open type fonts may appear to offer all the advantages one looks for in rendering a syllable consisting of multibyte Unicode representation. Unfortunately, the use of Opentype fonts requires that the application take responsibility to handle the rendering of text, a requirement which is very difficult to satisfy. 

  The variations resulting as a consequence of different applications rendering the same text string differently are too painful to accept since electronic processing (text processing) becomes quite complicated.
It is unlikely that a single Opentype standard will emerge that will handle a given script satisfactorily.

   Unicode provides a range of code values for user defined codes and this range can be utilized for creating fonts that will adequately handle the scripts and also cater to a rich set of samyuktakshars. The discussion on fonts has given us an idea of how a normal truetype font can be used for Indian scripts. The only requirement here is that the rendering program should be able to handle zero width glyphs properly. This way, one can compose a samyuktakshar using overlapped glyphs whose index values can be obtained through a look up table.

   SDL has indeed generated one such font consisting of glyphs for all the ten scripts (Devanagari, Tamil, Telugu, Kannada, Malayalam, Oriya, Bengali, Gujarati, Punjabi and IPA). This font has about 240 glyphs devoted to each script located in the region E000-E9FF. It is a truetype font which gets rendered properly on Windows systems (XP/2000). Under Linux as well as on a Mac, the operating system's font rendering module does not seem to recognize the User Defined area as a valid range.

  The IITM designed common Unicode font can be used with browsers that handle UTF-8. The font is about 800 Kilobytes in size and can be freely downloaded from this site. A sample UTF-8 file consisting of the text of Vande Mataram in all the scripts is available for checking the correctness of the rendered text.

Download the common iitmuni.ttf font  (handles all the ten scripts)

The text of Vande Mataram in all the scripts served as a Unicode page (UTF-8 format). Your browser should automatically show the text if it is set to use user specified fonts. Otherwise, manually set the encoding to utf-8 from the view menu.


Features

Single font with a provision for upto 255 glyphs for each script. Current implementation has about 190 for each script, matching the index values for the Latin-1 (8859-1) character encoding.

Punctuation can be made uniform across the scripts.

Syllables are rendered by overlapping zero width glyphs. 

For most scripts a single glyph works for the vowels and consonants and multiple glyphs are required only for complex samyuktakshars. For instance, some aksharas in Telugu or Kannada may require about six to eight glyphs, a maximum of three of them overlapping.

Glyphs are located in the user defined range E000-E9FF.