The discussions
relating to Unicode for Indian languages have given us some insights into
the problems of rendering Unicode text strings. As already observed, current
thinking (as of this writing) is to use Opentype fonts to render the text.
Open type fonts may appear to offer all the advantages one looks for in
rendering a syllable consisting of multibyte Unicode representation. Unfortunately,
the use of Opentype fonts requires that the application take responsibility
to handle the rendering of text, a requirement which is very difficult
to satisfy.
The variations resulting
as a consequence of different applications rendering the same text string
differently are too painful to accept since electronic processing (text
processing) becomes quite complicated.
It is unlikely that a single
Opentype standard will emerge that will handle a given script satisfactorily.
Unicode provides
a range of code values for user defined codes and this range can be utilized
for creating fonts that will adequately handle the scripts and also cater
to a rich set of samyuktakshars. The discussion on fonts has given us an
idea of how a normal truetype font can be used for Indian scripts. The
only requirement here is that the rendering program should be able to handle
zero width glyphs properly. This way, one can compose a samyuktakshar using
overlapped glyphs whose index values can be obtained through a look up
table.
SDL has indeed
generated one such font consisting of glyphs for all the ten scripts (Devanagari,
Tamil, Telugu, Kannada, Malayalam, Oriya, Bengali, Gujarati, Punjabi and
IPA). This font has about 240 glyphs devoted to each script located in
the region E000-E9FF. It is a truetype font which gets rendered properly
on Windows systems (XP/2000). Under Linux as well as on a Mac, the operating
system's font rendering module does not seem to recognize the User Defined
area as a valid range.
The IITM designed
common Unicode font can be used with browsers that handle UTF-8. The font
is about 800 Kilobytes in size and can be freely downloaded from this site.
A sample UTF-8 file consisting of the text of Vande Mataram in all the
scripts is available for checking the correctness of the rendered text.
Download
the common iitmuni.ttf font (handles all the ten scripts)
The text
of Vande Mataram in all the scripts served as a Unicode page (UTF-8
format). Your browser should automatically show the text if it is set to
use user specified fonts. Otherwise, manually set the encoding to
utf-8 from the view menu.