A tutorial on Fonts for Indian
Languages: Section-3
Problems
in designing fonts for Indian languages
In respect of Indian
language text display and processing, we are confronted with the following
issues. There is no simple solution to the problems faced.
We cannot have standard fonts
for each script as the number of glyphs required to form the aksharas
properly is many more than 256 for almost all the languages, except Tamil.
Any attempt at standardization will necessarily have to compromise on the
approach to displaying the conjuncts. Worse still, this will have to be
done individually for each script.
It may be possible to agree
on a compromise set of glyphs for each script and request font designers
to place them at specified locations in the fonts. This way, some uniformity
may be possible in displaying the text for that script. However, this may
not be feasible in practice since printing requirements are fairly stringent
on the conjuncts and require that many additional glyphs be included.
An eight bit representation
of the consonants and vowels may be feasible but one will have to live
with multibyte codes for the aksharas. ISCII was an early attempt at achieving
some uniformity but this was not sustainable across all the Indian languages.
An eight bit representation will permit the use of an eight bit font and
this will be useful since most applications today can handle eight bit
fonts. However, coding only the basic vowels and consonants will restrict
the display. A separate mechanism is required to go from a multibyte representation
of a syllable to its shape. In reality, this mechanism will be extremely
complex.
We need to support automatic
transliteration to permit language independent information to be read in
any script. Here one is confronted with the identification of a global
set of aksharas. Fonts for different languages may require different number
of glyphs and it will not be possible to achieve transliteration by changing
the font face.
A one to one mapping between
an akshara and a glyph is ruled out if eight bit fonts are used. As of
now, 16 bit fonts are not handled properly in most computer systems.
There is no clue to how
many aksharas are actually needed in a language, for many new samyuktaksharas
may be formed. the set of currently used syllables may eventually increase.
It may be possible to design a font with about 300 glyphs in order to render
most of the samyuktaksharas but not many applications can work with such
fonts. Applications handling Unicode may be able to do this but Unicode
for Indian languages also implies a multibyte scheme. Special Opentype
fonts have been recommended for handling variable length codes but this
does not work well in practice. Thus Unicode fonts will not really solve
the problem.
It is not merely the aksharas
we must display. For educational needs, we also must display the symbols
for the matras along with special letters and punctuation. These should
also be coded into the character set and explicit glyphs for these must
be provided. Due to the variable physical width of the aksharas themselves,
more than one representation for a matra may be needed (see the glyphs
of the Xdvng font).
Unicode
Fonts for Indian languages
Unicode has emerged
as a meaningful standard for multilingual work and developers all over
the world are taking advantage of the help provided by Operating systems
to work with 16 bit codes. It turns out that Unicode does offer solutions
for scripts which are based on syllables but the solutions depend on the
availability of API to deal with Unicode strings. Unfortunately, the support
for Unicode in respect of Indian languages has not been incorporated to
any satisfactory extent in almost all the systems. The issues involved
in using Unicode for computing with Indian languages are quite complex
and cannot be discussed in this tutorial. This web site has a comprehensive
set of pages devoted to this subject. Unicode
for Indian languages is discussed in detail in the linked pages. The fonts
issue is included as an important topic for discussion.
Back
to Contents
|
Sections
The
concept of the font
Font
encoding
Fonts
for Indian languages/scripts
Problems in designing fonts
for our languages/scripts
Unicode
fonts for Indian languages
Fonts
as used in web pages
In
summary
The phonetic base of the Indian
languages allows sounds to be displayed using specific scripts. However,
the shape for a sound may have to be built from many glyphs if the akshara
is a conjunct.
On account of this, it is
not possible to define a character set and an accompanying encoding scheme
that would apply to all the scripts uniformly.
There is a need to look at
this problem with some seriousness if we need to develop a standard scheme
for displaying Indian scripts using eight bit fonts.
Unicode does not specify
a meaningful character set for any of the Indian languages. The font for
use with Unicode has to provide for additional flexibilities in being able
to map a Unicode string for a syllable to the required shape of the syllable.
Conventional fonts will not be able to satisfy this requirement. Opentype
fonts are offered as the solution but these will not really provide an
answer. |