The most important point to observe here is that, the display of all the
scripts of Indian languages can be comfortably handled with just about
150 to 230 glyphs in the font for each script and the data entry process
consists of specifying the Glyph codes for each akshara that should appear
on a printed page. 230 glyphs are feasible only on some platforms which
correctly understand the Font Encoding. The MacIntosh, for instance has
always featured Fonts with close to 250 glyphs and has been a favourite
for DTP work. Microsoft Windows fonts allow about 210 glyphs, still considered
a rich set when compared to ISO8859-1 fonts which support only about 190.
Applications which rely
on the availability of specific (and well designed) eight bit fonts for
printing Indian language documents use a text representation that relates
to the Glyphs in the font and not the vowels and consonants in the language.
Typesetting
packages based on TEX use a simple ASCII based input document consisting
of macro definitions for each akshara or in some cases, just plain glyph
codes. Data entry is often effected using a transliteration scheme which
allows the user to enter the text using ASCII equivalents (names) for the
aksharas and a special preprocessing program will convert the same into
a proper TeX document. TeX fonts have the advantage that almost all the
256 glyphs in the font may be output which is something not realizable
with fonts used in word processors. Typesetting with TeX can result in
superior quality of the documents with carefully controlled spacing of
lines but the process is very time consuming and a uniform approach across
all Indian languages cannot be taken. Besides, typesetting using TeX will
not permit a "WYSIWYG" user interface. TeX documents are usually seen or
printed using PostScript.
Typesetting
using DTP packages which make use of standard fonts (typically Truetype
fonts under Windows) relies on data entry based on glyph codes. In
some cases, the software permits data entry through macros where one need
not worry about glyph codes but may concentrate on the aksharas to be input.
Most DTP packages will require that specific fonts be used and one often
runs into the problem of not being able to produce certain conjunct representations
either due to the absence of the corresponding glyphs in the font or due
to the inherent limitations in the macro.
In either
case, whether a typesetting program is used or a word processor, the internal
representation of the text is in a form specific to the word processor
or program and varies widely. Worse still, the internal representations
will include extensive formatting information, rendering it quite difficult,
if not impossible, for us to discern the Indian language text in the document.
Thus the most important limitation imposed by Typesetting or word processing
software used in the preparation of Indian language documents is their
inability to provide an internal representation suited to effective electronic
processing of the text consistent with the syllabic writing systems followed
for the languages.
Secondly,
since these packages are designed to be run on specific platforms, moving
documents across different platforms becomes quite difficult. Even today,
the full power of TeX is realized only on Unix Systems and the useful features
of Word or WordPerfect may be utilized only under Microsoft Windows and
to a limited extent on the Mac.
It must
however be admitted that more than 90% of the requirements for Indian
language applications relate to plain publishing and the available packages
seem to adequately meet the requirements.
We might
mention here that almost all these packages which permit multilingual printing,
do not permit text entered in one script to be automatically printed in
other scripts, as they do not support correct transliteration. Invariably
there is a need to enter the same text a second time (in a different script).
This multilingual presentation of the same text through transliteration
is very useful when common information is sent across different states
of the country.
Problems
faced with Indian language fonts
DTP and
Typesetting packages rely on a specific set of fonts to produce a printout.
Often it turns out that the glyphs in the font do not accommodate the representations
for some conjuncts, specifically some which have four consonants in them.
The non availability of glyphs for specific conjuncts poses frequent limitations
in getting printouts. Further, even standard fonts used under Windows,
may not be rendered properly under Unix, due to the variations in font
encodings supported on different systems. In most fonts, the first thirty
two glyphs are not rendered and often the 32 locations from 128 through
159 are not likely to be rendered.
The variations
in the minimum number of glyphs required to form an akshara in a specific
script create a basic problem in respect of Indian language text. Any internal
representation, which relies on glyph codes, will not work across languages
and unless one standardizes font glyph locations for a script, software
to render the text will become font dependent. The aksharas of Indian
scripts vary significantly in width from very narrow aksharas to very wide
glyphs. Designing Glyphs to accommodate widely varying character widths,
that too to be formed from multiple glyphs is a tricky job.
Among
the font families, Metafont offers the maximum flexibility to accommodate
the requirements for Indian scripts. The Metafont given by Charles Wikner
represents a design that is thorough in respect of rendering more than
a thousand conjuncts of Sanskrit. Unfortunately, the fonts used for Windows
and Unix, restrict the number of glyphs and hence the rich set of conjuncts
supported by Wikner's Metafont, cannot be realized in practice except through
TeX, which does not admit of WYSIWYG features.
Unicode
Fonts for Indian Languages
In recent years,
Unicode has gained significance for multilingual work and applications
supporting Unicode for Indian languages have been created (Mostly by Microsoft).
Yet, on account of the variable length nature of Unicode strings for a
syllable, conventional fonts cannot be used. Special Open Type fonts are
required and these are rather difficult to design. It also happens that
an application supporting Unicode should build into itself the rules for
generating the shape for the syllable. Since these shapes are a choice
of the font designer, consistent displays are difficult to get. It is clearly
known that applications render the same Unicode text in different ways.
This goes against the principle that text representation should not be
tied to display requirements, if uniform handling of text is to be accomplished
across applications.