Tamil Discussion archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
A proposal for font encoding scheme for tamil
Dear Sujatha:
You may recall that early this year, there were extensive discussions in
tamil.net
on possible standards for font encoding and keyboard layouts. Even
though
I am not a computer scientist in any sense, I participated in many of
these
discussions, giving my viewpoints evolved over the last three years with
my interests in tamil computing. Topics discussed
were: 7-bit (128 characters) vs. 8-bit (256 character) encoding schemes,
keeping
or leaving out old style tamil characters such as the forward kokki in
lai/Nai, Ra,..
the possibe limitations to kerning in point-sale systems and a limited
character choice
that can impose reforms in the way tamil is currently written. Standards
for
Transliterated/romanized form of writing tamil were not discussed even
though, in
my opinion, this issue must be discussed concurrently with font encoding
for
tamil scripts. Many of the points are indicated/summarized in my
presentation
at the recent singapore TamilNet'97 Conference.
In tamil.net discussions and in my presentation at Singapore, I
suggested
a possible font encoding scheme that is modelled on ISO 8859-X
schemes currently used widely for handling all the european languages.
Such a scheme will be easily understood by all in the computing world,
even if they are not conversant with tamil. It can be implemented at
very
short notice and can co-exist happily even with possible Unicode
standard.
After giving due considerations to discussions held in tamil.net email
discussion
group and also those expressed publicly or privately during Singapore
Conference,
I would like to make the following propositions for possible character
choices
for a tamil font encoding scheme in a 8-bit encoding scenario.
After presentation of the character choices as such, I elaborate a bit
on the
motivations behind this choice. I think this character set is a
reasonable,
viable one, acceptable to majority of tamils irrespective of the nature
of the
tamil dtp software/font being used. I would appreciate much if you can
bring this proposal to the attention of the Tamilnadu Advisory
Committee.
I humbly request the distinguished members of the Committee to consider
this proposal in their deliberations. It may not be the perfect one for
the
committee to adopt as such but it can serve certainly as the starting
point for
further refinement if necessary. Needless to say, I am at the
disposition
of the committee for any clarifications or follow-up.
A PROPOSAL FOR POSSIBLE FONT ENCODING
SCHEME FOR TAMIL
(you need to have anjal/inaimathi font to see tamil alphabets in
tamil!!)
SCHEME:
8-BIT (256 CHARACTER SLOTS) with the standard
roman characters occupying the first 128 slots as in Latin-1 or
Lower ASCII scheme.
The scheme is modelled on ISO 8859-X schemes currently in use
(such as 8859-1/Latin -I, 8859-2/Latin-II, ..)
I leave open the issue of actual assignment of tamil characters to
the upper ASC II slot (128-255 ) for the moment.
CHARACTER CHOICES
vowels: 12
( )
consonants: 18
( )
modifiers: 10
(virama dot, O, O , O , O, O, O, O, O,
and the kokki/ hook for old style lai/Lai/Nai/nai )
unique uyirmeis
akaram eRRiya iyir 18
( )
aakara varisai 3 ( old style Raa, Naa, naa )
ikara varisai 1 ( )
iikaara varisai 1 ( )
ukara varisai 16 (ngu, NYu are omitted )
uukara varisai 16 (ngU, NYU are omitted )
aikaara varisai: 0
(use and modifier for old style lai, Nai, nai and Lai !! )
grantha: 6 ( , , , , , sri )
Note: for all grantha ones use the modifiers to get ikara,
iikara, ukara. varisais !!
and
diacritical markers: 4 (two dashes one above and one below the
character, two dots one above, one below)
total: 105
FEATURES OF THIS PROPOSED CHARACTER SET
The scheme accommodates almost all the points raised in the
tamil.net discussions. In addition, it has the follwing key features:
a) In the present proposal, kerning is invoked ONLY FOR TWO SERIES
ikara varisai using modifier and iikaara varisai using
modifier.
All other unique tamil characters (uyir, uyirmeis) are kept as such!
b) has all key grantha characters. It is proposed to use modifiers to
get all the required compound ones in the currently used form
c) has provisions to get old style lai/Lai/Nai/nai and also old style
Raa, Naa and naa.
d) has the four diacritical markers required to use along with roman
letters to write transliterated tamil in the Library of Congress
transliteration scheme
e) has still more than a dozen (12) empty slots / can include tamil
numerals or leave empty for future revisions (preferred choice)
(On windows there have been problems using characters placed
at 14-144, 160, double quotes, bullets etc...)
Muthu Nedumaran in his proposals prefer to keep as many of the
uyirmeis as such, due to difficulties in implementing kerning on
simple LCD displays/point sale systems and the need for high quality
production of tamil texts comparable to current printing.
In the present proposal, kerning is invoked ONLY FOR TWO SERIES
ikara varisai using modifier and iikaara varisai using modifier.
(All other unique tamil characters (uyir, uyirmeis) are kept as such!)
Since it is a right end modifier, there should not be any problem in
implementation. Secondly, demanding requirements of professional
printing houses can be readily met by storing high quality versions
of the entire uyirmeis in the software and calling them during the
printing process. In fact many of the tamil DTP softwares (incl.
those that use romanized/transliterated input) are of the "interpreted"
type where a given sequence of typed characters are replaced by
equivalent tamil characters. Even the displays of LCD screens in
point-scale systems are not permanent. The screen is constantly
re-written and so the complex tamil characters can be called and
displayed, as is currently done for many south asian languages.
Many of the computer professionals I talked to, confirm that this
is indeed feasible in present/today technology.
In short I do not see any serious problems in delivering high quality
outputs using the above character choices in font encoding scheme.
I may also add that, the above proposed scheme of mine is very
similar to Dr. Nandasara's proposals for tamil presented at the
recent TamilNet'97 conference held in Singapore.
With best regards,
Kalyan
(K. Kalyanasundaram)
--
*******************************************************************
Dr. K. Kalyanasundaram, |
Institute of Physical Chemistry, | Tel: 41-21-693 3622 (off)
Swiss Federal Inst. of Technology | Fax: 41-21-693 4111
CH-1015 Lausanne, Switzerland | Email:kalyan@igcsun3.epfl.ch
*******************************************************************
Home |
Main Index |
Thread Index