Tamil Discussion archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[WMASTERS] 8-bit scheme-version 1.4




________________________________________________

This week's sponsors -The Asia Pacific Internet Company (APIC)
  @  Nothing Less Than A Tamil Digital Renaissance Now   @
<http://www.apic.net> Click now<mailto:info@apic.net> for instant info
________________________________________________


Dear Friends:
Myself and Muthu felt that we have discussed quite a bit
the glyph choices (tamil glyphs, granthas, tamil numerals,
ORNL,..) that we can now go ahead and discuss a bit the
possible slot assignments.
Based on the email exchanges that I had with Muthu, I have
made a new gif labelled version 1.4 and this is now
available under the URL:
http://www.geocities.com/Athens/5180/charset14.gif

I would like to cite the following points to start the
slot assignment discussions:

i) The ordering used now is: tamil numerals, modifier glyphs
(to generate Akara, ikara, Ikara, ukara, Ukara, ekara, Ekara
and aikara varisais), akaram eRRiya mei (incl. of the corresp.
granthas), sri, ti, tI, ukara varisai, Ukara varisai and
the meis. ngu, ngU, nyu and nyU are back in the normal
ordering places even though this means that we are wasting
four useful slots. It was felt that we should have them 
to have some consistency in the entire scheme.
(If keeping the order of modifier glyphs in tact is more
important, they can be placed first followed by tamil 
numerals!)
I have implemented the suggestions of Muthu on possible
ordering but with some very minor changes. He has already 
carrried out some successful preliminary "sorting" trials 
and is happy with this ordering. So he is optimistic
on the practical viability of the scheme.

ii) To facilitate easy use in HTML/Webpages, slots at 
169 (copyright sign), 174 (Registered sign) are kept as
in Latin-1 schemes. Slot 183 (small bullet) is also kept the
same way.

iii) The following slots are identified as troublesome 
by Muthu,Ravi, Srinivasan and others: 145-151 and 160.
Slot 160 (non-breaking space in ANSI) has been left out
as is often the case in most 8-bit font schemes.
It has been noted that many applications of DTP 
(incl. of Word/WordPerfect, ClarisDraw/CoralDraw)
do automatic substitution of plain straight quotes ('), (")
by the corresponding curly ones as a default option.
In ANSI set, the single and double curly quotes are at
slots 145-148. 
In order that one does not have to check and remove the
above default option, it would be better to place these
curly quotes at these ANSI slots. 
The above constraints (ii) and (iii) has caused some
discontinuity in the ordering listed.

As you can see we have more or less used up all the slots.

If the above ordering and the selection of glyphs is
appealing, we can leave out further discussions on ORNL and
diacritical markers. These special requirements have to be
met by dedicated softwares. The design of such softwares
can be a topic in a different trend if there is some interest.

We still see the encoding scheme 1.4 as a minimal
collection of glyphs with which we can define an umbiguous
character definitions for the entire 250+ alphabets.
It is an encoding scheme that can go to make a 8-bit
self-standing general interest font that will function 
without a hitch even on primitive machines.

Early this year we discussed extensively various keyboard
input options and associated output features.
The simplest is "direct output" scenario with a one-to-one
correspondance with what you get is what you type. Tamil
typewriter keyboard and its variations work that way.
The other approach is "interpreted output" linked to
"phonetic" and "romanized/transliterated" input processes.
Here the software "interprets" the keystrokes and their
relative sequence and comes up with equivalent tamil
characters. Usage of Anjal, Kanian packages are typical
examples. 
So for input process, one can try to keyin directly 
using all glyphs in the present scheme or use just uyirs
and meis to get the uyirmeis or type in romanized format.
It is a question of personal taste/preference.
I thought of repeating these rather obvious points
so that we all can proceed along the same route.

Feel free to post your comments.
If there is general consensus on the above slot assignments,
we can then take up i) character correspondance table/map
of the present scheme to the current Unicode 2.0 scheme
for tamil (I already posted one such table last week); 
ii) possible revisions of ISCII and UNICODE Tamil schemes,
........

anbudan,
Kalyan

PS: To represent more explicitly its content, I have simply
labelled the gif as "A possible 8-bit encoding scheme
(roman-tamil-grantha), version 1.4.

--
*******************************************************************
Dr. K. Kalyanasundaram,            |
Institute of Physical Chemistry,   | Tel: 41-21-693 3622 (off)
Swiss Federal Inst. of Technology  | Fax: 41-21-693 4111
CH-1015 Lausanne, Switzerland      | Email:kalyan@igcsun3.epfl.ch
*******************************************************************


________________________________________________

Sponsors/Advertisers  needed -  please email bala@tamil.net
Check out the tamil.net web site on <http://tamil.net>
Postings to <webmasters@tamil.net>. To unsubscribe send
the text - unsubscribe webmasters - to majordomo@tamil.net
________________________________________________



Home | Main Index | Thread Index