Tools Developed so far

In our work we found, data entry in Devnagri is not a trivial job, supporting work related to improve it is given below:

Lot of Sanskrit Data is digitally provided through ISCII or UNICODE encoding , given below are conversion utilities between ISCII and UNICODE encoding of text.

ISCII: Indian Standard Code for Information Interchange (ISCII) is the character code for Indian languages that originate from Brahmi script. ISCII was evolved by a standardization committee under the Department of Electronics during 1986-88, and adopted by the Bureau of Indian Standards (BIS) in 1991. Unlike Unicode, ISCII is an 8-bit encoding that uses escape sequences to announce the particular Indic script represented by a following coded character sequence. The ISCII document is IS13194:1991, available from the BIS offices.

UNICODE: Unicode is designed to be a multilingual encoding that requires no escape sequences or switching between scripts. For any given Indic script, the consonant and vowel letter codes of Unicode are based on ISCII.


Linguistic Tools Resources

The following URL gives a plethora of tools and achievements in the field of Indian language processing:

SIL INTERNATIONAL  has developed more than 60 pieces of software to support the language processing tasks ,  most available to the public for free download.

DESIKA : This software incorporates language generation and analysis modules for plain and accented written Sanskrit texts. It is based on the principles of ancient Indian Sciences. The analysis module, which is a general purpose Sanskrit parser currently being extended to handle compound and combined word forms dissolution and identification. This software can also analyse Vedic (scriptural) texts.      



