HINDI RELATED CONTENT | WIKIPEDIA |

Indian Script Code for Information Interchange

This text from Wikipedia is available under the Creative Commons Attribution-ShareAlike License, additional terms may apply. See Terms of Use for details. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.

Indian Standard Code for Information Interchange (ISCII) is a coding scheme for representing various writing systems of India. It encodes the main Indic scripts and a Roman transliteration. The supported scripts are: Assamese, Bengali, Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, and Telugu. ISCII does not encode the writing systems of India based on Arabic, but its writing system switching codes nonetheless provide for Kashmiri, Sindhi, Urdu, Persian, Pashto and Arabic. The Arabic-based writing systems was subsequently encoded in the PASCII encoding.

The Brahmi-derived writing systems are mostly rather similar in structure, but have different letter shapes. So ISCII encodes letters with the same phonetic value at the same codepoint, overlaying the various scripts. For example, the ISCII codes 0xB3 0xDB represent [ki]. This will be rendered as कि in Devanagari, as ਕਿ in Gurmukhi, and as கி in Tamil. The writing system can be selected in rich text by markup or in plain text by means of the ATR code described below.

One motivation for the use of a single encoding is the idea that it will allow easy transliteration from one writing system to another. However, there are enough incompatibilities that this is not really a practical idea. See About ISCII.

ISCII is a stateful 8-bit encoding. The lower 128 codepoints are plain ASCII, the upper 128 codepoints are ISCII-specific. In addition to the codepoints representing characters, ISCII makes use of a codepoint with mnemonic ATR that indicates that the following byte contains one of two kinds of information. One set of values changes the writing system until the next writing system indicator or end-of-line. Another set of values select display modes, such as bold and italic. ISCII does not provide a means of indicating the default writing system.

ISCII has not been widely used outside of certain government institutions and has now been rendered largely obsolete by Unicode. Unicode uses a separate block for each Indic writing system, and largely preserves the ISCII layout within each block.

Codepage layout

The following table shows the character set for Devanagari. The code sets for Assamese, Bengali, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, and Telugu are similar, with each Devanagari form replaced by the equivalent form in each writing system. Each character is shown with its decimal code and its Unicode equivalent.

Windows-1251
	—0	—1	—2	—3	—4	—5	—6	—7	—8	—9	—A	—B	—C	—D	—E	—F
0−	NUL 0000 0	SOH 0001 1	STX 0002 2	ETX 0003 3	EOT 0004 4	ENQ 0005 5	ACK 0006 6	BEL 0007 7	BS 0008 8	HT 0009 9	LF 000A 10	VT 000B 11	FF 000C 12	CR 000D 13	SO 000E 14	SI 000F 15
1−	DLE 0010 16	DC1 0011 17	DC2 0012 18	DC3 0013 19	DC4 0014 20	NAK 0015 21	SYN 0016 22	ETB 0017 23	CAN 0018 24	EM 0019 25	SUB 001A 26	ESC 001B 27	FS 001C 28	GS 001D 29	RS 001E 30	US 001F 31
2−	SP 0020 32	! 0021 33	" 0022 34	# 0023 35	$ 0024 36	% 0025 37	& 0026 38	' 0027 39	( 0028 40	) 0029 41	* 002A 42	+ 002B 43	, 002C 44	- 002D 45	. 002E 46	/ 002F 47
3−	0 0030 48	1 0031 49	2 0032 50	3 0033 51	4 0034 52	5 0035 53	6 0036 54	7 0037 55	8 0038 56	9 0039 57	: 003A 58	; 003B 59	< 003C 60	= 003D 61	> 003E 62	? 003F 63
4−	@ 0040 64	A 0041 65	B 0042 66	C 0043 67	D 0044 68	E 0045 69	F 0046 70	G 0047 71	H 0048 72	I 0049 73	J 004A 74	K 004B 75	L 004C 76	M 004D 77	N 004E 78	O 004F 79
5−	P 0050 80	Q 0051 81	R 0052 82	S 0053 83	T 0054 84	U 0055 85	V 0056 86	W 0057 87	X 0058 88	Y 0059 89	Z 005A 90	[ 005B 91	\ 005C 92	] 005D 93	^ 005E 94	_ 005F 95
6−	` 0060 96	a 0061 97	b 0062 98	c 0063 99	d 0064 100	e 0065 101	f 0066 102	g 0067 103	h 0068 104	i 0069 105	j 006A 106	k 006B 107	l 006C 108	m 006D 109	n 006E 110	o 006F 111
7−	p 0070 112	q 0071 113	r 0072 114	s 0073 115	t 0074 116	u 0075 117	v 0076 118	w 0077 119	x 0078 120	y 0079 121	z 007A 122	{ 007B 123	\| 007C 124	} 007D 125	~ 007E 126	DEL 007F 127
8−
9−
A−	ॐ 0950 160	ँ 0901 161	ं 0902 162	ः 0903 163	अ 0905 164	आ 0906 165	इ 0907 166	ई 0908 167	उ 0909 168	ऊ 090A 169	ऋ 090B 170	ऎ 090E 171	ए 090F 172	ऐ 0910 173	ऍ 090D 174	ऒ 0912 175
B−	ओ 0913 176	औ 0914 177	ऑ 0911 178	क 0915 179	ख 0916 180	ग 0917 181	घ 0918 182	ङ 0919 183	च 091A 184	छ 091B 185	ज 091C 186	झ 091D 187	ञ 091E 188	ट 091F 189	ठ 0920 190	ड 0921 191
C−	ढ 0922 192	ण 0923 193	त 0924 194	थ 0925 195	द 0926 196	ध 0927 197	न 0928 198	ऩ 0929 199	प 092A 200	फ 092B 201	ब 092C 202	भ 092D 203	म 092E 204	र 0930 205	य़ 095F 206	य 092F 207
D−	ऱ 0931 208	ल 0932 209	ळ 0933 210	ऴ 0934 211	व 0935 212	श 0936 213	ष 0937 214	स 0938 215	ह 0939 216	INV 217	ा 093E 218	ि 093F 219	ी 0940 220	ु 0941 221	ू 0942 222	ृ 0943 223
E−	ॆ 0946 224	े 0947 225	ै 0948 226	ॅ 0945 227	ॊ 094A 228	ो 094B 229	ौ 094C 230	ॉ 0949 231	् 094D 232	़ 093C 233	ऽ 093D 234					ATR 239
F−	EXT 240	० 0966 241	१ 0967 242	२ 0968 243	३ 0969 244	४ 096A 245	५ 096B 246	६ 096C 247	७ 096D 248	८ 096E 249	९ 096F 250

The nukta is used to create a number of characters which have precomposed forms in Unicode, as well as a number of rarer characters which don't exist in the main ISCII set, such as the Sanskrit character ॠ.

ISCII code point	Original character	Character with nukta	Unicode code point
A6 (166)	इ	ऌ	090C
A7 (167)	ई	ॡ	0961
AA (176)	ऋ	ॠ	0960
B3 (179)	क	क़	0958
B4 (180)	ख	ख़	0959
B5 (181)	ग	ग़	095A
BA (186)	ज	ज़	095B
BF (191)	ड	ड़	095C
C0 (192)	ढ	ढ़	095D
C9 (201)	फ	फ़	095E
DB (219)	ि	ॢ	0962
DC (220)	ी	ॣ	0963
DF (223)	ृ	ॄ	0944
EA (224)	ऽ	।	0964

HINDI LANGUAGE RESOURCES

LONWEB.ORG is a property of Casiraghi Jones Publishing srl
Owners: Roberto Casiraghi e Crystal Jones
Address: Piazzale Cadorna 10 - 20123 Milano - Italy
Tel. +39-02-78622122 email:
P.IVA e C. FISCALE 11603360154 • REA MILANO 1478561
Other company websites: www.englishgratis.com • www.scuolitalia.com