Hindustani orthography

Hindustani (Standard Hindi and Urdu) has been written in several different scripts. Most Hindi texts are written in the Devanagari script, which is derived from the Brāhmī script of Ancient India. Most Urdu texts are written in the Urdu alphabet, which comes from the Perso-Arabic script. Hindustani has been written in both scripts. In recent years the Roman alphabet has been used in these languages for technological or internationalization reasons.

Devanagari script

The Devanagari script is an abugida writing system, so the written form of consonants have an inherent default vowel afterward, namely aschwa. In certain contexts, such as at the end of words, these schwas are deleted in correct Hindi pronunciation, in a phenomenon called the schwa syncope. Devanagari consonants which have other vowels afterward use diacritical marks around the consonant. The script is written from left to right, with a top-bar connecting the letters together.

a ā i ī u ū e ai o au
ख़ ग़    
k x ɡ ɠ ɣ   ɡʱ   ŋ
c   ɟ ʄ z   ɟʱ   ɲ
  ड़   ढ़
ʈ ʈʰ   ɖ ɗ ɽ   ɖʱ ɽʱ ɳ
t   d     n
फ़ ॿ    
p f b ɓ     m
j r l ʋ  
ɕ ʂ s h  

Schwa deletion

The schwa (अ or 'ə', sometimes written as 'a') implicit in each consonant of the Devanagri script is "obligatorily deleted" in Hindi at the end of words and in certain other contexts. This phenomenon has been termed the "schwa syncope rule" or the "schwa deletion rule" of Hindi.One formalization of this rule has been summarized as ə -> ø | VC_CV. In other words, when a vowel-preceded consonant is followed by a vowel-succeeded consonant, the schwa inherent in the first consonant is deleted. However, this formalization is inexact and incomplete (i.e. sometimes deletes a schwa when it shouldn't or, at other times, fails to delete it when it should), and can yield errors. Schwa deletion is computationally important because it is essential to building text-to-speech software for Hindi.

As a result of schwa syncope, the correct Hindi pronunciation of many words differs from that expected from a literal rendering of Devanagari. For instance, राम is Rām (incorrect: Rāma), रचना is Rachnā (incorrect: Rachanā), वेद is Véd (incorrect: Véda) and नमकीन is Namkeen(incorrect Namakeen).

Perso-Arabic script

The Perso-Arabic script is an extension of the Arabic alphabet. It is written from right to left, and most letters connect to each other. This leads to different forms of a letter depending on its position in a word, although the different forms generally resemble each other. Most vowelsare omitted in normal texts, although they may be written for disambiguation or pedagogical purposes. Urdu is primarily written in acalligraphic style of the script called Nasta'liq.

جھ ڄ ج پ ث ٺ ٽ ٿ ت ڀ ٻ ب ا
ɟʱ ʄ ɟ p s ʈʰ ʈ t ɓ b *
ڙ ر ذ ڍ ڊ ڏ ڌ د خ ح ڇ چ ڃ
ɽ r z ɖʱ ɖ ɗ d x h c ɲ
ڪ ق ڦ ف غ ع ظ ط ض ص ش س ز
k q f ɣ z t z s  ? s z
  ي ه و ڻ ن م ل ڱ گھ ڳ گ ک
* h * ɳ n m l ŋ ɡʱ ɠ ɡ

Romanized Hindustani

The Latin alphabet has been used to write Hindustani for technological or internationalization reasons. Roman Urdu uses the Basic Latin alphabet. It is most commonly used by young native speakers for technological applications, such as chat, emails and SMS.

ITRANS, ISCII, IAST, and Harvard-Kyoto romanization schemes have been employed primarily for usage by non-native speakers who are more familiar with the Latin alphabet.


