Hindi–Urdu phonology

This text from Wikipedia is available under the Creative Commons Attribution-ShareAlike License, additional terms may apply. See Terms of Use for details. Wikipedia is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.

Modern Standard Hindi is the official language of India, while Urdu is the national language of Pakistan as well as a scheduled language in India. The two are often held as separate languages on the bases of higher vocabulary choice (and thus mutual intelligibility) as well as cultural orientation; however on a linguistic basis they are two standardized registers of a single subdialect, that being the Khari boli dialect of Delhi (a pluricentric language). In keeping with such a linguistic analysis, Hindi and Urdu occupy a single descriptive phonology page, with attention paid to phonological variations between the two registers, and associated dialects, wherever they arise.


The oral vowel phonemes of Hindi according to Ohala (1999:102)

Hindi/Urdu natively possesses a symmetrical ten-vowel system. The vowels: [ə], [ɪ], [ʊ] are always short in length, while the vowels: [aː, iː, uː, eː, oː, ɛː, ɔː] are always considered long (but see the details below). Among the close vowels, what in Sanskrit are thought to have been primarily distinctions of vowel length (that is /i ~ iː/ and /u ~ uː/) have become in Hindi/Urdu distinctions of quality, or length accompanied by quality (that is, /ɪ ~ iː/ and /ʊ ~ uː/). The historical opposition of length in the close vowels has been neutralized in word-final position, for example Sanskrit loans śakti (शक्ति — شکتی 'energy') and vastu (वस्तु — وستو 'item') are /ʃəkt̪i/ and /ʋəst̪u/, not /ʃəkt̪ɪ/ and /ʋəst̪ʊ/.

The vowel represented graphically as ऐ (Romanized as ai) has been variously transcribed as [ɛ] or []. Among sources for this article, Ohala (1999), pictured to the right, uses [ɛ], while Shapiro (2003:258) and Masica (1991:110) use []. Furthermore, an eleventh vowel // is found in English loanwords, such as /bʈ/ ('bat'). Hereafter, this will be represented as [ɛ] to distinguish it from this separate //. The open central vowel is often transcribed in IPA by either [aː] or [ɑː]. Despite this, the Hindi-Urdu vowel system is quite similar to that of English, in contrast to the consonants.

The standard educated Delhi pronunciations [ɛ, ɔ] have common diphthongal realizations, ranging from [əɪ] to [ɑɪ] and from [əu] to [ɑu], respectively, in eastern Hindi dialects and many non-standard western dialects. In addition, [ɛ] occurs as a conditioned allophone of /ə/ (schwa) in proximity to /h/, if and only if the /h/ is surrounded on both sides by two schwas.

For example, in /kəh(ə)naː/ (कहना — کہنا 'to say'), the /h/ is surrounded on both sides by schwa, hence both the schwas will become fronted to short [ɛ], giving the pronunciation [kɛh(ɛ)naː]. Syncopation of phonemic middle schwa can further occur to give [kɛh.naː]. The fronting also occurs in word-final /h/, presumably because a lone consonant carries an unpronounced schwa. Hence, /kəh(ə)/ (कह — کہ 'say!') becomes [kɛh] in actual pronunciation. However, the fronting of schwa does not occur in words with a schwa only on one side of the /h/ such as /kəhaːniː/ (कहानी — کہانی 'a story') or /baːhər/ (बाहर — باہر 'outside').

As in French, there are nasalized vowels in Hindi-Urdu. There is disagreement over the issue of the nature of nasalization (barring English-loaned // which isn't nasalized). Masica (1991:117) presents four differing viewpoints:

  1. there are no *[ẽ] and *[], possibly because of the effect of nasalization on vowel quality;
  2. there is phonemic nasalization of all vowels;
  3. all vowel nasalization is predictable (i.e. allophonic);
  4. Nasalized long vowel phonemes (/ĩː ẽː ɛ̃ː ɑ̃ː ɔ̃ː ː ũː/) occur word-finally and before voiceless stops; instances of nasalized short vowels ([ɪ̃ ə̃ ʊ̃]) and of nasalized long vowels before voiced stops (the latter, presumably because of a deleted nasal consonant) are allophonic.

Masica supports this latter view.


Hindi/Urdu has a core set of 28 consonants inherited from earlier Indo-Aryan. Supplementing these are 2 consonants that are internal developments in specific word-medial contexts, and 7 consonants originally found in loan words, whose expression is dependent on factors such as status (class, education, etc.) and cultural register (Modern Standard Hindi vs Urdu).

Most native consonants may occur geminate (doubled in length; exceptions are /bʱ, ɽ, ɽʱ, ɦ/). Geminate consonants are always medial and preceded by one of the interior vowels (that is, /ə/, /ɪ/, or /ʊ/). They all occur monomorphemically except [ʃː], which occurs only in a few Sanskrit loans where a morpheme boundary could be posited in between (i.e. /nɪʃ + ʃil/ for [nɪʃːil] 'without shame').

For the English speaker, a notable feature of the Hindi/Urdu consonants is that there is a four-way distinction of phonation among plosives, rather than the two-way distinction found in English. The phonations are:

  1. tenuis, as /p/, which is like ‹p› in English spin
  2. voiced, as /b/, which is like ‹b› in English bin
  3. aspirated, as /pʰ/, which is like ‹p› in English pin, and
  4. murmured, as /bʱ/.

The last is commonly called "voiced aspirate", though Shapiro (2003:260) notes that,

"Evidence from experimental phonetics, however, has demonstrated that the two types of sounds involve two distinct types of voicing and release mechanisms. The series of so-called voice aspirates should now properly be considered to involve the voicing mechanism of murmur, in which the air flow passes through an aperture between the arytenoid cartilages, as opposed to passing between the ligamental vocal bands."

The murmured consonants are quite a faithful preservation of these sounds right from Proto-Indo-European, a distinction that was lost in all branches of the Indo-European family except Indo-Aryan. In the IPA, the five murmured consonants can also be transcribed as /b̤/, /d̪̤/, /ɖ̈/, /dʒ̈/ and /ɡ̈/ respectively.

  Bilabial Labio-
Retroflex Post-alv./
Velar Uvular Glottal
Nasal m   n (ɳ)        
Plosive p


Fricative   f   s z   ʃ   (x) (ɣ)       ɦ
Tap or Flap     ɾ (ɽ)
Approximant   ʋ l   j      
Table: Consonants of Hindi and Urdu. Marginal and non-universal phonemes are in parentheses.

Stops in final position are not released; /ʋ/ varies freely as [v], and can also be pronounced [w]; /ɾ/ can surface as a trill [r], and geminate /ɾː/ is always a trill, e.g. [zəɾaː] (ज़रा — زرا 'little') versus well-trilled [zəraː] (ज़र्रा — ذرّہ 'dust'). The palatal and velar nasals [ɲ, ŋ] occur only in consonant clusters, where each nasal is followed by a homorganic stop, as an allophone of a nasal vowel followed by a stop, and in Sanskrit loanwords. There are murmured sonorants, [lʱ, ɾʱ, mʱ, nʱ], but these are considered to be consonant clusters with /ɦ/ in the analysis adopted by Ohala (1999).

The palatal affricates and sibilant are variously classified by linguists as palatal or post-alveolar or palato-alveolar, hence the sound represented by grapheme श can be transcribed as [ʃ] or [ɕ], and the grapheme च can be transcribed as [tʃ], [cɕ], [tɕ] or even plosive [c]. However, in this article, the sounds are transcribed as [ʃ] and [tʃ] respectively. The fricative /h/ in Hindi-Urdu is typically voiced (as [ɦ]), especially when surrounded by vowels, but there is no phonemic difference between this voiceless fricative and its voiced counterpart (Hindi-Urdu's ancestor Sanskrit has such a phonemic distinction).

Hindi-Urdu also has a phonemic difference between the dental plosives and the so-called retroflex plosives. The dental plosives in Hindi-Urdu are pure dentals and the tongue-tip must be well in contact with the front teeth, and have no alveolar articulation like the /t/ and /d/ of English. The retroflex series is not purely retroflex; it actually has an apico-postalveolar (also described as apico-pre-palatal) articulation, and sometimes in words such as /ʈuːʈaː/ (टूटा — ٹُوٹا 'broken') it even becomes alveolar.

 External borrowing

Loanwords from Sanskrit reintroduced /ɳ/ (marked orange in the chart) into formal Modern Standard Hindi. In casual speech it is usually replaced by /n/. It does not occur initially and has a nasalized flap [ɽ̃] as a common allophone.

Loanwords from Persian (including some words which Persian itself borrowed from Arabic or Turkish) introduced five consonants, /f, z, q, x, ɣ/. Being Persian in origin, these are seen as a defining feature of Urdu, although these sounds officially exist in Hindi and modified Devanagari characters are available to represent them. Among these, /f, z/, also found in English and Portuguese loanwords, are now considered well-established in Hindi; indeed, /f/ appears to be encroaching upon and replacing /pʰ/ even in native (non-Persian, non-English) Hindi words.

The other three Persian loans, /q, x, ɣ/, (marked green in the chart), are still considered to fall under the domain of Urdu, and are also used by many Hindi speakers; however, some Hindi speakers assimilate these sounds to /k, kʰ, ɡ/ respectively. The sibilant /ʃ/ is found in loanwords from all sources (English, Persian, Sanskrit) and is well-established. The failure to maintain /f, z, ʃ/ by some Hindi speakers (often non-urban speakers who confuse them with /pʰ, dʒ, s/) is considered nonstandard. Yet these same speakers, having a Sanskritic education, may hyperformally uphold /ɳ/ and [ʂ]. In contrast, for native speakers of Urdu, the maintenance of /f, z, ʃ/ is not commensurate with education and sophistication, but is characteristic of all social levels.

The plosives /ɖ, ɖʱ/ are realized as such initially, geminated, and postnasally; as flapped allophones [ɽ, ɽʱ] intervocally, finally, and before or after other consonants. However, the adoption of English loans with alveolar stops, which are identified with Hindi/Urdu retroflex rather than dental stops (cf. "bat" above), has led to the emergence of minimal environments (e.g. intervocalic and final [ɖ]), thus conferring marginal phonemic status to the flaps.

Being the main sources from which Hindi/Urdu draws its higher, learned terms– English, Sanskrit, Arabic, and to a lesser extent Persian provide loanwords with a rich array of consonant clusters. The introduction of these clusters into the language in fact contravenes an historical tendency within its native core vocabulary to eliminate clusters through processes such as cluster reduction and epenthesis. Schmidt (2003:293) lists distinctively Sanskrit/Hindi biconsonantal clusters of initial /kɾ, kʃ, st̪, sʋ, ʃɾ, sn, nj/ and final /t̪ʋ, ʃʋ, nj, lj, ɾʋ, dʒj, ɾj/, and distinctively Perso-Arabic/Urdu biconsonantal clusters of final /ft̪, ɾf, mt̪, mɾ, ms, kl, t̪l, bl, sl, t̪m, lm, ɦm, ɦɾ/.

 Suprasegmental features

Hindi-Urdu has a stress accent, but it is not so important as in English. To predict stress placement, the concept of syllable weight is needed:

  • A light syllable (one mora) ends in short vowel /ə, ɪ, ʊ/: V
  • A heavy syllable (two moras) ends in a long vowel /aː, iː, uː, eː, ɛː, oː, ɔː/ or in a short vowel and a consonant: VV, VC
  • An extra-heavy syllable (three moras) ends in a long vowel and a consonant, or a short vowel and two consonants: VVC, VCC

Stress is on the heaviest syllable of the word, and in the event of a tie, on the last such syllable. However, the final mora of the word is ignored when making this assignment (Hussein 1997) [or, equivalently, the final syllable is stressed either if it is extra-heavy, and there is no other extra-heavy syllable in the word or if it is heavy, and there is no other heavy or extra-heavy syllable in the word]. For example, with the ignored mora in parentheses (Hayes 1995:276ff):

aːs.ˈmaːn.dʒaː(h) ~ ˈaːs.mː.dʒaː(h)

Content words in Hindustani normally begin on a low pitch, followed by a rise in pitch. Strictly speaking, Hindi-Urdu, like most other Indian languages, is rather a syllable-timed language. The schwa /ə/ has a strong tendency to vanish into nothing (syncopated) if its syllable is unaccented.


  1. Hindi - A General Introduction
  2. Hindi-Urdu Grammar
  3. Standard Hindi
  4. Hindi Languages
  5. Devanagari (Hindi Script)
  6. Hindi Belt
  7. Hindi–Urdu phonology
  8. National Library at Kolkata romanization
  9. Khariboli
  10. Acharya Ramlochan Saran
  11. Hindustani orthography
  12. Awadhi language
  13. Bambaiya Hindi
  14. Braj Bhasha
  15. Fiji Hindi
  16. Urdu
  17. Hindi–Urdu controversy
  18. Hindustani (Hindi-Urdu) word etymology
  19. Hindustani orthography
  20. Hindi-Urdu Grammar
  21. India
  22. Hobson-Jobson
  23. Languages with official status in India
  24. Linguistic history of India
  25. List of English words of Hindi or Urdu origin
  26. List of English words of Sanskrit origin
  27. Prakrit
  28. Sanskritisation
  29. Devanagari transliteration
  30. Indian Script Code for Information Interchange
  31. Hindi phrasebook - Wikitravel
  32. Learning Devanagari


LONWEB.ORG is a property of Casiraghi Jones Publishing srl
Owners: Roberto Casiraghi and Crystal Jones
Address: Piazzale Cadorna 10 - 20123 Milano - Italy
Tel. +39-02-36553040 - email:
P.IVA e C. FISCALE 11603360154 REA MILANO 1478561
Other company websites: