Global Information Lookup Global Information

Arabic script in Unicode information


Many scripts in Unicode, such as Arabic, have special orthographic rules that require certain combinations of letterforms to be combined into special ligature forms. In English, the common ampersand (&) developed from a ligature in which the handwritten Latin letters e and t (spelling et, Latin for and) were combined.[1] The rules governing ligature formation in Arabic can be quite complex, requiring special script-shaping technologies such as the Arabic Calligraphic Engine by Thomas Milo's DecoType.[2]

As of Unicode 15.1, the Arabic script is contained in the following blocks:[3]

  • Arabic (0600–06FF, 256 characters)
  • Arabic Supplement (0750–077F, 48 characters)
  • Arabic Extended-B (0870–089F, 41 characters)
  • Arabic Extended-A (08A0–08FF, 96 characters)
  • Arabic Presentation Forms-A (FB50–FDFF, 631 characters)
  • Arabic Presentation Forms-B (FE70–FEFF, 141 characters)
  • Rumi Numeral Symbols (10E60–10E7F, 31 characters)
  • Arabic Extended-C (10EC0-10EFF, 3 characters)
  • Indic Siyaq Numbers (1EC70–1ECBF, 68 characters)
  • Ottoman Siyaq Numbers (1ED00–1ED4F, 61 characters)
  • Arabic Mathematical Alphabetic Symbols (1EE00–1EEFF, 143 characters)

The basic Arabic range encodes the standard letters and diacritics, but does not encode contextual forms (U+0621–U+0652 being directly based on ISO 8859-6); and also includes the most common diacritics and Arabic-Indic digits. The Arabic Supplement range encodes letter variants mostly used for writing African (non-Arabic) languages. The Arabic Extended-B and Arabic Extended-A ranges encode additional Qur'anic annotations and letter variants used for various non-Arabic languages. The Arabic Presentation Forms-A range encodes contextual forms and ligatures of letter variants needed for Persian, Urdu, Sindhi and Central Asian languages. The Arabic Presentation Forms-B range encodes spacing forms of Arabic diacritics, and more contextual letter forms. The presentation forms are present only for compatibility with older standards, and are not currently needed for coding text.[4] The Arabic Mathematical Alphabetical Symbols block encodes characters used in Arabic mathematical expressions. The Indic Siyaq Numbers block contains a specialized subset of Arabic script that was used for accounting in India under the Mughal Empire by the 17th century through the middle of the 20th century.[5][6] The Ottoman Siyaq Numbers block contains a specialized subset of Arabic script, also known as Siyakat numbers, used for accounting in Ottoman Turkish documents.[6]

  1. ^ "What is the origin of the ampersand (&)?"
  2. ^ unicode.org Biography: Thomas Milo - DecoType
  3. ^ "UAX #24: Script data file". Unicode Character Database. The Unicode Consortium.
  4. ^ "Section 9.2: Arabic, Arabic Presentation Forms-B" (PDF). The Unicode Standard. The Unicode Consortium. September 2022.
  5. ^ Pandey, Anshuman (2015-11-05). "L2/15-121R2: Proposal to Encode Indic Siyaq Numbers" (PDF).
  6. ^ a b "Chapter 22: Symbols". The Unicode Standard, Version 15.0 (PDF). Mountain View, CA: Unicode, Inc. September 2022. ISBN 978-1-936213-32-0.

and 23 Related for: Arabic script in Unicode information

Request time (Page generated in 0.8473 seconds.)

Arabic script in Unicode

Last Update:

Many scripts in Unicode, such as Arabic, have special orthographic rules that require certain combinations of letterforms to be combined into special ligature...

Word Count : 1013

Hanifi Rohingya script

Last Update:

The Hanifi Rohingya script is a unified script for the Rohingya language. Rohingya today is written in three scripts, Hanifi, Arabic (Rohingya Fonna), and...

Word Count : 575

Arabic alphabet

Last Update:

al-ʿarabiyyah), or Arabic abjad, is the Arabic script as specifically codified for writing the Arabic language. It is written from right-to-left in a cursive style...

Word Count : 5448

Persian alphabet

Last Update:

also known as the Perso-Arabic script, is the right-to-left alphabet used for the Persian language. It is a variation of the Arabic alphabet with five additional...

Word Count : 2168

Adlam script

Last Update:

taken from the Arabic script.[better source needed] The shape of the initial marks changed in 2019 as part of the efforts for Unicode standardization...

Word Count : 777

Urdu alphabet

Last Update:

other writing systems derived from the Arabic script, Urdu uses the 0600–06FF Unicode range. Certain glyphs in this range appear visually similar (or...

Word Count : 3430

Brahmic scripts

Last Update:

Anshuman (2 November 2015). "L2/15-233: Proposal to encode the Makasar script in Unicode" (PDF). Datta, Amaresh (1987). Encyclopaedia of Indian Literature...

Word Count : 1824

Arabic script

Last Update:

The Arabic script is the writing system used for Arabic and several other languages of Asia and Africa. It is the second-most widely used alphabetic writing...

Word Count : 4007

Allah

Last Update:

Retrieved 30 March 2021. Arabic script in Unicode symbol for a Quran verse, U+06DD, page 3, Proposal for additional Unicode characters Sale, G AlKoran...

Word Count : 5707

Teth

Last Update:

dental fricative. The Arabic letter (ط) is sometimes transliterated as Tah in English, for example in Arabic script in Unicode. The sound value of Teth...

Word Count : 545

Unicode character property

Last Update:

The Unicode Standard assigns various properties to each Unicode character and code point. The properties can be used to handle characters (code points)...

Word Count : 3264

Unicode

Last Update:

uncommon Unicode characters. Without proper rendering support, you may see question marks, boxes, or other symbols. Unicode, formally The Unicode Standard...

Word Count : 10732

List of Unicode characters

Last Update:

other symbols. As of Unicode version 15.1, there are 149,878 characters with code points, covering 161 modern and historical scripts, as well as multiple...

Word Count : 1827

Ancient South Arabian script

Last Update:

South Arabian script (Old South Arabian: 𐩣𐩯𐩬𐩵 ms3nd; modern Arabic: الْمُسْنَد musnad) branched from the Proto-Sinaitic script in about the late...

Word Count : 1190

Numerals in Unicode

Last Update:

numeral (often called number in Unicode) is a character that denotes a number. The decimal number digits 0–9 are used widely in various writing systems throughout...

Word Count : 1599

Greek alphabet

Last Update:

August 5, 2012) Unicode FAQ – Greek Language and Script alphabetic test for Greek Unicode range (Alan Wood) numeric test for Greek Unicode range Classical...

Word Count : 8169

Khudabadi script

Last Update:

support to display the uncommon Unicode characters in this article correctly. Khudabadi (देवदेन/ Devden) was a script used to write the Sindhi language...

Word Count : 997

Universal Character Set characters

Last Update:

or other symbols. The Unicode Consortium and the ISO/IEC JTC 1/SC 2/WG 2 jointly collaborate on the list of the characters in the Universal Coded Character...

Word Count : 6987

Sundanese script

Last Update:

Unicode Standard in April 2008 with the release of version 5.1. In version 6.3, the support of pasangan and some characters from Old Sundanese script...

Word Count : 867

Thaana

Last Update:

other Indic scripts or of the Arabic script. There is no apparent logic to the order; this has been interpreted as suggesting that the script was scrambled...

Word Count : 1433

Cham script

Last Update:

need rendering support to display the uncommon Unicode characters in this article correctly. The Cham script is a Brahmic abugida used to write Cham, an...

Word Count : 1384

Latin script

Last Update:

alphabet. Library resources about Latin script Online books Resources in your library Resources in other libraries Unicode collation chart—Latin letters sorted...

Word Count : 3959

Bamum script

Last Update:

display the uncommon Unicode characters in this article correctly. The Bamum scripts are an evolutionary series of six scripts created for the Bamum...

Word Count : 1423

PDF Search Engine © AllGlobal.net