Global Information Lookup Global Information

Unicode information


Unicode
Logo of the Unicode Consortium
Alias(es)
  • Universal Coded Character Set (UCS)
  • ISO/IEC 10646
Language(s)See list of scripts
StandardUnicode Standard
Encoding formats
  • UTF-8
  • UTF-16
  • GB18030

  • UTF-32
  • BOCU
  • SCSU
(uncommon)
  • UTF-7
  • UTF-1
(obsolete)
Preceded by
  • ISO/IEC 8859
  • various others
  • Official website
  • Technical website

Unicode, formally The Unicode Standard,[note 1] is a text encoding standard maintained by the Unicode Consortium designed to support the use of text written in all of the world's major writing systems. Version 15.1 of the standard[A] defines 149813 characters[3] and 161 scripts used in various ordinary, literary, academic, and technical contexts.

Many common characters, including numerals, punctuation, and other symbols, are unified within the standard and are not treated as specific to any given writing system. Unicode encodes thousands of emoji, with the continued development thereof conducted by the Consortium as a part of the standard.[4] Moreover, the widespread adoption of Unicode was in large part responsible for the initial popularization of emoji outside of Japan. Unicode is ultimately capable of encoding more than 1.1 million characters.

Unicode has largely supplanted the previous environment of myriad incompatible character sets, each used within different locales and on different computer architectures. Unicode is used to encode the vast majority of text on the Internet, including most web pages, and relevant Unicode support has become a common consideration in contemporary software development.

The Unicode character repertoire is synchronized with ISO/IEC 10646, each being code-for-code identical with one another. However, The Unicode Standard is more than just a repertoire within which characters are assigned. To aid developers and designers, the standard also provides charts and reference data, as well as annexes explaining concepts germane to various scripts, providing guidance for their implementation. Topics covered by these annexes include character normalization, character composition and decomposition, collation, and directionality.[5]

Unicode text is processed and stored as binary data using one of several encodings, which define how to translate the standard's abstracted codes for characters into sequences of bytes. The Unicode Standard itself defines three encodings: UTF-8, UTF-16, and UTF-32, though several others exist. Of these, UTF-8 is the most widely used by a large margin, in part due to its backwards-compatibility with ASCII.

  1. ^ "Unicode Technical Report #28: Unicode 3.2". Unicode Consortium. 2002-03-27. Retrieved 2022-06-23.
  2. ^ Jenkins, John H. (2021-08-26). "Unicode Standard Annex #45: U-source Ideographs". Unicode Consortium. Retrieved 2022-06-23. 2.2 The Source Field
  3. ^ "Unicode Character Count V15.1". Unicode. Archived from the original on 2023-10-09. Retrieved 2023-09-12.
  4. ^ "Emoji Counts, v15.1". Unicode. Archived from the original on 2023-09-28. Retrieved 2023-09-12.
  5. ^ "The Unicode Standard: A Technical Introduction". Retrieved 2010-03-16.


Cite error: There are <ref group=note> tags on this page, but the references will not show without a {{reflist|group=note}} template (see the help page).
Cite error: There are <ref group=upper-alpha> tags or {{efn-ua}} templates on this page, but the references will not show without a {{reflist|group=upper-alpha}} template or {{notelist-ua}} template (see the help page).

and 22 Related for: Unicode information

Request time (Page generated in 0.5857 seconds.)

Unicode

Last Update:

uncommon Unicode characters. Without proper rendering support, you may see question marks, boxes, or other symbols. Unicode, formally The Unicode Standard...

Word Count : 10732

List of Unicode characters

Last Update:

(Unicode block) Khojki (Unicode block) Khudawadi (Unicode block) Lao (Unicode block) Lepcha (Unicode block) Limbu (Unicode block) Mahajani (Unicode block)...

Word Count : 1827

Mathematical operators and symbols in Unicode

Last Update:

marks, boxes, or other symbols. The Unicode Standard encodes almost all standard characters used in mathematics. Unicode Technical Report #25 provides comprehensive...

Word Count : 889

Emoji

Last Update:

This article contains Unicode emoticons or emojis. Without proper rendering support, you may see question marks, boxes, or other symbols instead of the...

Word Count : 9878

Greek alphabet

Last Update:

character list in Unicode Unicode collation charts—including Greek and Coptic letters, sorted by shape Examples of Greek handwriting Greek Unicode Issues (Nick...

Word Count : 8169

Unicode block

Last Update:

A Unicode block is one of several contiguous ranges of numeric character codes (code points) of the Unicode character set that are defined by the Unicode...

Word Count : 825

Chess symbols in Unicode

Last Update:

symbols are part of Unicode. Instead of using images, one can represent chess pieces by characters that are defined in the Unicode character set. This...

Word Count : 268

Byte order mark

Last Update:

The byte-order mark (BOM) is a particular usage of the special Unicode character code, U+FEFF ZERO WIDTH NO-BREAK SPACE, whose appearance as a magic number...

Word Count : 1995

Unicode input

Last Update:

Unicode input is the insertion of a specific Unicode character on a computer by a user; it is a common way to input characters not directly supported by...

Word Count : 1904

Unicode subscripts and superscripts

Last Update:

rendering support, you may see question marks, boxes, or other symbols. Unicode has subscripted and superscripted versions of a number of characters including...

Word Count : 2474

Unicode Consortium

Last Update:

The Unicode Consortium (legally Unicode, Inc.) is a 501(c)(3) non-profit organization incorporated and based in Mountain View, California, U.S. Its primary...

Word Count : 1376

Dingbat

Last Update:

This article contains Unicode emoticons or emojis. Without proper rendering support, you may see question marks, boxes, or other symbols instead of the...

Word Count : 519

Unicode font

Last Update:

A Unicode font is a computer font that maps glyphs to code points defined in the Unicode Standard. The vast majority of modern computer fonts use Unicode...

Word Count : 1466

Character encoding

Last Update:

modern computer systems allows more elaborate character codes (such as Unicode) which represent most of the characters used in many written languages...

Word Count : 3718

Regional indicator symbol

Last Update:

The regional indicator symbols are a set of 26 alphabetic Unicode characters (A–Z) intended to be used to encode ISO 3166-1 alpha-2 two-letter country...

Word Count : 1037

Alchemical symbol

Last Update:

This article contains Unicode alchemical symbols. Without proper rendering support, you may see question marks, boxes, or other symbols instead of alchemical...

Word Count : 981

Private Use Areas

Last Update:

In Unicode, a Private Use Area (PUA) is a range of code points that, by definition, will not be assigned characters by the Unicode Consortium. Three private...

Word Count : 2996

Unicode symbol

Last Update:

In computing, a Unicode symbol is a Unicode character which is not part of a script used to write a natural language, but is nonetheless available for...

Word Count : 814

I

Last Update:

UCS" (PDF). Unicode. Everson, Michael; et al. (2002-03-20). "L2/02-141: Uralic Phonetic Alphabet characters for the UCS" (PDF). Unicode. Miller, Kirk...

Word Count : 1281

Hyphen

Last Update:

entity. In character encoding for use with computers, it is represented in Unicode by any of several characters. These include the dual-use hyphen-minus,...

Word Count : 5992

Unicode equivalence

Last Update:

Unicode equivalence is the specification by the Unicode character encoding standard that some sequences of code points represent essentially the same...

Word Count : 1902

Infinity symbol

Last Update:

Components for Unicode. Unicode Consortium. Retrieved 2022-02-19 – via GitHub. "IBM-970". International Components for Unicode. Unicode Consortium. May...

Word Count : 2138

PDF Search Engine © AllGlobal.net