Global Information Lookup Global Information

Universal Coded Character Set information


Universal Coded Character Set
Alias(es)UCS, Unicode
Language(s)International
StandardISO/IEC 10646
Encoding formatsUTF-8, UTF-16, GB 18030
Less common: UTF-32, BOCU, SCSU, UTF-7
Preceded byISO/IEC 8859, ISO/IEC 2022, various others

The Universal Coded Character Set (UCS, Unicode) is a standard set of characters defined by the international standard ISO/IEC 10646, Information technology — Universal Coded Character Set (UCS) (plus amendments to that standard), which is the basis of many character encodings, improving as characters from previously unrepresented typing systems are added.

The UCS has over 1.1 million possible code points available for use/allocation, but only the first 65,536, which is the Basic Multilingual Plane (BMP), had entered into common use before 2000. This situation began changing when the People's Republic of China (PRC) ruled in 2006 that all software sold in its jurisdiction would have to support GB 18030. This required software intended for sale in the PRC to move beyond the BMP.[clarification needed]

The system deliberately leaves many code points not assigned to characters, even in the BMP. It does this to allow for future expansion or to minimise conflicts with other encoding forms.

The original edition of the UCS defined UTF-16, an extension of UCS-2, to represent code points outside the BMP. A range of code points in the S (Special) Zone of the BMP remains unassigned to characters. UCS-2 disallows use of code values for these code points, but UTF-16 allows their use in pairs. Unicode also adopted UTF-16, but in Unicode terminology, the high-half zone elements become "high surrogates" and the low-half zone elements become "low surrogates".[clarification needed]

Another encoding, UTF-32 (previously named UCS-4), uses four bytes (total 32 bits) to encode a single character of the codespace. UTF-32 thereby permits a binary representation of every code point in the APIs, and software applications.

and 22 Related for: Universal Coded Character Set information

Request time (Page generated in 1.0653 seconds.)

Universal Coded Character Set

Last Update:

The Universal Coded Character Set (UCS, Unicode) is a standard set of characters defined by the international standard ISO/IEC 10646, Information technology...

Word Count : 1861

Universal Character Set characters

Last Update:

list of the characters in the Universal Coded Character Set. The Universal Coded Character Set, most commonly called the Universal Character Set (abbr. UCS...

Word Count : 6987

Character encoding

Last Update:

repertoire over time. A coded character set (CCS) is a function that maps characters to code points (each code point represents one character). For example, in...

Word Count : 3718

Null character

Last Update:

defined by the Baudot and ITA2 codes, ISO/IEC 646 (or ASCII), the C0 control code, the Universal Coded Character Set (or Unicode), and EBCDIC. It is...

Word Count : 959

EBCDIC

Last Update:

Extended Binary Coded Decimal Interchange Code (EBCDIC; /ˈɛbsɪdɪk/) is an eight-bit character encoding used mainly on IBM mainframe and IBM midrange computer...

Word Count : 2550

Plane

Last Update:

tool for shaping wood Plane (Unicode), in the Universal Coded Character Set, a continuous group of 216 code points Plane, part of a telecommunications network...

Word Count : 307

Universal Product Code

Last Update:

The Universal Product Code (UPC or UPC code) is a barcode symbology that is used worldwide for tracking trade items in stores. The chosen symbology has...

Word Count : 5358

Theban alphabet

Last Update:

created a draft proposal for adding the Theban alphabet to the Universal Coded Character Set/Unicode. "Theban alphabet". Omniglot. Retrieved March 6, 2023...

Word Count : 804

ISO 5428

Last Update:

5428:1984, Greek alphabet coded character set for bibliographic information interchange, is an ISO standard for an 8-bit character encoding for the modern...

Word Count : 136

Private Use Areas

Last Update:

the SMP. GB/T 20542-2006 ("Tibetan Coded Character Set Extension A") and GB/T 22238-2008 ("Tibetan Coded Character Set Extension B") are Chinese national...

Word Count : 2996

Universal code

Last Update:

items Universal code (typography), a standard set of characters in typography Universal code (cartography), another term for the Natural Area Code, a geocode...

Word Count : 173

List of Unicode characters

Last Update:

character reference refers to a character by its Universal Character Set/Unicode code point, and a character entity reference refers to a character by...

Word Count : 1827

ZX Spectrum character set

Last Update:

ZX Spectrum character set is the variant of ASCII used in the ZX Spectrum family computers. It is based on ASCII-1967 but the characters ^, ` and DEL...

Word Count : 1331

ASCII

Last Update:

National Standard for Information Systems — Coded Character Sets — 7-Bit American National Standard Code for Information Interchange (7-Bit ASCII), ANSI...

Word Count : 8053

ArmSCII

Last Update:

7-bit encoding, from which the encoding and mapping to the UCS (Universal Coded Character Set (ISO/IEC 10646) and Unicode standards) were also derived a few...

Word Count : 2195

Unicode Consortium

Last Update:

ISBN 978-0-321-18578-5. Comparison of Unicode encodings Universal Character Set characters Universal Coded Character Set "Tax Exempt Organization Search". Internal...

Word Count : 1363

Han unification

Last Update:

Universal Character Set to map multiple character sets of the Han characters of the so-called CJK languages into a single set of unified characters....

Word Count : 6316

Unicode

Last Update:

Unicode Standard: Unicode and the ISO's Universal Coded Character Set (UCS) use identical character names and code points. However, the Unicode versions...

Word Count : 10732

Allegro Common Lisp

Last Update:

external text encodings and provides string and character types based on Universal Coded Character Set 2 (UCS-2). Allegro CL can be used with and without...

Word Count : 546

Emoji

Last Update:

of emoji to be used across all platforms in the country. The Universal Coded Character Set (Unicode), controlled by the Unicode Consortium and ISO/IEC...

Word Count : 9878

Cell Broadcast

Last Update:

maximum message length of 1395 characters in the Latin alphabet, and 615 characters in Universal Coded Character Set (UCS-2) encoding in order to support...

Word Count : 2230

JIS X 0208

Last Update:

identify characters without relying on their codes. The names of characters are coordinated with other character set standards, notably the Universal Coded Character...

Word Count : 13276

PDF Search Engine © AllGlobal.net