Universal Coded Character Set information

Universal Coded Character Set
Alias(es)	UCS, Unicode
Language(s)	International
Standard	ISO/IEC 10646
Encoding formats	UTF-8, UTF-16, GB 18030; Less common: UTF-32, BOCU, SCSU, UTF-7
Preceded by	ISO/IEC 8859, ISO/IEC 2022, various others
	v; t; e;

The Universal Coded Character Set (UCS, Unicode) is a standard set of characters defined by the international standard ISO/IEC 10646, Information technology — Universal Coded Character Set (UCS) (plus amendments to that standard), which is the basis of many character encodings, improving as characters from previously unrepresented typing systems are added.

The UCS has over 1.1 million possible code points available for use/allocation, but only the first 65,536, which is the Basic Multilingual Plane (BMP), had entered into common use before 2000. This situation began changing when the People's Republic of China (PRC) ruled in 2006 that all software sold in its jurisdiction would have to support GB 18030. This required software intended for sale in the PRC to move beyond the BMP.^{[clarification needed]}

The system deliberately leaves many code points not assigned to characters, even in the BMP. It does this to allow for future expansion or to minimise conflicts with other encoding forms.

The original edition of the UCS defined UTF-16, an extension of UCS-2, to represent code points outside the BMP. A range of code points in the S (Special) Zone of the BMP remains unassigned to characters. UCS-2 disallows use of code values for these code points, but UTF-16 allows their use in pairs. Unicode also adopted UTF-16, but in Unicode terminology, the high-half zone elements become "high surrogates" and the low-half zone elements become "low surrogates".^{[clarification needed]}

Another encoding, UTF-32 (previously named UCS-4), uses four bytes (total 32 bits) to encode a single character of the codespace. UTF-32 thereby permits a binary representation of every code point in the APIs, and software applications.

Universal Coded Character Set information

and 22 Related for: Universal Coded Character Set information

Universal Coded Character Set

Universal Character Set characters

Character encoding

Null character

EBCDIC

Plane

Universal Product Code

Theban alphabet

ISO 5428

Private Use Areas

Universal code

List of Unicode characters

ZX Spectrum character set

ASCII

ArmSCII

Unicode Consortium

Han unification

Unicode

Allegro Common Lisp

Emoji

Cell Broadcast

JIS X 0208