Unicode, formally The Unicode Standard, is a text encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized.
The Unicode Standard defines a codespace: a sequence of integers called code points in the range from to , notated according to the standard as U+0000
–U+10FFFF
.The codespace is a systematic, architecture-independent representation of The Unicode Standard; actual text is processed as binary data via one of several Unicode encodings, such as UTF-8
In this normative notation, the two-character prefix U+
always precedes a written code point, and the code points themselves are written as hexadecimal numbers. At least four hexadecimal digits are always written, with leading zeros prepended as needed. For example, the code point U+00F7
which represents the division sign ÷
is padded with two leading zeros.
There are a total of valid code points within the codespace.