Unicode

Apr 20, 2023

# Code point

In character encoding terminology, a code point, codepoint or code position is a numerical value that maps to a specific character. Code points usually represent a single grapheme—usually a letter, digit, punctuation mark, or whitespace—but sometimes represent symbols, control characters, or formatting.
Code point - Wikipedia

# UTFs

Each Unicode code point can be expressed in several different formats. These formats are called Unicode transformation formats (UTFs). For example, the letter M is the Unicode code point U+004D. In UTF-8, this code point is represented as X'4D’. In UTF-16, this code point can be represented as X'004D'.

UTF-8 is a transmission format

# Links

Java use unicode in String object.