cppannotations/annotations/yo/first/unicode.yo

In bf(C++) string literals can be defined as NTBSs.  Prepending an NTBS
by tt(L) (e.g., tt(L"hello")) defines a tt(wchar_t) string literal.

bf(C++) also supports 8, 16 and 32 bit i(Unicode) encoded
strings. Furthermore, two new data types are introduced: tt(char16_t) and
tt(char32_t) storing, respectively, a ti(UTF-16) and a ti(UTF-32) unicode
value.

A tt(char) type value fits in a tt(utf_8) unicode value. For character sets
exceeding 256 different values wider types (like tt(char16_t) or tt(char32_t))
should be used.

String literals for the various types of unicode encodings (and associated
variables) can be defined as follows:
        verb(
    char     utf_8[] = u8"This is UTF-8 encoded.";
    char16_t utf16[] = u"This is UTF-16 encoded.";
    char32_t utf32[] = U"This is UTF-32 encoded.";
        )
    Alternatively, unicode constants may be defined using the ti(\u) escape
sequence, followed by a hexadecimal value. Depending on the type of the
unicode variable (or constant) a tt(UTF-8, UTF-16) or tt(UTF-32) value is
used. E.g.,
        verb(
    char     utf_8[] = u8"\u2018";
    char16_t utf16[] = u"\u2018";
    char32_t utf32[] = U"\u2018";
        )
    Unicode strings can be delimited by double quotes but raw string literals
can also be used.