cppannotations/annotations/yo/first/rawstring.yo
2017-06-03 09:20:33 +02:00

97 lines
3.3 KiB
Text

Standard series of ASCII characters (a.k.a. emi(C strings)) are delimited by
double quotes, supporting
hi(escape sequence) escape sequences like tt(\n, \\) and tt(\"), and ending
in 0-bytes. Such series of ASCII-characters are commonly known as
em(null-terminated byte strings) (singular: emi(NTBS), plural: em(NTBSs)).
bf(C)'s NTBS is the foundation upon which an enormous amount of code has
been built
In some cases it is attractive to be able to avoid having to use escape
sequences (e.g., in the context of XML). bf(C++) allows this using
hi(raw string literal) em(raw string literals).
Raw string literals start with an tt(R), followed by a double quote,
optionally followed by a label (which is an arbitrary sequence of characters
not equal to tt(OPENPAR)), followed by tt(OPENPAR). The raw string ends at the
closing parenthesis tt(CLOSEPAR), followed by the label (if specified when
starting the raw string literal), which is in turn followed by a double
quote. Here are some examples:
verb(
R"(A Raw \ "String")"
R"delimiter(Another \ Raw "(String))delimiter"
)
In the first case, everything between tt("OPENPAR) and tt(CLOSEPAR") is
part of the string. Escape sequences aren't supported so the text tt(\ ")
within the first raw string literal defines three characters: a backslash, a
blank character and a double quote. The second example shows a raw string
defined between the markers tt("delimiter+OPENPAR) and tt(CLOSEPARdelimiter").
Raw string literals come in very handy when long, complex ascii-character
sequences (e.g., usage-info or long html-sequences) are used. In the end they
are just that: long NTBSs. Those long raw string literals should be separated
from the code that uses them, thus maintaining the readability of the using
code.
As an illustration: the bf(bisonc++) parser generator supports an option
tt(--prompt). When specified, the code generated by bf(bisonc++) inserts
prompting code when debugging is requested. Directly inserting the raw string
literal into the function processing the prompting code results in code that
is very hard to read:
verb(
void prompt(ostream &out)
{
if (d_genDebug)
out << (d_options.prompt() ? R"(
if (d_debug__)
{
s_out__ << "\n================\n"
"? " << dflush__;
std::string s;
getline(std::cin, s);
}
)" : R"(
if (d_debug__)
s_out__ << '\n';
)"
) << '\n';
}
)
Readability is greatly enhanced by defining the raw string literals as named
NTBSs, defined in the source file's anonymous namespace (cf. chapter
ref(NAMESPACE)):
verb(
namespace {
char const *noPrompt =
R"(
if (d_debug__)
s_out__ << '\n';
)";
char const *doPrompt =
R"(
if (d_debug__)
{
s_out__ << "\n================\n"
"? " << dflush__;
std::string s;
getline(std::cin, s);
}
)";
} // anonymous namespace
void prompt(ostream &out)
{
if (d_genDebug)
out << (d_options.prompt() ? doPrompt : noPrompt) << '\n';
}
)