Discussion:
utf-8 string literal
(too old to reply)
Sascha Schwarz
2016-03-14 12:49:01 UTC
Permalink
{ edited to shorten lines to ~70 characters. -mod }

Hello all.

Recently we were discussing if the following snippet is guaranteed to
compiles on all conforming platforms.

int main() {
// wikipedia's example from https://en.wikipedia.org/wiki/UTF-8
constexpr const char euro[] = u8"\u20ac";
static_assert(
sizeof euro == 4
&& euro[0] == static_cast<const char>(0b11100010)
&& euro[1] == static_cast<const char>(0b10000010)
&& euro[2] == static_cast<const char>(0b10101100),
"Not utf-8.");
}

Looking at 2.3 (Basic charset) and 2.14.5 (String literals) we _think_
so, but are not sure.

This came up whilst implementing Adobe's glyphlist in C++.
See https://github.com/adobe-type-tools/agl-aglfn
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Öö Tiib
2016-03-14 23:10:29 UTC
Permalink
Post by Sascha Schwarz
Recently we were discussing if the following snippet is guaranteed to
compiles on all conforming platforms.
int main() {
// wikipedia's example from https://en.wikipedia.org/wiki/UTF-8
constexpr const char euro[] = u8"\u20ac";
static_assert(
sizeof euro == 4
&& euro[0] == static_cast<const char>(0b11100010)
&& euro[1] == static_cast<const char>(0b10000010)
&& euro[2] == static_cast<const char>(0b10101100),
"Not utf-8.");
}
Looking at 2.3 (Basic charset) and 2.14.5 (String literals) we _think_
so, but are not sure.
Can you elaborate what makes you unsure?
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Sascha Schwarz
2016-03-15 14:45:35 UTC
Permalink
Post by Öö Tiib
Can you elaborate what makes you unsure?
It comes down to the difference between "\u20ac" and u8"\u20ac".

My understanding is, that whilst there is no guarantee about the encoding of
the
former, the latter is encoded using utf-8, and the static_assert() holds.
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Continue reading on narkive:
Loading...