Re: wtf is happening here @ bitwise comparison
James Kanze <james.kanze@gmail.com> wrote in news:c90c658c-5efb-4703-
b2dc-aea046bb534d@c2g2000yqc.googlegroups.com:
On Dec 23, 9:03 am, Paavo Helde <myfirstn...@osa.pri.ee> wrote:
tschmittldk <tschmitt...@googlemail.com> wrote in news:9139bceb-5be4-
4653-8b82-d92663dd1...@l32g2000yqc.googlegroups.com:
On 22 Dez., 19:45, tschmittldk <tschmitt...@googlemail.com> wrote:
Okay thanks for all your answers. I try it tomorrow and
post the code then (I left my notebook in my student
flat...). But it seems more clearly to me now, thanks!
Okay, now here's the code:
void codevert(char *ArrayToTransform)
{
int j = 0;
char *ptr = ArrayToTransform;
while (*ptr != '\0') {
if((*ptr & 0xC0) > 0xbf)
{
if(*ptr == '\xc3')
simplifier_correct(3, ptr++);
else if(*ptr == '\xc4')
simplifier_correct(3, ptr++);
else if(*ptr == '\xc4')
simplifier_correct(3, ptr++);
else
std::cout << "E01";
}
ptr++;
}
}
This is all very brittle.
Yes, but not for the reasons you imply. It's brittle because
it only handles a very small subset of UTF-8. But presumably,
the poster knows that, and accepts that any but a few specific
two byte sequences will result in "E01". Not to mention the
typo: the last two else if test exactly the same thing.
There's nothing brittle about it at the C++ level.
*ptr is char, which is most probably a signed
type and can be negative.
And is probably 8 bits.
(*ptr & 0xC0) is int and appears to be positive
Not only appears to be: is.
The intermediate values will be unexpected, of course, but the
final result should be correct. (The expression *ptr might be
negative.)
So, if *ptr is negative, one has here a bitwise AND of a negative and a
positive number. The result is positive only because the sign bit is
cleared by the AND operation, this was probably unintensional by the
writer and what I called "by chance". There are also several guidelines
along the lines "Use bitwise operators only on unsigned operands", see
e.g.
https://www.securecoding.cert.org/confluence/display/cplusplus/INT13-CPP.
+Use+bitwise+operators+only+on+unsigned+operands
As far as I see, both operands in this example are promoted to int, and
the AND operation is applied to the signed int operands, one of which is
negative. If you say this is working correctly with signed magnitude and
one's complement representations, then I am all ready to believe that.
However, it seems to me not immediately clear. Especially with one's
complement the negative number should be the bitwise invertion of all
bits so the result of '\xc3' & 0xc3 should be zero at first glance.
and of the desired value even if *ptr is negative, this is
more by chance and not very portable.
Could you name an architecture where it wouldn't work?
Sorry, no.
[...]
0xbf is int and positive, '\xc3' is char and
negative.
And?
It would be just confusing to operate with positive values on one line of
the code and with negative values on another. One must also take care to
not make any mistakes like
if(*ptr == 0xc3)
I'm sure you would never make such a mistake, but I'm not so sure about
OP (or myself).
Cheers
Paavo