Re: Are _T() and TEXT() macros equivalent?

From:

"Doug Harrison [MVP]" <dsh@mvps.org>

Newsgroups:

microsoft.public.vc.mfc

Date:

Sun, 15 Apr 2007 13:27:55 -0500

Message-ID:

<edo423t729vfn8im94p09v7kljs04kfgbl@4ax.com>

On Sat, 14 Apr 2007 15:50:53 GMT, "David Ching" <dc@remove-this.dcsoft.com>
wrote:

"Mihai N." <nmihai_year_2000@yahoo.com> wrote in message
news:Xns991223EABEAMihaiN@207.46.248.16...

To say that it is FUNDAMENTAL to the
language that sizeof(char) == 1?

When something is part of the standard, is used (as an implicit
assumption)
by every single application developed in that language for the last 30
years,
and every single such application will break if this changes, then yes,
it is fundamental.

Err, no. HelloWorld.cpp does not make use of the fact that sizeof(char) ==
1. That's my point. Many modern apps are Unicode native and wouldn't make
this assumption either.

Your argument would be better served by listing the sorts of things that
would break rather than holding "Hello, world" up as something that would
not break. I don't know about everyone, but I don't get a lot of mileage
out of "Hello, world".

It also doesn't help to mock and deny facts, and I've previously stated
several facts in this thread. It is a fact that the definition of the
language equates the terms "byte" and "char", defines sizeof(char) == 1,
and measures object size in terms of bytes (chars). It doesn't get any more
"fundamental" than this, and like I said several messages ago, "char is the
fundamental unit of addressing in C and C++." I don't know if you've
actually stated it as such, but what you are proposing is to introduce a
new type "byte" to liberate "char" from its duties as the quantum of object
size and representation. If you are at all serious about convincing people
who know the language, you should:

1. Go through the standard and list the sections that use char as byte.

2. Collect and present the ways programs make use of (1).

3. Describe the transformations that would be necessary to fix (2) after
changing (1).

4. Describe the difficulty and degree of automation possible to implement
(3).

5. Show why it's worth it to require updating 35+ years of C and 20+ years
of C++ code to conform to the new language.

6. Present your proposal in groups such as comp.std.c++, where you will
find many more language experts than you will here,

Myself, I'd prefer the language to have separate byte and char types, but I
also know this would not be an easy thing to change. Here, I'll give a
brief example using my numbering system above:

2. Problem

char* strdup(const char* s)
{
   size_t len = strlen(s);
   char* res = (char*) malloc(len+1);
   return strcpy(res, s);
}

3. Fix

char* strdup(const char* s)
{
   size_t len = strlen(s);
   char* res = (char*) malloc((len+1)*sizeof(char));
   return strcpy(res, s);
}

4. To automate this, a program would have to be written that recognizes
that malloc is being used to allocate space for a char array. Moreover, it
would have to recognize this across function calls, translation units, and
even libraries. This program will not be written, so it will be up to
people to do this by hand.

Another example concerns any function that takes a void* and writes data as
bytes:

2. Problem

void write(const void* buf, size_t n)
{
   const unsigned char* p = (const unsigned char*) buf;
   const unsigned char* pEnd = p+n;
   while (p != pEnd)
      write(*p++);
}

3. Fix (partial)

void write(const byte* buf, size_t n)
{
   const byte* p = buf;
   const byte* pEnd = p+n;
   while (p != pEnd)
      write(*p++);
}

4. All calls must cast to byte* instead of relying on the standard
conversion to void*. This could be automated. However, if the buffer is a
char array and n is its length, that will have to fixed as in the previous
example, and that's not easy to automate. More generally, any call that
does not amount to write(&x, sizeof(x)) is problematic. Again, people would
have to vet code line by line to make this change.

Note that part (2) of both examples I presented would compile OK under your
hypothetical new language, and they would both lead to buffer overruns.

No doubt, this is just scratching the surface. It's what I could think of
immediately without really trying. To do a reasonably thorough job of part
(2), you would need to pose the question to a great number of people (easy)
and get them to think about it long and hard and answer you (not as easy).
You'd also need to survey millions of lines of code written in every area
people use the language. I haven't done that, of course, but I would have
to conclude, based on my knowledge of the language and experience using it,
that you would be creating a brand new language. I say this because
existing code would not be portable to it, and no one would find it
worthwhile to update their code to use it, because using char as byte and
wchar_t or even TCHAR as "character" works well enough.

--
Doug Harrison
Visual C++ MVP