Re: Type punning question
On Jun 18, 5:13 pm, "Alf P. Steinbach" <al...@start.no> wrote:
* James Kanze:
On Jun 17, 9:49 pm, "Alf P. Steinbach" <al...@start.no> wrote:
* Travis Vitek:
I'm maintaining some code that uses the SunOS priocntl
system call [*], and this has led me to some questions about
type punning. To avoid a long-winded explaination, I'm just
going to post some source and then ask my questions.
Why are you posting in a C++ group?
This is C code.
I've got a lot of similar stuff in my C++. (I code for
Posix based machines, too.)
You can save quite a bit of work by simply not adding the
complexities of the lowest abstraction level possible. ;-)
I'm not sure what you're actually trying to say, but the point
is: Posix has some fairly strange interfaces, which aren't
really very typesafe---the type of the out parameter of many
functions depends on arguments to the function, or in this case,
other results of the function. Sometimes, as with ioctl,
they'll use varargs, other times, it will be a void*, or, like
here, they'll declare some "untyped" buffer. When using these
interfaces, you do end up having to do some strange casts.
[...]
On some machines breaking aliasing rules can cause a hardware
exception, on some machines and OSes that exception is caught
by the OS and fixed but then amounting to inefficient access.
And on most, if not all machines, compilers will assume that
lvalues expressions with different types can't alias the same
memory, with certain exceptions (most notably: lvalues with char
or unsigned char).
I'm not sure your statement is completely correct. C99
reportedly has a strict aliasing requirement, that lvalues of
different types can't refer to the same (or overlapping)
location. AFAIK C++ doesn't.
Well, it's not where you'd reasonably look for it, but =A73.10/15
places a lot of restrictions on what user code can do. It
doesn't speak in terms of aliasing; however, you can only access
an "object" (i.e. a sequence of bytes) through an lvalue
expression with either the correct type, or char or unsigned
char. (Other places state you cannot access memory unless there
is an "object" there, again with the exception of accessing it
as char or unsigned char.)
[...]
2. why does changing pc_clinfo to array of char eliminate the
warning? i.e., does that make the cast 'safe'?
Not sure why it eliminates the warning, although the fact that
you can convert any POD value to a sequence of chars and back
is probably involved in making the compiler not see the
problem. It doesn't make the cast safe.
It eliminates the warning, because the compiler doesn't see a
possible aliasing that it doesn't take into consideration. The
standard requires compilers to consider possible aliasing when
char and unsigned char are involved.
I suspect that you're trying to describe the correct
explanation, but I don't quite understand it (that is, I don't
quite understand what's going on in the compiler's "mind"
here, in order to produce or not produce the warning).
The requirements in =A73.10/15. "If a program attempts to access
the stored value of an object through an lvalue of other than
one of the following types the behavior is undefined[...]". The
list of allowed types includes char and unsigned char,
regardless of the type of the object, so the compiler must
assume the lvalues of type char or unsigned char might alias the
object, regardless of its type. On the other hand, if the
object has type int, it cannot be legally accessed through an
lvalue of type rtinfo_t (or whatever his type was), and vice
versa, so the compiler can assume that there is no aliasing
between lvalues of type rtinfo_t and int.
Sometimes, at least. C++ allows memory for an object to be
reused, and both C and C++ allow unions (which comes down to
basically the same thing). At any given time, the memory has a
specific type, and can only be accessed as that type (or as char
or unsigned char), but that type can change in time. G++
definitely doesn't take this possibility into consideration,
however, and in its response to a DR, the C committee seems to
indicate that in the case of unions, it really only meant it to
work if all of the accesses where through the union type.
(Which of course makes it hard for us, who can't read minds, and
can only base what we think on what the standard actually says.)
3. what techniques are there for avoiding problems like this?
If my understanding of the standard is correct, the above
does indeed invoke undefined behavior, but I'm not certain
how exactly to best work around the issue and to avoid
dangerous punning in the future.
C++ code would go like this (off the cuff) -- look ma, no casts! :-)
struct RtInfo
{
short maxPriority;
char filler[32 - sizeof(short)];
};
struct PcInfo
{
int id;
char className[16];
RtInfo rtInfo;
};
int main ()
{
PcInfo pc;
// assume a call to priocntl() will initialize pc with
// something like this...
// pc.classname [0] = 'R';
// pc.classname [1] = 'T';
// pc.classname [2] = 0;
//if (-1L == priocntl (idtype_t(), id_t(), PC_GETCLINFO, &pc)=
)
// return -1;
return (0 == pc.rtInfo.maxPriority);
}
Except, of course, that he can't redefine the types, since
they're part of a system API.
The above doesn't redefine the types.
It's passing an incorrect type to priocntl. That's undefined
behavior.
It simply replaces a silly processing sequence
[type A -> call -> cast to B* -> type B]
that's so roundabout and redundant that it causes a warning,
with
[type B -> call]
Note that the called routine is variadic (or at least is
documented as such).
Yes. Which means that the compiler can't detect the
error---it's undefined behavior, rather than a required compile
time diagnostic. The specifications of the function say that if
the cmd argument (the third argument) is PC_GETCLINFO (his
case), he must pass a pcinfo_t* as fourth argument. Your PcInfo
is *not* a pcinfo_t, so passing its address is undefined
behavior.
Practically, of course, it's going to work (because Posix
requires all pointers to have the same size and representation)
*IF* your PcInfo has exactly the same size and layout as the
pcinfo_t. But the size and layout of a pcinfo_t isn't
specified, and in fact could easily change when e.g. moving from
32 bits to 64. Not using the "official" structure is formally
illegal, and practically unmaintainable.
[...]
Conceptually, the system request is returning a polymorphic
object; if all of the code were in C++, including the system API
(and we had garbage collection---not just over the process, but
at the system level, accross system and process boundaries), the
function would return a pointer to a base type, and he'd use
dynamic_cast to get at the additional data in the derived type.
No.
With a proper design there'd be separate routines with
differently typed arguments instead of a single routine with
subfunctions chosen by an argument.
That's also an alternative. Probably a better one, most of the
time---in this case, both solutions would seem appropriate:
different functions for different commands, definitely, but for
the PC_GETCLINFO command, the actual type of some of the data
returned depends on other data returned, not on the command.
As the system API's (at least those of both Posix and
Windows) are defined in a very primitive C, however, such
elegant options aren't available, and we have to play such
funny games.
No, that's no excuse; see right above.
Right above has undefined behavior, and doesn't really solve the
problem.
If I were writing a C++ wrapper for the function, I think
I'd define the class hierarchy (including a derived class
for the error cases), and return an std::auto_ptr to an
instance of it. But I don't expect to see anything that
evolved in Posix or Windows anytime soon.
IMHO, don't force unrelated types into a hierarchy.
The types are very much related. They have a common base class,
with the id and the classname. Depending on the class, the
specific attributes will vary.
Instead, I think it best to define each "logical" routine
(sub-function) as a distinct routine.
That allows static type checking.
Agreed, but... Sun has already defined the routine, as part of
Solaris, so you can't choose another design. And even if you
split out each command into a different function (which is how
Sun should have designed it), you still have the fact that the
class will dynamically depend on the cid passed into the
function, and that the data concerning the scheduling policy
depends on the class.
--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34