Re: Header File Clutter
On Jan 15, 12:50 am, Dietmar Kuehl <dietmar.ku...@gmail.com> wrote:
Welcome back to news. You can't imagine how happy I am that you
can participate again.
On Jan 14, 7:59 pm, Keith H Duggar <dug...@alum.mit.edu> wrote:
No doubt your measurements (or even more likely the test cases)
are highly flawed. But since you haven't presented them how can
we know?
I did a simple test on UNIXes which would at worst fail on the
wrong side (i.e. make compilers appear as if they always include
files although they actually only include header files once if
there is an include guard): I used a named pipe for a header file
with a program writing a header with an include guard and a report
of the file having been written. Together with a source file which
just includes the header twice this quite nicely shows which
compilers don't touch the file again.
It's nice that someone has the time to do this kind of work.
(My measurements didn't actually test whether the header was
opened twice. They just compared compile times for specific
cases. Since that's what my client was interested in.)
Here are the sources:
[...]
I tried this with SUN's CC, IBM's xlC, HP's aCC, Apple's
clang++, and g++ (these are the compiler I currently have
easy access to). CC includes the file twice,
That's interesting. I've spoken with Steve Clamage concerning
this (many, many years ago); at the time, he told me that Sun CC
didn't implement it because they found that it didn't make a
difference. At the time, on our Sun systems (Sun OS 4 at the
time), it did, for us. I suspect that the difference depends on
the network---on my Sparc at home (with a local disk), I
couldn't detect a difference, but on the systems at work (with
most of the include files remotely mounted on a relatively
overloaded network), I certainly could.
More recent measurements seemed to indicate that Sun CC had
added this optimization, but it's possible that this was just
because the more modern systems had more memory, and were
caching the network accesses better.
(Of course, if this is the case, this would disprove Walter's
claim that the parsing itself was the cause of the slowing down.
Parsing C++ enough to find a corresponding #endif is *not*
significant. Actually opening and reading the file over the
network might be, depending on network latency.)
clang++, xlC
and g++ include the file just once, and HP's aCC fails
if the file is a named pipe (it seems to open the file and
then probably does an fstat() which would yield the file size
being 0; it can be seen that it opens the file but then fails
with an error message saying that it can't open the file).
More generally, you've encountered the problem "what does `same
file' mean". Is a named pipe the same file when it is opened
the second time? (With network mounted drives, of course, you
don't need such exotic and unrealistic cases.)
Verifying that the compilers actually do the right thing, i.e.
commenting out the #define, reveals that clang++ includes the
file only once no matter what which is clearly wrong and xlC
and g++ include the file twice as would be expected.
[...]
Given my little statistic above yields the result: the majority
of the contemporary compilers available to me obviously
implement the logic. One might (it may detect the named pipe)
but I would guess it doesn't and one I can't tell with my test.
I seem to recall, however, that the guys who implemented aCC's
front end mentioned that it does implement the include guard
logic (and they are certainly smart enough to come up with the
technique I outlined above).
My guess is that they implement it if they consider that it
makes a difference. It's possible, however, that they didn't
implement it because on the systems they use, it didn't make a
difference, where as on some customer sites it might. In
particular, it likely will make a difference for anyone using
Clearclase.
Where I currently work, we mainly use Windows (with VS 2005),
but also support Solaris (on Sparc, with Sun CC) and Linux (on
Intel 32 bit, with g++ 4.4.2). Our code uses the Lakos
technique, because it makes a significant difference with VS.
It makes no difference in compile times on the other systems,
but I'm not sure if this is related---the environments on the
other systems are so different otherwise, I'm not sure that any
comparisons are valid.
--
James Kanze