Re: Some errors in MIT's intro C++ course
"Joshua Maurice" <joshuamaurice@gmail.com>
How you calculate a "chance" of what harm the effect of
a bug can be? Please admit you really can not.
The exception handler at upper level si hardly any wiser than the
designer
and design translator who already was proven wrong by the Assert
violation
triggered.
This is an argument saying that fault isolation isn't possible.
You know the old tale about Baron Munchhausen, who was sinking in a swamp,
but then pulled himself out by his hair.
Fault isolation is like that. In reality you can do it only on the
archimedan path, using some fixed point. Otherwise it's jut the baron's
tale.
I think that's silly.
Is it really? I'd classify the line of thoughts in the usual 'wishful
thinking' -- where you expect the solution to be present, so it is talked
into the world. As words are easy to mince.
I think that fault isolation is practical, and the
only way to achieve fault tolerance and robustness in an application.
It is, but just stating that will not create its funation anywhere. You may
have it, sometimes you can build it -- but it is a serious business.
Perhaps your implicit argument is that fault isolation must be done at
the "process level" for any language, but I disagree.
Process level is one possible 'fixed point'. Provided your environment
implements process handling that way. (i.e your 'process' runs in a proper
sandbox (say "user space") while some other part of the system runs
separated (say "kernel space"), and nothing can be done in the former to
mess up the latter. Including I/O operation, processor exceptions, resource
exhaustion, etc -- the supervisory system shall be able to recover from
anything.
Even that is not easy to accomplish, with all the support built into today's
microprocessors. But most OS-es aim for exactly that, so you can use fruits
of the gigantic effort.
More reliable systems use separation of more systems.
I do not state that isolation within a process is impossible in the first
place, but it has similar amount of requirements and way less support. So it
will be too rarely practical, if you stick to true meaning of roboustness,
not just wish it in.
Also, one must calculate the chance that one part of the application
can harm another part of the application.
I heard about calculation of chances in too many sad conversations. All were
really just empty claims and playing russian roulette with customers/users.
Until I observe something better, I stick to binary: can or can not violate.
Where can violate, I calculate it as a 100% chance, and act accordingly. Too
bad others do not -- as the world seem to go the Terry Pretchet's way (in
Discworld seem that 1 to million chances happen 9 times out of ten...).
It's part of the design of robustness.
Robustness is always a measure of degree. There are no
absolutes. For example, do I need separate processes? Perhaps I'm
concerned that the processes may interact with the OS in a way that
will kill the entire computer, such as allocating too much memory,
screwing with resources, exploiting a bug to get root and doing "bad
things" (tm). So, perhaps I need separate physical machines. Maybe I
even need an uninterruptable power supply. These are very important
questions. As an extreme, perhaps I need to shield my hardware against
EMPs or cosmic rays.
Yes, you build the threat model like that. Not the other way around, as
usual ("who on earth will enter 8000 characters in the password field?"
"access to that variable happens rarely, no way will that race condition
manifest" ... etc ... )
Indeed. I said it depends on the particulars, and for example if you
have a JNI library written in C++, then you would lose the Java
guarantees of which I spoke. As an example, the JVM is not magic. It's
written in C or C++ or something like it, so all of the same problems
are possible. It's just that the JVM is far better tested and reviewed
than your program will be, so the odds of a bug in the JVM breaking
your program are rather small. Again, it's all about odds.
I wouldn't bet on the last one, as my programs tend to run a decade 7/24
with like 1 defect reported in the period, while JVMs are full of
frightening fixes, and their stability did not impress me ever.
But that was not what I was talking about really, for the scope of this
discussion we can assume the JVM works perfectly to its specification, and
just look whether a faulting java program is okay to throw exception at the
fault detecting spot instead of doing halt, or jump directly to the monitor.
I don't think so. For a moment let's even take aside concerns on building
the in-process monitor. Suppose we have it, and it is sitting on the top
end of the exception chain, and once reached can magically discard all the
bad stuff and resume some healthy execution.
What can happen in between? just two things from tp of my head:
1. code running form finally{} blocks
2. catch{}
Say your program writes some output and uses temporary files. On the
normative path when it finished all the temporary files are removed. it
keeps a neat list of them.
If it detects some problem (not fault, just a recoverable condition, like
access denied or disk full), it throws exception, cleanup is done upwards in
finally blocks, on top an exception handler reports the user that another
try is due. But the state is happy and clean.
Now suppose there is a fault in the program, and the state is messed up. it
is detected in some assert -- and you throw exception also -- to be caught
even higher than the previous. The finally blocks run, and process theit
job list -- that can be messed up to any degree, so deleting all your disk
including mounts. Or maybe just the input files instead of the temporaries,
whatever. Not my idea of being roboust, or containment.
For the other is there anything to say? Mis-processing exceptions is not
so different to other kind of bugs. Execution continues in the bad state.
I would direct you to my post in comp.lang.c++.moderated:
http://groups.google.com/group/comp.lang.c++.moderated/msg/dacba7e87ded4dd7
In short: we're out to make money, so we have to make trade offs, such
as trade offs between developer time and robustness.
I like that article. But I would not merge the different subjects. It is
originally about correct programs with different performance.
Certainly quality is also a 'tradeoff' if you measure up costs of
problem-detection methods. And we well know that software production is yet
in the wild west era, no enforced quality controls, 'provided as is' license
nonsense and so on. And customers buy, often even demand the crapware.
my poractical view is that there is indeed limit, and diminishing returns to
detection -- a few problems will remain in coding, and some interaction not
addressed in design. But the usual practices stop quality control several
magnitudes under that point. And replace actual quality with talk, or
making up chances. Or delusions about roboustness without any kind of proof
it is there.
Of course, it's all a matter of degree. Process boundaries are only
good to a degree. Separate physical hardware gives better fault
isolation than separate processes under Linux. In the end, as you say,
good design is required.
In the beginning. So what we're talking about? Why twist the mud? State
can be correct/incorrect/unknown. discoveruing violation puts you where?
Just like if you discover a bug in a C++ process, what makes you think
that the state of the OS is known so that you can rollover to a backup
process? What makes you think that the entire computer isn't entirely
borked? Because it's not likely.
Actually it does not make me think just like that, and when I'm asked about
'likely', I rather stick to the raw threat model -- ahat actions can or can
not happen, and what consequences. Where it is important, checking or
providing kernel code is due, or suggesting the external measures/better
isolation.
Certainly we're not clairvoyant, so yet uncovered bugs are not considered.
But that leads far away from the original point, and instead of
relativisation, please defend the in-process containment idea. If you mean
it in "general".
I just wanted to make the observations:
1- Fault tolerance requires fault isolation.
2- Generally one cannot reliably isolate faults inside of a C++
process. Fault isolation must be at the process level for C++.
provided the system sandboxes the process -- and the process concept
>apply
in the first place...
I'm sorry. I cannot understand what you are trying to say here. Could
you phrase it in another way please?
C++ is used for many things, not only in unix/win32-like environment. In
embedded stuff you often do not have any OS at all, so it's up to you to
build the "fixed point" inside or outside.
If you're trying to argue that fault isolation can only be done at the
process level (or higher), then I disagree, and you have made no
attempt to argue this position besides saying "no it's not".
Maybe so, in my view if someone claims to have the perpetoom mobile, it is
his job to prove it -- just like I want to see the old Baron lifting from
swamp.
OTOH we may not really be in total disagreement. As a java system
(probably) can be configured such that executing code is fixed, and an
internam monitor that uses no state at all -- thus protected from in-process
problems. tratring from there it may be possible to build a better one,
with some protected state too.
I'm yet sceptic that when talk is about such a thing it is actually created
around my requirements on "trusted computer base". Pelase don;t take it
personally, I just encountered too many 'smopke and mirrors' stuff,
especially with java-like systems where security was just lied in, referring
non-existing or irrelevant things.
C-like systems are at least famous for being able to corrupt anything, so we
can skip over an idle discussion. :-)
After some replies, I made one final claim:
3- It's much easier to get more reliable fault isolation inside of a
single process in other languages, like Java, as opposed to C++.
You meant to say, it is way easier elude yourself to think that true,
just
because other languages have less UB and no trivial means to corrupt
memory
like buffer overrun or access free-d object.
Yes. I do mean that. (Well, except for the deluded part.) Because it's
harder for a programmer bug to corrupt arbitrary memory, then there is
more fault isolation between different parts inside of a single Java
process.
The crucial question is whether that 'more' isolation is 'enough' isolation
for the purpose. That is where I stay sceptic. You know, a beer bottle
is way less fragile than an egg, but the difference is irrelevant if you
drop them from the roof on concrete. Or even from a meter height.
And to point out again, the original idea was introducing yet an extra gap
between the supposedly safe area and the point of fault discovery.