Re: problem with CMtex and CSingle lock - WAIT_ABANDONED
Dh wrote:
Now the question is, how can the Mutex get abandoned?
When a thread holding the mutex (which, given that the mutex is a named
mutex, could be part of any process) terminates without first releasing
the mutex.
Yes, this is the behaviour noticed in the test programs(run as services
also) that we wrote. But, in the production code, this is not the case.
Are you sure that no thread has terminated? If WAIT_ABANDONED is being
returned by WaitForSingleObject, either you've found a fairly serious OS
bug, or the mutex really is being abandoned by some other code -
remember that a named mutex can be locked from literally any thread of
any process (subject to sufficient permissions). What is the name of
your named mutex?
The destructor of the Single lock is invoked each time and the return
value of CSingleLock::Unlock() is also 1 [double locking does occur in
the code].
Check for WAIT_ABANDONED after you lock each time, and if it happens,
try to lock again. Also, make sure your mutex names are unique to your
service, for obvious reasons - conceivably it could be a usage of your
named mutex from somewhere else that is the problem.
The behaviour noticed is, the thread which got WAIT_ABANDONED while
trying to lock appears to have got the lock since I see other threads
waiting to acquire this. Also, when the deadlock occurs, using process
explorer, the thread that got WAIT_ABANDONED is still holding on to the
lock.
Yes, looking at the docs again, it appears that WAIT_ABANDONED does mean
you got the lock.
Will try your suggestion to do a double lock in ccase of encountering a
WAIT_ABANDONED.
My suggestion is definitely wrong!
Ok, I've looked a bit more closely, and you've hit a VC6 MFC problem in
addition to the general problem of the mystery WAIT_ABANDONED: Lock
should probably TRUE when WAIT_ABANDONED is returned (since the mutex is
actually locked by the caller in this case), but it in fact returns
FALSE, and hence the destructor of the lock doesn't attempt to release
the mutex. I note this has been changed since VC6, and it returns TRUE
for WAIT_ABANDONED under VC8.
But, if WAIT_ABANDONED is returned, you have such a serious problem that
you should probably exit immediately anyway, since you are probably
dealing with corrupt data (an alternative would be to restart the service).
Tom