[SDL] atomic ops emulation race condition, plus my opinion [Re: SDL atomic operations part #2...]

Martin name.changed.by.editors at online.de
Sun Aug 23 17:30:19 PDT 2009


Hi Bob,

I think there is a race condition in the mutex-based version of privateWaitLock 
that is used by the emulation code in the dummy module (and copies thereof). 
This is the code currently in SVN (/trunk/SDL/src/atomic/dummy/SDL_atomic.c, 
Rev. 4611):

static __inline__ void
privateWaitLock()
{
    if(NULL == lock)
    {
       lock = SDL_CreateMutex();
       if (NULL == lock)
       {
	 SDL_SetError("SDL_atomic.c: can't create a mutex");
	 return;
       }
    }

    if (-1 == SDL_LockMutex(lock))
    {
       SDL_SetError("SDL_atomic.c: can't lock mutex");
    }
}

The problem is that the mutex is initialized on demand, i. e. on the first call 
of the function. Since the purpose of atomic operations is to be used by several 
threads concurrently, it is perfectly possible to have several simultaneous 
"first calls" of privateWaitLock (through any of the emulated operations).

A scenario showing the race condition in detail, with the worst possible outcome:
No thread has called an emulated atomic operation yet, so the lock variable 
still is NULL.
Threads A and B have each just reached their first call to, for example, 
SDL_AtomicIncrementThenFetch8.
A enters privateWaitLock.
A tests if(NULL == lock), which is true, so it enters the respective block.
-- The following actions of B either happen in parallel to the actions of A (on 
a multi-core system) or after A has been preempted (on any type of system) right 
here. --
B enters privateWaitLock.
B tests if(NULL == lock), which still is true, so it enters the same block.
B executes SDL_CreateMutex.
B assigns the result to the lock pointer variable.
-- If B is preempted here, we can get away with just a resource leak. --
B executes SDL_LockMutex, which succeeds immediately (B is not blocked).
B returns from privateWaitLock and now reaches the statement (*ptr)+= 1; in 
SDL_AtomicIncrementThenFetch8.
-- Meanwhile, on a different core, or after preemption... --
A executes SDL_CreateMutex. Another mutex object is created.
A assigns the result to the lock pointer variable.
-- The mutex object that B created is now garbage. --
A executes SDL_LockMutex, which succeeds immediately (A is not blocked), because 
the mutex locked is a different one than that locked by B.
A returns from privateWaitLock and now reaches the statement (*ptr)+= 1;.
A and B could now mess up (*ptr). Both could get the same result from the 
addition, because as we all know, += is not atomic, which is why we need the mutex.
A executes privateUnlock, which unlocks the mutex pointed to by lock, that A had 
created and locked before.
A returns from SDL_AtomicIncrementThenFetch8.
-- Time for B again... --
B also executes privateUnlock, which attempts to unlock the mutex pointed to by 
lock. This is, however, the mutex created by A, which B had never locked. The 
operation will give an error - or maybe worse.

Possible solutions:
- Bad workaround: Require the user to call each atomic operation he intends to 
use at least once before the first call to SDL_CreateThread or whatever else he 
uses to create additional threads.
- Initialize the mutex on the first call of SDL_CreateThread. With atomic 
operations (ha ha), you might even be able to manage a reference count for it, 
to destroy it when all additional threads have ended. But that could be overkill.
- Include the mutex initialization in SDL_Init. This is what I'd prefer - if you 
stick with the mutex -, because it allows using the atomic ops even with non-SDL 
threads. You never know.


Now for my long-awaited opinion. ;-)

- Use the mutex emulation only as a very last resort. I'm quite sure that 
spinlocks (like in the current linux version of SDL_atomic.c) will perform much 
better for the use cases of SDL_atomic. Of course, the mutex may also be 
implemented to spin some time before actually blocking.
The fact that the mutex operations might fail is also a problem. The SDL_atomic 
functions don't return error codes, and they aren't really needed either. 
SDL_SetError alone won't be "heard". Abort?

- Make it possible to test at compile time, whether a particular operation is 
supported natively by the platform. This will allow to pick an atomic-optimized 
algorithm on platforms where it will really boost performance. Other platforms 
may be better off with an algorithm that is taylored directly to using a mutex 
or whatever else.

- Pointers: You should definitely have distinct functions for pointers and ints. 
You might still cast internally, if possible, but at least the user won't have to.

- Signedness: There are certainly many more use cases for atomic ops than I 
know, but so far I guess that signedness will rarely matter.
You could have your own set of typedefs for SDL_atomic (which could include the 
volatile modifier for convenience, BTW):
Type       Minimal range
atomic7    0..127 (0x7f)
atomic15   0..32767 (0x7fff)
atomic31   0..2147483647 (0x7fffffff)
atomic63   0..9223372036854775807 (0x7fffffffffffffff)
Whether those types have more (i. e. negative) values is unspecified, and the 
user must not rely on it.
For the cases where signedness does matter, you could again make it possible to 
check support at compile time.
Other idea: Make the interface signed, but internally cast or transform the 
value where the implementation is unsigned. A transformation would require that 
all simple reads and writes (including initialization) are done through 
SDL_atomic functions as well.

Martin


Bob Pendleton wrote:
> I have a couple of major question that I would like folks to get opinionated
> about. The main question is about atomic operations on pointers. I
> originally wrote the atomic ops so that they only worked on unsigned ints. I
> figured people would deal with pointers by casting them to unsigned of the
> right size. Mac OS X, and Windows provide atomic ops on signed ints and
> provided operations for pointers. On windows there is very little support
> for atomic ops on pointers, but there is a little.
> 
> Having looked at what the other guys do I rather like providing separate ops
> for ints and pointers. BUT, that would mean casting pointers to signed ints
> on some platforms. Casting pointers to signed ints makes me very very
> nervous. I spent too much of my youth working on machines where int and
> (void *) were not the same size.
> 
> The other thing is also a casting question. How do you feel about casting
> signed to unsigned, doing doing addition or subtraction, and then casting it
> back? Or, the reverse. We have the problem that some of the platforms
> provide ops for only signed ints and some for only unsigned ints.
> 
> Bob Pendleton





More information about the SDL mailing list