Thursday, January 20, 2005

Beware BeginInvoke

Updated 21/01: Included additional solution in download. See this entry.

After reading Matthew's blog entry in detail I have a few comments...

The code for this is available for download.

Is this really thread safe?

To answer this we must ask, what is the intended semantics? While there is little to go on, I have assumed that each click of the button should add the text, "plus a bit" to the end of the label. So clicking this button multiple times should result in that number of "plus a bit"s being added to the label.

Does this occur? What does the 'evil little daemon' have to say about this?

Download the code and give it a try... Click the button quickly 10 times, and watch the result.

It works... But is it safe?

The problem is that the evil little daemon is a slacker! He (or she) does not want to work all the time. So lets give him a hand.

Try inserting the line "System.Threading.Thread.Sleep(1000);" after the lock.

This puts the current thread (i.e. the worker) to sleep for 1000 milliseconds, giving the evil little daemon a chance to work his magic.

Run it again and see the results, clicking the button 10 times. In my case three "plus a bit"s were added to the label, with a little delay due to the sleep. Why? Race conditions, as I discussed earlier.

To view the problem for yourself, download the new version of the code and run it in the debugger. You can then view the changes that occur on the instance variable in the output window. You should notice that the value gets updated the correct number of times, but that the values are wrong (much like the account problem I discussed earlier).

Where is the problem? The problem exists because of the locking strategy. In this case the lock is acquired to read the value, but is then released allowing other threads to read and write to this value. This thread plans on reading the value, changing the value, and writing the value back (via the GUI thread). Because the lock is released after reading, it is possible that it has changed before the write occurs. As the semantics of our program require that none of these values get lost, this is not safe. So what can be done?

There are several potential approaches, so lets discuss them.

Solution 1
If the update can be performed on the worker thread then the update is simple. Move the updating of the value into the lock with the read. Now that is code is updating the instance variable we do not need to tell the GUI of the value of the string, the GUI can re-query for it. As a result we can use a standard method invoker and get the form to update itself.

The new code looks like this (I have changed "plus a bit" to " 1" as it is easier to count the correct number in the output)

lock( littleLockObject )
{
//read and update the string...
myUsefulString += " 1";
}

To make the test fairer this can be changed to:

lock( littleLockObject )
{
string copy;
copy = myUsefulString;
System.Threading.Thread.Sleep(1000);
myUsefulString = copy + " 1";
}

This way the sleep is between the read and the write, as it was before.

With this we can now consider the String to be our model and GUI our view. The GUI code only views the value, it doesn't update it. Ideally we could move this into another class, and then raise an even upon change etc...

When you run this you will see that all of the clicks get through to the label.

Solution 2
A much more complex solution is to remain with the updating of the instance variable from the GUI thread. In order to ensure that this works safely we must ensure that only one thread can be performing this operation at a time. We can do this by introducing a boolean variable that controls access to this functionality. The body of the method can then be guarded via a condition on this variable, see the code below:

public class Form1 : System.Windows.Forms.Form
{
private bool _MutexAvailable = true;

private void DoSomeWork()
{
string copy;
lock( littleLockObject )
{
while(false == _MutexAvailable)
Monitor.Wait( littleLockObject );

_MutexAvailable = false;
copy = myUsefulString;
}



So how does this work? The thread waits until _MutexAvailable is true, it then sets _MutexAvailable to be false and then processes the body of the method. No others can continue as they are blocked by the guard, i.e. they are waiting for _MutexAvailable to become true.

At the end of the method we now need to reset _MutexAvailable and wake any waiting threads. This must be done in a critical section to ensure safety. So the following code is added to the end of the method:

lock( littleLockObject )
{
_MutexAvailable = true;
Monitor.PulseAll( littleLockObject );
}

The PulseAll wakes the threads from the Wait at the top of the method.

Run the program again. Leave in the Sleep so that we are testing the same thing. Now all of the clicks result in an update of the label.

Is this safe? I am afraid that it is not...

Why?
Try adding a sleep to the UpdateTheUI method.

Running the program now causes incorrect updating again. The problem is now the use of BeginInvoke. With BeginInvoke a new thread is being used to update the value. This allows the worker to pulse another thread and have it read a stale value from the instance variable. The solution is to replace BeginInvoke with Invoke. It is critical that we wait at this point until after the instance variable is updated.

Running this again, with the two sleeps, and now it works correctly.

Solution 3
The problem with solution 2 is that it is full of thread management code that is not fun to play with (for most people). This can be extracted out into a utility class and can then be reused again and again. In this case the utility is called a Mutex. However, this is not the Mutex that comes with .NET, rather it is a lightweight Mutex.

The code now looks like this:

private void DoSomeWork()
{
string copy;

_MyMutex.Acquire();

try
{
//no need for lock, this is a reference to a immutable object
copy = myUsefulString;

System.Threading.Thread.Sleep(1000);

if( this.InvokeRequired )
{
Invoke( new StringMethodInvoker( UpdateTheUI ), new object[] { copy + " 1" } );
}
else
{
UpdateTheUI( copy + " plus a bit" );
}
}
finally
{
_MyMutex.Release();
}
}

The lock is no longer used as the Mutex provides us with the concurrency control.

Conclusion
Concurrency is difficult. There are no magic solution. BeginInvoke is a great tool, but you must take care if you want to ensure total safety.

Having said this, I cannot imagine the circumstances that would be required for this safety issue to raise itself without the inserted Sleep statements. Personally, however, I think you are best to try to make your application as safe as possible (i.e. live + safe).

No comments: