+1.916.577.1977 | Downloads | Buy | Register | Login
 Search  
Wednesday, August 20, 2008
Search Blogs
 

Available Blogs
 

Previous Blogs
 

Technorati
 
More blogs about coversant.

About Coversant
 

Concurrency Patterns and Anti-Patterns for Enumeration
 
Location: BlogsMullin' with Mullins    
Posted by: Chris Mullins 5/19/2007

Iterating over a collection in multithreaded code presents a problem. At the most basic level, the problem stems from the premise that while a thread is iterating over the collection, another thread may be adding or removing items from the same collection. When this happens, the runtime throws an Invalid Operation Exception with the description of 'The Collection was modified...".

Iterating over a collection is further complicated when you try to follow good Object Oriented techniques, including abstraction and encapsulation. For example, this means that your GUI code responsible for dynamically building a Context Menu is separate from the business entity that represents your collection of Contacts.

Additional complications arise when building frameworks. In the case of a Contact Collection inside a framework, this collection is likely to be used in a number of different projects and across a number of different use cases. Some of these projects will be single threaded WinForms apps, other will highly concurrent ASP.Net applications, and others will be multithreaded Service Applications.

Anti-Pattern: Make the Application Developer do the work

Many frameworks take a "place the burden on the application developer" approach, which means that locking and threaded operations are the responsibility of the application developer building the final application. This typically means the framework collection classes will look something like:

public class ContactManager
{
    private List<string> _contacts = new List<string>();
    private object _syncroot = new object();

    public object SyncRoot { get { return _syncroot; } }

    /// <remarks>
    /// Anyone iterating over this must do so inside
    /// a lock on SyncRoot
    /// </remarks>
    public List<string> Contacts
    {
        get { return _contacts; }
    }

    // Various Contact Operations Go here
    // - Add
    // - Delete
    // - Filter
}

In this code the collection itself is directly exposed, along with a locking primitive. To use this code, developers must know to use the lock primitive, and write their code to look like:

public void AddContactsToMenu()
{
    lock (_manager.SyncRoot)
    {
        foreach (string s in _manager.Contacts)
        {
            // Do something here
        }
    }
}

Although this works, the drawbacks are many. Every access to the Contact collection needs to be done within the context of the correct lock. The odds are very high (and I speak from personal experience as a Framework Architect) that developers will:

  1. Not use the lock construct at all.
  2. Use the lock construct incorrectly.
  3. Use the lock construct in some places, but forget all about it in others.
  4. Blame you for having a crappy Framework that has threading issues.

This pattern typically ends in frustration and failure for the application developer. They simply don't know enough about threading, and don't know enough about the framework to properly use the locking mechanisms exposed.

In general, Frameworks should ease the burden on application developers. Pushing thread management out of the framework and into the application is not a good way to go.

Anti-Pattern: Using ReadOnlyCollection to eliminate locks

A common solution that we used for some time was to allow application developers the ability to iterate over a ReadOnlyCollection that was populated from the real collection. The code that we wrote to do this typically looked like:

public ReadOnlyCollection<string> Contacts
{
    get
    {
        lock (_syncroot)
        {
            return new ReadOnlyCollection<string>(_contacts);
        }
    }
}

This approach looks and sounds great, and is conceptually sound. After all, if we only give the application developer a ReadOnly copy of the collection, they don't have to worry about the collection being changed out from under them.

Unfortunately, the .NET Framework version of the ReadOnlyCollection doesn't work this way. I wrote all about that in a previous blog.

If you end up taking this route, make sure the ReadOnly version of the collection really is ReadOnly. This means a fair bit of Cloning and Copying.

Anti-Pattern: Abandoning Object Oriented Principles

Another common solution is to start throwing everything into the framework, including GUI code. It's easy to envision a business class that has the collection, yet also has in it the GUI code to build a WinForms Context Menu, Populate a ListBox, or build an ASP.Net menu. This is especially true with frameworks to which application developers have source code access to.

Each time a developer needs something new, it's easy to add one more method into the factory: Populate a ListView, Build a tree, add items to a context menu. After all, the locking constructs are already there, all the relevant business objects are handy, and you can just create a method that takes a DataGridView as a parameter...

18 months and 300 methods (with at least 75% code duplication) later, the code is now a jumbled mess, with no sense of structure or ownership.

This, um, Organic approach to building a framework leads to chaos and darkness.

Pattern: Using Lambda Functions

After spending a few years going through the various (in hindsight) Anti-Patterns to solving the Collection Enumeration problem, I think I've finally settled on a solid approach. The requirements that the problem needs to solve is:

  1. Allow applications developers to enumerate over collections of items in the framework.
  2. Not require application developers to perform any explicit Thread or Locking operations.
  3. Participate in any locking mechanism required by the Framework, so that multi-threaded code can be written and leveraged.

The best solution to this overall problem seems to be using Lambda Functions - or in C# 2.0 terms, passing in an Action or a custom ActionWithState delegate. Using the same code as earlier, our Application code will typically end up looking like:

public void BuildMenu()
{
    _manager.EnumerateContacts(BuildMenuCallback);
}

private void BuildMenuCallback(string s)
{
    _myMenu.Add(s);
}

Alternatively, we could do this same thing with an Anonymous method. Using outer/captured variables for state, and doing lightweight operations, this is a great way to go.

public void BuildMenu()
{
    List<string> menu = new List<string>();
    _manager.EnumerateContacts(delegate(string s)
    {
        menu.Add(s);                
    });            
}

The key to this application code is the callback method. With the framework class controlling the callback, it's easy to participate in any locking infrastructure.

The Framework implementation of this code is straightforward:

public void EnumerateContacts(Action<string> contactAction)
{
    lock (_syncroot)
    {
        foreach (string s in _contacts)
        {
            contactAction(s);
        }
    }
}

One drawback to this approach is the difficulty of passing and managing state. In the anonymous method example above, I used a captured variable that was implicitly passed in for state. In general, I prefer explicit methods over anonymous, yet I still want to be able to pass in a state variable. For this, I borrow a page from the .Net Async Delegate pattern, and add on an Object parameter to the Action delegate.

public delegate void ActionWithState<T>(T obj, object state);

Now, in the framework code, I create another overload of the EnumerateContacts method:

public void EnumerateContacts(ActionWithState<string> contactAction, object state)
{
    lock (_syncroot)
    {
        foreach (string s in _contacts)
            contactAction(s, state);
    }
}

Now that we have state, it's easy for the client code to do more interesting things:

public void BuildMenu()
{
    List<string> menu = new List<string>();
    _manager.EnumerateContacts(BuildMenuCallback, menu);

    // Now we can actually build our menu right here...
}

private void BuildMenuCallback(string s, object state)
{
    List<string> menu = state as List<string>;
    if (s.StartsWith("C", StringComparison.OrdinalIgnoreCase))
        menu.Add(s);
}

Using C# 3.0, we could certainly get fancier with the delegates, and turn them into true Lambda functions, Predicate Functions, and other classic Functional Programming techniques. Even with C# 2.0 we could get most of the way there, but the syntax is a bit cumbersome.

There is one big drawback to this approach: If the application developer does 'significant' work in the callback function, the lock pattern will be held much longer than we would like. Fortunately, this is pretty easy to mitigate by telling application developers not to get too crazy in their callback methods. It's also easy to debug later if performance becomes an issue.

Conclusions

Overall, I've been pretty happy with this approach so far. It certainly beats the other approaches I've used - so much so that I now consider them to be Anti-Patterns.

While I'm sure this pattern isn't new, I haven't seen it before. There are a number of concurrency patterns discussed on the web, including Thread Safe Iterators, and a few other related examples. Certainly a close variation of this would be the Loop-Tiling pattern Joe Duffy talked about in his recent MSDN article.

I would love to hear suggestions and other variations to solving the underlying thread safe enumeration problem.

Copyright ©2007 Chris Mullins
Permalink |  Trackback

©2008 Coversant, Inc. | Privacy Policy | About Coversant | Contact Info