Performance Zone is brought to you in partnership with:

Developer and designer of distributed software systems using the most current technologies and standards in order to exceed customer expectations. Specialties: Microsoft.NET, Java, web development (JavaScript) and user Luis is a DZone MVB and is not an employee of DZone and has posted 3 posts at DZone. You can read more from them at their website. View Full User Profile

Overview Of The Task Parallel Library (TPL)

04.09.2013
| 1977 views |
  • submit to reddit

Introduction

Remember those times when we needed to spawn a separate thread in order to execute long-running operations without locking the application execution until the operation execution completes? Well, time to rejoice; those days are long gone. Starting by its version 4.5, the Microsoft.NET Framework delivers a new library that introduces the concept of “tasks”. This library is known as the Task Parallel Library; or TPL.

Tasks v.s. Threads

In the good (annoying) old days we frequently had the need to spawn a separate thread to query the database without locking the main application thread so we could show a loading message to the user and wait for the query to finish execution and then process results. This is a common scenario in desktop and mobile applications. Even though there are several ways to spawn background threads (async delegates, background workers and such), in the most basic and rudimentary fashion, things went a little something like this:

User user = null;
 
// Create background thread that will get the user from the repository.
Thread findUserThread = new Thread(() =>
{
    user = DataContext.Users.FindByName("luis.aguilar");
});
 
// Start background thread execution.
findUserThread.Start();
 
Console.WriteLine("Loading user..");
 
// Block current thread until background thread finishes assigning a
// value to the "user" variable.
findUserThread.Join();
 
// At this point the "user" variable contains the user instance loaded
// from the repository.
Console.WriteLine("User loaded. Name is " + user.Name);

Once again, this code is effective, it does what it has to do: Load a user from a repository and show the loaded user’s name on console. However, this code sacrifices succinctness completely in order to initialize, run and join the background thread that loads the user asynchronously.

The Task Parallel Library introduces the concept of “tasks”. Tasks are basically operations to be run asynchronously, just like what we just did using “thread notation”. This means that we no longer speak in terms of threads, but tasks instead; which lets us execute asynchronous operations by writing very little amount of code (which also is a lot easier to understand and read). Now, things have changed for good like this:

Console.WriteLine("Loading user..");
 
// Create and start the task that will get the user from the repository.
var findUserTask = Task.Factory.StartNew(() => DataContext.Users.FindByName("luis.aguilar"));
 
// The task Result property hold the result of the async operation. If
// the task has not finished, it will block the current thread until it does.
// Pretty much like the Thread.Join() method.
var user = findUserTask.Result;
 
Console.WriteLine("User loaded. Name is " + user.Name);

A lot better, huh? Of course it is. Now we have the result of the async operation strongly typed. Pretty much like using async delegates but without all the boilerplate code required to create delegates; which is possible thanks to the power of C# lambda expressions and built-in delegates (Func, Action, Predicate, etc.)

Tasks have a property called Result. This property contains the value returned by the lambda expression we passed to the StartNew() method. What happens when we try to access this property while the task is still running? Well, the execution of the calling method is halted until the task finishes. This behavior is similar to Thread.Join() (line 16 of the first code example).

Tasks Continuations

OK, we now have knowledge of how all this thing about tasks goes. But, let’s assume you don’t want to block the calling thread execution until the task finishes, but have it call another task after it finishes that will do something with the result later on. For such scenario, we have task continuations.

The Task Parallel Library allows us to chain tasks together so they are executed one after another. Even better, code to achieve this is completely fluent and verbose.

Console.WriteLine("Loading user..");
 
// Create tasks to be executed in fluent manner.
Task.Factory
    .StartNew<User>(() => DataContext.Users.FindByName("luis.aguilar")) // First task.
    .ContinueWith(previousTask =>
    {
        // This will execute after the first task finishes. First task's result
        // is passed as the first argument of this lambda expression.
        var user = previousTask.Result;
 
        Console.WriteLine("User loaded. Name is " + user.Name);
    });
 
// Tasks will start running asynchronously. You can do more things here...

As verbose as it gets, you can read the previous code like “Start new task to find a user by name and continue by printing the user name on console”. Is important to notice that the first parameter of the ContinueWith() method is the previously executed task which allows us to access its return value through its Result property.

Async And Await

The Task Parallel Library means so much for the Microsoft.NET Framework that new keywords were added to all its languages specifications to deal with asynchronous tasks. These new keywords are async and await.

The async keyword is a method modifier that specifies that it is to be run in parallel with the caller method. Then we have the await keyword, which tells the runtime to wait for a task result before assigning it to a local variable, in the case of tasks which return values; or simply wait for the task to finish, in the case of those with no return value.

Here is how it works:

// 1. Awaiting For Tasks With Result:
async void LoadAndPrintUserNameAsync()
{
    // Create, start and wait for the task to finish; then assign the result to a local variable.
    var user = await Task.Factory.StartNew<User>(() => DataContext.Users.FindByName("luis.aguilar"));
 
    // At this point we can use the loaded user.
    Console.WriteLine("User loaded. Name is " + user.Name);
}
 
// 2. Awaiting For Task With No Result:
async void PrintRandomMessage()
{
    // Create, start and wait for the task to finish.
    await Task.Factory.StartNew(() => Console.WriteLine("Not doing anything really."));
}
 
// 3. Usage:
void RunTasks()
{
    // Load user and print its name.
    LoadAndPrintUserNameAsync();
 
    // Do something else.
    PrintRandomMessage();
}

As you can see, asynchronous methods are now marked with a neat async modifier. As I mentioned before, that means they are going to run asynchronously; better said: in a separate thread. Is important to clarify that asynchronous methods can contain multiple child tasks inside them which are going to run in any order, but by marking the method as asynchronous means that when it is called in traditional fashion, the runtime will implicitly wrap this method contents in a task object.

For example, writing this:

var loadAndPrintUserNameTask = LoadAndPrintUserAsync();

... is equivalent to writing this:

var loadAndPrintUserNameTask = new Task(LoadAndPrintUserAsync);

Remember the task was created, but it has not been started yet. You need to call the Start() method in order to do so.

Now, we can also create awaitable methods. This special kind of methods are callable using the await keyword.

async Task LoadUserAsync()
{
    // Create, start and wait for the task to finish; then assign the result to a local variable.
    var user = await Task.Factory.StartNew<User>(() => DataContext.Users.FindByName("luis.aguilar"));
 
    // Return the loaded user. The runtime converts this to a Task<User> automagically.
    return user;
}

All awaitable methods specify a task as its return type. Now, there are things we need to discuss in detail here. This method’s signature specifies that it has a return value of type Task<User> but it is actually returning the loaded user instance instead (line 7). What is this? Well, this method can return two types of values depending of the calling scenario.

First scenario would be when it is called in a traditional fashion. In this case it returns the actual task instance ready to be executed.

Task loadUserTask = LoadUserAsync();
 
// The previous code is equivalent to:
Task loadUserTask = new Task<User>(() => LoadUserAsync().Result);

Second scenario would be when it is called using await. In this case it starts the task, waits for it to finish and gets the result, which then gets assigned to the specified local variable.

User user = await LoadUserAsync();
 
// The previous code is equivalent to:
User user = LoadUserAsync().Result;

See? Personally it is the first time I see a method that can return two types of value depending on how it is called. Even though is quite interesting such thing exists. By the way, is important to remember that any method which at any point awaits for an asynchronous method by using the await keyword needs to be marked as async.

Conclusion

This surely means something for the whole framework. Looks like Microsoft has taken care of parallel programming on its latest framework release. Desktop and mobile application developers will surely love this new feature which reduces significantly boilerplate code and increases code verbosity. We can all feel happy about our beloved framework moving forward the right way once again.

That’s all for now, folks. Stay tuned! ;)

Further Reading



Published at DZone with permission of Luis Aguilar, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)