Performing Parallel Loops with Dependencies

The parallel loops so far in this chapter have been examples of what is known as embarrassingly parallel problems. This is an unfortunate name, but it means that the work being done is extremely easy to parallelize, because each data element can be processed independently of every other element; that allows the items to be processes in any order and with any degree of concurrency. In other words, these problems are perfect for demonstrating the new .NET parallel programming features, and they appear at one end of a spectrum of dependency between items.

At the other end of the spectrum are problems where there is a very high degree of dependency between data items, for example, generating Fibonacci numbers, where each item is the sum of the previous two items (with 0 and 1 as seed values), a problem that is best solved sequentially.

In the middle are problems where there is some degree of dependence between some items. Generally, such problems are best approached using the techniques from Chapters 3 and 4 to coordinate and synchronize a set of Tasks. There is one situation, however, where we can mix sequential and parallel loops to process data with a particular type of dependency.

The dependency I have in mind is where the first value in a region of data depends on the last value of the previous region, as illustrated in Figure 5-1. The nth region of data cannot be completed until the (n - 1) region has been processed, but the items within the region can be processed in parallel.

n-1m region if region n+1m region Figure 5-1. Chain dependency

Listing 5-13 demonstrates this situation. A series of bank account transactions must be processed in order to determine the final balance for each month of the year, where each month contains 10,000 transactions. To determine the balance for the end of one month, we need to know the balance of the previous month. Aside from that dependency, we are able to process a month's transactions concurrently.

Listing 5-13 solves the dependency issue by mixing sequential and parallel loops. The outer loop is sequential and iterates through each monthly region in turn. The inner loop is parallel and sums the individual transactions. The result of the parallel loop for one month becomes the seed value for the following month.

Listing 5-13. Mixing Synchronous and Parallel Loops using System;

using System.Threading.Tasks;

namespace Listing_13 {

class Transaction { public int Amount { get; set;

class Listing_13 {

static void Main(string[] args) {

// create a random number generator Random rnd = new Random();

// set the number of items created int itemsPerMonth = 100000;

// create the source data

Transaction[] sourceData = new Transaction[12 * itemsPerMonth]; for (int i = 0; i < 12 * itemsPerMonth; i++) {

sourceData[i] = new Transaction() { Amount = rnd.Next(-400, 500) };

// create the results array int[] monthlyBalances = new int[12];

for (int currentMonth = 0; currentMonth < 12; currentMonth++) { // perform the parallel loop on the current month's data Parallel.For(

currentMonth * itemsPerMonth, (currentMonth + 1) * itemsPerMonth, new ParallelOptions(), () => 0,

return tlsBalance += sourceData[index].Amount;

}, tlsBalance => monthlyBalances[currentMonth] += tlsBalance); // end of parallel for // add the previous month's balance if (currentMonth > 0) monthlyBalances[currentMonth] += monthlyBalances[currentMonth - l];

for (int i = 0; i < monthlyBalances.Length; i++) {

Console.WriteLine("Month {0} - Balance: {l}", i, monthlyBalances[i]);

// wait for input before exiting

Console.WriteLine("Press enter to finish");

Console.ReadLine();

By mixing sequential and parallel loops in this way, we get a good degree of concurrency and still benefit from the simplicity of using a Parallel.For() loop. You may be tempted to replace the sequential loop with synchronization to achieve a similar effect; see the "Common Problems and Their Causes" section later in this chapter for an often-encountered pitfall.

0 0

Post a comment