Tuesday, November 3, 2009

OPL: Task Parallelism, Recursive Splitting, & Discrete Event

Task Parallelism

When you first saw this pattern title, did you feel like you knew the subject beforehand? After reading the pattern, did you confirm your impression or did it surprise you? how?
+Yes & yes. It's basically a layer of abstraction on top of the implementation patterns. The Monte Carlo method is mentioned several times since it is a computational pattern that uses task parallelism.

In what category falls this pattern?
+Parallel algorithm strategy pattern. Although the different categorizations of the patterns on the parlab website can help group similar patterns, sometimes it feels like there's a little too much overlap or one to many layers of abstraction.

What platform do you have more experience programming parallel applications on? And what Task Management Mechanism? (Jobs over Networks, OS Processes, SW Task Queue, HW Task Queue)
+I have no experience writing any applications that use parallel programming to solve a problem. Here and there in industry and class projects, I'll make some async calls, but that's always just so the main thread of execution doesn't block when doing something like sending an email or having another single thread create a long-running report.

Do you think that the lack of new hardware is preventing us of reaching better run times?
+No. I don't think we utilize the number of cores in modern servers and PCs to their full extent. Language support that would make parallel programming accessible to the masses is the major obstacle right now.

Do you think that this patterns requires you to learn more about hardware platforms in order to make a correct implementation?
+No. You could implement this pattern in Erlang and never have to know a thing about the hardware.

Do you agree with the mapping the author made for the Monte Carlo example? Can you think of something else?
+Yes, the Monte Carlo VaR example seems like a set of independent tasks, and not a problem that is broken down into smaller recursive pieces, which would be the Recursive Splitting pattern, or small tasks performed in sequence, which would be the Pipeline pattern. After starting to read the recursive splitting paper where they say that recursive splitting is a specific form of task parallelism, I don't understand why there are so many different layers of abstraction inside the one Parlab category. What in here isn't considered task parallelism? A "sequential" pattern like pipline seems like the only one. If they could be arranged nicely enough, I think I would rather see the patterns presented here in a tree form showing which patterns are more specific versions of their parents.

Recursive Splitting

I thought this paper went into a good amount of detail, and I was very pleased by the fact that it showed how it was related to many of the other patterns & grouped those related pattern into higher-, same, and lower-level groupings.

How to control task granularity in this pattern?
+You just have to define a base case beyond which the problem is not split any further. The problem is sloved in a more straight-forward, usually sequential manner after crossing the threshold. The threshold is tied to both the characteristics of the problem domain, the data, and the hardware.

Can you think of any other popular algorithm (besides Selection Sort) which this pattern can not be applied to?
+An algorithm to find the median value in a set of numbers.

The authors mention the idea of composing the Data Parallelism pattern inside of the Recursive Splitting pattern. Can you think of other patterns composition?
+Task parallelism + pipeline.

The authors mention that the ideas behind the Fork/Join and Task-queue strategy patterns are essential to any developer who wants an efficient implementation of the Recursive Splitting pattern. Why?
+Although nascent, they are the most mature and tested implementation patterns for recursive splitting problems. Reinventing the wheel can lead you to make naïve mistakes. The work stealing algorithm in the fork/join pattern is especially novel and worth reusing.

Discrete Event

In the context of asynchronous communication of events, the authors mention two environments: message-passing and shared-memory. Which environment are you more familiarized with? Which one you like most?
+I'm not extremely familiar with either, but I'm a much bigger fan of message passing. Recently I've been coming across a lot of readings and OSS code that use message passing to build decoupled, reliable systems. Erlang is an example of using this pattern at a low level, and enterprise service buses are an example of doing it at a high level. In both instances, the systems are touted as being extremely reliable since components are isolated from each other. With shared memory, there is a greater probability of contention and especially corruption between different tasks and processes.

Do you agree with the authors on that, often, the best approach to deal with deadlocks is to use timeouts instead of accurate deadlock detection?
+I do agree. The timeout method requires little overhead/latency, and it's easy to roll your own implementation. Accurate deadlock checks will take more resources to execute & likely a library for the implementation that an everyday programmer would need to use.

No comments:

Post a Comment