Thursday, October 1, 2009

A Java Fork/Join Framework

I was scared at first by Alexander's first question that this paper would be obsolete, but I was pleasantly surprised to find that the concepts in the paper and the concepts in the reading on the new Java functionality that he provided complemented each other nicely. I think fork/join and other similar patterns like the ones outlined in the OPL pattern will be necessary in modern languages. As parallelizability across threads, cores, processors, & machines becomes the main mode of increasing performance and scalability, the standard mutex and monitor paradigms offered by most languages today are insufficient to prevent the average developer from running into severe problems with deadlock, starvation, and other concurrency-related problems.

I think that it is definitely a worth-while trade-off to present developers with canned approaches to concurrency problems that may not align perfectly with their given tasks. C++ developers were initially skeptical that managed languages like Java and C# could ever perform well, but advances in computer hardware, garbage collection, and virtualization technologies to name a few have overcome most performance problems to make managed languages perform beyond the needs of typical developers. I think something similar will happen with concurrency-related patterns & frameworks; the need to have absolute control will be overshadowed by advances in hardware and software technology so that the general patterns will work well enough.

I'm honestly not sure if I would use the fork/join API presented by the author because I skipped all the code samples. I was more interested in the underlying ideas of how fork/join frameworks work, especially novel ideas such as work stealing. Also, since Java will be coming out with support for fork/join, I didn't see any need to take a look at an interface that won't make it beyond research.

I have the feeling that faster interfaces for just about anything can be developed for C++ than a managed language like Java. I'm sure this rule-of-thumb breaks down for some specific scenarios, but it seems reasonable in general. However, the convenience offered by Java in terms of automatic garbage collection and portability can often outweigh the performance gains that C++ would give you. It depends if you need a bleedingly fast system like something that processes trades for the New York Stock Exchange, or if you just need to crunch a lot of numbers for business reporting purposes and a general fork/join implementation would give you some performance gains over a single-threaded app. But I'm a consultant; my standard answer to questions like this is always "It depends."

I wasn't exactly clear on how forking, dividing work up, breaking it down, and stealing all worked. I got the feeling that some of the large blocks that got stolen would be dependent on smaller tasks that weren't (e.g. one of the big tasks at the top is merge sorting all the data and one of the smaller tasks is directly sorting 5% of the total data), but that doesn't seem like it would work at all so my understanding is wrong somewhere. I'll have to re-read the paper and listen to the class discussion to get a feel for the pros and cons of this model.

As for patterns in the paper, I believe that fork/join is a pattern in and of itself, though that's a bit of a cop-out answer.

I found it odd that the fork/join pattern becomes so problematic when many of the workers empty out their own queues and can't find much work to steal. I was naively thinking that using a monitor to block them and then signaling the monitor when data is ready would work, but I'll have to think a little harder on why that won't work.

This paper was much better than reading about how Guardian did a bad job of naming things and allocating bits :)

1 comment:

  1. "using a monitor"

    It is that line of thinking that these newer concurrency frameworks are working to avoid. Most of all the classes in the currency packages work without locks or synchronization. Otherwise they would not be truely "concurrent" ;)