A few weeks ago, Anwar Ghuloum at the Research@Intel blog advised developers that they should start thinking about “tens, hundreds and thousands of cores”. In essence, the people that control processor development are telling us that the future of programming is going to move out rather than up. To get an idea of the time line, Intel’s CEO tells us to expect 80 cores by 2011. We can probably assume the number of cores on a processor is going to increase geometrically in the years that follow.
The catch is that we don’t really know how to program for this.
To take advantage of massively multicore architectures, we really need to take our computing problems and decompose them into smaller bits that can be farmed out to various worker processes. This idea is what Google’s MapReduce is all about, and is an approach that has worked well with certain categories of problems, such as bioinformatic data analysis, or 3D rendering.
Programming Crisis
I am seeing a pending programmer educational crisis coming. Most programmers just are not trained to think about this kind of decomposition. They’ve learned functional and object-oriented languages, and tend to think in a linear fashion. Furthermore, the nature of the programming languages and platforms they’ve worked on has ingrained the concept into them that multi-threaded programming is extremely tricky. And with mainstream platforms, it is.
A lot of the existing difficulty with well-known languages, such as Java, is managing shared state, or memory between different threads. Java has various synchronization mechanisms, but using them in heavily multithreaded environments can be challenging. I’ve seen unit tests randomly pass or fail, swinging on race conditions – which thread ended first – that determine the success or failure of the test.
In the web world we’ve become rather fond of very high level languages such as Python and Ruby. However the runtime story gets even worse – most scripting language runtimes can’t acknowledge more than one processor at a time – meaning that the only way to parallelize work is to instantiate multiple instances of the runtime. Besides this being a hack, it makes communication between processes awkward (usually relying on expensive data serialization), and creates the kind of issues that we’ve been trying to get away from with high-level languages: the developer has to deal with boilerplate computer science problems, rather than being allowed to exclusively concentrate on their business problems.
Concurrent Languages
One answer seems to be tech stacks that are designed for concurrency from the ground up. Enter Erlang, a language developed for telephony applications by Ericsson. Erlang handles concurrency very, very well. For example, it circumvents the shared state problem by not having any shared state. However, from the perspective of someone with a Java or Python background (e.g. me) it’s frickin’ weird. Some serious adjustments in the way problems are approached have to be made.
Here’s a real simple code snip, that generates the squares of a list of numbers. First a naive implementation in Python:
def squares(num_list): square_list = [] for n in num_list: square_list.append(n * n) return square_list
A more compact version in Python, using a list comprehension:
def squares(num_list): return [n*n for n in num_list]
Here’s the same code in Erlang:
square([H|T]) -> H*H | square(T); square([]) -> [].
Assuming you don’t know Erlang, can you even tell what’s going on? You’re looking at a completely different paradigm. The left side of the functions (to the left of the “->” marker) relies on pattern matching (sort of like regular expressions). To the right side, the function implementation is tail recursion at work. Needless to say, this is a bit of a departure for many developers. I shudder to think of trying to take a department of developers who cut their teeth on Java and PHP, and train them up in Erlang.
This is why I’m so interested in projects like Reia. In an attempt to fix the “impedance mismatch” between the coming massively multicore future, and today’s programming skills, their goal is to make a high-level Python/Ruby like language that compiles onto bytecode that will run on the Erlang VM. It’s in a very early stage, and I’m not sure if it will ever gain critical mass, but this is a problem that needs solving, and I’m intererested in any attempts.
Check out http://www.multicoreinfo.com for a large collection of multicore resources.