Monday, June 09, 2008

The New York Times Plays Techno Coyote

An American military supercomputer, assembled from components originally designed for video game machines, has reached a long-sought-after computing milestone by processing more than 1.026 quadrillion calculations per second.

The new machine is more than twice as fast as the previous fastest supercomputer, the I.B.M. BlueGene/L, which is based at Lawrence Livermore National Laboratory in California.

The new $133 million supercomputer, called Roadrunner in a reference to the state bird of New Mexico, was devised and built by engineers and scientists at I.B.M. and Los Alamos National Laboratory, based in Los Alamos, N.M. It will be used principally to solve classified military problems to ensure that the nation’s stockpile of nuclear weapons will continue to work correctly as they age. The Roadrunner will simulate the behavior of the weapons in the first fraction of a second during an explosion.

Before it is placed in a classified environment, it will also be used to explore scientific problems like climate change. The greater speed of the Roadrunner will make it possible for scientists to test global climate models with higher accuracy.

To put the performance of the machine in perspective, Thomas P. D’Agostino, the administrator of the National Nuclear Security Administration, said that if all six billion people on earth used hand calculators and performed calculations 24 hours a day and seven days a week, it would take them 46 years to do what the Roadrunner can in one day.

The machine is an unusual blend of chips used in consumer products and advanced parallel computing technologies. The lessons that computer scientists learn by making it calculate even faster are seen as essential to the future of both personal and mobile consumer computing.

The high-performance computing goal, known as a petaflop — one thousand trillion calculations per second — has long been viewed as a crucial milestone by military, technical and scientific organizations in the United States, as well as a growing group including Japan, China and the European Union. All view supercomputing technology as a symbol of national economic competitiveness.

[...]

The Roadrunner is based on a radical design that includes 12,960 chips that are an improved version of an I.B.M. Cell microprocessor, a parallel processing chip originally created for Sony’s PlayStation 3 video-game machine. The Sony chips are used as accelerators, or turbochargers, for portions of calculations.

The Roadrunner also includes a smaller number of more conventional Opteron processors, made by Advanced Micro Devices, which are already widely used in corporate servers.

“Roadrunner tells us about what will happen in the next decade,” said Horst Simon, associate laboratory director for computer science at the Lawrence Berkeley National Laboratory. “Technology is coming from the consumer electronics market and the innovation is happening first in terms of cellphones and embedded electronics.”

The innovations flowing from this generation of high-speed computers will most likely result from the way computer scientists manage the complexity of the system’s hardware.

Roadrunner, which consumes roughly three megawatts of power, or about the power required by a large suburban shopping center, requires three separate programming tools because it has three types of processors. Programmers have to figure out how to keep all of the 116,640 processor cores in the machine occupied simultaneously in order for it to run effectively.
First, YARGH!

Second, Hi! Horst! How come they didn't quote Kathy?

Third, a few small explanations and clarifications are in order:

The supercomputer that they talk about is emphatically not military. It lives and sucks down power at Los Alamos National Laboratory. It is paid for by the Department of Energy. While LANL lives in a different part of DOE, we're decidedly civilians. Yes, there will classified research done on that machine. In fact, that it's main job: simulations of nuclear weapons. Yes, quite soon, it will be behind the fence and unusable by the regular science crowd. However, until then it will be used for a number of simulations. Most notably, climate sims will be run on the system.


(a Roadrunner node. image courtesy LANL)

Secondly, this system is what we call a 'hybrid.' This is where commodity parts are taken and intermixed with custom components. Sometimes this is accomplished by modifying the original parts as was done on the Cray T3E (mod'ed Alpha processors). Sometimes this is done by hanging relatively off the shelf nodes off a custom interconnect (Cray XT4). Other times, it's done by building mixing and matching that. Or building custom nodes with a semi-specialized OS. This is done with the Roadrunner. These days, truthfully though Infiniband is almost COTS. The scale is what makes the difference here. It's almost OMG, it's the Cloverfield computing platform. However, in the future, the custom side of things may become more and more dominant or so sayeth John et al, amen.

Cluster is a word that gets me hot under the collar, but we'll save that for another time. Roadrunner does appear to qualify. *mutters* another time...another time...

Now as for the comments above about the increased speed. Oy. A speed increase on the computer doesn't always translate into a linear increase for the codes. There are a lot of factors that govern the speed of scientific programs. Memory is the big killer these days for most platforms: COTS memory cannot keep up with the CPU; therefore, the CPU goes idle for really nontrivial amounts of time waiting for a memory fetch. This happens a lot. If the memory subsystem could be improved, this would be fixed. The second reason is a double trouble problem of coding: first, algorithms have to be found that fit the platform and second the compilers often are less than good for intelligently dividing up the code to be run on what and when. In fact, intelligent compilers tend to be not at all. Scientists, even code monkey scientists that run simulations, often don't have time to sit there and break out the code like they used to do ten, fifteen years ago. In fact, they often hired specialists to do just that because they didn't have that time or knowledge. That may be what is needed now as the codes they work on become more and more specialized knowledge of the processors they are going to run on: the Cell is by most accounts a nontrivial beast to code for.

Another bit is the talk of the 'resurgence of American supercomputing.' Gag me. They act as though the US was so trumped and was going to be that way because of the Japanese Earth Simulator. The fact is that this is not the first time it has happened. Lest we forget, the Japanese built an uber'puter that sat at the top of the charts for a looooong time in the 1990s. Every ten or so years, they do this. They spend waaaaaaaaaaay more money on a single computer than the US is willing to do: the Earth Simulator cost over $750 million for a 5x speed up over a $30 million supercomputer of the time. Hmmm. Doesn't sound terribly economically bright. Especially since in a couple years the ES was overshadowed by much, much cheaper systems.

Anyways, I'm sicker than I thought so I'll wrap this up because my brain's in a near meltdown. The thing that angers me about the NYT article is that they use the wrong analogies (turbochargers?! the cells are the whole point, they're the engine. the opterons are the sidekick) and seem to just not understand the stuff they're writing about. I really wish that they could get an intelligent popularizer. Someone that actually was or is an HPC guy/gal that speaks plain English. At least they didn't say "supercomputer in a box..."

Whew.

No comments: