Where there used to be 11 towers, each about 2 meters (6 feet) tall and glimmering with subcutaneous green and orange lights, there are now 29. Tall, black, almost daunting, they march across the floor, two hedgerows of one-and-a-half ton units with an aggregate peak speed of 2 teraflopsNCAR's latest bid for the processing power needed by the research community to answer its burning scientific questions.
The blackforest expansion is part of the new Advanced Research Computing System, a collection of IBM equipment that will be significantly upgraded over the next several years (see IBM.) But while the first installation alone doubles computational power at NCAR, many users will find that ARCS means business as usualwith a shorter wait in the job queues.
|The arrival of 13 new IBM SP towers at the NCAR Mesa Laboratory on 5 October kept movers and SCD staff busy. (Photos by Carlye Calvin and Lynda Lester).|
Thus, users who have been computing on blackforest will find the same operating system, the same node configuration, the same batch system (LoadLeveler), and the same industry-standard debugger (TotalView). Usage of the new system will be split 50/50 between community computing needs and the Climate Simulation Laboratory, the same ratio as before the upgrade. But whereas blackforest has been saturated for some time, there will be plenty of room on the larger ARCS system.
To make certain that users' scientific needs were adequately reflected in ARCS specifications, SCD invited representatives from each NCAR division to serve as full partners on the RFP technical committee. The committee evaluated the needs of the UCAR community and the Climate Simulation Lab, forged them into explicit requirements, and assembled a benchmark suite that included tests of representative models run at NCAR (see sidebar). The review team included two members of the communityJames Kinter (Center for Ocean-Land-Atmosphere Studies) and Albert Semtner (Naval Postgraduate School)plus James Hack of NCAR's Climate and Global Dynamics Division.
The collaborative effort continued during the evaluation and selection process, ensuring that the new equipment would serve a wide range of users.
|The ARCS system includes 29 six-foot (one-meter) towers. (Photo by Brian Bevirt.)|
This makes the old-fashioned vector supercomputers, which were reliable standbys in the 1980s and early 1990s, prohibitively expensive in the new millennium. And even though NCAR codes tend to get only 510% efficiency out of microprocessors, off-the-shelf systems are so much cheaper than vector supercomputers that, in terms of price-performance ratio, they win.
"When it comes right down to it, I see the low efficiency on these microprocessor systems as being a land of opportunitythere's so much potential," says Thomas Engel, a high-performance computing specialist in SCD. "By paying attention to cache optimization and so forth, you can double the efficiency of certain applications."
SCD software engineers Richard Loft, Stephen Thomas, and John Dennis recently developed a scalable, three-dimensional dynamical core for climate models. Based on spectral elements, it achieves a higher percentage of peak on microprocessors.
Their code attained 15.7% efficiency (127 gigaflops) at NCAR using 134 4-processor IBM SP nodes, and 16.1% (370 gigaflops) at the National Energy Research Scientific Computing Center using 128 16-processor IBM SP nodes.
Although the learning curve in the transition from vector to distributed machines is similar to what programmers dealt with 25 years ago when vector hardware was introduced, today's users face an added layer of complexity. Codes must not only be rewritten for cache optimization, but distributed across nodes tied together by high-performance crossbar switches.
Certainly it's a challengebut real work is getting done. "We can be nostalgic, but we'll go where the opportunities are," says Vincent Wayland, a software engineer in NCAR's Climate and Global Dynamics Division. "We've done good science on these microprocessors. In the past two years, we've completed over 3,700 years of climate simulations on blackforest."
Meanwhile, as part of the ARCS contract, IBM and SCD have agreed to work together to improve the user environment and support services for the UCAR community and the Climate Simulation Lab. IBM will offer training in advanced programming, performance analysis, and tuning techniques as well as specialized training tailored to user needs. IBM will also provide two on-site applications specialists and is committed to a more efficient process for reporting and resolving compiler and tools problems.
"ARCS is a world-class system that exceeds the articulated purpose and goals of the RFP," says Al Kellie. As such, he adds, its extended capability (speed) and capacity (size) will allow users to be leading-edge participants in the discovery and exploration of new environmental science.
Edited by Bob Henson,
Prepared for the Web by Jacque Marshall
Last revised: Thu Dec 20 16:42:17 MST 2001