UCAR > Communications > Staff Notes > November 1998 Search


November 1998

Parallel power: MMM enters the big leagues with Compaq/DEC deal

Modelers in the Mesoscale and Microscale Meteorology Division can't be blamed for gorging on their smorgasbord of new parallel computers. MMM is accustomed to squeezing ever-more-sophisticated numerical models onto limited computer space. Now, thanks to a three-year project centered around a major hardware loan, the division has acquired seven high-end, multiprocessor servers and 42 workstations from Compaq Computer Corporation. The world's second-largest computer manufacturer, Compaq acquired Digital Equipment Corporation (DEC) earlier this year, with whom the deal began.

The team behind MMM's computer bonanza: (left to right) Dave Gill, Pat Waukau, Bob Gall, Bill Kuo, and Jordan Powers. (Photo by Carlye Calvin.)

The hardware, which arrived in October, was acquired through an arrangement with iMSC, a computing consulting firm based in Colorado Springs. IMSC is purchasing the hardware from Compaq and loaning it to MMM in return for testing and evaluation, modeling-system porting and parallelization, and demonstrations. To support its purchase, iMSC teamed with six in-state investors to form Advance Research Alliance. ARA purchased the machines, which list at about $8.7 million, at a substantial discount. MMM will use the machines primarily for development of the MM5 community mesoscale weather-forecast model and its adjoint version, described below.

At specified times during the three-year period, iMSC will upgrade the hardware (through steps such as chip replacements) and ARA will sell the used components. In 2001, ARA will sell the entire set of computers. When the loan is complete, ARA will have profited from its sales, MMM will have made some otherwise impossible scientific leaps, and Compaq will have obtained prestige and proof of performance for its high-end product line.

Hitting the ground running

"This will have a huge impact on the work we're doing on data assimilation," says Bob Gall, MMM director. Although in the works for 18 months, the deal was finalized so quickly that the machines arrived with little fanfare. Nonetheless, Bob noted, "it's amazing how quickly the users discovered them." Some MMM modelers were running full simulations on the first machine to be configured within a day after its start-up.

At its theoretical aggregate peak, the army of Alpha workstations and AlphaServers could deliver a stunning 118 gigaflops (118 billion floating-point operations per second). In practice, however, each of the single-processor workstations--which now have the same 600 MHz chips as the multiprocessor AlphaServers--runs the MM5 code at approximately 200 megaflops. This translates to an aggregate of about 4 gigaflops on the new workstations alone. However, the MM5 and MM5 adjoint development teams are looking to advance their science through the larger, distributed-shared memory (DSM) AlphaServers, the largest of which is expected to sustain 5 gigaflops. All of this computing power equates to an order-of-magnitude increase in MMM computing capacity (measured by adding SCD allocations and MMM's previous machines).

AlphaServers are the flagship products in Compaq's high-performance computing division. The first incarnations of these were on the market in 1991, and iMSC has one of the first ever shipped. Although recently retired due to a lightning strike, this machine was able to run the same MMM code as NCAR's new Alphas, albeit much more slowly. "This long-term compatibility is the main reason why we can upgrade the systems at NCAR so easily over the next three years without hurting productivity," says iMSC vice president Madeline Chen.

What's in the deal?

The total MMM package includes the following AlphaServer units: four 4-processor 4100 systems, one 8-processor 8400 shared-memory system, one 16-processor 4100 DSM system, and one 32-processor 4100 DSM system. The processors are currently 600 MHz Alphas, but these will be swapped as Compaq's state-of-the-art replacements become available. All of these machines are composed of building blocks of 4-processor shared-memory nodes, with the nodes connected in a distributed-memory architecture. MMM software engineer John Michalakes, a long-term visitor from Argonne National Laboratory, will be working on a distributed-memory parallel version of the MM5 code and, soon, the MM5 adjoint version.

The adjoint model can be used to study the MM5's sensitivity and ingest nontraditional observations through variational data assimilation (which aims to optimize the initial conditions from which a model begins its forecasts). A prime benefit of the adjoint version is that it allows the MM5 to ingest many more kinds of data than have historically been possible. In addition to standard meteorological variables such as temperature and wind, the adjoint can assimilate data from the much-anticipated Constellation Observing System for Meteorology, Ionosphere, and Climate (COSMIC). This network of satellites, slated for launch in 2002, is designed to yield global measurements of the atmosphere. (See UCAR, Taiwan join forces to launch COSMIC.)

Until now, rapid progress in the use of the adjoint model has been impeded by cost. It must be run iteratively, with each cycle repeating CPU-intensive mathematical algorithms and input/output-intensive code. A single data-assimilation run using the adjoint can require up to 200 times as much processor time as a standard MM5 run. "The jump in our computing capacity will relieve this bottleneck significantly," says MMM project scientist Jordan Powers.

During the three years of the NCAR loan, iMSC will use a similar batch of Compaq machines to refine a friendly-to-nonscientists version of MM5 that runs on the Windows NT operating system rather than UNIX. This version will be used by decision makers in business and industry, emergency managers in government, or even students in high school. Meanwhile, MMM plans to refine a parallelized version of MM5 suitable for a purely distributed-memory environment. Through all these developments, the MM5 will continue to be supported by NCAR as a public-domain resource, available free of user fees.

Who's who

The principal investigators for the three-year project are Bill Kuo (who heads MMM's Mesoscale Prediction Group) and Jordan. Instrumental in the deal has been MMM software engineer Dave Gill, who negotiated extensively with iMSC and kept things moving during the deal's 18-month evolution. Other key players are Paul Chen, iMSC president and chief executive officer; Pat Waukau, MMM system administrator; and Dick Foster, associate technical director of Compaq's High Performance Technical Computing Group. Foster will be visiting NCAR about one week per month to provide support and to help with the planned processor upgrades. He shared his Alpha expertise with MMM staff during a program at the Foothills Lab on 28 October; a morning of technical discussions was followed by an afternoon reception and ribbon-cutting ceremony.

"Compaq jumped at this chance," says Dave, "because we convinced them that there's a wide user community that's exposed to MM5 through workshops and tutorials." There are about 500 users of MM5 worldwide. Aside from the dedicated AlphaServers, the Compaq deal has also enhanced the MMM classroom used for model tutorials and other workshops. The classroom has 14 new workstations, each packing a 600 MHz processor, 0.5 gigabytes of RAM, and a 17-inch monitor.

Though the hardware loan seems like a can't-lose situation, the principals at MMM are sobered by the charge to achieve new levels of model performance and accelerated advances in development. "Of course, we could lose a lot of reputation if we don't produce the deliverables," says Bob. So far, so good, he adds, noting that the group is way ahead of schedule. •BH


In this issue...
Other issues of Staff Notes Monthly


UCAR
NCAR
UOP

Edited by Bob Henson, bhenson@ucar.edu

Prepared for the Web by Jacque Marshall