CRAYS, CLUSTERS, AND A CROWDED HOUSE: A LOOK AT SCD'S NEW HARDWARE Bob Niffenegger is definitely relieved. The manager of operations in the Scientific Computing Division's Operations Section is good- humored by nature, but the trials of installing a supercomputer could rattle anyone. And SCD has installed not one, but as many as seven major computers (depending on which systems one considers major). "It's too good to be true how trouble-free the hardware installation was. Something has to be wrong," Bob laughs. Another relieved man is Gene Schumacher, the group head for supercomputer systems in SCD's High Performance Systems Section, who shepherded the software installation and integration of six new Crays into the network of NCAR Cray computers. "It's been a grueling summer for my group and the Cray analysts and engineers that we work with--lots of long hours and long days. Now that the hectic pace has let up some, we're starting to tidy things up." The new installations give NCAR's computing power a major boost. Allocations of computer time to the university community have risen 20%, and in-house scientists are enjoying similar increases. Beyond their sheer number-crunching ability, the machines--including a massively parallel system and a workstation cluster--expand the range of computers on hand in SCD. As usual, the driver is the continual demand from scientists both within and without UCAR for increased computer time, speed, and storage capacity. Over the past couple of years, the supercomputing world has been gravitating toward two newer technologies: massively parallel systems, in which a task can be subdivided and sent to many processors at once; and clusters, which distribute work among entire machines rather than among processors. SCD opted to explore both of these new directions. The installation action began in March and the pace is only now beginning to ease. New hardware notwithstanding, SCD already has its eyes on the next generation of machines. "Leading-edge computing is often a prerequisite to leading-edge science," says SCD director Bill Buzbee. Other research centers in atmospheric science are forging ahead with their own acquisitions. Instead of massively parallel machines, the emphasis seems to be on ever-larger, moderately parallel versions of the traditional supercomputer. Several centers--England's Hadley Center and Meteorology Office, the Max Planck Institute in Germany, and the U.S. National Meteorological Center (which becomes the National Centers for Environmental Prediction on 1 October)--are acquiring 16-processor CRAY C-90 machines. These have roughly five times the capability of NCAR's CRAY Y-MP 8/864. SCD's immediate strategy for keeping up with the Joneses will be to upgrade existing machines and to dedicate an entire supercomputer to climate modeling. The Model Evaluation Consortium for Climate Assessment (MECCA) began using a dedicated CRAY Y-MP 2 in 1991 and reduced the time needed to complete large simulations by as much as sixfold. Now, with MECCA having run its course, SCD is upgrading that Y-MP 2 to a Y-MP 8 this fall and devoting it entirely to coupled climate model runs that will connect earth, ocean, ice, and atmosphere. This will be the Climate Simulation Laboratory (CSL), NCAR's computing contribution to the Climate Modeling, Analysis, and Prediction Program (CMAP) of the Global Change Research Program. "We expect that the CSL should be capable of completing several 100- year simulations [with all four physical components linked] in a calendar year," says Bill. "You can count on the fingers of one hand the number of air-ocean-land coupled simulations that have been performed out to 100 years." What's on the wish list for SCD? Bill wouldn't complain if a next- generation supercomputer were dropped (gently) into the operations room sometime in the next year or so. Fully configured, such a machine should offer at least 24 processors and 15 times the power of the Y-MP8/T3D combo. A bonus is that supercomputers are relatively easy to program. Massively parallel machines often require a major investment in converting established models to a wholly different environment. There's always next year, and the next order-of-magnitude improvement. In the meantime, SCD will have its hands full managing several times its former load of computers with roughly the same number of support staff. So far, so good, reports Bob, who gives credit beyond SCD for the virtually flawless installations this summer. "The Facilities Support People were on top of it all along. They did a superb job." --BH ********************************************** WHAT'S WHAT IN THE SCD OPERATIONS ROOM CM-5 littlebear This 32-node computer (named littlebear, in the tradition of naming SCD machines after high peaks in the Colorado Rockies) arrived from the Thinking Machines Corporation in April 1993. It became available for users to conduct parallel experiments later in the year and has since been linked to the CRAY-3 (see below). The CM-5 is being used for turbulence modeling, ocean modeling, and climate simulations. CRAY-3 graywolf The sleek CRAY-3, boasting gallium arsenide circuits and ethereal beauty, came to NCAR last year on loan from the Cray Computer Corporation for testing and experimentation. It returned to Colorado Springs this summer for refinement and is expected to begin a second stay in SCD in the near future. To accommodate its return--and the presence of more machines than ever in the newly crowded operations room--SCD and FSS have been working to make the room self- sufficient in its cooling capacity. A new cooling tower and chiller installed in SCD 26Ð29 August make the room independent from the rest of the Mesa Lab (and give FSS a chance to remodel the lab's main cooling system without having to bring all of SCD's computers down). CRAY T3D NCAR's most extensive foray to date into massively parallel computing began in July as SCD acquired a CRAY T3D system with 64 processing elements. The computer was attached to antero, SCD's recently upgraded Cray (see below). (The T3D has no informal name because it is networked via antero.) SCD staff have been working with Cray Research to gain experience in T3D programming environments. A class offered 19Ð23 September by Cray Research will introduce the first wave of users at NCAR to T3D protocol. CRAY Y-MP 2/216 antero The Y-MP has undergone one metamorphosis already this year, with another to go. In May the two-processor Cray was upgraded to a Y-MP 5; in October it will be upgraded to an eight-processor machine that will be functionally the same as a Y-MP 8/864. Already, this Cray serves as the front end for the massively parallel T3D machine, and their combined capabilities will increase substantially after this fall's upgrade. The system is being used for long-running coupled simulations of climate. EL Cluster echo, monarch, st-elmo, alpine In March came NCAR's first CRAY EL 92; on its heels in June and July came three EL 98s. Each of these are entry-level supercomputers compatible with bigger Crays yet standing less than five feet tall. An EL processor is about one-fifth as fast as a processor on the CRAY Y-MP 8/864. To simplify maintenance and maximize usage, the four ELs will be joined as a Cray cluster, with two-thirds of their time devoted to climate simulations and the other third allocated to the user community at large. IBM RS/6000 Cluster arapahoe, comanche, navaho A set of four IBM RS/6000 model 550 workstations was first installed in SCD in 1992, with a fifth model 550, chief, serving as the cluster's front- end machine. Two members of the cluster (arapahoe and comanche) were upgraded to model 590s in May and opened to the general user community in August; the third cluster member, navaho, is being upgraded to a 590 and dedicated to modeling work of the Climate and Global Dynamics Division. The fourth of the model 550s has been broken off for other tasks. With processor speed comparable to the ELs but less hard-disk storage, these machines are primarily used for relatively short jobs of less than five hours. A model 990, now called wildhorse, will be added to the cluster in a few weeks.