UCAR > Communications > UCAR Quarterly > Fall 2000 Search


Fall 2000

New high-performance computing plan signals a cultural (r)evolution

by Carol Rasmussen and Bob Henson

The IBM SP, acquired last year, is an important part of NCAR's supercomputing strategies. (Photos by Carlye Calvin.)

NCAR and UCAR have submitted a plan to NSF that will reshape both NCAR's computing and its institutional culture in the coming years. A Strategic Plan for High Performance Simulation encompasses not only how NCAR's modeling efforts can meet the coming challenges but how to create what is beling called an "end- to-end simulation environment" within NCAR and UCAR.

The plan grew from the work of an NSF-appointed Code Assessment Panel that visited NCAR last summer. The panel, whose members were computer science specialists, was asked by NSF to review NCAR's key models from a software standpoint. Their report, issued in August 1999, stressed the need for change. As early as 1997, NCAR had begun a shift from vector machines toward distributed-memory, shared-multiprocessor (SMP) architectures with the introduction of a 64-processor Hewlett-Packard cluster, and a 128-processor SGI Origin 2000 went into use in 1998. A much larger SMP machine, an IBM SP, was installed in August 1999, just after the NSF review was complete. While acknowledging the promise of the new IBM, the review panel asserted that NCAR would need to modify its model development strategies in order to remain a leader in its field.

Steven Hammond.

"There was a genuine interest on the part of the leadership here in responding [to the report] in a positive way," says John Michalakes, a visiting computer scientist from Argonne National Laboratory who's been working at NCAR for over a year. Robert Serafin (then the director of NCAR) asked Steven Hammond, manager of the Computational Science Section of NCAR's Scientific Computing Division (SCD), to chair a committee that would prepare a strategic plan for high-performance simulation. Timothy Killeen joined the process shortly after becoming NCAR's director- designate.

Instead of looking at each point in the modeling process separately, the committee studied the entire computing environment "end to end," says Hammond. "There are some fundamental changes [proposed] in the plan."

What will change

"To some extent, the computational aspects of our models have been an afterthought," says Hammond. "The emphasis has been more on the phenomenological." That may have been natural in the early years of earth system modeling, when there were legions of scientific issues to be worked out before viable codes could be produced. But today's huge, multicomponent models like the Community Climate System Model (CCSM) can't be developed or tailored to run optimally by a couple of scientists. The NSF panel proposed—and the NCAR committee agreed—that teams of scientists and software engineers are needed for model creation and development, from beginning to end.

Killeen points out, "During the time that NCAR has existed and models were being developed, computer science has grown out of its infancy to be a mature science. There's great strength now in theory, practice, applications, quality of service, networking, bandwidth utilization, and also on the details of how software gets developed and can be made agile. So you could say that now is the time to do this."

The new emphasis on computation will bring a greater "level of formalism in our modeling activities, consistent with [how we develop] field programs or observational programs," says Hammond. "There haven't typically been design reviews for our software. A lot of things that are part of the systematic process of software development in the commercial sector would be very beneficial to software projects conducted here."

Equal to the scientific challenges ahead is the challenge of creating a social environment where the new interdisciplinary teams can thrive. "But we're good at putting together large teams with a shared vision," says Killeen. "A buy-in from the big community is a social organization feat that Maurice [Blackmon, director of NCAR's Climate and Global Dynamics Division] and his people have already accomplished" in developing the CCSM.

Another question is how to divide labor between the atmospheric scientists and computer scientists. One way is to confine each group's main concern to its own layer in the simulation code. Scientists, concerned primarily with algorithms for dynamics and physics, are able to work within one layer of the software hierarchy to code these in a standardized, platform-independent form. This leaves parallelism and other computational concerns to an implementation layer tailored to the machine at hand—primarily the domain of the computational specialists. The Weather Research and Forecasting Model (WRF; see p. 6) is being built in this way, with a mediation layer in between.

Frameworks (reusable collections of code) are being explored as a way to streamline the creation of these model implementation layers. NCAR has submitted a three-year grant proposal to lead the development of an earth system model framework that could be used for multiple models, in collaboration with NASA, Argonne, the National Centers for Environmental Prediction, Los Alamos National Laboratory, the University of Michigan, the Massachusetts Institute of Technology, and NOAA's Geophysical Fluid Dynamics Laboratory.

NCAR staff had an opportunity to discuss these and other issues related to the new plan at a workshop on 31 August. Upcoming workshops are planned to involve university scientists and other community members.

Getting more computer scientists on board

Tim Killeen.

Computer scientists will be more involved at NCAR than ever before. "We're working through a strategy of how to make that happen," says Killeen. His office is hiring a distinguished visiting computer scientist for an initial term of about one year to help with that process.

Killeen believes that the problems we'll be offering to computer scientists are right up their alley. "You have this inhomogeneous set of providers and software; different pieces of the coupled model have very different requirements of memory; [there are issues of] storage, computational efficiency, parallelizability, swapping in and out of memory . . . that's what turns [computer scientists] on. They write papers about how to do that."

Michalakes and NCAR scientist Joseph Klemp work together in exactly the kind of team that's called for in the NCAR plan: the WRF model development group. Michalakes says that for him, "The attraction of the WRF project is that there's a genuine partnership between the scientific members of the team and the software engineering members." Klemp also sees collaboration as its own reward, and he adds that "a sincere interest in the scientific goals of the organization" is likely to attract the right people. Both warn, however, that part of the reason their team works is that "those who wanted to get involved did, and the structure was imposed later," as Michalakes puts it. Efforts that are organized from the top down may meet more pitfalls, they note.

The bottom line

It's estimated that the software engineering changes outlined in the NCAR plan would cost several million dollars a year over five years, and there's no slack in NCAR's budget for it. On the contrary, NCAR modelers are already stretched trying to carry out the community-service aspects of their models without eroding the organization's basic-research agenda, says Klemp. "Somehow we need a mechanism to support these models as facility resources."

Many NCAR and UCAR groups are seeking grants to help in the transition. Proposals are due by December for NSF's Information Technology Research program, which is offering grants ranging from single-investigator projects (total budgets below $500,000 each) to group projects (up to $5 million). The topics include complex geophysical coding, data assimilation, collaboratories, and accessible visualization tools. Large institutions are eligible to submit a single proposal for a large ITR grant (total budgets up to $15 million). Killeen has asked SCD to lead the coordination of the development of such a proposal for NCAR. "It'll probably build upon the themes laid out in the HP strategic plan as well as other initiatives under way at NCAR." says SCD director Al Kellie. "Some early thoughts are that NCAR could really serve the geoscience community if we could achieve much better efficiencies for our applications on highly parallel, microprocessor-based systems. We need to crack some of the barriers that have been in the way of using these machines." He believes that one of the keys to crafting the proposal will be "to seek strong partnerships with universities and potentially other centers."

Killeen notes that NCAR will also need more flexibility within its core funding. "There definitely has to be a lot of permeability at the boundaries among the divisions, and there already is. We need structures that facilitate cross-divisional and cross-disciplinary interactions," such as those in place in NCAR's Environmental and Societal Impacts Group and Advanced Study Program.

Only the first step

The simulation plan is a crucial piece of the growth he envisions for NCAR, says Killeen, but it's not the whole thing. "It's a first, earnest step toward something that is more comprehensive yet: a plan for a knowledge system approach. What do I mean by that? It's where scientific simulation is part of [an environment that also encompasses] learning modules, data access, visualization, a general workspace, collaborative tools that support the acquisition and dissemination of knowledge about the earth system. That ties in with the whole information technology revolution, where NCAR just has to be."

Killeen believes that these changes come at an opportune time. "There is a special opportunity now to help define the national agenda in [our] areas [of expertise] and help define the connections between environmental science and information sciences. We have to think hard and get our story straight so we can demonstrate continued leadership."

Killeen has already established the overall theme of the upcoming strategic plan: NCAR as integrator. "Even a national center as well endowed in terms of people and materials as NCAR cannot handle it all, and shouldn't anyway," he says. "Our role is to be a player and often a leader in the development of new science, and I think that [role] requires NCAR to put together consortia and then to learn how to collaborate most efficiently with its partners."

Next-generation CCSM slated for 2001

Even as NCAR's approach to modeling is being examined, model development continues. The next generation of the Community Climate System Model, developed and supported by NCAR's Climate and Global Dynamics Division, will be designed over the next 18 months, with support from the U.S. Department of Energy (DOE). "We're starting to bring the pieces together," says CGD senior scientist Byron Boville. "We are pushing to have a model at the end of the year which will be running on [NCAR's] IBM [distributed-memory computer, acquired last year]." Early next year CGD hopes to carry out a 1,000-year control run on the CCSM- 2. It would be based on preindustrial conditions in order to determine the model's internal variability.

To create the next-generation CCSM-2, the Avant Garde Project, part of the DOE Accelerated Climate Prediction Initiative, involves about 15 people from NCAR and five DOE labs. The project is merging the CCSM with the DOE-supported Parallel Climate Model, originally developed by NCAR's Warren Washington and Gerald Meehl. NCAR has worked with Argonne and Oak Ridge National Laboratories to create software engineering guidelines for the entire model: requirements documents, unit testers, validation code, and the like.

The newly released Community Climate System Model Plan 2000–2005 offers details on how these and other advances will be accomplished. It's available on the Web.


In this issue... Other issues of UCAR Quarterly
UCARNCARUOP

Edited by Carol Rasmussen, carolr@ucar.edu
Prepared for the Web by Jacque Marshall
Last revised: Wed Dec 13 17:24:16 MST 2000