UCAR > Communications > Staff Notes Monthly > October 2000 Search


October 2000

Planning a new paradigm: the future of high-end modeling at NCAR

Strategizing on the Web

The following reports provide more detail on the issues covered in this article. An overall NCAR strategic plan is now under development and expected to be completed this winter. The contact is Bob Harriss, NCAR associate director for planning, ext. 8106, harriss@ucar.edu.

UCAR Information Technology Strategic Plan
September 1998

Toward a Robust, Agile, and Comprehensive Information Infrastructure for the Geosciences: A Strategic Plan for High Performance Simulation
May 2000

Community Climate System Model Plan 2000–2005
June 2000

Major cultural shifts don't come along very often. One is under way right now at UCAR and NCAR. The focus is on our large-scale computer models and how they're designed, developed, and maintained. Driven by changes in computer architecture and by the increasing complexity of the models, people throughout UCAR—scientists, software engineers, and managers—are rethinking how we ought to proceed. Like any paradigm shift, this one is producing both angst and excitement among those at the heart of our modeling activities.

The guidebooks for this process are several strategic plans released in the past few months and another soon to be completed (see sidebar). An internal workshop on high-performance (HP) simulation took place on 31 August at the Mesa Lab; a follow-up is being planned for the research community at large.

Even as our approach to modeling is being examined, the modeling continues. A wholly new mesoscale package will be released in October (the Weather Forecast Model, or WRF; see sidebar), and an upgrade to the Community Climate System Model is expected early next year (see below).

What's driving the action?

In the summer of 1999, an eight-member panel of computer-science specialists was asked by NSF to review NCAR's key models from a software standpoint. Their report, issued in August 1999, stressed the need for change. As early as 1997, NCAR had begun a shift from vector machines toward distributed-memory shared-multiprocessor (SMP) architectures with the introduction of a 64-processor Hewlett-Packard cluster, and the Climate System Laboratory began using a 128-processor SGI Origin 2000 in June of 1998. A much larger IBM SP was installed for production use in August 1999, just after the NSF review was completed. While acknowledging the promise of the new IBM, the review panel asserted that NCAR would need to modify its model development strategies in order to remain a leader in the field.

Steve Hammond. (Photos by Carlye Calvin.)

A new direction began taking shape last October, when Bob Serafin (then the director of NCAR) asked Steve Hammond, manager of SCD's Computational Science Section, to chair a committee that would prepare a strategic plan for HP simulation. Tim Killeen joined the process shortly after becoming NCAR's director-designate, while he was still at the University of Michigan. Instead of looking at each point in the modeling process separately, the committee studied the entire computing environment "end to end," says Steve. "There are some fundamental changes [proposed] in the plan."

In the old days, a scientist could whip up the code for a model virtually solo and call on a software engineer when problems arose. There's still a time and a place for this approach, according to senior scientist Joe Klemp (MMM), another member of the committee. "Lots of times we build models, run them for a few weeks, and then throw them away," says Joe. However, it's clear that the major community models are now far beyond the scope of a single researcher. The NSF panel proposed—and the NCAR committee agreed to—a newly collaborative approach that involves teams of scientists and software engineers in each model's creation from beginning to end.

The NCAR plan, released in May, was a first step in establishing this new paradigm. The subject headings—"Computing Resources," "Software Tools, Frameworks, and Algorithms," and the like—imply as much concern with the software itself as with the science behind it. "To some extent the computational aspects of our models have been an afterthought," says Steve. "The emphasis has been more on the phenomenological." Steve says the committee called for a greater "level of formalism in our modeling activities, consistent with [how we develop] field programs or observational programs. There haven't typically been design reviews for our software. A lot of things that are part of the systematic process of software development in the commercial sector would be very beneficial to software projects conducted here."

Cecelia DeLuca.

In fact, the relevance of commercial software design was one of the hot topics at the staff workshop on 31 August. SCD software engineer Cecelia DeLuca pointed out that a popular framework used to describe the maturity of an organization's software procedures employs a five-level system, ranging from totally freewheeling (level 1) to exhaustively prescribed and documented (level 5). The midpoint, level 3, "means that the entire organization has a standard set of software engineering practices." Given the range of tasks carried out here, Cecelia argued that full standardization may not be feasible or appropriate ("for large-scale projects, level 2 would be nice"). She also emphasized that we're not the first institution to have our coding scrutinized closely. A major 1994 review of the U.S. Department of Defense called for improvements in such areas as software quality assurance and definition of requirements. Looking at the list, she pointed out, "These are some of the same issues we're facing now."

What's on the table?

A range of topics was covered in the strategic plan and at the workshop, including

  • Layered simulation software. One way to divide labor among scientists and computer specialists is to confine their respective purviews to distinct layers in the simulation code. Scientists, concerned primarily with algorithms for dynamics and physics, are able to work within one layer of the software hierarchy to code these in a standardized, platform-independent form. This leaves parallelism and other computational concerns to an implementation layer tailored to the machine at hand—primarily the domain of the computational specialists. The WRF model is being built in this way, with a mediation layer in between.

  • Easier access to data. "It should be very easy for people to compare observations and model data, but it's not," says software engineer Lawrence Buja (CGD). More metadata—"data about data"—is what's needed to help researchers comb through the vast archives at NCAR and elsewhere. A number of groups at NCAR and UCAR are exploring ways to create better metadata, especially for large data sets accessible through the World Wide Web.

  • Continued progress in visualization. The NSF review stressed NCAR's leadership role in visualizing large data sets. Considering the unique demands of earth system modeling, the HP strategic plan states that "visualization problems such as these are outside the scope of the commercial marketplace, and an extensive research and development program is required."

  • A higher profile for computer science at NCAR. The strategic plan calls for a "refocused view of computing professionals" and recommends their placement in the scientist job category in order to ensure compensation on a par with their modeling colleagues.

    What's next?

    Al Kellie

    SCD director Al Kellie, who moderated the August workshop, says he is quite pleased with the interest it generated. "We had over 110 people to start the day off. The group discussions were highly productive." The next step, he says, is a continuing series of workshops and discussions to help make the new paradigm a reality. Already, UCAR and NCAR are teaming with several other institutions for HP simulation projects.

  • The next generation of CGD's Community Climate System Model (CCSM- 2) will be designed with support from the U.S. Department of Energy (DOE) over the next 18 months. "We're starting to bring the pieces together," says CGD senior scientist Byron Boville. "We are pushing to have a model at the end of the year which will be running on the IBM." Early next year CGD hopes to carry out a 1,000-year control run on the CCSM-2. It would be based on preindustrial conditions in order to determine the model's internal variability.

    To create the CCSM-2, about 15 people from NCAR and five DOE labs are collaborating on the Avant Garde Project, part of DOE's Accelerated Climate Prediction Initiative. The project is merging the CCSM with the DOE-supported Parallel Climate Model, originally developed by CGD's Warren Washington and Jerry Meehl. NCAR has worked with Argonne and Oak Ridge National Laboratories to create software engineering guidelines for the entire model, including, for example, requirements documents, unit testers, and validation code.

  • SCD's new visualization laboratory, to be completed late this year, will house a node on the AccessGrid. Based at Argonne, this fast-growing network comprises several dozen institutions. The network includes large-format displays integrated with virtual meeting rooms that allow group-to-group communication. NSF has already conducted one of its program reviews using the AccessGrid. Closer to home, SCD's Don Middleton suggests that the high-speed connection between the Mesa and Foothills Labs be used as a testbed for new collaborative technologies that increased bandwidth on the 'Net might allow in coming years.

  • Frameworks—reusable collections of code—are being explored as a way to simplify and speed up the creation of model implementation layers. NCAR is about to submit a three-year proposal to lead the development of an earth system model framework that could be used for multiple models. The project's large group of collaborators includes NASA, Argonne National Laboratory, the National Centers for Environmental Prediction, Los Alamos National Laboratory, the University of Michigan, the Massachusetts Institute of Technology, and NOAA's Geophysical Fluid Dynamics Laboratory.

    Where's the funding?

    How will the enthusiasm and brainstorming of recent months fare in a climate of relatively fixed budgets? Nobody has an easy answer. It's already a challenge to carry out the community-service aspects of our big models without cutting into research, says Joe. "Somehow we need a mechanism to support these models as facility resources." As for the retooling outlined in the HP strategic plan, one estimate is that the software engineering components could require several million dollars over each of five years.

    Many NCAR and UCAR groups are seeking grants to help in the transition. Proposals are due by December for NSF's Information Technology Research (ITR) program, which is offering grants ranging from single-investigator projects (total budgets below $500,000 each) to group projects (up to $5 million) to large institutional proposals (up to $15 million over 5 years). The topics include complex geophysical coding, data assimilation, collaboratories, and accessible visualization tools. Cliff Jacobs, the NCAR program officer at NSF, encouraged attendees at the August workshop to apply for the ITR grants.

    Tim Killeen has asked SCD to coordinate development of a large- institution proposal for NCAR. Al says, "It'll probably build upon the themes laid out in the HP strategic plan as well as other initiatives underway at NCAR. Some early thoughts are that NCAR could really serve the geosciences community if we could achieve much better efficiencies for our applications on highly parallel, microprocessor-based systems. We need to crack some of the barriers that have been in the way of using these machines. We're going to discuss this at the director's level and then go out, and one of the keys will be to seek strong partnerships with universities and potentially other centers."

    According to Tim, NCAR will need more flexibility within its core funding as well. "There definitely has to be a lot of permeability at the boundaries among the divisions, and there already is. We need structures that facilitate cross-divisional and cross-disciplinary interactions," such as those in place in ESIG and ASP.

    Last month Tim's office announced a new position that will bring in a distinguished visiting computer scientist for an initial term of roughly one year. Tim hopes that the inherent appeal of modeling the earth system will spur interest from top people in applying for this position and others that may follow. "If you think of it from the computer science point of view—someone doing research—what would bring them to the party? It would be a computer science challenge, not a geophysics challenge: How do you integrate the different functionalities [within an earth system model] in a way that maximally uses the available computing architecture?" Most current code runs at well below the ten percent efficiency level on today's highly parallel computers, says Tim, "so we're underutilizing the computational resources. That's what the strategic plan for scientific simulation is all about.

    "Computer science has grown out of its infancy to be a mature science. Now's the time to make an attempt to do this. It's going to be hard, because a community climate model, for example, has players working at different paces. Now we're giving them the extra challenge of a more rigorous approach to the computer science, a more documented approach, and it's going to be more work. But I think challenging things are often more work, and it'll pay off in the long run."

    • BH

    WRF: A model to follow?

    The WRF team includes (left to right) Shu-Hua Chen, principal implementer of model physics; overall coordinator Joe Klemp; Bill Skamarock, head of the working group for dynamic model numerics; and John Michalakes, head of the working group for software architecture, standards, and implementation. Not shown are Jimy Dudhia, head of the working group for workshops, model distribution, and community support; and Dave Gill, implementer of Web pages and real-data testing. (Photo by Carlye Calvin.)

    A "bare bones" version of the Weather Research and Forecast model will shortly be released to a group of interested users. WRF (pronounced "worf") will offer resolution that's about an order of magnitude better than existing operational mesoscale models. "When we look down the road to greater computer power, we want to have horizontal grids of a couple kilometers so we can resolve small-scale weather features as they're evolving," says Joe Klemp, who is leading the development effort at MMM. WRF's other collaborators include NOAA's National Centers for Environmental Prediction (NCEP) and Forecast Systems Laboratory (FSL), the University of Oklahoma's Center for Analysis and Prediction of Storms, and the Air Force Weather Agency.

    WRF has a three-layer structure. John Michalakes (MMM), a visiting computer scientist from Argonne National Laboratory working on WRF development, explains: A driver layer deals with computer architecture (and also such issues as managing nested grids) so that the user can run the model on distributed-memory, shared-memory, vector, or cluster machines without having to modify it. Theoretically, WRF's driver layer could be used for other models—including general circulation models. However, Michalakes points out that it would have to be modified to deal with, for example, spectral transforms and coupling among component models, since these features aren't yet part of WRF. The other main layer, the model dynamics and physics, is the only one that will be "visible" to a user. Joining the driver layer to the model layer is a mediation layer, which Michalakes describes as "a glue layer that has to know a little bit about both other layers so they can interact."

    This structure gives WRF a flexibility that will be needed to serve both researchers and forecasters. "There was rapid recognition among all the participating organizations that there was value in developing a common modeling system," Klemp says. "With WRF, at least there's a potential for streamlining a lot of technology transfer."

    Development of the model got started without a lot of WRF-specific funding. "We've been trying to forge ahead on the resources available," says Klemp, who adds that the development team, which includes software engineers and scientists, works together very well. "Our success is in developing a real team attitude. [The engineers] don't just tell us what to do and leave us to do it or not; there's a lot of going back and forth until we agree on the best way to do it." Michalakes concurs: "There's a joint appreciation, respect, and feeling of ownership by the respective members of the team."

    • Carol Rasmussen

    More details on WRF will appear in the fall issue of the UCAR Quarterly.


    In this issue... Other issues of Staff Notes Monthly
    UCARNCARUOP

    Edited by Bob Henson, bhenson@ucar.edu
    Prepared for the Web by Jacque Marshall
    Last revised: Tue Sep 26 12:38:56 MDT 2000