Planning a new paradigm: the future of high-end modeling at NCAR
Major cultural shifts don't come along very often. One is under way
right now at UCAR and NCAR. The focus is on our large-scale computer
models and how they're designed, developed, and maintained. Driven by
changes in computer architecture and by the increasing complexity of the
models, people throughout UCARscientists, software engineers, and
managersare rethinking how we ought to proceed. Like any paradigm
shift, this one is producing both angst and excitement among those at
the heart of our modeling activities.
The guidebooks for this process are several strategic plans released in
the past few months and another soon to be completed (see sidebar). An
internal workshop on high-performance (HP) simulation took place on 31
August at the Mesa Lab; a follow-up is being planned for the research
community at large.
Even as our approach to modeling is being examined, the modeling
continues. A wholly new mesoscale package will be released in October
(the Weather Forecast Model, or WRF; see sidebar), and an upgrade to the
Community Climate System Model is expected early next year (see
What's driving the action?
In the summer of 1999, an eight-member panel of computer-science
specialists was asked by NSF to review NCAR's key models from a software
standpoint. Their report, issued in August 1999, stressed the need for
change. As early as 1997, NCAR had begun a shift from vector machines
toward distributed-memory shared-multiprocessor (SMP) architectures with
the introduction of a 64-processor Hewlett-Packard cluster, and the
Climate System Laboratory began using a 128-processor SGI Origin 2000 in
June of 1998. A much larger IBM SP was installed for production use in
August 1999, just after the NSF review was completed. While
acknowledging the promise of the new IBM, the review panel asserted that
NCAR would need to modify its model development strategies in order to
remain a leader in the field.
Steve Hammond. (Photos by Carlye Calvin.)
A new direction began taking shape last October, when Bob Serafin (then
the director of NCAR) asked Steve Hammond, manager of SCD's
Computational Science Section, to chair a committee that would prepare a
strategic plan for HP simulation. Tim Killeen joined the process shortly
after becoming NCAR's director-designate, while he was still at the
University of Michigan. Instead of looking at each point in the modeling
process separately, the committee studied the entire computing
environment "end to end," says Steve. "There are some fundamental
changes [proposed] in the plan."
In the old days, a scientist could whip up the code for a model
virtually solo and call on a software engineer when problems arose.
There's still a time and a place for this approach, according to senior
scientist Joe Klemp (MMM), another member of the committee. "Lots of
times we build models, run them for a few weeks, and then throw them
away," says Joe. However, it's clear that the major community models are
now far beyond the scope of a single researcher. The NSF panel
proposedand the NCAR committee agreed toa newly
collaborative approach that involves teams of scientists and software
engineers in each model's creation from beginning to end.
The NCAR plan, released in May, was a first step in establishing this
new paradigm. The subject headings"Computing Resources," "Software
Tools, Frameworks, and Algorithms," and the likeimply as much
concern with the software itself as with the science behind it. "To some
extent the computational aspects of our models have been an
afterthought," says Steve. "The emphasis has been more on the
phenomenological." Steve says the committee called for a greater "level
of formalism in our modeling activities, consistent with [how we
develop] field programs or observational programs. There haven't
typically been design reviews for our software. A lot of things that are
part of the systematic process of software development in the commercial
sector would be very beneficial to software projects conducted here."
In fact, the relevance of commercial software design was one of the hot
topics at the staff workshop on 31 August. SCD software engineer Cecelia
DeLuca pointed out that a popular framework used to describe the
maturity of an organization's software procedures employs a five-level
system, ranging from totally freewheeling (level 1) to exhaustively
prescribed and documented (level 5). The midpoint, level 3, "means that
the entire organization has a standard set of software engineering
practices." Given the range of tasks carried out here, Cecelia argued
that full standardization may not be feasible or appropriate ("for
large-scale projects, level 2 would be nice"). She also emphasized that
we're not the first institution to have our coding scrutinized closely.
1994 review of the U.S. Department of Defense called for
improvements in such areas as software quality assurance and definition
of requirements. Looking at the list, she pointed out, "These are some
of the same issues we're facing now."
What's on the table?
A range of topics was covered in the strategic plan and at the workshop,
Layered simulation software. One way to divide labor among
scientists and computer specialists is to confine their respective
purviews to distinct layers in the simulation code. Scientists,
concerned primarily with algorithms for dynamics and physics, are able
to work within one layer of the software hierarchy to code these in a
standardized, platform-independent form. This leaves parallelism and
other computational concerns to an implementation layer tailored to the
machine at handprimarily the domain of the computational
specialists. The WRF model is being built in this way, with a mediation
layer in between.
Easier access to data. "It should be very easy for people to
compare observations and model data, but it's not," says software
engineer Lawrence Buja (CGD). More metadata"data about
data"is what's needed to help researchers comb through the vast
archives at NCAR and elsewhere. A number of groups at NCAR and UCAR are
exploring ways to create better metadata, especially for large data sets
accessible through the World Wide Web.
Continued progress in visualization. The NSF review stressed
NCAR's leadership role in visualizing large data sets. Considering the
unique demands of earth system modeling, the HP strategic plan states
that "visualization problems such as these are outside the scope of the
commercial marketplace, and an extensive research and development
program is required."
A higher profile for computer science at NCAR. The strategic
plan calls for a "refocused view of computing professionals" and
recommends their placement in the scientist job category in order to
ensure compensation on a par with their modeling colleagues.
SCD director Al Kellie, who moderated the August workshop, says he is
quite pleased with the interest it generated. "We had over 110 people to
start the day off. The group discussions were highly productive." The
next step, he says, is a continuing series of workshops and discussions
to help make the new paradigm a reality. Already, UCAR and NCAR are
teaming with several other institutions for HP simulation projects.
The next generation of CGD's Community Climate System Model (CCSM-
2) will be designed with support from the U.S. Department of Energy
(DOE) over the next 18 months. "We're starting to bring the pieces
together," says CGD senior scientist Byron Boville. "We are pushing to
have a model at the end of the year which will be running on the IBM."
Early next year CGD hopes to carry out a 1,000-year control run on the
CCSM-2. It would be based on preindustrial conditions in order to
determine the model's internal variability.
To create the CCSM-2, about 15 people from NCAR and five DOE labs are
collaborating on the Avant Garde Project, part of DOE's Accelerated
Climate Prediction Initiative. The project is merging the CCSM with the
DOE-supported Parallel Climate Model, originally developed by CGD's
Warren Washington and Jerry Meehl. NCAR has worked with Argonne and Oak
Ridge National Laboratories to create software engineering guidelines
for the entire model, including, for example, requirements documents,
unit testers, and validation code.
SCD's new visualization laboratory, to be completed late this year,
will house a node on the
AccessGrid. Based at Argonne, this fast-growing network
comprises several dozen institutions. The network includes large-format
displays integrated with virtual meeting rooms that allow group-to-group
communication. NSF has already conducted one of its program reviews
using the AccessGrid. Closer to home, SCD's Don Middleton suggests that
the high-speed connection between the Mesa and Foothills Labs be used as
a testbed for new collaborative technologies that increased bandwidth on
the 'Net might allow in coming years.
Frameworksreusable collections of codeare being
explored as a way to simplify and speed up the creation of model
implementation layers. NCAR is about to submit a three-year proposal to
lead the development of an earth system model framework that could be
used for multiple models. The project's large group of collaborators
includes NASA, Argonne National Laboratory, the National Centers for
Environmental Prediction, Los Alamos National Laboratory, the University
of Michigan, the Massachusetts Institute of Technology, and NOAA's
Geophysical Fluid Dynamics Laboratory.
Where's the funding?
How will the enthusiasm and brainstorming of recent months fare in a
climate of relatively fixed budgets? Nobody has an easy answer. It's
already a challenge to carry out the community-service aspects of our
big models without cutting into research, says Joe. "Somehow we need a
mechanism to support these models as facility resources." As for the
retooling outlined in the HP strategic plan, one estimate is that the
software engineering components could require several million dollars
over each of five years.
Many NCAR and UCAR groups are seeking grants to help in the transition.
Proposals are due by December for NSF's Information Technology Research
(ITR) program, which is offering grants ranging from single-investigator
projects (total budgets below $500,000 each) to group projects (up to $5
million) to large institutional proposals (up to $15 million over 5
years). The topics include complex geophysical coding, data
assimilation, collaboratories, and accessible visualization tools. Cliff
Jacobs, the NCAR program officer at NSF, encouraged attendees at the
August workshop to apply for the ITR grants.
Tim Killeen has asked SCD to coordinate development of a large-
institution proposal for NCAR. Al says, "It'll probably build upon the
themes laid out in the HP strategic plan as well as other initiatives
underway at NCAR. Some early thoughts are that NCAR could really serve
the geosciences community if we could achieve much better efficiencies
for our applications on highly parallel, microprocessor-based systems.
We need to crack some of the barriers that have been in the way of using
these machines. We're going to discuss this at the director's level and
then go out, and one of the keys will be to seek strong partnerships
with universities and potentially other centers."
According to Tim, NCAR will need more flexibility within its core
funding as well. "There definitely has to be a lot of permeability at
the boundaries among the divisions, and there already is. We need
structures that facilitate cross-divisional and cross-disciplinary
interactions," such as those in place in ESIG and ASP.
Last month Tim's office announced a new position that will bring in a
distinguished visiting computer scientist for an initial term of roughly
one year. Tim hopes that the inherent appeal of modeling the earth
system will spur interest from top people in applying for this position
and others that may follow. "If you think of it from the computer
science point of viewsomeone doing researchwhat would bring
them to the party? It would be a computer science challenge, not a
geophysics challenge: How do you integrate the different functionalities
[within an earth system model] in a way that maximally uses the
available computing architecture?" Most current code runs at well below
the ten percent efficiency level on today's highly parallel computers,
says Tim, "so we're underutilizing the computational resources. That's
what the strategic plan for scientific simulation is all about.
"Computer science has grown out of its infancy to be a mature science.
Now's the time to make an attempt to do this. It's going to be hard,
because a community climate model, for example, has players working at
different paces. Now we're giving them the extra challenge of a more
rigorous approach to the computer science, a more documented approach,
and it's going to be more work. But I think challenging things are often
more work, and it'll pay off in the long run."
WRF: A model to follow?
The WRF team includes (left to right) Shu-Hua Chen, principal
implementer of model physics; overall coordinator Joe Klemp; Bill
Skamarock, head of the working group for dynamic model numerics; and
John Michalakes, head of the working group for software architecture,
standards, and implementation. Not shown are Jimy Dudhia, head of the
working group for workshops, model distribution, and community support;
and Dave Gill, implementer of Web pages and real-data testing. (Photo by
A "bare bones" version of the Weather Research and Forecast model will
shortly be released to a group of interested users. WRF (pronounced
"worf") will offer resolution that's about an order of magnitude better
than existing operational mesoscale models. "When we look down the road
to greater computer power, we want to have horizontal grids of a couple
kilometers so we can resolve small-scale weather features as they're
evolving," says Joe Klemp, who is leading the development effort at MMM.
WRF's other collaborators include NOAA's National Centers for
Environmental Prediction (NCEP) and Forecast Systems Laboratory (FSL),
the University of Oklahoma's Center for Analysis and Prediction of
Storms, and the Air Force Weather Agency.
WRF has a three-layer structure. John Michalakes (MMM), a visiting
computer scientist from Argonne National Laboratory working on WRF
development, explains: A driver layer deals with computer architecture
(and also such issues as managing nested grids) so that the user can run
the model on distributed-memory, shared-memory, vector, or cluster
machines without having to modify it. Theoretically, WRF's driver layer
could be used for other modelsincluding general circulation
models. However, Michalakes points out that it would have to be modified
to deal with, for example, spectral transforms and coupling among
component models, since these features aren't yet part of WRF. The other
main layer, the model dynamics and physics, is the only one that will be
"visible" to a user. Joining the driver layer to the model layer is a
mediation layer, which Michalakes describes as "a glue layer that has to
know a little bit about both other layers so they can interact."
This structure gives WRF a flexibility that will be needed to serve both
researchers and forecasters. "There was rapid recognition among all the
participating organizations that there was value in developing a common
modeling system," Klemp says. "With WRF, at least there's a potential
for streamlining a lot of technology transfer."
Development of the model got started without a lot of WRF-specific
funding. "We've been trying to forge ahead on the resources available,"
says Klemp, who adds that the development team, which includes software
engineers and scientists, works together very well. "Our success is in
developing a real team attitude. [The engineers] don't just tell us what
to do and leave us to do it or not; there's a lot of going back and
forth until we agree on the best way to do it." Michalakes concurs:
"There's a joint appreciation, respect, and feeling of ownership by the
respective members of the team."
More details on WRF will appear in the
fall issue of the UCAR Quarterly.
In this issue...
Other issues of Staff Notes Monthly
Edited by Bob Henson,
Prepared for the Web by Jacque Marshall
Last revised: Tue Sep 26 12:38:56 MDT 2000