by Bob Henson
Necessity has driven NCAR to consider a new high-performance computing data center, as the power to operate computing systems is rapidly outgrowing the capabilities of the Mesa Lab. However, the occasion also presents an opportunity. In a recent workshop, geoscientists envisioned a new, high-tech, collaborative environment that would provide an innovative setting in which to address their toughest modeling problems.
NSF organized the workshop, which was hosted by NCAR on 25–27 September. The meeting drew more than 100 participants, including nearly a dozen officials from NSF's Geosciences Directorate and the Office of Cyberinfrastructure.
The need for advanced computational technologies in support of frontier research and education in the geosciences permeated the workshop's presentations and discussions. The vision that emerged from presentations and discussions at the workshop is that of a distributed set of science and technology centers, or nodes, linked by ultra high speed networks to form a powerful infrastructure supporting and coordinating the nation's research activities in the geosciences.
"This infrastructure would better enable interdisciplinary collaboration, and it could help educate students on advanced techniques in Earth system modeling, supercomputing, and visualization," said Richard Loft, director of technology development at NCAR's Computational and Information Systems Laboratory.
A centerpiece of the collaboratory vision is the call for a large computing system dedicated to geoscience and capable of something in the range of 500-1000 trillion math operations per second. This system will likely have at least 100,000 processors—a huge challenge for programmers, but enough computing power to address such grand goals as modeling plate tectonics from first principles, or running climate, ocean and weather codes at unprecedented resolutions.
Loft noted that a precedent for this vision already exists. "The feasibility of the collaboratory has been demonstrated through such initiatives as the NSF-funded TeraGrid project," he pointed out. "The collaboratory would enable geo-specific computing, data, and observation systems in the public and private sectors to come together more effectively on the most challenging problems in geoscience."
Among the issues addressed in the workshop were how to deal with the enormous scale of computing that the new system would make possible. As one participant said, "It's not clear to me that anybody knows how to program 100,000 processors." Another added, "It takes a long time to develop codes; you have to start early."
Ensuring a steady source of support was another hot topic. Attendees stressed the need for multiyear funding that could keep initial deployments running while later systems are brought on.
"I think the key word is 'collaboratory,' " said Thomas "Zack" Powell (University of California at Berkeley) during the workshop's closing session. Others, including NCAR director Tim Killeen, echoed the need for distributed involvement. In his wrap-up remarks, Killeen emphasized the need for "more than petaflops" and the strong interest that both the geosciences community and NSF had expressed in the collaboratory concept. "This is significant for our professional union as well," said Killeen, who is currently serving as president of the American Geophysical Union (AGU).
Next steps in the process include:
• a workshop report to be issued in December and made available on the workshop Web site (see "On the Web");
• a town-hall meeting on the evening of 11 December at the annual AGU meeting in San Francisco; and
• the formation of a new GeoCollaboratory steering committee, spanning the geosciences, to work with NCAR and NSF on developing an implementation plan for the collaboratory.
In closing remarks, Tim Palmer (European Centre for Medium-Range Weather Forecasts) harked back to the roots of geoscience computing, including L.F. Richardson's idea in the 1920s of a "forecast factory" driven by hundreds of humans carrying out simultaneous calculations. "Richardson had the vision of a multiprocessor computing center for the geosciences. Of course, in his day, a processor was a human being," said Palmer.
"I'm really pleased to see the United States take the lead on this important initiative," Palmer added. "Geosciences worldwide will benefit from it."
As the collaboratory concept evolves, NCAR will continue to evaluate possible locations and structures for its new data center, according to Loft. The center may be located in Boulder (in partnership with the University of Colorado) or in Cheyenne (in partnership with the state of Wyoming and the University of Wyoming).