by Marijke Unger
Data sets on the order of terabytes (trillion
of bytes) aren’t easy to explore in an interactive
fashion without an armada of hardware and
software. But a new platform developed by NCAR
and two university partners offers users a comprehensive
desktop environment for analyzing massive data
sets, allowing them to see both the forest for
the trees.
VAPOR, the Visualization and Analysis
Platform for Ocean, Atmosphere, and Solar
Research, is an open-source software tool
developed by NCAR’s
Computational and Information Systems Laboratory
(CISL) in partnership with the University
of California at Davis and Ohio State University.
VAPOR is made possible through support from NSF’s
Information Technology Research for National Priorities
program.
Designed to render visualizations from
numerical data, VAPOR makes it possible for
researchers literally to see the meaning
in data sets that are often too large or complex
to analyze in any other way. Scientists are using
VAPOR to explore simulations for cubes as large
as 15363 grid points. That’s
3.6 billion grid points and about 15 gigabytes
of storage for a single variable.
The
four-step image below, produced by NCAR’s
Pablo Mininni and colleagues using VAPOR, shows
progressive zooms in a 3-D rendering of the intensity
of vorticity (circulation) in a very-high-resolution
simulation of hydrodynamic turbulence. The
image allows scientists to see vortex filaments
(elongated structures with strong vorticity) as
well as their clustering into larger-scale patterns.
Before the
advent of VAPOR, interactively exploring
such a large data set would require a “brute
force” approach, using parallel visualization
software and dozens of connected workstations.
Proceeding through multiple time steps would
demand a visual supercomputer comprised of
hundreds of nodes and an equally substantial storage
system capable of delivering tens of gigabytes
of data per second. What makes VAPOR unique is
its capacity to render a big-picture view and explore
smaller areas of interest, all from the convenience
of a single desktop or laptop computer.
Analogous
to the way that Google Earth lets users zoom
in from a low-Earth orbit all the way down to their
driveways, VAPOR can generate a 3-D visualization
from an enormous collection of data, then
allow the user to zoom in on particular features
for closer study (see graphic). And VAPOR lets
researchers analyze high-resolution data at a distance,
without having to download it from the supercomputing
facilities where it is generated or stored.
“We wanted to create a tool that could be
used by scientists rather than visualization experts,” says
John Clyne, who leads CISL’s VAPOR efforts.
The idea, he adds, was for “an analysis tool
as opposed to a visualization tool, but one where
visualization was a significant part of the analysis
process.”
Getting a handle on turbulence
VAPOR’s roots lie in studies of turbulence,
which underlies phenomena as diverse as Earth’s
magnetosphere, salinity gradients that drive
ocean circulation, and daily weather. To understand
the dynamics of the atmosphere and oceans, the
Sun, and solar-terrestrial interactions, physical
scientists find it essential to investigate relevant
turbulent processes at a fundamental level—which
often takes massive simulations.
The Institute for
Mathematics Applied to Geosciences (IMAGe),
part of CISL, works extensively on turbulence problems.
The group created a large magnetohydrodynamic
(MHD) simulation
at very high resolution, using a code developed
and maintained at NCAR and run in collaboration
with several groups in the United States,
France, England, and Argentina. MHD simulations
are used to understand the dynamics of solar
magnetic fields, solar winds, Earth’s core,
and space weather. IMAGe’s simulation—the
largest numerical experiment of its kind—illustrated
for the first time the self-similar (fractal-style)
growth of maxima in the formation, rolling,
and stretching of vorticity (circulation) and current
sheets.

Front-page treatment: VAPOR-produced
imagery was featured on the 30 November 2007 cover of
Physical Review Letters and earlier this year in a 10th
Anniversary Highlights compilation from New Journal of
Physics. The PRL cover image portrays a wavy spatial distribution
of magnetic energy and field lines from magnetohydrodynamic
simulations carried out by Yannick Ponty of France’s
Côte d’Azur Observatory. For New Journal of
Physics, Mark Rast (University of Colorado at Boulder)
and NCAR’s Pablo Mininni worked with VAPOR’s
John Clyne and Alan Norton to produced the image at top
center, showing a downflowing circulation within turbulent
convection, and center right, depicting a snapshot of
pressure fluctuation within the head of a 3-D compressible
plume. (Cover images courtesy American Physical Society
and American Institute of Physics/Deutsche Physikalische
Gesellschaft.)
IMAGe’s Pablo Mininni used VAPOR to
look at areas of the MHD simulation in closer
detail. “We
were able to find structures that we wouldn’t
have found in any other way. One of the nicest
features was being able to navigate in an
interactive way and look at tiny structures
in this enormous cube, without having to wait a
long time to see what the image looks like. It
really facilitates the discovery process,” he
says.
At Oregon State University, William Smyth
and graduate student Satoshi Kimura used
VAPOR to study the role of turbulence in
the mixing of seawater. This crucial aspect
of ocean circulation and climate is complicated
by the fact that heat and salt diffuse at
very different rates. A glass half-filled
with cold water and half with warm water will mix
within a few minutes, notes Smyth. However, if
the same glass is filled with equal portions
of salty and fresh water, the salinity difference
will persist for many days because the salt
gradients diffuse very slowly.
The upshot:
in order to simulate ocean mixing, one must
use fine enough resolution to capture the millimeter-scale
salt gradients, while covering a volume of
water large enough for turbulence to develop. “This
computational problem was considered impossible
until last year,” says
Smyth. His and Mininni’s projects, plus several
others, were made possible through the Breakthrough
Science program, designed by NSF and NCAR
to facilitate scientific discovery through very
large allocations of resources associated with
the arrival of NCAR’s
blueice supercomputer in late 2006. (CISL
recently announced a similar program, Accelerated
Scientific Discovery, calling on the lab’s
latest supercomputer, bluefire, which is being
installed and tested this spring. Applicants for
the program are being accepted through 21 May;
see “On the Web.”)
“With the supercomputers at NCAR, we can
do much more realistic simulations,” says
Smyth. However, he adds, analyzing such simulations
is no piece of cake. “You have to ask a
lot of questions, and it’s no good if it
takes months to get every answer. By then, you’ve
forgotten the question!”
One way that VAPOR makes large data sets
more tractable is through wavelet transforms—mathematical
transformations that facilitate representing the
data in a compact fashion. With VAPOR’s wavelet
transforms, according to Smyth, “the size
of the data set is reduced while keeping most of
the essential information, so you can get answers
on time scales of minutes, just like with workstation-sized
problems.”
Onward to weather

VAPOR depicts cold air sweeping southward
across Georgia by tracing particle motion based
on output from the Weather Research and Forecasting
model. (Image courtesy VAPOR and Thara Prabhakaran,
University of Georgia.)
While the early versions of VAPOR targeted physicists
who study turbulence, the software is being
enhanced to meet the needs of the more general
Earth and space sciences community. For example,
researchers are developing capabilities for
use with the multiagency Weather Research and Forecasting
(WRF) modeling system, which is now used
widely at global prediction centers as well as
research labs. CISL has surveyed WRF users and
is planning to tailor several new features for
them, including two-dimensional data slices as
well as colored surfaces
that depict constant values of temperature
or other weather variables.
Weather modelers may
also find some of the turbulence-oriented
features of VAPOR useful, according to CISL’s
Alan Norton. For example, VAPOR can display
moving images of fluid motion using a technique
called Image-Based Flow Visualization. With this
tool, weather researchers have been able to observe
the multiplicity of vortices emerging along the
front edge of a line of severe thunderstorms.
Two
major new versions of VAPOR were completed
and released in the last year, and more than 4,000
copies of the software have been downloaded
by research groups worldwide. VAPOR-produced imagery
has already appeared in a number of scientific
publications and presentations.
VAPOR’s developers are looking ahead to future
editions. Preliminary research in more aggressive
data reduction techniques has been promising. Such
research is the cornerstone to another key area
of future development: preparing VAPOR for petascale
computing.
“VAPOR is breaking down some of the barriers
to large-scale science,” says Clyne, “making
it possible to explore vast data sets without the
need for Herculean computing resources.” ♦
|