
October 2001
|
|
UOP's newest program will help unite digital libraries
|
|
|
|
Ben Domenico and Dave Fulker. (Photo by Carlye Calvin.)
|
Countless science-oriented Web sites have blossomed on the open
frontier of cyberspace. Many of those serving as outposts of
science-education reform are being federated over the next few
years as part of an unusually grand experiment. They will still
operate as independent units but, like a United Sites of the Web,
they will be linked in a fashion that multiplies their power and
educational impact, creating the National Science Digital Library,
or NSDL.
Likely to be the largest and most heterogeneous science library
ever attempted, NSDLscheduled to debut in the fall of
2002will offer selected materials to students, teachers, and
professionals at all levels in science, mathematics, engineering,
and technology education (the areas denoted by "science" in
NSDL).
NSDL's core integration effort will be headquartered at UCAR as a
new part of the UCAR Office of Programs. It will be overseen by
Dave Fulker, long-time director of Unidata. Dave will continue to
lead Unidata on a half-time basis, with program manager Ben
Domenico
(
see sidebar)
taking on a larger role in Unidata's day-to-day operations. NSDL
expects to hire at least six new staffers in its office on the
third floor of FL4. The office's deputy director will be Kaye
Howe, a consultant in higher education and former vice chancellor
for academic services at CU-Boulder. "I am really looking forward
to this positiongreat people and a wonderful project," says
Howe.
The core integration effort involves two key institutional
partnersCornell and Columbia Universitiesas well as
two other parts of UCAR: the Digital Library for Earth System
Education (DLESE), whose Program Center is led by Mary Marlino,
and the Education and Outreach Program, led by Roberta Johnson.
According to Dave, the library's goal is to establish information
flows and an organizational architecture that will take it beyond
what one can now do with a Web-based search engine. "We think of
NSDL as an education layer over the Web." The Cornell team
estimates that by 2006 there may be a million users choosing
materials from ten million resources at many thousands of
independent sites. Effective characterization of these resources
through metadata (data about the data) will thus be essential.
The project will encourage users to customize the library for
their own needs. For example, Dave notes, "a teacher might combine
certain design elements, tools, and collections into a portal
appropriate for his or her eighth-grade astronomy unit." Using a
"one library, many portals" philosophy, independent
portalssites oriented toward providing access to other data-
rich siteswill be built and supported.
The NSDL core team is also looking closely at copyright and
financial aspects of the library. Some parts of the collection
will likely be open to all, while others will be restricted to
paying users. The project expects to rely largely on institutional
licenses, which will enable students and educators to use most or
all of NSDL at no additional cost.
"In the long term, any science library worth its salt has to have
data," says Dave, "but data sets in a modern library are useless
without tools." With its expert-certified materials, powerful
indexing, and multiple interfaces, NSDL stands to provide far more
than the sum of the vast data holdings it will soon encompass.
Bob Henson
THREDDS: Helping users weave through data
Tucked in amid the flow charts and plans on Ben Domenico's
blackboard is a quote from the jazz great Thelonius Monk: "Simple
ain't easy." Although it may not be easy, Ben's goal is a simple
one: he wants to "provide the capabilities for scientific data on
the Web that we now have for multimedia documents."
Unidata's program manager is heading up a newly funded project to do
just that. It's called Thematic Real-time Environmental Data
Distributed Services. THREDDS has come to life with roughly $900,000 in
funding over the next two years from NSDL (see sidebar). THREDDS is
part of NSDL's collections effortwhich itself is separate from
the core integration tasks that will be based in UOPbut it's a
key piece of the puzzle.
The focus of THREDDS is to improve how scientists, educators, and
students publish, find, and use data. The default practice for
many people in the atmospheric and related sciences is to contact
their colleagues about where to get data sets, then download and
process them using software on local computers or have them
delivered automatically in real time using the Unidata Internet
Data Distribution system. "The Unidata community uses powerful
analysis tools, [but] the data must reside on users' local
machines," says Ben. With THREDDS, researchers will still be able
to use the analysis tools on their own machines, but they'll have
the option of accessing data from a set of distributed servers.
THREDDS is a highly collaborative project with more than 20
participant institutions. For example, the Distributed
Oceanographic Data System (DODS) is a key component that allows
users to specify a data set in terms of a URL on a remote server
as if it were a file on a local computer. "DODS makes access
convenient once you know the URLs for the data sets of interest,"
says Ben, "but finding the data is not always easy." The metadata
at the heart of THREDDS will make complex data sets much easier to
find.
These data sets will range across and beyond the breadth of UCAR
science, says Ben"anything from a single report at a weather
observation station, to a complete satellite picture, to seismic
data."
Since it's centered at Unidata, THREDDS will be able to call upon
that program's long history of innovation in providing
universities with data. The 12 data providers committed to
providing THREDDS services include
- NOAA's National Climatic Data Center, for climate data;
- the Incorporated Research Institutions for Seismology, for
seismic data;
- the Navy's Fleet Numerical Meteorology and Oceanography
Center, for oceanographic data; and
- NOAA's National Geophysical Data Center, for geophysical
data.
Testbed server implementations will be done on the SCD/Unidata
Community Data Portal, which captures nearly a gigabyte of data
each hour from the Internet Data Distribution system, and on a
satellite data server at the University of Wisconsin's Space
Science and Engineering Center.
THREDDS will incorporate a set of client applications for analysis
and display that will allow speedy and intuitive access to the
data. In addition to Unidata's MetApps team, groups within ATD and
SCD, as well as several organizations outside UCAR, are working on
a diverse array of potential client software (software linking
desktop computers to remote servers). Some of these clients will
be as simple as an existing browser that would allow data analysis
to be carried out on data-hosting servers. Other, "thicker"
clients would allow users to find, analyze, and display data from
the remote servers on their local machines. In either case, the
client would be capable of
- visualizing complex, multidimensional data;
- integrating and overlaying data from multiple sources; and
- gracefully handling spatial coordinate systems, measurable
quantities, units of measure, and sampling variations.
As THREDDS evolves, users will find themselves freed from the
arcane world of file formats and naming conventions. Instead
they'll be navigating through data almost as easily as a newbie
surfs the Web. It's an ambitious goal, says Ben, "but we think we
can make significant strides."
Bob Henson
|
|
In this issue...
Other issues of Staff Notes Monthly
UCAR
NCAR
UOP
Edited by David Hosansky,
hosansky@ucar.edu
Prepared for the Web by Jacque Marshall
Last revised: Thu Oct 25 11:18:36 MDT 2001