Project

General

Profile

h1. NODC Slender Node Implementation

The goal of this project is to implement a GMN-based Slender Node to enable NODC content to appear within the DataONE federation.

GMN supports operation as a proxy, where the content is registered within the GMN instance, though is located elsewhere. The GMN instance operates like a normal Member Node within the DataONE ecosystem with the only major difference being that the MN.get() operation will retrieve the content from the remote repository and pass it on to the caller. In this sense, the GMN is operating as a proxy service for the NODC repository.

!nodc_connector.png!

h2. The NODC Repository Service

The NODC-connector utilizes the OGC CSW protocol exposed by the NODC repository:

http://data.nodc.noaa.gov/geoportal/csw

A typical request:

http://data.nodc.noaa.gov/geoportal/csw/discovery?Service=CSW&startposition=0&Request=GetRecords&resulttype=results&maxrecords=5&Version=2.0.2

With response:

csw:SummaryRecord
*** gov.noaa.nodc:0118767/dc:identifier
*** {4C51D1C4-A9DA-4CA0-898D-283483E24F67}/dc:identifier
dc:titleOceanographic and surface meteorological data collected
from station wwef1 by Everglades National Park (ENP) and assembled
by Southeast Coastal Ocean Observing Regional Association (SECOORA)
in the Coastal Waters of Florida, Gulf of Mexico and North Atlantic
Ocean from 2014-02-13 to 2015-04-30 (NODC Accession 0118767)
/dc:title
downloadableData/dc:type
application/dc:type
downloadableData/dc:type
downloadableData/dc:type
application/dc:type
downloadableData/dc:type
downloadableData/dc:type
dc:subjectoceanography/dc:subject
dc:subjectoceans/dc:subject
dc:subjectenvironment/dc:subject
dc:subject0118767/dc:subject
dc:subject118767/dc:subject
dc:subjectWATER TEMPERATURE/dc:subject
dc:subjectTemperature Sensors/dc:subject
dc:subjectphysical/dc:subject
dc:subjecttime series/dc:subject
dc:subjectFIXED PLATFORM/dc:subject
dc:subjectUS DOI; NPS; Everglades National Park/dc:subject
dc:subjectSoutheast Coastal Ocean Observing Regional Association
/dc:subject
dc:subjectIntegrated Ocean Observing System (IOOS)/dc:subject
dc:subjectIntegrated Ocean Observing System Data Assembly Centers
Data Stewardship Program/dc:subject
dc:subjectCoastal Waters of Florida/dc:subject
dc:subjectGulf of Mexico/dc:subject
dc:subjectNorth Atlantic Ocean/dc:subject
dc:subjectlatitude/dc:subject
dc:subjectlongitude/dc:subject
dc:subjectsea_water_temperature/dc:subject
dc:subjecttime/dc:subject
dc:subjectNorth Atlantic Ocean/dc:subject
dc:subjectNorth Atlantic Ocean/dc:subject
*** dct:modified2015-06-04T20:25:40+00:00/dct:modified
dct:abstractNODC Accession 0118767 contains oceanographic and
surface meteorological data in netCDF formatted files, which follow
the Climate and Forecast metadata convention (CF) and the Attribute
Convention for Data Discovery (ACDD). ENP collected the data from
their in-situ moored station named wwef1 in the Coastal Waters of
Florida, Gulf of Mexico and North Atlantic Ocean. SECOORA, which
assembles data from ENP and other sub-regional coastal and ocean
observing systems of the southeast United States, submitted the data
to NODC as part of NODC's Integrated Ocean Observing System Data
Assembly Centers (IOOS DACs) Data Stewardship Program. Each month,
NODC adds to the Accession the data collected during the previous
month./dct:abstract
http://accession.nodc.noaa.gov/oas/118767/dct:references
http://data.nodc.noaa.gov/cgi-bin/gfx?id=gov.noaa.nodc:0118767
/dct:references
http://data.nodc.noaa.gov/cgi-bin/gfx?id=gov.noaa.nodc:0118767
/dct:references
http://data.nodc.noaa.gov/geoportal/csw?getxml=%7B4C51D1C4-A9DA-4CA0-898D-283483E24F67%7D
/dct:references
http://data.nodc.noaa.gov/thredds/catalog/ioos/secoora/enp.wwef1.met/
/dct:references
*** ftp://ftp.nodc.noaa.gov/nodc/archive/arc0064/0118767/
/dct:references
http://data.nodc.noaa.gov/opendap/ioos/secoora/enp.wwef1.met/
/dct:references
http://accession.nodc.noaa.gov/download/118767
/dct:references
ows:WGS84BoundingBox
ows:LowerCorner-80.939 25.23/ows:LowerCorner
ows:UpperCorner-80.939 25.23/ows:UpperCorner
/ows:WGS84BoundingBox
ows:BoundingBox
ows:LowerCorner-80.939 25.23/ows:LowerCorner
ows:UpperCorner-80.939 25.23/ows:UpperCorner
/ows:BoundingBox
2014-02-13Z/dct:date
2015-04-30Z/dct:date
/csw:SummaryRecord
/csw:SearchResults
/csw:GetRecordsResponse

Records retrieved from the CSW instance provide further information about the location of science metadata and data objects and further descriptive information.

Each record presented by the CSW instance corresponds conceptually with a Data Package in DataONE.

Objects are accessible through HTTP (for science metadata) and FTP (for data objects) protocols.

h3. FTP Content

Example: ftp://ftp.nodc.noaa.gov/nodc/archive/arc0064/0118767/

!nodc_ftp_ss.png!

1** 0118767
2** 14.14
3** 2015-06-05 07:33:15+00
mare
4** /nodc/archive/arc0064/0118767/14.14/
24
416077



5** NODC-Readme.txt
4ed87de71d73f3bf04f48b23591265ca
6*** 7f0f82e8500dd343d4c7e0ff4f7259d50c297a91
700575ea312083223095e1c8f7a54a1e7bcfc790146ed4ccb1bbad247a3b6aa3
c890984f8709860ae0958cee10e489b3aa5d5ddfbfde51da6d500a2609c91fb8724767494200ce974281f2799d256fa3
c65c6154c459e490bf5842423f38628ceda74e4c8101f32ab88273a0af018433241ca0327b8e1015bca7bed1de9dc12bf216d129cd8b900fb2fb547ac8559aaa
7*** regular
8*** text
-rw-rw-r--
14
root
root
9*** 2816
32768
8
10* 2013-12-08 14:53:02+00
11* 2015-06-05 07:33:14+00
2015-06-04 07:36:15+00
2015-06-05 07:33:15+00
266538

  1. Used internally by connector
  2. The version of the package, user internally
  3. Used to indicate to GMN when the package was last modified
  4. PID - Identifier of package
  5. Name of file - parsed to determine format ID
  6. sysmeta.checksum and algorithm 7, 8. Used to assist with format ID determination
  7. sysmeta.size
  8. sysmeta.dateUploaded
  9. connector tests if object has changed.

PID for object = directory + filename

SID for object = directory - version + filename

h2. The NODC Connector

The connector is to be developed in Python and will operate as a standalone tool that is integrated with an instance of GMN. The connector is responsible for discovering content on the CSW instance and ensuring that the content is accurately represented in the GMN instance. The connector is responsible for detecting new and changed content, generating system metadata, generating resource map documents, and presenting these along with object location URLs to the GMN instance.

h2. The GMN Instance

The GMN instance is responsible for operating as a DataONE Member Node, responding appropriately to updates form the connector, and ensuring clients are able to access the objects held by the NODC repository.

The MN.get() operation of the GMN instance will need to efficiently proxy both HTTP and FTP sources so that clients are able to access the content.

nodc_connector.png (40.2 KB) Dave Vieglais, 2015-06-05 12:58

nodc_ftp_ss.png (118 KB) Dave Vieglais, 2015-06-05 13:21

Add picture from clipboard (Maximum size: 14.8 MB)