Project

General

Profile

Story #7170

Evaluate the feasibility of extracting provenance information from the journal.txt document

Added by Dave Vieglais almost 9 years ago. Updated almost 9 years ago.

Status:
New
Priority:
Normal
Assignee:
Target version:
-
Start date:
Due date:
% Done:

0%

Story Points:
Sprint:

History

#1 Updated by Dave Vieglais almost 9 years ago

  • Assignee set to Dave Vieglais

Each entry on the FTP site contains a "journal.txt" document that basically describes the provenance of the package.

This is unfortunately in a very much human-ingestible form.

The goal of this activity is to determine what, if any, useful provenance information can be divined from the journal and associated information (e.g. manifest files).

e.g. journal file:

ftp://ftp.nodc.noaa.gov/nodc/archive/arc0064/0118783/14.14/about/journal.txt

#2 Updated by Dave Vieglais almost 9 years ago

Likely to be a laborious free text processing exercise. From John Relph:

No, the journal.txt is not available in any machine-readable format. We
have talked about implementing a system to enable that, but currently the
file is a mostly free-form file for the use of the human data content
manager to record actions performed and other information.

We will, at some point soon, start populating lineage sections in our ISO
metadata with information about the various revisions of the accessions.

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 14.8 MB)