Project

General

Profile

Story #1378

Adding new object formats shouldn't require buildout

Added by Chris Jones about 10 years ago. Updated over 9 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Robert Waltz
Category:
d1_schemas
Target version:
Start date:
2011-03-29
Due date:
% Done:

100%

Story Points:
Sprint:

Subtasks

Task #1451: Add an object format service to MetacatClosedChris Jones

Task #1452: Replace ObjectFormat calls with ObjectFormatService calls in MetacatClosedChris Jones

Task #1453: Proxy the Metacat ObjectFormatService as a D1 CN serviceClosedRobert Waltz

Task #1454: Replicate state changes in the object format list across CNsClosedChris Jones

Task #1455: Update D1 libraries to use CN object format serviceClosedRobert Waltz

Task #1464: Write unit tests for listFormats() and getFormat()ClosedChris Jones

Task #1593: Add an ObjectFormatDisk Cache to d1_common_javaClosedChris Jones

Task #1594: Add an ObjectFormatCache to d1_libclient_javaClosedChris Jones

Task #1638: Migrate ObjectFormatService in Metacat to CNCoreImplClosed


Related issues

Related to Infrastructure - Story #582: Implement an object format registry Closed 2010-10-06
Related to Infrastructure - Story #1038: need a stub ObjectFormat service at CN that can later be replaced Closed

History

#1 Updated by Chris Jones about 10 years ago

  • Category set to d1_schemas
  • Assignee set to Chris Jones

In order to add a new object format (mime type), the dataoneTypes.xsd schema needs to be modified and new classes must be generated from it, which requires a full buildout of the CNs.
Either an easier mechanism needs to be put in place to add in new types, or the object format list needs to be exhaustive. All mimetypes registered at:
http://www.iana.org/assignments/media-types/index.html
could be listed in the XSD, which would reduce the frequency of updates to the XSD dramatically. Otherwise, an object format lookup service could also be used to match file types.

#2 Updated by Chris Jones about 10 years ago

  • Target version changed from Sprint-2011.07 to Sprint-2011.10-Block.2

#3 Updated by Dave Vieglais about 10 years ago

  • Position deleted (34)
  • Position set to 1
  • Position changed from 1 to 170

#4 Updated by Chris Jones about 10 years ago

  • Status changed from New to In Progress

After looking at how to create a dynamic enum in Java to reduce how much code would need alteration, although it's possible, it is contrary to the spirit of enums (immutable) and may only work on the Sun JVM. Instead, all classes using object format will need to change to using getter methods that access a backing properties file, and client code will need to query a REST service for object format information.

Proposal:

The following REST interface would provide object type information for named object types in the DataONE system:

GET: RETURNS
/types/object/ : xml or json representation of all d1 object formats
/types/object/{fmtid} : xml or json representation of the object format

A full response would include a hierarchical structure including the D1 id of the object format, along with the important identifiers that are used for this object format in other systems (doctype, namespace, mime, puid, gfid, udfr).

Example 1 (Metadata from schema):

/types/object/EML_2_0_0/ :

eml://ecoinformatics.org/eml-2.0.0
application/xml
x-fmt/522
fmt/eml-2.0.0
fmt/eml-2.0.0

/types/object/EML_2_0_0/doctype : ""
/types/object/EML_2_0_0/namespace : "eml://ecoinformatics.org/eml-2.0.0"
/types/object/EML_2_0_0/mime : "application/xml"
/types/object/EML_2_0_0/puid : "x-fmt/522"
/types/object/EML_2_0_0/gfid : "fmt/eml-2.0.0"
/types/object/EML_2_0_0/udfr : "fmt/eml-2.0.0"

Example 2 (Metadata from DTD):

/types/object/FGDC_STD_001_1_1999/ :
FGDC-STD-001.1-1999

application/xml
x-fmt/319
fmt/fgdc-std-001.1-1999
fmt/fgdc-std-001.1-1999

/types/object/FGDC_STD_001_1_1999/doctype : "FGDC-STD-001.1-1999"
/types/object/FGDC_STD_001_1_1999/namespace : ""
/types/object/FGDC_STD_001_1_1999/mime : "application/xml"
/types/object/FGDC_STD_001_1_1999/puid : "x-fmt/319"
/types/object/FGDC_STD_001_1_1999/gfid : "fmt/fgdc-std-001.1-1999"
/types/object/FGDC_STD_001_1_1999/udfr : "fmt/fgdc-std-001.1-1999"

Example 3 (data):

/types/object/TEXT_CSV/doctype :


text/csv
x-fmt/18
fmt/text-csv
fmt/text-csv

/types/object/TEXT_CSV/namespace : ""
/types/object/TEXT_CSV/mime : "text/csv"
/types/object/TEXT_CSV/puid : "x-fmt/18"
/types/object/TEXT_CSV/gfid : "fmt/text-csv"
/types/object/TEXT_CSV/udfr : "fmt/text-csv"

#5 Updated by Chris Jones about 10 years ago

After discussing the interface definition with Matt, we decided to:

1) Make the collection more flat to be consistent with other D1 REST URLs (we don't need the /types/object hierarchy)
2) Change the collection name to /formats to coincide with the D1 ObjectFormat class
3) Keep the collection name plural
4) Only return full object format records, as opposed to individual identifier leaf nodes to reduce calls to the CN service
5) Use the canonical identifier string as the D1 identifier string ()
6) Add a 'name' element with a human readable name for the object format
6) Set the content model of to ANY to allow for future hooks into other typing systems

So, the revised interface would be:

GET: RETURNS:
/formats : xml or json representation of all d1 object formats
/formats/{fmtid} : xml or json representation of the d1 object formats

with the folowing examples:

GET: RETURNS:
/formats/eml://ecoinformatics.org/eml-2.0.0 :

Ecological Metadata Language,
version 2.0.0


/formats/FGDC-STD-001.1-1999 :

Federal Geographic Data Committee Content Standard,
version 001.1-1999


/formats/text/csv :

Comma Separated Values

The above will be added to the D1 architecture documents as the proposed interface.

#6 Updated by Matthew Jones about 10 years ago

  • Target version changed from Sprint-2011.10-Block.2 to Sprint-2011.11-Block.2
  • Position deleted (171)
  • Position set to 1

#7 Updated by Dave Vieglais about 10 years ago

  • Target version changed from Sprint-2011.11-Block.2 to Sprint-2011.12-Block.2
  • Position deleted (11)
  • Position set to 4

#8 Updated by Dave Vieglais about 10 years ago

  • Position set to 3
  • Target version changed from Sprint-2011.12-Block.2 to Sprint-2011.13-Block.2
  • Position deleted (13)

#9 Updated by Dave Vieglais about 10 years ago

  • Target version changed from Sprint-2011.13-Block.2 to Sprint-2011.14-Block.2
  • Position set to 2
  • Position deleted (6)

#10 Updated by Dave Vieglais about 10 years ago

  • Position deleted (24)
  • Target version changed from Sprint-2011.14-Block.2 to Sprint-2011.15-Block.2
  • Position set to 7

#11 Updated by Dave Vieglais almost 10 years ago

  • Position set to 4
  • Target version changed from Sprint-2011.15-Block.2 to Sprint-2011.17-Block.3
  • Position deleted (25)

#12 Updated by Chris Jones almost 10 years ago

  • Target version changed from Sprint-2011.17-Block.3 to Sprint-2011.18-Block.3
  • Position deleted (10)
  • Position set to 9

#13 Updated by Dave Vieglais almost 10 years ago

  • Target version changed from Sprint-2011.18-Block.3 to Sprint-2011.20-Block.3

#14 Updated by Dave Vieglais almost 10 years ago

  • Target version changed from Sprint-2011.20-Block.3 to Sprint-2011.21-Block.3
  • Position deleted (26)
  • Position set to 2

#15 Updated by Dave Vieglais almost 10 years ago

  • Target version changed from Sprint-2011.21-Block.3 to Sprint-2011.22-Block.3
  • Position deleted (12)
  • Position set to 2

#16 Updated by Chris Jones almost 10 years ago

  • Target version changed from Sprint-2011.22-Block.3 to Sprint-2011.23-Block.3
  • Position deleted (9)
  • Position set to 21

#17 Updated by Dave Vieglais almost 10 years ago

  • Target version changed from Sprint-2011.23-Block.3 to Sprint-2011.26-Block.4
  • Position deleted (23)
  • Position set to 13

#18 Updated by Rob Nahf over 9 years ago

  • Target version changed from Sprint-2011.26-Block.4 to Sprint-2011.33-Block.4
  • Milestone set to None

#19 Updated by Chris Jones over 9 years ago

  • Assignee changed from Chris Jones to Robert Waltz

Assigning this to Robert since the one remaining task is assigned to him.

#20 Updated by Dave Vieglais over 9 years ago

  • Target version changed from Sprint-2011.33-Block.4 to Sprint-2011.35-Block.5
  • Position deleted (42)
  • Position set to 3

#21 Updated by Robert Waltz over 9 years ago

  • Status changed from In Progress to Closed

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 14.8 MB)