Project

General

Profile

Story #882

Need mechanism to delete content from CN that belongs to an MN

Added by Dave Vieglais about 14 years ago. Updated over 12 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Ben Leinfelder
Category:
d1_cn_service
Start date:
2012-04-30
Due date:
% Done:

100%

Story Points:
Sprint:

Description

Need a mechanism to remove content from the CNs that belongs to a MN so that after a change in MN functionality (e.g. sysmeta generation), the erroneous / out of date content can be removed from the catalog. This would include removing from Metacat and ensuring that the content is re-indexed so that it doesn't show up in searches.

This may initially consist of a set of tasks that can be manually executed to clear out content from a MN, but eventually should be scripted so that an administrator can wipe MN content from a CN.


Subtasks

Task #2671: Document CN.delete() methodClosedDave Vieglais

Task #2672: Document CN.archive() methodClosedDave Vieglais

Task #2673: Document MN.archive() methodClosedDave Vieglais

Task #2674: Implement MN.delete()ClosedBen Leinfelder

Task #2675: Implement CN.archive()ClosedBen Leinfelder

Task #2676: Implement CN.delete()ClosedBen Leinfelder

Task #2677: Metacat needs to actually delete content not just leave itClosedBen Leinfelder

Task #2678: Wire up CN/MN.archive() methodClosedBen Leinfelder

Task #2687: Include CN.archive() in cn-rest-proxy pass through to MetacatClosedRobert Waltz


Related issues

Related to Infrastructure - Story #3036: Administrative expunge of an object and its replicas needs to be enabled Closed 2012-06-29

History

#1 Updated by Dave Vieglais about 14 years ago

Note that is applies only to test and development activity, not for operations.

#2 Updated by Robert Waltz about 14 years ago

  • Parent task set to #832

#3 Updated by Robert Waltz about 14 years ago

  • Milestone set to CCI-0.6

#4 Updated by Dave Vieglais about 14 years ago

  • Start date set to 2010-10-07
  • Tracker changed from Bug to Task

#5 Updated by Robert Waltz over 13 years ago

  • Milestone deleted (CCI-0.6)

#6 Updated by Dave Vieglais almost 13 years ago

  • Assignee changed from Robert Waltz to Ben Leinfelder
  • Milestone set to None

Requires some design work to layout the workflow for deleting an object from the CN and ensuring that all replicas are also removed.

The delete() method is inadequate for this - the content needs to be purged rather than tagged as archived or obsoleted.

#7 Updated by Ben Leinfelder almost 13 years ago

There is an MN.delete() method with a note that we should determine what the semantics of this operation are.
For Metacat, "delete" does not remove any content and only prevents it from:
a) being updated by another revision,
b) having the same identifier reused, and
c) showing the object in search results.
If you know the identifier (cited in a paper, say) you can always retrieve it. In DataONE we would set SystemMetadata.archived=true for these items and the change in SystemMetadata should be replicated up to the CN and propagated to all replicas on other MNs.

Reasons for a more forceful "delete" mechanism:
a) Inappropriate content (illegal, copyrighted, too large)
b) Mistake/testing
Since we NEVER want to reuse identifiers, we should maintain a SystemMetadata record for all deleted objects. I would vote to change the SystemMetadata.archived flag to be an optional "status" indicator with initial possible values of "archived" and "deleted" where, if omitted, it would indicate a normal/active object. The MN should propagate this SystemMetadata change to the CN which would spread the word to the other MNs. I think the MNs could remove all trace of that object and rely on the CN to keep a record of the identifier being used (so that it was not reused in the DataONE system). MNs holding a replica could also completely remove the object. This points to a need for MNs to have two methods:
MN.archive()
MN.delete()

#8 Updated by Chris Jones over 12 years ago

  • Target version set to Sprint-2012.17-Block.3.1

During our standup discussion on 04/23/2012, we decided to enable administrative delete() functionality by:

1) Renaming the current delete() method to archive(), and
2) Creating a new delete() method accessible only to administrative subjects

We had planned on changing the 'archived' flag in SystemMetadata to 'status' as Ben suggested, but a schema change is too late in the release cycle, and so we are keeping it the same.

The implementation of delete() needs to:

1) Remove the object from the CN (database and/or filesystem)
2) Mark the system metadata as 'archived' so it is not indexed
3) Iterate through the replica list and call MN.delete() for each replica
4) For each replica deletion, call MN.systemMetadataChanged() to update MN sysmeta

MN.delete() should likewise purge the object from the database/filesystem but keep the system metadata up-to-date with the CN copy.

#9 Updated by Robert Waltz over 12 years ago

  • Parent task deleted (#832)
  • Tracker changed from Task to Story

#10 Updated by Robert Waltz over 12 years ago

  • Target version set to Sprint-2012.19-Block.3.2
  • Milestone changed from None to CCI-1.0.0

#11 Updated by Dave Vieglais over 12 years ago

  • Target version changed from Sprint-2012.19-Block.3.2 to Sprint-2012.21-Block.3.3
  • Position set to 1

#12 Updated by Jing Tao over 12 years ago

  • Position set to 3
  • Position deleted (5)

#13 Updated by Dave Vieglais over 12 years ago

  • Status changed from New to Closed

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 14.8 MB)