Project

General

Profile

Task #4064

Story #3875: Create a dashboard (version 1) for DataONE to provide high level overall system status

Report of downloads of DATA objects per month

Added by Robert Waltz over 8 years ago. Updated almost 6 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Robert Waltz
Category:
d1_dashboard
Target version:
Start date:
2013-10-08
Due date:
% Done:

100%

Estimated time:
0.00 h
Milestone:
None
Product Version:
*
Story Points:
Sprint:

Description

Report on Number of GETs per month by object format per MN since DataONE production deployment

Report on Number of GETs per month per MN since DataONE production deployment.

Filter out CN activity

make substitutions in static report. null or "" then CLOAKN, and appropriate subs for DAAC and Clearinghouse.


Subtasks

Task #4065: Report on Number of GETs per month by object format per MNClosedRobert Waltz

Task #4066: Report on Number of GETs per month per MN ClosedRobert Waltz


Related issues

Blocks Infrastructure - Task #4120: Display total data downloads in the SummaryView header New 2013-10-28

Associated revisions

Revision 03363014
Added by Robert Waltz over 8 years ago

refs #4064

Report of downloads per object format per month (data/metadata)

Revision 12674
Added by Robert Waltz over 8 years ago

refs #4064

Report of downloads per object format per month (data/metadata)

Revision 6653e837
Added by Robert Waltz over 8 years ago

refs #4064

Report of downloads per object format per month (data/metadata)

Revision 12684
Added by Robert Waltz over 8 years ago

refs #4064

Report of downloads per object format per month (data/metadata)

Revision 6a47bf22
Added by Robert Waltz over 8 years ago

refs #4064

Report of downloads per object format per month (data/metadata)
ignoring a jar file

Revision 12685
Added by Robert Waltz over 8 years ago

refs #4064

Report of downloads per object format per month (data/metadata)
ignoring a jar file

History

#1 Updated by Dave Vieglais over 8 years ago

Following discussion on how to integrated the report data into the dashboard, an alternative approach was identified that would provide the same capability but would be more flexible for feeding the dashboard:

  1. Modify the reporting tool to generate a CSV file, one row per log record, with each row augmented to include size, formatId, and formatType. The CSV should be escaped according to the requirements of the solr.CSVRequestHandler described at http://wiki.apache.org/solr/UpdateCSV This step could also do obfuscation of PII as appropriate.

  2. Create a new SOLR index using the log aggregation schema with the additional fields included. This SOLR index need not be running on the CNs, just at a location accessible by the dashboard client

  3. Populate the new SOLR index with the CSV file. Word is that the CSV loader is very fast.

  4. Point the dashboard client at the new SOLR index. Summary reports (for Rebecca et al) could also be generated easily from this source.

#2 Updated by Robert Waltz over 8 years ago

  • Subject changed from Report of downloads per object format per month (data/metadata) to Report of downloads of DATA objects per month

The design I was given has always been 'READ' count and bytes for data objects only.

On Oct 14, 2013, at 2:25 PM, Waltz, Robert Patrick wrote:

I thought part of the issue was the # of GETS per month for all objects. formatType=DATA will only provide a fraction of the objects that may be interesting to report upon. As has been said, eml may contain a dataset. Maybe we just need to combine METADATA + DATA and exclude the RESOURCE types, so that we don't have the break down by formatId and remove the reports array?

Robert Patrick Waltz
Research Associate, Center for Information and Communication Studies
The University of Tennessee
5 James D Hoskins Library
1400 West Cumberland
Knoxville, TN 37996-0341
rwaltz@utk.edu
From: Skye Roseboom sroseboo@epscor.unm.edu
Sent: Monday, October 14, 2013 15:45
To: Waltz, Robert Patrick; Christopher Jones
Subject: Re: log aggregation

Hi Robert,

For dashboard, we just need information about formatType=DATA as an aggregated report. Dashboard is not looking at formatId at all. Preference would be for a single report object per member node for the entire DATA document set.

Combining the time series information to contain both the count and the byte size sounds good to me.

-s

#3 Updated by Robert Waltz over 8 years ago

  • Description updated (diff)

#4 Updated by Robert Waltz over 8 years ago

  • Start date set to 2013-10-27
  • Due date set to 2013-11-09
  • Target version set to 2013.44-Block.6.1

#5 Updated by Chris Jones over 8 years ago

  • Due date changed from 2013-11-09 to 2014-02-15
  • Target version changed from 2013.44-Block.6.1 to 2014.6-Block.1.3

#6 Updated by Robert Waltz over 8 years ago

  • Target version changed from 2014.6-Block.1.3 to 2014.2-Block.1.1

#7 Updated by Robert Waltz over 8 years ago

  • Parent task set to #3875

#8 Updated by Skye Roseboom over 8 years ago

  • Target version changed from 2014.2-Block.1.1 to 2014.12-Block.2.2

#9 Updated by Skye Roseboom over 8 years ago

  • Target version changed from 2014.12-Block.2.2 to 2014.14-Block.2.3

#10 Updated by Robert Waltz almost 8 years ago

  • Product Version changed from 1.0.0 to *
  • Milestone changed from CCI-1.2 to None

#11 Updated by Robert Waltz almost 6 years ago

  • Status changed from In Progress to Closed
  • % Done changed from 30 to 100

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 14.8 MB)