DataONE Tasks: Issueshttps://redmine.dataone.org/https://redmine.dataone.org/favicon.ico2020-02-29T01:00:11ZDataONE Tasks
Redmine CN REST - Bug #8860 (New): /token endpoint doesn't set a content-type and character encodinghttps://redmine.dataone.org/issues/88602020-02-29T01:00:11ZBryce Mecummecum@nceas.ucsb.edu
<p>On Firefox only, requests to the /portal/token endpoint (i.e., the one MetacatUI and other clients use to fetch their auth tokens, like <a href="https://cn.dataone.org/portal/token">https://cn.dataone.org/portal/token</a>) result in errors in the browser console.</p>
<p>When you access the URL via an XHR request, you see:</p>
<blockquote>
<p>XML Parsing Error: syntax error<br>
Location: <a href="https://cn-stage.test.dataone.org/portal/token">https://cn-stage.test.dataone.org/portal/token</a><br>
Line Number 1, Column 1:</p>
</blockquote>
<p>When you access the URL directly in Firefox:</p>
<blockquote>
<p>The character encoding of the plain text document was not declared. The document will render with garbled text in some browser configurations if the document contains characters from outside the US-ASCII range. The character encoding of the file needs to be declared in the transfer protocol or file needs to use a byte order mark as an encoding signature.</p>
</blockquote>
<p>I had a hunch that this error would go away if the response simply had the <code>Content-Type</code> header set to <code>text/plain; charset=utf-8</code> so I spun up <code>mitmproxy</code>, made that edit to the intercepted response, and saw that the error does go away.</p>
<p>I think we should modify the portal code to set the <code>Content-Type</code> header like above so the error goes away.</p>
Infrastructure - Bug #8822 (New): Account queries in STAGE-2 failing from web browserhttps://redmine.dataone.org/issues/88222019-06-19T01:44:19ZBryce Mecummecum@nceas.ucsb.edu
<p>Steps to reproduce:</p>
<ol>
<li>Visit <a href="https://search-stage-2.test.dataone.org/">https://search-stage-2.test.dataone.org/</a></li>
<li>Log in</li>
<li>Navigate to <a href="https://search-stage-2.test.dataone.org/data">https://search-stage-2.test.dataone.org/data</a></li>
<li>Observe an HTTP 500 in the Network pane of whichever browser you're using to <a href="https://search-stage-2.test.dataone.org/cn/v2/accounts/?query=%7BYOUR_DN%7D">https://search-stage-2.test.dataone.org/cn/v2/accounts/?query={YOUR_DN}</a></li>
</ol>
<p>Response body:</p>
<pre><?xml version="1.0" encoding="UTF-8"?>
<error detailCode="500" errorCode="500" name="ServiceFailure">
<description>Internal Server Error: The server encountered an unexpected condition which prevented it from fulfilling the request.</description>
</error>
</pre>
<p>Some notes:</p>
<ul>
<li>I notice this in multiple browsers</li>
<li>I notice this only when the request is issued from MetacatUI, not when I visit the URL in my browser or hit it with curl</li>
<li>I don't notice this error on search.dataone.org</li>
<li>MetacatUI on search.dataone.org doesn't even issue this request</li>
<li>MetacatUI on stage-2 is at v2.4.2 and on search is at 2.6.1</li>
</ul>
<p>It seems like a bug to me that we see a service failure but I'm not sure if this is a MetacatUI bug or an issue in the CN stack (i.e., Apache config or something) but I wanted to file it for someone to take a look.</p>
Infrastructure - Task #8817 (New): Configure sitemaps on the CNhttps://redmine.dataone.org/issues/88172019-06-06T23:52:23ZBryce Mecummecum@nceas.ucsb.edu
<p>Support for sitemaps landed last fall in Metacat: <a href="https://github.com/NCEAS/metacat/pull/1283">https://github.com/NCEAS/metacat/pull/1283</a>. Sitemaps are good for users but especially for search engines and DataONE's Search Catalog could benefit from having sitemaps enabled. A sitemap could help crawlers discover all of the datasets in DataONE. The CNs already run Metacat and should use Metacat's sitemaps ability to generate sitemaps for all content.</p>
<p>To enable sitemaps on the CNs, a few things seem to be needed. I'm not very familiar with how the CNs get built so I may be wrong or be missing things:</p>
<ul>
<li>Sitemaps rely on two properties in Metacat's <code>metacat.properties</code> file, which should have values: <code>sitemap.location.base=https://search.dataone.org/</code> and <code>sitemap.entry.base=https://search.dataone.org/view</code></li>
<li>The Apache config the CNs are built with need to serve the <code>sitemap_index.xml</code> and individual sitemaps from the Tomcat webapps dir. Metacat generates sitemaps in the <code>sitemaps</code> subfolder (e.g., <code>/usr/lib/tomcat8/webapps/metacat/sitemaps</code>). A <code>Directory</code> directive should work so long as filesystem permissions are set up for Apache to see the files.</li>
<li>We need a robots.txt that points at the sitemap index file at search.dataone.org which provides the entrypoint to <code>sitemap_index.xml</code></li>
<li>Metacat generates sitemaps with a recurring job mechanism that's internal to Metacat. AFAIK this job isn't turned on when Tomcat loads Metacat and a request has to get sent to the admin API which turns this job on as a side-effect. We might want to change this to reduce maintenance burden or chance of having stale sitemaps</li>
</ul>
<p>Dave nominated Jing for this work and has targeted this for the next CCI release. I'm not sure which that is so please select whichever one is appropriate.</p>
<p>Note this relates to <a href="https://redmine.dataone.org/issues/8693">https://redmine.dataone.org/issues/8693</a> which we've delayed because Google's crawler infrastructure has changed and DataONE is now visible by Google. Google staff have indicated we only need to send them a robots.txt that points to our sitemaps for them to begin crawling.</p>
Infrastructure - Task #8775 (In Progress): Make taxonomic rank fields in Solr index non-case-sens...https://redmine.dataone.org/issues/87752019-03-11T22:39:44ZBryce Mecummecum@nceas.ucsb.edu
<p>In the current system, EML documents with taxonomic coverage get indexed into fields such as <code>species</code> if they contain XML such as:</p>
<pre>...snip
<taxonomicCoverage>
<taxonomicClassification>
<taxonRankName>Species</taxonRankName>
<taxonRankValue>Some species</taxonRankValue>
...snip
</pre>
<p>The field values are extracted using the XPath in In <a href="https://repository.dataone.org/software/cicore/trunk/cn/d1_cn_index_processor/src/main/resources/application-context-eml-base.xml:">https://repository.dataone.org/software/cicore/trunk/cn/d1_cn_index_processor/src/main/resources/application-context-eml-base.xml:</a></p>
<pre>//taxonomicClassification/taxonRankValue[../taxonRankName="Species"]/text()
</pre>
<p>We ran into a case where the <code>taxonRankName</code> had been entered as 'species' instead of 'Species' and we decided that the XPath is too restrictive and that the strictness is needless and surprising. This change should result in a slight but negligible decrease in performance.</p>
<ul>
<li>Change all EML taxonomy fields to also match the lowercase form of each taxonomic rank</li>
<li>Check over other indexing field definitions related to taxonomy to make sure the above change is consistent</li>
</ul>
Infrastructure - Decision #8765 (Closed): Consider changing how BaseSolrFieldXPathTest workshttps://redmine.dataone.org/issues/87652019-02-13T00:36:55ZBryce Mecummecum@nceas.ucsb.edu
<p>Ran into a weird thing while expanding the <a href="https://repository.dataone.org/software/cicore/trunk/cn/d1_cn_index_processor/src/test/java/org/dataone/cn/index/SolrFieldXPathEmlTest.java">https://repository.dataone.org/software/cicore/trunk/cn/d1_cn_index_processor/src/test/java/org/dataone/cn/index/SolrFieldXPathEmlTest.java</a> to test an EML 2.2.0 doc.</p>
<p><code>SolrFieldXPathEmlTest</code> compares values extracted via various subprocessors to a set of expectations stored in a <code>HashMap<String, String>()</code> (of form <code><fieldName, expectedValue></code>). This data structured limits expectations to one value per field. Forever ago, in <code>r7985</code>,</p>
<pre>r7985 | sroseboo | 2012-03-23 12:12:53 -0800 (Fri, 23 Mar 2012) | 1 line
initial commit of search index support for parsing FGDC science metadata docs.
Index: src/test/java/org/dataone/cn/index/BaseSolrFieldXPathTest.java
</pre>
<p>support was added for testing multiple expectations for a single field by defining a convention of smushing multiple values into a single string, separated by two # characters (##). For example,</p>
<pre>eml210Expected.put("project", "Random Project Title##Another Random Project Title");
</pre>
<p>would test the <code>project</code> field for two values, "Random Project Title" and "Another Project Title" not a literal "Random Project Title##Another Random Project Title". I imagine this was picked because it's rare to see a ## in a metadata record which seems reasonable.</p>
<p>I ran afoul of this today because I wanted to test an expectation for a field with a # in it and I couldn't because it was being split when I didn't want it to be. Why did a single # break things when the convention above is a double #? Because <code>BaseSolrFieldXPathTest.java</code> splits the expectation string using <code>StringUtils.split</code> like this:</p>
<pre>StringUtils.split(expectedForField, "##") // Where expectedForField might equal "Random Project Title##Another Random Project Title"
</pre>
<p>According to <a href="https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html#split-java.lang.String-java.lang.String-">https://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/StringUtils.html#split-java.lang.String-java.lang.String-</a>, the second arg to <code>StringUtils.split</code>, <code>separatorChars</code>, is "the characters used as the delimiters, null splits on whitespace" which tells me we're using it wrong. I think the method we needed to use was <code>StringUtils.splitByWholeSeparator</code> which correctly splits only on double #.</p>
<p>I could just change the line of code and move on but that breaks a ton (98) of tests. Before I did that, I wanted to ask what others thought. I see a few routes:</p>
<ol>
<li>Change the code to correctly split only on ## and not # <em>and</em> update all the tests I break</li>
<li>Do 1 <em>and</em> use a different separator to something more future proof. I suggest "&&" because that'd be invalid in XML outside a CDATA section.</li>
<li>Change how the expectations get tested so field expectations can have a one to many relationship. This'd take some time so I'd opt for (1) or (2) instead</li>
</ol>
<p>Any preferences out there?</p>
Infrastructure - Task #8755 (New): Expand EML indexing support for EML 2.2https://redmine.dataone.org/issues/87552018-12-19T00:55:12ZBryce Mecummecum@nceas.ucsb.eduInfrastructure - Task #8754 (New): Add EML 2.2 to CN formats listhttps://redmine.dataone.org/issues/87542018-12-19T00:54:05ZBryce Mecummecum@nceas.ucsb.eduInfrastructure - Task #8753 (New): Add support for EML 2.2 (indexing, view)https://redmine.dataone.org/issues/87532018-12-19T00:43:52ZBryce Mecummecum@nceas.ucsb.edu
<p>EML 2.2 is nearly released and, with it, comes changes to indexing and the view service. The key points are that EML now supports Markdown and semantic annotations but some more stuff was added too. See the in-progress What's New in EML 2.2.0 page <a href="https://github.com/NCEAS/eml/blob/BRANCH_EML_2_2/docs/eml-220info.md">https://github.com/NCEAS/eml/blob/BRANCH_EML_2_2/docs/eml-220info.md</a> for details.</p>
<p>I'll track these with sub-tasks but here's an overview:</p>
<p><strong>Formats</strong></p>
<p>We need to expand the list of formats to include EML 2.2.</p>
<p><strong>Indexing</strong></p>
<p>EML 2.2 requires adding new fields and changes to some existing fields. For example, abstracts in EML can now also include a <code>markdown</code> child element which we'll also want to store in Solr. We also want to add in support for semantic annotations and probably structured funding information since that's been requested so much.</p>
<p>Support for semantic annotations will likely use the same approach as our previous work on semantic search where incoming annotations were subject to materialization during the indexing process. To explain what that means, the idea is that, when an annotation for term X comes in, the index processor loads the relevant ontology, materializes the superclass hierarchy for the term, and stores the entire hierarchy in Solr. For EML records with multiple annotations, only the unique set needs to be stored.</p>
<p><strong>View service</strong></p>
<p>As mentioned above, EML 2.2 changes how some existing elements (e.g., abstract) work and adds some new ones (e.g., annotations. The existing EML 2 stylesheets will work for EML 2.2 because 2.2 is backwards compatible but we'll want to extend them to support what's new and changed in EML 2.2</p>
<p>Of note: EML now supports storing Markdown where previously only EML TextType (basically DocBook) was allowed. Our plan on Metacat/MetacatUI to support this is to render the Markdown on the client side which does involve some security concerns (see <a href="https://github.com/NCEAS/metacatui/issues/860">https://github.com/NCEAS/metacatui/issues/860</a> for tracking).</p>
Infrastructure - Decision #8693 (In Progress): Support Google Dataset Search on search.dataone.or...https://redmine.dataone.org/issues/86932018-09-07T00:16:59ZBryce Mecummecum@nceas.ucsb.edu
<a name="Background"></a>
<h2 >Background<a href="#Background" class="wiki-anchor">¶</a></h2>
<p>Yesterday, <a href="https://toolbox.google.com/datasetsearch" class="external">Google Dataset Search</a> launched. We previoiusly attempted to make MetacatUI (and by extension, DataONE Search) compatible with it by <a href="https://github.com/NCEAS/metacatui/issues/482" class="external">injecting Schema.org JSON-LD into appropriate pages</a>. During development and testing, we checked our compatibility with the upcoming Google Dataset Search using Google's <a href="https://search.google.com/structured-data/testing-tool" class="external">Structured Data Testing Tool</a>. During development, this was all working fine and the feature appeared to be compatible but, after launching the feature on search.dataone.org, behavior changed on Google's end making it so Google no longer saw this JSON-LD. The reason for this is likely that, because MetacatUI follows a single page application architecture and we inject the JSON-LD on the client side, Google's JSON-LD crawler only saw what was sent from the server (a nearly empty index.html) and not our full page (with JSON-LD). I was able to test this theory and, while Google's crawler does execute JavaScript, it limits execution to about or exactly five seconds and MetacatUI <em>usually</em> doesn't finish injecting JSON-LD and rendering all content until after that timeout.</p>
<p>Potential paths forward to get DataONE Search compatible with Google's Dataset Search include (none of which are mutually exclusive):</p>
<ol>
<li>The assets that make up MetacatUI and the asset loading strategies could be optimized: <a href="https://github.com/NCEAS/metacatui/issues/224">https://github.com/NCEAS/metacatui/issues/224</a></li>
<li>Move the code (and any dependencies) that injects JSON-LD further up in the app boot so that Google sees it</li>
<li>Inject the appropriate JSON-LD on the server side to guarantee that Google sees it (originally Matt Jones' idea!)</li>
</ol>
<p>(1) is being worked on for sure, and (2) may not be needed if (1) is successful. I want to talk about option (3) because:</p>
<ul>
<li>It's a quicker solution (I already have something working) which would help get us involved in the project faster</li>
<li>It paves the way for future features and/or improvements to MetacatUI (we could be rendering more on the server side than just JSON-LD, like other metadata, more page content, etc)</li>
</ul>
<a name="What-I-did"></a>
<h2 >What I did<a href="#What-I-did" class="wiki-anchor">¶</a></h2>
<p>To test this idea, I modified a <a href="https://github.com/amoeba/backbone-pushstate-example" class="external">previous project</a> which is just a simple Node (Express.js) app that hosts MetacatUI by intercepting every request and serving the appropriate asset. In injects Schema.org JSON-LD, when appropriate, by querying the CN Solr index before sending MetacatUI's index.html to the client. <a href="https://github.com/amoeba/metacatui-ssr" class="external">Code is here</a> and its deployed <a href="http://neutral-cat.nceas.ucsb.edu/" class="external">here</a>. View source on any /view/... pages and you'll see a minimal Schema.org/Dataset description in the head. More properties can be added later. I did it quick and dirty: The app pre-loads MetacatUI's index.html as a <code>String</code> at app boot and injects the JSON-LD into it. No templating language or other magic.</p>
<a name="Things-to-address"></a>
<h2 >Things to address<a href="#Things-to-address" class="wiki-anchor">¶</a></h2>
<ul>
<li>How do we feel abouts switching from hosting MetacatUI via Apache (simple, bullet proof) to a Node based deployment just to support this feature (new territory, at least for me)?</li>
<li>If we do switch, we'd want to make really sure the Node app doesn't have weird failure cases where it doesn't return index.html (e.g., when Solr is down, or slow). The app needs to return index.html (and every other static asset) on every request and do it very fast and we should decide what the cutoff is so that it doesn't hold up app boot if Solr is slow/down.</li>
<li>Can this type of deployment easily be integrated with CN buildouts? I've deployed Node apps before by fronting them with Apache/nginx (via reverse proxy) and then keeping the node process up with Upstart</li>
<li>Is this performant enough for DataONE? I think my implementation is non-blocking but I'm not a Node expert so we'd want to code review and probably benchmark </li>
<li>We could wait on (1) and stick with our current deployment strategy</li>
</ul>
<a name="Other-notes"></a>
<h2 >Other notes<a href="#Other-notes" class="wiki-anchor">¶</a></h2>
<p>Unrelated to the Google Dataset Search issue but related to Google's crawling for Google Search, we've also identified:</p>
<ul>
<li>That the Metacat View Service is often unreasonably slow: <a href="https://github.com/NCEAS/metacat/issues/1234">https://github.com/NCEAS/metacat/issues/1234</a> and are planning to figure out why</li>
<li>That we can and should make use of sitemaps to help Google crawl our pages: <a href="https://github.com/NCEAS/metacat/issues/1263">https://github.com/NCEAS/metacat/issues/1263</a></li>
</ul>
Infrastructure - Task #8499 (New): Improve rendering of http://www.isotc211.org/2005/gmd-pangaea ...https://redmine.dataone.org/issues/84992018-03-14T00:49:31ZBryce Mecummecum@nceas.ucsb.edu
<p>I offered to file this a while back but it just sat in my inbox.</p>
<p>Initial support for rendering went in with <a href="https://redmine.dataone.org/issues/8219">https://redmine.dataone.org/issues/8219</a> but we noted in Slack in #ci that the rendering could be a lot better. As an example of how much better, Pangaea's own landing pages are quite nice: <a href="https://doi.pangaea.de/10.1594/PANGAEA.511392">https://doi.pangaea.de/10.1594/PANGAEA.511392</a></p>
<p>Upon checking the first Pangaea dataset in the index, <a href="https://search.dataone.org/#view/af113afed60c98df50052feaf6cb7894">https://search.dataone.org/#view/af113afed60c98df50052feaf6cb7894</a>, I see that the view service isn't actually working at all for this document. So I guess this could be two tasks:</p>
<ol>
<li>Enable the Metacat View Service for Pangaea docs</li>
<li>Improve the XSLT</li>
</ol>
<p>2 just involve extending the existing isotc211 XSLT if we think that other isotc211 creators want those fixes but if we want to make a nice view that's Pangaea-specific, we may want to consider a separate stylesheet suite.</p>
Infrastructure - Story #7859 (New): Add formatID for the STL 3d model file formathttps://redmine.dataone.org/issues/78592016-08-04T19:02:58ZBryce Mecummecum@nceas.ucsb.edu
<p>The STL file format is a domain standard file format for storing 3d models and is the most common way I've managed 3d models used while 3d printing. Given that 3d printing is seeing increased usage in the sciences, I would say this is a good candidate for inclusion in the controlled list of format ids.</p>
<p>Type: DATA<br>
Id: STL<br>
Name: StereoLithography File Format<br>
Media type: application/sla (unofficial)<br>
Extension: .stl</p>
<p>There is an ASCII form and a Binary form of this format. They don't see to be distinguished according to any standard. What do we do in this case?</p>
<p>References: <br>
- <a href="https://en.wikipedia.org/wiki/STL_(file_format)">https://en.wikipedia.org/wiki/STL_(file_format)</a><br>
- <a href="https://reference.wolfram.com/language/ref/format/STL.html">https://reference.wolfram.com/language/ref/format/STL.html</a></p>
DataONE API - Bug #7684 (New): Call to MNStorage.update() via REST API returns java.lang.StackOve...https://redmine.dataone.org/issues/76842016-03-21T23:07:39ZBryce Mecummecum@nceas.ucsb.edu
<p>I was trying to update an object via the REST API via cURL and forgot to enter the correct URL. The cURL command I used and response is:</p>
<p>$ curl -X PUT -H "Authorization: Bearer $TOKEN" -F "pid=resourceMap_doi:10.5065/D6G44NFV" -F "object=@object.xml" -F "sysmeta=@sysmeta.xml" -F "newPid=resourceMap_doi:10.5065/D6G44NFV_v3" $URL<br>
<?xml version="1.0" encoding="UTF-8"?><br>
java.lang.StackOverflowError<br>
</p>
<p>Where $URL was '<a href="https://arcticdata.io/metacat/d1/mn/v2/object">https://arcticdata.io/metacat/d1/mn/v2/object</a>' instead of '<a href="https://arcticdata.io/metacat/d1/mn/v2/object/resourceMap_doi:10.5065/D6G44NFV">https://arcticdata.io/metacat/d1/mn/v2/object/resourceMap_doi:10.5065/D6G44NFV</a>'</p>
<p>I expected to receive some sort of warning/error that I had forgotten to specify the URL properly for this call but instead saw a StackOverflowError.</p>
Infrastructure - Story #7668 (New): Determine how indexing of data packages should workhttps://redmine.dataone.org/issues/76682016-03-02T00:16:25ZBryce Mecummecum@nceas.ucsb.edu
<p>I've discovered (with Lauren's help) a strange requirement for how the resource maps for nested data packages have to be written. In order to get nested data packages correctly indexed in Solr so that the 'resourceMap' field of the resource map being nested is set to the parent resource map's PID, you have to create the appropriate set of @cito:documents@ statements in addition to the expected @ore:aggregates@ statements.</p>
<p>I expected the following to be sufficient (pardon the highly abstracted RDF, examples are linked below):</p>
<p>parent_resource_map#aggregation ore:aggregates child_resource_map<br>
parent_resource_map#aggregation ore:aggregates metadata_object</p>
<p>but I also had to add a @cito:documents@ statement between the <em>parent resource map's metadata object</em> and the resource maps being nested</p>
<p>parent_resource_map#aggregation ore:aggregates child_resource_map<br>
parent_resource_map#aggregation ore:aggregates metadata_object</p>
<p>parent_metadata_object cito:documents child_resource_map</p>
<p>The documentation does not suggest this and I found it confusing. A real life example of what I expected to work is here: <a href="https://gist.github.com/amoeba/c7a6ba269c5a1f78db1d">https://gist.github.com/amoeba/c7a6ba269c5a1f78db1d</a><br>
What I actually had to insert is here: <a href="https://dev.nceas.ucsb.edu/knb/d1/mn/v2/object/resourceMap_urn:uuid:ab17b047-a341-4d06-b433-92eed90dacec">https://dev.nceas.ucsb.edu/knb/d1/mn/v2/object/resourceMap_urn:uuid:ab17b047-a341-4d06-b433-92eed90dacec</a></p>
<p>Is the need for the @cito:documents@ statement(s) really required and is this the intended behavior? I've made this issue in the hopes we can talk about it.</p>
<p>I suggest updating the API docs with whatever we decide, and hopefully that update will include example RDF for a nested data package.</p>
DataONE API - Bug #7578 (New): Fix 404 link to d1_instance_generator folder in documentationhttps://redmine.dataone.org/issues/75782016-01-08T22:01:20ZBryce Mecummecum@nceas.ucsb.edu
<p>In the MN API documentation for MNStorage.create (<a href="https://jenkins-ucsb-1.dataone.org/job/API%20Documentation%20-%20trunk/ws/api-documentation/build/html//apis/MN_APIs.html#MNStorage.create">https://jenkins-ucsb-1.dataone.org/job/API%20Documentation%20-%20trunk/ws/api-documentation/build/html//apis/MN_APIs.html#MNStorage.create</a>), I found a the following paragraph contains a broken link to d1_instance_generator:</p>
<blockquote>
<p>"The system metadata included with the create call must contain values for the elements required to be set by clients (see System Metadata). The system metadata document can be crafted by hand or preferably with a tool such as generate_sysmeta.py which is available in the d1_instance_generator Python package. See documentation included with that package for more information on its operation."</p>
</blockquote>
<p>The link to d1_instance_generator was to the SVN folder <a href="https://repository.dataone.org/software/cicore/trunk/d1_instance_generator">https://repository.dataone.org/software/cicore/trunk/d1_instance_generator</a> which is currently a 404. I think the folder moved to /d1_test_utilities_python/src/d1_test/instance_generator.</p>
Infrastructure - Task #7466 (In Progress): Some objects not accessible on the CN via REST APIhttps://redmine.dataone.org/issues/74662015-11-04T18:41:38ZBryce Mecummecum@nceas.ucsb.edu
<p>While doing other work, I noticed that a good number (not sure how many) of objects listed on the CN's Solr index are not accessible via the REST API get() and resolve() methods. Instead of returning the object, they return a NotFound error. </p>
<p>To reproduce,</p>
<ol>
<li>Visit <a href="https://cn.dataone.org/cn/v1/query/solr/?fl=identifier,title,authoritativeMN,datasource&q=formatType:METADATA+AND+-obsoletedBy:*&rows=100&start=0">https://cn.dataone.org/cn/v1/query/solr/?fl=identifier,title,authoritativeMN,datasource&q=formatType:METADATA+AND+-obsoletedBy:*&rows=100&start=0</a></li>
<li>Pick a PID from the query result, e.g.</li>
</ol>
<ul>
<li>knb-lter-cap.148.9</li>
<li>CLOEBDMETADATA.10242013.1</li>
</ul>
<ol>
<li>Attempt to resolve() or get() the object via the REST API like: <a href="https://cn.dataone.org/cn/v1/object/CLOEBDMETADATA.10242013.1">https://cn.dataone.org/cn/v1/object/CLOEBDMETADATA.10242013.1</a></li>
<li>Receive a NotFound error instead of the object.</li>
</ol>
<p>Notes:</p>
<p>In IRC, Skye noticed that the objects can be retrieved via their respective MN so it appears this issue may be a Metacat replication issue.</p>