DataONE Tasks: Issueshttps://redmine.dataone.org/https://redmine.dataone.org/favicon.ico2017-05-21T17:06:46ZDataONE Tasks
Redmine Infrastructure - Task #8098 (Closed): Token-based authentication fails with LE CN certshttps://redmine.dataone.org/issues/80982017-05-21T17:06:46ZChris Jonescjones@nceas.ucsb.edu
<p>When trying to call @MN.create()@ on my local Metacat setup (which points to the production CN environment for ORCID authentication), I'm getting an @InvalidToken@ error:<br>
<br>
<?xml version="1.0" encoding="UTF-8"?><br>
Session is required to WRITE to the Node.<br>
</p>
<p>This is odd because I recently logged in via ORCID, so it looked like a token verification issue. In the Metacat log, I see:</p>
<p>metacat 20170521-10:35:07: [WARN]: Could not use public key to verify provided token: eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiJodHRwOlwvXC9vcmNpZC5vcmdcLzAwMDAtMDAwMi04MTIxLTIzNDEiLCJmdWxsTmFtZSI6IkNocmlzdG9waGVyIEpvbmVzIiwiaXNzdWVkQXQiOiIyMDE3LTA1LTIxVDE1OjU3OjA2LjEyOCswMDowMCIsImNvbnN1bWVyS2V5IjoidGhlY29uc3VtZXJrZXkiLCJleHAiOjE0OTU0NDcwMjYsInVzZXJJZCI6Imh0dHA6XC9cL29yY2lkLm9yZ1wvMDAwMC0wMDAyLTgxMjEtMjM0MSIsInR0bCI6NjQ4MDAsImlhdCI6MTQ5NTM4MjIyNn0.dynDbRKqIuI1bXzPYlHfW7aFcrl2J7O8ZWqxS_2DHBotx4AqX_hbxuRrlQ_9s-V1mRJupyxkYxW3EWkLcoMUQNTuyMLGpV53GPoGdBjkTEd407GU-yxv_G3cmmSovXSLj6AAjeKJ8KHBt4y6JtgqR2isf5YGoM18CwM-IZV3nJVPBMZpNMPhYSWJeaeD2u02duKCpcy7L-XD_OCLJdzHjtjyFqqbHvqGyZIPqc9Kp_JTuTmlYaAZe9JiLcjHnyaOeHMGCEkmOekiRA_wh6DtnBLKyCczBjNg0kirxMk27abjAxt-ckhKfrCT6dnXbd1lCLNnxVYiJj5wztNOGH492T3nyaSQGROnSQd6cxB3pPAiwW7AOR34MPNJlNv_r-3WbwThDeOOtrMSvfZtYGv6Mn_i0-d1yjccRDzZeXdRS0P91GYfdK2lfog1lhiPuec3gD4V4plNJR3wKSSMhgjikH6igCB5I7C5n9Ye5vSeyWW9ApwLogfbEUc3xKgiCgj1jtED4L7E3WgUvtWxsyqMMtaEAJGvRHlGPPShD3xHPsm6ltCVrU1arLXneuGa0R7M-GgzMk0z5HdRE2bD2agu5WuN-w5-w9W6jwrzgI4wM7v8KiJYxeM332nx4f2BF6ArFJ2K-DxlpgmdK6bkPTtL7H-uj5digXvBoHFYZAJF49c</p>
<p>After grabbing the public certificate from the production CN:</p>
<p>-----BEGIN CERTIFICATE-----<br>
MIIFQzCCBCugAwIBAgISAxPSoq7BM7aFc1VzgyTJkz3wMA0GCSqGSIb3DQEBCwUA<br>
MEoxCzAJBgNVBAYTAlVTMRYwFAYDVQQKEw1MZXQncyBFbmNyeXB0MSMwIQYDVQQD<br>
ExpMZXQncyBFbmNyeXB0IEF1dGhvcml0eSBYMzAeFw0xNzA1MTcxMjI5MDBaFw0x<br>
NzA4MTUxMjI5MDBaMBkxFzAVBgNVBAMTDmNuLmRhdGFvbmUub3JnMIIBIjANBgkq<br>
hkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAtp++UWPu0Zm4gIs01F+LE94i4eExI+UX<br>
82DIB3Xn93FW4IgDTsjEfXCB3AHggdx6GnExbDzu/iXn+K3LiW6QaeasG47XOeup<br>
JjpmJqDROAJvLy1GpgrFeNxEe5F6xljPcAxUH/W/NkoHAem7wMatRNA53f6JkMVd<br>
sKXAYPOdKUOqhQ9QRMqEFIPImt+SHfvxUkQyL4g+1taQ5XYDu5zwF5+k77ZRre+o<br>
RVR9gHdbdlvLLQYP9eGJdi+nmFFTrEuXIklB8SQi6yvck0p6nR2sjmxFlnaLTe7Z<br>
iaVWaA1vvwvwgG27Q2iMcnAG+JXQDe7Jd1YIuXUW7vVYyGl4ONbp3QIDAQABo4IC<br>
UjCCAk4wDgYDVR0PAQH/BAQDAgWgMB0GA1UdJQQWMBQGCCsGAQUFBwMBBggrBgEF<br>
BQcDAjAMBgNVHRMBAf8EAjAAMB0GA1UdDgQWBBSyQkmQUHHO3EkItWseuA3L6vg8<br>
1DAfBgNVHSMEGDAWgBSoSmpjBH3duubRObemRWXv86jsoTBwBggrBgEFBQcBAQRk<br>
MGIwLwYIKwYBBQUHMAGGI2h0dHA6Ly9vY3NwLmludC14My5sZXRzZW5jcnlwdC5v<br>
cmcvMC8GCCsGAQUFBzAChiNodHRwOi8vY2VydC5pbnQteDMubGV0c2VuY3J5cHQu<br>
b3JnLzBcBgNVHREEVTBTghRjbi1vcmMtMS5kYXRhb25lLm9yZ4IVY24tdWNzYi0x<br>
LmRhdGFvbmUub3JnghRjbi11bm0tMS5kYXRhb25lLm9yZ4IOY24uZGF0YW9uZS5v<br>
cmcwgf4GA1UdIASB9jCB8zAIBgZngQwBAgEwgeYGCysGAQQBgt8TAQEBMIHWMCYG<br>
CCsGAQUFBwIBFhpodHRwOi8vY3BzLmxldHNlbmNyeXB0Lm9yZzCBqwYIKwYBBQUH<br>
AgIwgZ4MgZtUaGlzIENlcnRpZmljYXRlIG1heSBvbmx5IGJlIHJlbGllZCB1cG9u<br>
IGJ5IFJlbHlpbmcgUGFydGllcyBhbmQgb25seSBpbiBhY2NvcmRhbmNlIHdpdGgg<br>
dGhlIENlcnRpZmljYXRlIFBvbGljeSBmb3VuZCBhdCBodHRwczovL2xldHNlbmNy<br>
eXB0Lm9yZy9yZXBvc2l0b3J5LzANBgkqhkiG9w0BAQsFAAOCAQEAJo/aaCo0NweP<br>
prHz+9Ko39xZ/Y6kum0ZOSw6BFM8zgkOOd1R0rbc53j09yKDi3V+MKd5rXfISNsp<br>
LKBVe/R8HH/rglYUhMTBBizGsEdyPE4n5I3ml4RyOVmC1SpDPUzH0CAeSLkzBpBV<br>
WVIfEwl641GtT0hBcwVjMlDYywrvSHv4mifVLd/2ZTSYillrhQzQySKb9g7jbEld<br>
LHY1WoIU0E5XgQJq3b6Vhb5dXVkHsDfwPHNpJA5fVCVYoKazo+xSNBP757ta/ix4<br>
e9CbRsQQ0TgEsuUAOa9lh9+O8uAL5zkZ4kwZCLypxbkZ8/YYOCMGMtGz4632J7VF<br>
Ozukfk41bw==<br>
-----END CERTIFICATE-----</p>
<p>and trying to verify the token with this certificate, it fails. </p>
<p>However, it verifies correctly with the old CN certificate:<br>
<br>
-----BEGIN CERTIFICATE-----<br>
MIIFrTCCBJWgAwIBAgICbkowDQYJKoZIhvcNAQELBQAwRzELMAkGA1UEBhMCVVMx<br>
FjAUBgNVBAoTDUdlb1RydXN0IEluYy4xIDAeBgNVBAMTF1JhcGlkU1NMIFNIQTI1<br>
NiBDQSAtIEczMB4XDTE0MTEwMzEyNTMyNFoXDTE3MDUyMDIxNDU0OVowgZExEzAR<br>
BgNVBAsTCkdUMzkwMjU2MTcxMTAvBgNVBAsTKFNlZSB3d3cucmFwaWRzc2wuY29t<br>
L3Jlc291cmNlcy9jcHMgKGMpMTIxLzAtBgNVBAsTJkRvbWFpbiBDb250cm9sIFZh<br>
bGlkYXRlZCAtIFJhcGlkU1NMKFIpMRYwFAYDVQQDDA0qLmRhdGFvbmUub3JnMIIC<br>
IjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEAzZaa/tslwA/CJ6Wqfzl72TrF<br>
/8IurHHrfzmme/B2dSUt0+zDfdfXWe7p6pZ4yJp95Kk34cf0EFWgFJ5Nc1gyXJUh<br>
Ht6IVweDDFrExeNPsNbI5DLFdUJ5ZfNhWrqu2C4kdeRfHqxOvI0w6XEfdZ4yI3QC<br>
zfx5EtsoFEXpqK5Xe3r5KEnXVsPq6azerVqvq2UqhPa0EYJA8/CVJiQ0CRQl+w9x<br>
Mh6GBvHUXqCHBPlRPIY7QomI+3Cx8gYgcLCCEcHVgzU05zQQRwdtIqjENq6CubH9<br>
UTMiKS81CFJbAVrKetDRI3bNGIcEEpjV1XC28OOWXNc9fXXAK3fvVFVl2tuzYFn0<br>
ROmRrtiz4+jXC7mp7/fTb5ekTeenKyoVA5UicbIHM1PPQeTwcHUH7CxybJVheGAo<br>
7wwzqrxin3LMMyn56QBXqB81qL+iMJ+ZBHXxiS5V6g4W1ag3VOtDvyRtN1QGB6J2<br>
enOTBOHNwr9bHuJcVPx1dYd6YjZD3LQbyJZyVtYHalnlCXGjLCxs9B2uL4MBllb5<br>
N++ouBiujO5ww6Ht+MgOq/gbahx9WlJCs5xXLy8Hf+FfjUBZXDdkvLwa36FWktZa<br>
ibbqqeBBq9IaW0gUNNmhYs3SB8J7JICVflUIp7e7wy7cXBJHpkATZKAuHVnqJ8ZT<br>
83YekoQFyxpcqB2fmRkCAwEAAaOCAVYwggFSMB8GA1UdIwQYMBaAFMOc8/zTRgg0<br>
u85Gf6B8W/PiCMtZMFcGCCsGAQUFBwEBBEswSTAfBggrBgEFBQcwAYYTaHR0cDov<br>
L2d2LnN5bWNkLmNvbTAmBggrBgEFBQcwAoYaaHR0cDovL2d2LnN5bWNiLmNvbS9n<br>
di5jcnQwDgYDVR0PAQH/BAQDAgWgMB0GA1UdJQQWMBQGCCsGAQUFBwMBBggrBgEF<br>
BQcDAjAlBgNVHREEHjAcgg0qLmRhdGFvbmUub3JnggtkYXRhb25lLm9yZzArBgNV<br>
HR8EJDAiMCCgHqAchhpodHRwOi8vZ3Yuc3ltY2IuY29tL2d2LmNybDAMBgNVHRMB<br>
Af8EAjAAMEUGA1UdIAQ+MDwwOgYKYIZIAYb4RQEHNjAsMCoGCCsGAQUFBwIBFh5o<br>
dHRwczovL3d3dy5yYXBpZHNzbC5jb20vbGVnYWwwDQYJKoZIhvcNAQELBQADggEB<br>
ABcvSyNwX1jHZ7HRX5Lzcua0Q4//wc5KCBvPgPrbr3bGSi3+t+Rc4ZagIUxFWSd1<br>
uZ+guQ4lywhQXGOXh7dH1SPljPOwZ9VPdhJMPW/woaQ0ndakLvW0OBIgyyqIcJ57<br>
8e6DKzZ0jd97xmXYAa7iMhCxL2lpXzDQMH5k8XhENHcjMXfVitkqmIS2Wfi1rEMK<br>
phszml9yRABtx+X0z/4/xmNZ2PrNApqmqVD2DnY1MgJNHga/KmPX/6VZ+NEszudP<br>
rvrD5hQvAjkJA+5kgqX31w98ggfXg4oxQo8AhKrHWnhI52SoWT1BOwSGDRpgRW/n<br>
1AdVxT9TIoHXbhf6+c8fWOU=<br>
-----END CERTIFICATE-----</p>
<p>So, effectively, the @d1_cn_portal@ component is still using the old RapidSSL certificate to sign tokens, but (I think) on MNs that have recently been restarted and grab the most recent CN certificate for verification purposes, the get the new LE certificate, and so can't verify incoming tokens signed by the CN. My guess is that this is going to be problematic for other MNs that go through a reboot and or restart and rely on the CN signing tokens. Looking at the @portal.properties@ file on the cn, I see that it is indeeed still pointing to the old certificate and key:<br>
<br>
cn.server.publiccert.filename=/etc/ssl/certs/_.dataone.org.crt<br>
cn.server.privatekey.filename=/etc/ssl/private/dataone_org.key</p>
<p>So, in the short term, we need to plan to re-configure @portal.properties@ on the production CNs to use the new Let's Encrypt certificates for token signing:<br>
<br>
cn.server.publiccert.filename=/etc/letsencrypt/live/cn.dataone.org/fullchain.pem<br>
cn.server.privatekey.filename=/etc/letsencrypt/live/cn.dataone.org/privkey.pem</p>
<p>However, the @fullchain.pem@ includes the intermediate CA certs as well, and I don't know if @CertificateManager.loadCertificateFromFile()@ handles multiple certificates in a file (i.e. does it use the first found, last found, etc?). We need to determine this before making the properties change, but also before other production MNs get rebooted and begin to fail authentication for clients.</p>
<p>Once tested, for the long term, we need to update the portal properties in the buildout to make the changes permanent. We may also need to add some logic for ensuring the @/etc/letsencrypt@ files have the correct permissions as Dave pointed out.</p>
Search UI - Task #7490 (Closed): Search UI filter list does not render correctly under IE 11https://redmine.dataone.org/issues/74902015-11-16T17:53:07ZChris Jonescjones@nceas.ucsb.edu
<p>Using IE 11 on Windows 8.1, the filter list is rendering incorrectly. See the attached image. The search boxes are also not scaling to the width of the containing element, and are super short.</p>
CN REST - Bug #7391 (Closed): Synchronization and replica updates fail after MN.updateSystemMetad...https://redmine.dataone.org/issues/73912015-09-28T20:56:55ZChris Jonescjones@nceas.ucsb.edu
<p>In the mixed V1/V2 DEV2 environment, a call to MN.updateSystemMetadata() successfully calls CN.synchronize(), but the synchronization ultimately fails, and so the CN and the replica member nodes fail to receive the updated system metadata. An example object is pid <code>http://dx.doi.org/10.5061/dryad.37mm8/2/bitstream</code>, which was updated to include a seriesId field. We expect that the CN will then get the new system metadata, and that the V2 MNs with replicas will also get the update.</p>
<p>The error message on cn-dev-ucsb-2 in /var/log/dataone/synchronization/cn-synchronization.log was (unfortunately) a bit terse:<br>
<br>
[ INFO] 2015-09-28 12:21:10,541 (V2TransferObjectTask:validateSeriesId:544) Task-urn:node:mnDevUCSB2-<a href="http://dx.doi.org/10.5061/dryad.37mm8/2/bitstream">http://dx.doi.org/10.5061/dryad.37mm8/2/bitstream</a> SeriesId doesn't exist for any object on the CN...<br>
[DEBUG] 2015-09-28 12:21:10,541 (V2TransferObjectTask:alreadyExists:578) Task-urn:node:mnDevUCSB2-<a href="http://dx.doi.org/10.5061/dryad.37mm8/2/bitstream">http://dx.doi.org/10.5061/dryad.37mm8/2/bitstream</a> entering alreadyExists...<br>
[ INFO] 2015-09-28 12:21:10,543 (V2TransferObjectTask:alreadyExists:595) Task-urn:node:mnDevUCSB2-<a href="http://dx.doi.org/10.5061/dryad.37mm8/2/bitstream">http://dx.doi.org/10.5061/dryad.37mm8/2/bitstream</a> Pid Exists. Must be a systemMetadata update.<br>
[DEBUG] 2015-09-28 12:21:10,543 (V2TransferObjectTask:processUpdates:870) Task-urn:node:mnDevUCSB2-<a href="http://dx.doi.org/10.5061/dryad.37mm8/2/bitstream">http://dx.doi.org/10.5061/dryad.37mm8/2/bitstream</a> entering processUpdates...<br>
[ INFO] 2015-09-28 12:21:10,543 (V2TransferObjectTask:processUpdates:877) Task-urn:node:mnDevUCSB2-<a href="http://dx.doi.org/10.5061/dryad.37mm8/2/bitstream">http://dx.doi.org/10.5061/dryad.37mm8/2/bitstream</a> Processing as an Update<br>
[ INFO] 2015-09-28 12:21:10,543 (V2TransferObjectTask:processUpdates:878) Task-urn:node:mnDevUCSB2-<a href="http://dx.doi.org/10.5061/dryad.37mm8/2/bitstream">http://dx.doi.org/10.5061/dryad.37mm8/2/bitstream</a> Getting sysMeta from HazelCast map<br>
[ INFO] 2015-09-28 12:21:10,544 (SystemMetadataValidator:schemaValidateSystemMetadata:71) Entering schemaValidateSysMeta method...<br>
[ INFO] 2015-09-28 12:21:10,544 (SystemMetadataValidator:validateEssentialProperties:186) The submitted checksum matches existing one<br>
[ERROR] 2015-09-28 12:21:10,546 (V2TransferObjectTask:processTask:410) Task-urn:node:mnDevUCSB2-<a href="http://dx.doi.org/10.5061/dryad.37mm8/2/bitstream">http://dx.doi.org/10.5061/dryad.37mm8/2/bitstream</a><br>
null</p>
<p>The general Exception is thrown on line 410 of V2TransferObjectTask, with a "null" message. Given the messages above, it looks like we can narrow it down to the call to processUpdates() where things go south, although there are no more messages coming from that method. Needs a bit of logging to track this down.</p>
ONE Mercury - Bug #6870 (Rejected): Fix handling of identifiers with url-escaped charactershttps://redmine.dataone.org/issues/68702015-02-28T00:11:20ZChris Jonescjones@nceas.ucsb.edu
<p>We've been getting new content from Dryad, and the ONEMercury handling of the identifiers looks to be broken. For example, doing a search for * and Dryad as the Member Node:<br>
<br>
<a href="https://cn.dataone.org/onemercury/send/query?term1=*&term1attribute=fullText&term2.1=&term2.1attribute=fullText&op2.1=AND&term2.2attribute=fullText&term2.2=&op2.2=AND&term2.3attribute=fullText&term2.3=&op2.3=and&term3=%2C%2C%2C&term3attribute=overlaps&op4=during&term4=&term4attribute=beginDate&term5=&term5attribute=endDate&term6attribute=datasource&term8=either&pageSize=10&queryString=+Entire+Document+%3A+*++and+true+coordinates+%28N%2CW%2CS%2CE%29+%3D+%28%2C%2C%2C%29+and+++and++from+sources%3A+urn%3Anode%3ADRYAD&instance=pilotcatalog&filterForDataHidden=&term6=urn%3Anode%3ADRYAD">https://cn.dataone.org/onemercury/send/query?term1=*&term1attribute=fullText&term2.1=&term2.1attribute=fullText&op2.1=AND&term2.2attribute=fullText&term2.2=&op2.2=AND&term2.3attribute=fullText&term2.3=&op2.3=and&term3=%2C%2C%2C&term3attribute=overlaps&op4=during&term4=&term4attribute=beginDate&term5=&term5attribute=endDate&term6attribute=datasource&term8=either&pageSize=10&queryString=+Entire+Document+%3A+*++and+true+coordinates+%28N%2CW%2CS%2CE%29+%3D+%28%2C%2C%2C%29+and+++and++from+sources%3A+urn%3Anode%3ADRYAD&instance=pilotcatalog&filterForDataHidden=&term6=urn%3Anode%3ADRYAD</a></p>
<p>Most all of the identifiers are rendered incorrectly. Some look truncated, others contain XML markup, etc:</p>
<blockquote>
<p>ttp://dx.doi.org/10.5061/dryad.6gr7t/2?ver=2014-02-21T12:54:19.782-05:00<br>
x.doi.org/10.5061/dryad.121d03jc/11?ver=2012-08-16T10:38:20.266-04:00<br>
00oi.org/10.5061/dryad.121d03jc/8?ver=2012-08-16T10:36:22.333-04:00<br>
080/9?ver=2013-01/10.5061/dryad.sd080/9?ver=2013-01-30T12:24:48.723-05:00 </p>
</blockquote>
<p>This results in broken links to metadata content, such as:<br>
<br>
<a href="https://cn.dataone.org/onemercury/send/xsltText2?pid=%3Ettp://dx.doi.org/10.5061/dryad.6gr7t/2?ver=2014-02-21T12:54:19.782-05:00&fileURL=https://cn.dataone.org/cn/v1/resolve/http%3A%2F%2Fdx.doi.org%2F10.5061%2Fdryad.6gr7t%2F2%3Fver%3D2014-02-21T12%3A54%3A19.782-05%3A00&full_datasource=Dryad%20Digital%20Repository&full_queryString=%20*%20AND%20has%20direct%20data%20AND%20%28%20datasource%20:%28%20urn:node:DRYAD%20%20%29%20%29%20&ds_id=">https://cn.dataone.org/onemercury/send/xsltText2?pid=%3Ettp://dx.doi.org/10.5061/dryad.6gr7t/2?ver=2014-02-21T12:54:19.782-05:00&fileURL=https://cn.dataone.org/cn/v1/resolve/http%3A%2F%2Fdx.doi.org%2F10.5061%2Fdryad.6gr7t%2F2%3Fver%3D2014-02-21T12%3A54%3A19.782-05%3A00&full_datasource=Dryad%20Digital%20Repository&full_queryString=%20*%20AND%20has%20direct%20data%20AND%20%28%20datasource%20:%28%20urn:node:DRYAD%20%20%29%20%29%20&ds_id=</a><br>
<br>
When using the the actual pid, the content is present:<br>
<br>
<a href="https://cn.dataone.org/onemercury/send/xsltText2?pid=http://dx.doi.org/10.5061/dryad.6gr7t/2?ver=2014-02-21T12:54:19.782-05:00&fileURL=https://cn.dataone.org/cn/v1/resolve/http%3A%2F%2Fdx.doi.org%2F10.5061%2Fdryad.6gr7t%2F2%3Fver%3D2014-02-21T12%3A54%3A19.782-05%3A00&full_datasource=Dryad%20Digital%20Repository&full_queryString=%20*%20AND%20has%20direct%20data%20AND%20%28%20datasource%20:%28%20urn:node:DRYAD%20%20%29%20%29%20&ds_id=#top">https://cn.dataone.org/onemercury/send/xsltText2?pid=http://dx.doi.org/10.5061/dryad.6gr7t/2?ver=2014-02-21T12:54:19.782-05:00&fileURL=https://cn.dataone.org/cn/v1/resolve/http%3A%2F%2Fdx.doi.org%2F10.5061%2Fdryad.6gr7t%2F2%3Fver%3D2014-02-21T12%3A54%3A19.782-05%3A00&full_datasource=Dryad%20Digital%20Repository&full_queryString=%20*%20AND%20has%20direct%20data%20AND%20%28%20datasource%20:%28%20urn:node:DRYAD%20%20%29%20%29%20&ds_id=#top</a></p>
<p>We need to track down where the identifiers are getting mangled.</p>
Infrastructure - Task #4231 (Closed): Generate a Master System Metadata Table from all CNshttps://redmine.dataone.org/issues/42312014-01-22T17:15:00ZChris Jonescjones@nceas.ucsb.edu
<p>Source all pids from the CNs, using CN{1-3}DAO.listPids(). This may either be stored in memory or persisted (impl detail). For each pid, call each CN to get the system metadata using CN{1-3}DAO.getSystemMetadata() and insert the results into the HarvestSystemMetadataTable using HarvestCacheDAO.saveSystemMetadata(). A given CN may return a null SystemMetadata object (from a Not Found exception), or a SystemMetadata object. After receiving responses from all 3 CNs, submit a MergeJob (Callable) to the MergeExecutorService. </p>
Infrastructure - Story #4230 (Closed): CN System Metadata needs tidyinghttps://redmine.dataone.org/issues/42302014-01-22T17:12:38ZChris Jonescjones@nceas.ucsb.edu
<p>System metadata across the CNs is out of sync (see <a class="issue tracker-4 status-6 priority-6 priority-high2 closed parent" title="Story: CN Consistency Check and CN Recovery (Rejected)" href="https://redmine.dataone.org/issues/3736">#3736</a>), and we've decided to take a simpler approach for a short term fix. This ticket documents the work being done in the d1_tidy_sysmeta tool that will be used as a short term fix for getting the system metadata content sync'd across the CNs. The high level design document is "here":<a href="https://docs.google.com/a/nceas.ucsb.edu/document/d/1hUhWX5XOKPZfqZUEirkoiY2vEGryLhqD6B0Aaj_WgyE/edit">https://docs.google.com/a/nceas.ucsb.edu/document/d/1hUhWX5XOKPZfqZUEirkoiY2vEGryLhqD6B0Aaj_WgyE/edit</a> , and the etherpad notes are "here":<a href="http://epad.dataone.org/cn-audit-systemmetadata-design">http://epad.dataone.org/cn-audit-systemmetadata-design</a> .</p>
Infrastructure - Task #4210 (Testing): Metacat does not set serialVersion correctly in CNodeServi...https://redmine.dataone.org/issues/42102013-12-20T15:22:50ZChris Jonescjones@nceas.ucsb.edu
<p>For DATA and METADATA, CNodeService.archive() and D1NodeService.archive(), respectively, don't increment the serialVersion field. Check this for delete() as well. D1NodeService delegates to DocumentImpl to call the HZ put() method, so the fix needs to be there, and in CNodeService.</p>
Infrastructure - Task #4185 (Rejected): Develop a CN audit algorithm to recover inconsistent CN s...https://redmine.dataone.org/issues/41852013-11-22T17:15:56ZChris Jonescjones@nceas.ucsb.edu
<p>Based on the categories of inconsistencies found across the CNs, we need a workflow that allows us to independently recover each CN's system metadata backing store, along with the science metadata and resource map content, to a state that is consistent across all 3 CNs. This will involve a direct DAO layer that doesn't rely on the hzsystemMetadata map for making updates, since they must be altered independently.</p>
<p>Robert, please flesh this task out with the work you've already been doing.</p>
Infrastructure - Task #4184 (Closed): Evaluate CN to CN access policy inconsistencieshttps://redmine.dataone.org/issues/41842013-11-22T17:11:13ZChris Jonescjones@nceas.ucsb.edu
<p>Due to network partitioning in the production Hazelcast cluster, there is now inconsistent system metadata content across the CNs. Evaluate the extent of the access policy differences. From the etherpad:<br>
<br>
For AccessPolicy diff classification:<br>
* SELECT guid, principal_name, permission, perm_type from xml_access ORDER BY guid, principal_name, permission, perm_type;<br>
* output to file<br>
* diff 3 files - evaluate visually using Meld or FileMerge, classify the types of problems, ensure that the CN Audit code being written addresses all classes</p>
Infrastructure - Task #4183 (Closed): Evaluate CN to CN system metadata attribute inconsistencieshttps://redmine.dataone.org/issues/41832013-11-22T17:08:49ZChris Jonescjones@nceas.ucsb.edu
<p>Due to network partitioning in the production Hazelcast cluster, there is now inconsistent system metadata content across the CNs. Evaluate the extent of the main system metadata attribute differences. From the etherpad:<br>
<br>
Diff criteria - determine differences<br>
1.) systemmetadata table <br>
Any differences in following column values:<br>
* rights holder<br>
* replication allowed<br>
* number replicas<br>
* obsoletes<br>
* obsoletedBy<br>
* archived<br>
* dateSysMetadataModified<br>
RESOLUTION for systemmetadata table<br>
* Highest serial version<br>
* Rights Holder on the highest serial version, followed by mod date (if same serial)<br>
* replication allowed if set, merged to yes.<br>
* number replicas - merge to highest value.<br>
* archived if true anywhere, merge to true<br>
* obsoleted if set anywhere, merge value<br>
* obsoletedBy if set anywhere, merge value<br>
* dateSysMetadataModified - keep latest</p>
Member Nodes - Task #4152 (Closed): Fix SEAD indexing of science metadatahttps://redmine.dataone.org/issues/41522013-10-31T21:33:57ZChris Jonescjones@nceas.ucsb.edu
<p>The SEAD MN sync'd content correctly to the 3 CNs, but due to a CN to CN misconfiguration issue, the science metadata documents were not replicated to each CN, and therefore the indexing process has only indexed 6 of 12 documents. The ids and their autogen ids are:<br>
<br>
sead-Bode-Collin-2099c7b0-fb88-4236-a1e3-bbe269d7a2cb | autogen.2013102416000658346<br>
sead-Bode-Collin-32a51798-22d1-48e9-8b13-05e5243b54e4 | autogen.2013102416000450351<br>
sead-Bode-Collin-5c66ff91-7925-4e45-ae06-9cce86ef59f3 | autogen.2013102216000426337<br>
sead-Bode-Collin-c9138cf9-0eee-4511-830c-3d3f8b8ca785 | autogen.2013102416000348942<br>
sead-Kim-Wonsuck-dcd4602a-09c4-4e82-9f1f-93deea34fe99 | autogen.2013102216000440543<br>
sead-Marr-Jeff-D.-88169ca9-d8e9-475d-8b2f-c6a98f7c5513 | autogen.2013102416000507243<br>
sead-Marr-Jeff-D.-e5139c88-b097-4311-9411-cb3884e600e8 | autogen.2013102416000344241<br>
sead-Martin-John-f1dbc3df-c27c-4647-b05a-4b1f05c99a24 | autogen.2013102416000565544<br>
sead-Morin-Paul-5eb4779c-53bc-43ae-8952-2486c2a2bc00 | autogen.2013102416000772945<br>
sead-Nguyen-Charles-fba9a7cf-c8dd-4632-b71f-408c183844fb | autogen.2013102416000440941<br>
sead-Singh-Arvind-2ff15a1b-f9be-43b8-a408-c1dc647d12ea | autogen.2013102416000636044<br>
sead-Strong-Nikki-572e7ebc-b392-4103-af2e-36ba8209d9f2 | autogen.2013102416000484552<br>
sead-Tal-Michal-0a60d1a1-5758-4b96-8eaf-b83e69c976e6 | autogen.2013102416000352243<br>
sead-Willcock-Peter-0928b6f0-66f8-4b30-8a54-a5a0576181d6 | autogen.2013102216000425042<br>
sead-bode-0bf5a0a7-831d-4e11-bf5b-27f9c6022c85 | autogen.2013102416000444550</p>
<p>There are 15 in this list, and the following 3 can be administratively deleted after the indexing issue is resolved:<br>
<br>
sead-Kim-Wonsuck-dcd4602a-09c4-4e82-9f1f-93deea34fe99<br>
sead-Willcock-Peter-0928b6f0-66f8-4b30-8a54-a5a0576181d6<br>
sead-Bode-Collin-5c66ff91-7925-4e45-ae06-9cce86ef59f3</p>
<p>Robert will be fixing the CN-CN replication issue, with Jing's help where needed, and Skye will be helping to ensure the last 6 documents get indexed.</p>
Infrastructure - Task #3890 (Closed): ns2.afraid.org is not serving dataone.org properlyhttps://redmine.dataone.org/issues/38902013-08-07T14:15:39ZChris Jonescjones@nceas.ucsb.edu
<p>We've been having trouble with the ns2.afraid.org server pulling updates to the dataone.org domain after changing the zone files. </p>
<p>Nick, will you verify the changes I made to /etc/bind/named.conf.local (described below), and if everything looks okay, re-assign this to Dave so he can look at the afraid.org configuration (unless you can check that too)?</p>
<p>We noticed that the ns2.afraid.org IP address had changed (from 174.37.196.55 to 208.43.71.243), and so after having trouble with ns2.afraid.org having updated DNS entries, I changed /etc/bind/named.conf.local to add 208.43.71.243 to the xferhost acl instead of 174.37.196.55. For instance, for the recent addition of ansible.dataone.org:<br>
<br>
$ dig @ns2.afraid.org ansible.dataone.org</p>
<p>; <<>> DiG 9.7.6-P1 <<>> @ns2.afraid.org ansible.dataone.org<br>
; (1 server found)<br>
;; global options: +cmd<br>
;; Got answer:<br>
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 6301<br>
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0<br>
;; WARNING: recursion requested but not available</p>
<p>;; QUESTION SECTION:<br>
;ansible.dataone.org. IN A</p>
<p>;; Query time: 59 msec<br>
;; SERVER: 208.43.71.243#53(208.43.71.243)<br>
;; WHEN: Wed Aug 7 07:50:11 2013<br>
;; MSG SIZE rcvd: 37</p>
<p>ns2.afraid.org continues to give a SERVFAIL status, whereas a call to 8.8.8.8 gives a NOERROR status and returns the CNAME record for ansible.dataone.org.</p>
<p>Here are the changes made to files in /etc/bind:</p>
<p>$ sudo git diff HEAD^ HEAD<br>
diff --git a/bind/db.dataone.org b/bind/db.dataone.org<br>
index 1b1b4a4..8c3aa34 100644<br>
--- a/bind/db.dataone.org<br>
+++ b/bind/db.dataone.org<br>
@@ -5,7 +5,7 @@<br>
;<br>
$TTL 86400 ; changed from default 86400<br>
dataone.org. IN SOA ns1.nceas.ucsb.edu. root.ns1.nceas.ucsb.edu. (<br>
- 2013080200 ; serial number<br>
+ 2013080600 ; serial number<br>
360 ; 1 min; default refresh 1 hour (3600) (frequency secondary DNS is updated)<br>
900 ; 1 min; default retry 15 min<br>
3600000 ; expire 1000 hours<br>
@@ -65,6 +65,7 @@ releases 1D IN A 129.24.0.11<br>
ns 1D IN CNAME releases<br>
ldap 1H IN CNAME ldap.ecoinformatics.org.<br>
test123 IN A 128.111.220.124<br>
+ansible 1H IN CNAME ansible.dataone.utk.edu.<br>
;<br>
;test subdomain<br>
;<br>
diff --git a/bind/named.conf.local b/bind/named.conf.local<br>
index 946c730..1f54498 100644<br>
--- a/bind/named.conf.local<br>
+++ b/bind/named.conf.local<br>
@@ -20,7 +20,7 @@ acl xferhosts {<br>
128.111.1.1; <br>
128.111.220.16;<br>
128.111.220.18;<br>
- 174.37.196.55;<br>
+ 208.43.71.243;<br>
localhost;<br>
}; </p>
Infrastructure - Story #3889 (Closed): Document the log aggregation component in the operations d...https://redmine.dataone.org/issues/38892013-08-06T16:47:41ZChris Jonescjones@nceas.ucsb.edu
<p>Story <a class="issue tracker-4 status-5 priority-4 priority-default closed parent" title="Story: Implement Log Aggregation (Closed)" href="https://redmine.dataone.org/issues/2093">#2093</a> provides the tickets for the d1_log_aggregation component of the CN, but doesn't provide the overview documentation of the code. Write up a section of the Operations Docs that describes the installation and overall workflow of the component, including the harvest process, the indexing process, and the access control implementation. Describe which classes provide which function, and how the log aggregation is kicked off (scheduler, etc). This will help other developers understand the code. This implementation description can be tied to the "Log Aggregation architecture design":<a href="http://mule1.dataone.org/ArchitectureDocs-current/design/LogAggregator.html">http://mule1.dataone.org/ArchitectureDocs-current/design/LogAggregator.html</a> description as needed.</p>
Infrastructure - Task #3886 (Closed): Filter the d1-cn-log index based on a public rolehttps://redmine.dataone.org/issues/38862013-08-06T15:36:30ZChris Jonescjones@nceas.ucsb.edu
<p>The d1-cn-log Solr index currently requires authenticated access, since portions of the log entries are sensitive information. For the d1_dashboard, we need access to the index for various levels of display, both authenticated and public.</p>
<p>pid, ipAddress, userAgent, subject, event, dateLogged and nodeId are the fields exposed through the d1 log api call. Of these, ipAddress, userAgent, and subject are sensitive fields. These fields should only be accessible by a 1) CN subject, 2) owning MN subject, 3) rights owner subject or equivalent identity. </p>
<p>For the first version of the d1_dashboard application, filter Solr queries to provide public access to only the summary information returned by Solr. This requires that queries by the public user<br>
1) should be accepted<br>
2) should have the rows parameter set to 0 despite the input prior to executing the query<br>
3) queries that include facets should redact the ipAddress, userAgent, and subject fields from the facet.field parameter prior to executing the query</p>
<p>This will provide general data on total MN CRUD events per pid.</p>
Infrastructure - Story #3729 (Closed): Member Nodes should be the authoritative source of System ...https://redmine.dataone.org/issues/37292013-04-26T15:38:13ZChris Jonescjones@nceas.ucsb.edu
<p>After gaining some experience with CN-based authority for system metadata, we've realized that there are many use cases that require system metadata to be managed by the MNs authoritatively, and by the CNs secondarily as a cached version. The main use case involves access control. When an ITK client creates an object through MN.create(), control of the system metadata is transferred to the CN once synchronization happens. After that point, the ITK client (and scientist) has to make CN.setAccessPolicy() calls to make any changes. If the MN is set to sync once a week, this is problematic.</p>
<p>Ultimately, the CN stores and manages system metadata in order to track replicas of an object, and to perform auditing on those replicas. This change would really just require that the CN remains the authority of the ReplicaList, whereas the MN would become the authority of all other fields. By doing so, ITK clients will be able to interact with their MN without delay, and the CNs will work off of cached versions of the system metadata (in Hazelcast and persisted in Metacat).</p>
<p>To effect these changes, specific sub tasks include:</p>
<p>1) Design of the system - sequence diagram and use case changes<br>
2) Update architecture and guide docs to express that the MN is the authoritative source for system metadata<br>
3) Ensure MN API calls reflect that the MN is responsible for incrementing serial version<br>
3.1) Serial version will be used to track all fields except the ReplicaList<br>
3.2) Consider adding an optional serialVersion attribute to a a ReplicaList, and use it to track versions of the list on the CN<br>
4) Include a push notification of system metadata change (CN.systemMetadataChanged()) <br>
5) Change the CN stack to accommodate the MN-based authority <br>
5.1) Replication <br>
5.2) Synchronization<br>
5.3) DAO layer for replication<br>
6) Change all MN stacks to implement the new features (Metacat, GMN, Mercury, EDAC, Dryad, Merritt)<br>
7) Add MN.updateSystemMetadata() interfaces (d1_common_{java|python}, d1_libclient_{java|python})<br>
8) Refactor CN sysmeta methods to delegate to MN.</p>