Task #8098
Token-based authentication fails with LE CN certs
100%
Description
When trying to call @MN.create()@ on my local Metacat setup (which points to the production CN environment for ORCID authentication), I'm getting an @InvalidToken@ error:
<?xml version="1.0" encoding="UTF-8"?>
Session is required to WRITE to the Node.
This is odd because I recently logged in via ORCID, so it looked like a token verification issue. In the Metacat log, I see:
metacat 20170521-10:35:07: [WARN]: Could not use public key to verify provided token: eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiJodHRwOlwvXC9vcmNpZC5vcmdcLzAwMDAtMDAwMi04MTIxLTIzNDEiLCJmdWxsTmFtZSI6IkNocmlzdG9waGVyIEpvbmVzIiwiaXNzdWVkQXQiOiIyMDE3LTA1LTIxVDE1OjU3OjA2LjEyOCswMDowMCIsImNvbnN1bWVyS2V5IjoidGhlY29uc3VtZXJrZXkiLCJleHAiOjE0OTU0NDcwMjYsInVzZXJJZCI6Imh0dHA6XC9cL29yY2lkLm9yZ1wvMDAwMC0wMDAyLTgxMjEtMjM0MSIsInR0bCI6NjQ4MDAsImlhdCI6MTQ5NTM4MjIyNn0.dynDbRKqIuI1bXzPYlHfW7aFcrl2J7O8ZWqxS_2DHBotx4AqX_hbxuRrlQ_9s-V1mRJupyxkYxW3EWkLcoMUQNTuyMLGpV53GPoGdBjkTEd407GU-yxv_G3cmmSovXSLj6AAjeKJ8KHBt4y6JtgqR2isf5YGoM18CwM-IZV3nJVPBMZpNMPhYSWJeaeD2u02duKCpcy7L-XD_OCLJdzHjtjyFqqbHvqGyZIPqc9Kp_JTuTmlYaAZe9JiLcjHnyaOeHMGCEkmOekiRA_wh6DtnBLKyCczBjNg0kirxMk27abjAxt-ckhKfrCT6dnXbd1lCLNnxVYiJj5wztNOGH492T3nyaSQGROnSQd6cxB3pPAiwW7AOR34MPNJlNv_r-3WbwThDeOOtrMSvfZtYGv6Mn_i0-d1yjccRDzZeXdRS0P91GYfdK2lfog1lhiPuec3gD4V4plNJR3wKSSMhgjikH6igCB5I7C5n9Ye5vSeyWW9ApwLogfbEUc3xKgiCgj1jtED4L7E3WgUvtWxsyqMMtaEAJGvRHlGPPShD3xHPsm6ltCVrU1arLXneuGa0R7M-GgzMk0z5HdRE2bD2agu5WuN-w5-w9W6jwrzgI4wM7v8KiJYxeM332nx4f2BF6ArFJ2K-DxlpgmdK6bkPTtL7H-uj5digXvBoHFYZAJF49c
After grabbing the public certificate from the production CN:
-----BEGIN CERTIFICATE-----
MIIFQzCCBCugAwIBAgISAxPSoq7BM7aFc1VzgyTJkz3wMA0GCSqGSIb3DQEBCwUA
MEoxCzAJBgNVBAYTAlVTMRYwFAYDVQQKEw1MZXQncyBFbmNyeXB0MSMwIQYDVQQD
ExpMZXQncyBFbmNyeXB0IEF1dGhvcml0eSBYMzAeFw0xNzA1MTcxMjI5MDBaFw0x
NzA4MTUxMjI5MDBaMBkxFzAVBgNVBAMTDmNuLmRhdGFvbmUub3JnMIIBIjANBgkq
hkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAtp++UWPu0Zm4gIs01F+LE94i4eExI+UX
82DIB3Xn93FW4IgDTsjEfXCB3AHggdx6GnExbDzu/iXn+K3LiW6QaeasG47XOeup
JjpmJqDROAJvLy1GpgrFeNxEe5F6xljPcAxUH/W/NkoHAem7wMatRNA53f6JkMVd
sKXAYPOdKUOqhQ9QRMqEFIPImt+SHfvxUkQyL4g+1taQ5XYDu5zwF5+k77ZRre+o
RVR9gHdbdlvLLQYP9eGJdi+nmFFTrEuXIklB8SQi6yvck0p6nR2sjmxFlnaLTe7Z
iaVWaA1vvwvwgG27Q2iMcnAG+JXQDe7Jd1YIuXUW7vVYyGl4ONbp3QIDAQABo4IC
UjCCAk4wDgYDVR0PAQH/BAQDAgWgMB0GA1UdJQQWMBQGCCsGAQUFBwMBBggrBgEF
BQcDAjAMBgNVHRMBAf8EAjAAMB0GA1UdDgQWBBSyQkmQUHHO3EkItWseuA3L6vg8
1DAfBgNVHSMEGDAWgBSoSmpjBH3duubRObemRWXv86jsoTBwBggrBgEFBQcBAQRk
MGIwLwYIKwYBBQUHMAGGI2h0dHA6Ly9vY3NwLmludC14My5sZXRzZW5jcnlwdC5v
cmcvMC8GCCsGAQUFBzAChiNodHRwOi8vY2VydC5pbnQteDMubGV0c2VuY3J5cHQu
b3JnLzBcBgNVHREEVTBTghRjbi1vcmMtMS5kYXRhb25lLm9yZ4IVY24tdWNzYi0x
LmRhdGFvbmUub3JnghRjbi11bm0tMS5kYXRhb25lLm9yZ4IOY24uZGF0YW9uZS5v
cmcwgf4GA1UdIASB9jCB8zAIBgZngQwBAgEwgeYGCysGAQQBgt8TAQEBMIHWMCYG
CCsGAQUFBwIBFhpodHRwOi8vY3BzLmxldHNlbmNyeXB0Lm9yZzCBqwYIKwYBBQUH
AgIwgZ4MgZtUaGlzIENlcnRpZmljYXRlIG1heSBvbmx5IGJlIHJlbGllZCB1cG9u
IGJ5IFJlbHlpbmcgUGFydGllcyBhbmQgb25seSBpbiBhY2NvcmRhbmNlIHdpdGgg
dGhlIENlcnRpZmljYXRlIFBvbGljeSBmb3VuZCBhdCBodHRwczovL2xldHNlbmNy
eXB0Lm9yZy9yZXBvc2l0b3J5LzANBgkqhkiG9w0BAQsFAAOCAQEAJo/aaCo0NweP
prHz+9Ko39xZ/Y6kum0ZOSw6BFM8zgkOOd1R0rbc53j09yKDi3V+MKd5rXfISNsp
LKBVe/R8HH/rglYUhMTBBizGsEdyPE4n5I3ml4RyOVmC1SpDPUzH0CAeSLkzBpBV
WVIfEwl641GtT0hBcwVjMlDYywrvSHv4mifVLd/2ZTSYillrhQzQySKb9g7jbEld
LHY1WoIU0E5XgQJq3b6Vhb5dXVkHsDfwPHNpJA5fVCVYoKazo+xSNBP757ta/ix4
e9CbRsQQ0TgEsuUAOa9lh9+O8uAL5zkZ4kwZCLypxbkZ8/YYOCMGMtGz4632J7VF
Ozukfk41bw==
-----END CERTIFICATE-----
and trying to verify the token with this certificate, it fails.
However, it verifies correctly with the old CN certificate:
-----BEGIN CERTIFICATE-----
MIIFrTCCBJWgAwIBAgICbkowDQYJKoZIhvcNAQELBQAwRzELMAkGA1UEBhMCVVMx
FjAUBgNVBAoTDUdlb1RydXN0IEluYy4xIDAeBgNVBAMTF1JhcGlkU1NMIFNIQTI1
NiBDQSAtIEczMB4XDTE0MTEwMzEyNTMyNFoXDTE3MDUyMDIxNDU0OVowgZExEzAR
BgNVBAsTCkdUMzkwMjU2MTcxMTAvBgNVBAsTKFNlZSB3d3cucmFwaWRzc2wuY29t
L3Jlc291cmNlcy9jcHMgKGMpMTIxLzAtBgNVBAsTJkRvbWFpbiBDb250cm9sIFZh
bGlkYXRlZCAtIFJhcGlkU1NMKFIpMRYwFAYDVQQDDA0qLmRhdGFvbmUub3JnMIIC
IjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEAzZaa/tslwA/CJ6Wqfzl72TrF
/8IurHHrfzmme/B2dSUt0+zDfdfXWe7p6pZ4yJp95Kk34cf0EFWgFJ5Nc1gyXJUh
Ht6IVweDDFrExeNPsNbI5DLFdUJ5ZfNhWrqu2C4kdeRfHqxOvI0w6XEfdZ4yI3QC
zfx5EtsoFEXpqK5Xe3r5KEnXVsPq6azerVqvq2UqhPa0EYJA8/CVJiQ0CRQl+w9x
Mh6GBvHUXqCHBPlRPIY7QomI+3Cx8gYgcLCCEcHVgzU05zQQRwdtIqjENq6CubH9
UTMiKS81CFJbAVrKetDRI3bNGIcEEpjV1XC28OOWXNc9fXXAK3fvVFVl2tuzYFn0
ROmRrtiz4+jXC7mp7/fTb5ekTeenKyoVA5UicbIHM1PPQeTwcHUH7CxybJVheGAo
7wwzqrxin3LMMyn56QBXqB81qL+iMJ+ZBHXxiS5V6g4W1ag3VOtDvyRtN1QGB6J2
enOTBOHNwr9bHuJcVPx1dYd6YjZD3LQbyJZyVtYHalnlCXGjLCxs9B2uL4MBllb5
N++ouBiujO5ww6Ht+MgOq/gbahx9WlJCs5xXLy8Hf+FfjUBZXDdkvLwa36FWktZa
ibbqqeBBq9IaW0gUNNmhYs3SB8J7JICVflUIp7e7wy7cXBJHpkATZKAuHVnqJ8ZT
83YekoQFyxpcqB2fmRkCAwEAAaOCAVYwggFSMB8GA1UdIwQYMBaAFMOc8/zTRgg0
u85Gf6B8W/PiCMtZMFcGCCsGAQUFBwEBBEswSTAfBggrBgEFBQcwAYYTaHR0cDov
L2d2LnN5bWNkLmNvbTAmBggrBgEFBQcwAoYaaHR0cDovL2d2LnN5bWNiLmNvbS9n
di5jcnQwDgYDVR0PAQH/BAQDAgWgMB0GA1UdJQQWMBQGCCsGAQUFBwMBBggrBgEF
BQcDAjAlBgNVHREEHjAcgg0qLmRhdGFvbmUub3JnggtkYXRhb25lLm9yZzArBgNV
HR8EJDAiMCCgHqAchhpodHRwOi8vZ3Yuc3ltY2IuY29tL2d2LmNybDAMBgNVHRMB
Af8EAjAAMEUGA1UdIAQ+MDwwOgYKYIZIAYb4RQEHNjAsMCoGCCsGAQUFBwIBFh5o
dHRwczovL3d3dy5yYXBpZHNzbC5jb20vbGVnYWwwDQYJKoZIhvcNAQELBQADggEB
ABcvSyNwX1jHZ7HRX5Lzcua0Q4//wc5KCBvPgPrbr3bGSi3+t+Rc4ZagIUxFWSd1
uZ+guQ4lywhQXGOXh7dH1SPljPOwZ9VPdhJMPW/woaQ0ndakLvW0OBIgyyqIcJ57
8e6DKzZ0jd97xmXYAa7iMhCxL2lpXzDQMH5k8XhENHcjMXfVitkqmIS2Wfi1rEMK
phszml9yRABtx+X0z/4/xmNZ2PrNApqmqVD2DnY1MgJNHga/KmPX/6VZ+NEszudP
rvrD5hQvAjkJA+5kgqX31w98ggfXg4oxQo8AhKrHWnhI52SoWT1BOwSGDRpgRW/n
1AdVxT9TIoHXbhf6+c8fWOU=
-----END CERTIFICATE-----
So, effectively, the @d1_cn_portal@ component is still using the old RapidSSL certificate to sign tokens, but (I think) on MNs that have recently been restarted and grab the most recent CN certificate for verification purposes, the get the new LE certificate, and so can't verify incoming tokens signed by the CN. My guess is that this is going to be problematic for other MNs that go through a reboot and or restart and rely on the CN signing tokens. Looking at the @portal.properties@ file on the cn, I see that it is indeeed still pointing to the old certificate and key:
cn.server.publiccert.filename=/etc/ssl/certs/_.dataone.org.crt
cn.server.privatekey.filename=/etc/ssl/private/dataone_org.key
So, in the short term, we need to plan to re-configure @portal.properties@ on the production CNs to use the new Let's Encrypt certificates for token signing:
cn.server.publiccert.filename=/etc/letsencrypt/live/cn.dataone.org/fullchain.pem
cn.server.privatekey.filename=/etc/letsencrypt/live/cn.dataone.org/privkey.pem
However, the @fullchain.pem@ includes the intermediate CA certs as well, and I don't know if @CertificateManager.loadCertificateFromFile()@ handles multiple certificates in a file (i.e. does it use the first found, last found, etc?). We need to determine this before making the properties change, but also before other production MNs get rebooted and begin to fail authentication for clients.
Once tested, for the long term, we need to update the portal properties in the buildout to make the changes permanent. We may also need to add some logic for ensuring the @/etc/letsencrypt@ files have the correct permissions as Dave pointed out.
History
#1 Updated by Dave Vieglais over 7 years ago
The server certificate, not including the intermediate, is available at:
/etc/letsencrypt/live/cn.dataone.org/cert.pem
That can certainly be used for signing the token. It is not clear however, if CertificateManager relies on access to the intermediate certificates to do the signing.
It will also be necessary to alter the permissions on
/etc/letsencrypt/live
to allow the tomcat process to read the cert. Current perms for letsencrypt are read by root only. Perms on the old *.dataone.org cert are:
$ sudo ls -la /etc/ssl/private
total 28
drwx--x--- 3 root ssl-cert 4096 May 19 2015 .
drwxr-xr-x 4 root root 4096 Jan 31 22:56 ..
-r--r----- 1 tomcat7 ssl-cert 3244 Nov 5 2014 dataone_org.key
-r-------- 1 root root 1675 Apr 16 2014 dataone_org.key.20150519.old
-rw-r-x--- 1 tomcat7 ssl-cert 3272 Apr 26 17:03 dataone_org.key.pk8
drwxr-xr-x 2 root root 4096 Apr 16 2014 old
-rw-r----- 1 root ssl-cert 1704 Jul 16 2014 ssl-cert-snakeoil.key
#2 Updated by Chris Jones over 7 years ago
Excellent. Thanks Dave.
#3 Updated by Chris Jones over 7 years ago
- Description updated (diff)
#4 Updated by Chris Jones over 7 years ago
- Status changed from New to In Progress
- % Done changed from 0 to 30
Dave and I did some troubleshooting on this, and it turns out that we can use the certs located in @/etc/letsencrypt/live@ as long as the permissions in @/etc/letsencrypt/archive@ are open to reading by the @ssl-cert@ group:
sudo chgrp -R ssl-cert /etc/letsencrypt/archive
sudo chmod g+rx /etc/letsencrypt/archive
So, we've changed the portal.properties to now have:
cn.server.publiccert.filename=/etc/letsencrypt/live/cn.dataone.org/cert.pem
cn.server.privatekey.filename=/etc/letsencrypt/live/cn.dataone.org/privkey.pem
After restarting the CN daemons and Tomcat, JWT signing using the Let's Encrypt certificate works in stage now.
This needs to now be done in DEV, DEV2, SANDBOX, SANDBOX2, and PRODUCTION
#5 Updated by Chris Jones over 7 years ago
- Target version set to CCI-2.3.5
- Assignee set to Jing Tao
Because of Member Nodes failing to authenticate in the PRODUCTION environment (GOA in particular), Jing and I changed the production @portal.properties@ configuration to point to the LetsEncrypt cert and key in @/etc/letsencrypt/live/cn.dataone.org@. Note that we had to change the directory permissions for both the @/etc/letsencrypt/live@ and @/etc/letsencrypt/archive@ directories to be readable/executable by the @ssl-cert@ group.
Multiple MNs are successfully authenticating and verifying tokens again (GOA, KNB, ADC, ...) with the LetsEncrypt signing cert.
This now needs to be changed in the @cn-portal@ buildout, so I'm assigning this to Jing to finish that part of it. Thanks Jing!
#6 Updated by Jing Tao over 7 years ago
- Category changed from d1_portal_servlet to dataone-cn-portal
- Milestone set to None
- Target version changed from CCI-2.3.5 to CCI-2.4.0
- Project changed from CN REST to Infrastructure
#7 Updated by Chris Jones over 4 years ago
- % Done changed from 30 to 100
- Status changed from In Progress to Closed
This has been addressed.