Project

General

Profile

Bug #8471

MNDeployment #7895: Pangaea

Apache crashed

Added by Dave Vieglais almost 5 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Target version:
-
Start date:
2018-03-04
Due date:
% Done:

100%

Story Points:
Sprint:

Description

Apache crashed on the PANGAEA server:

[Sun Mar 04 00:13:56.478891 2018] [ssl:info] [pid 3660:tid 140612707731200] [client 128.111.54.80:40314] AH01964: Connection to child 149 established (server pangaea-orc-1.dataone.org:443)
[Sun Mar 04 00:13:57.618294 2018] [mpm_event:notice] [pid 28169:tid 140613059504000] AH00493: SIGUSR1 received.  Doing graceful restart
[Sun Mar 04 00:13:57.621157 2018] [wsgi:info] [pid 3575:tid 140613059504000] mod_wsgi (pid=3575): Shutdown requested 'gmn2'.
[Sun Mar 04 00:13:57.621495 2018] [wsgi:info] [pid 3575:tid 140613059504000] mod_wsgi (pid=3575): Stopping process 'gmn2'.
[Sun Mar 04 00:13:57.621521 2018] [wsgi:info] [pid 3575:tid 140613059504000] mod_wsgi (pid=3575): Destroying interpreters.
[Sun Mar 04 00:13:57.621532 2018] [wsgi:info] [pid 3575:tid 140613059504000] mod_wsgi (pid=3575): Cleanup interpreter ''.
[Sun Mar 04 00:13:57.621854 2018] [wsgi:info] [pid 3573:tid 140613059504000] mod_wsgi (pid=3573): Shutdown requested 'gmn2'.
[Sun Mar 04 00:13:57.622220 2018] [wsgi:info] [pid 3573:tid 140613059504000] mod_wsgi (pid=3573): Stopping process 'gmn2'.
[Sun Mar 04 00:13:57.622248 2018] [wsgi:info] [pid 3573:tid 140613059504000] mod_wsgi (pid=3573): Destroying interpreters.
[Sun Mar 04 00:13:57.622258 2018] [wsgi:info] [pid 3573:tid 140613059504000] mod_wsgi (pid=3573): Cleanup interpreter ''.
[Sun Mar 04 00:13:57.623423 2018] [wsgi:info] [pid 3575:tid 140613059504000] mod_wsgi (pid=3575): Terminating Python.
[Sun Mar 04 00:13:57.623937 2018] [wsgi:info] [pid 3573:tid 140613059504000] mod_wsgi (pid=3573): Terminating Python.
[Sun Mar 04 00:13:57.705453 2018] [wsgi:info] [pid 3575:tid 140613059504000] mod_wsgi (pid=3575): Python has shutdown.
[Sun Mar 04 00:13:57.705515 2018] [wsgi:info] [pid 3575:tid 140613059504000] mod_wsgi (pid=3575): Exiting process 'gmn2'.
[Sun Mar 04 00:13:57.710406 2018] [wsgi:info] [pid 3573:tid 140613059504000] mod_wsgi (pid=3573): Python has shutdown.
[Sun Mar 04 00:13:57.710463 2018] [wsgi:info] [pid 3573:tid 140613059504000] mod_wsgi (pid=3573): Exiting process 'gmn2'.
AH00112: Warning: DocumentRoot [/var/lib/letsencrypt/tls_sni_01_page/] does not exist
[Sun Mar 04 00:13:57.831053 2018] [ssl:warn] [pid 28169:tid 140613059504000] AH01906: 090c0b8652c5ba9e5c8a5098614047db.7cb193658e0aba88c976aa0e89bcac6d.acme.invalid:443:0 server certificate is a CA certificate (BasicConstraints: CA == TRUE !?)
[Sun Mar 04 00:13:57.831230 2018] [ssl:info] [pid 28169:tid 140613059504000] AH01914: Configuring server pangaea-orc-1.dataone.org:443 for SSL protocol
[Sun Mar 04 00:13:57.833221 2018] [ssl:info] [pid 28169:tid 140613059504000] AH02568: Certificate and private key pangaea-orc-1.dataone.org:443:0 configured from /etc/letsencrypt/live/pangaea-orc-1.dataone.org/fullchain.pem and /etc/letsencrypt/live/pangaea-orc-1.dataone.org/privkey.pem
[Sun Mar 04 00:13:57.833547 2018] [wsgi:warn] [pid 28169:tid 140613059504000] mod_wsgi: Compiled for Python/2.7.11.
[Sun Mar 04 00:13:57.833564 2018] [wsgi:warn] [pid 28169:tid 140613059504000] mod_wsgi: Runtime using Python/2.7.12.
[Sun Mar 04 00:13:57.834984 2018] [mpm_event:notice] [pid 28169:tid 140613059504000] AH00489: Apache/2.4.18 (Ubuntu) OpenSSL/1.0.2g mod_wsgi/4.3.0 Python/2.7.12 configured -- resuming normal operations
[Sun Mar 04 00:13:57.835168 2018] [core:notice] [pid 28169:tid 140613059504000] AH00094: Command line: '/usr/sbin/apache2'
[Sun Mar 04 00:13:58.414522 2018] [wsgi:info] [pid 24702:tid 140613059504000] mod_wsgi (pid=24702): Attach interpreter ''.
[Sun Mar 04 00:13:58.414525 2018] [wsgi:info] [pid 24703:tid 140613059504000] mod_wsgi (pid=24703): Attach interpreter ''.
[Sun Mar 04 00:13:58.415090 2018] [wsgi:info] [pid 24702:tid 140613059504000] mod_wsgi (pid=24702): Adding '/var/local/dataone/gmn_venv/lib/python2.7/site-packages' to path.
[Sun Mar 04 00:13:58.415126 2018] [wsgi:info] [pid 24703:tid 140613059504000] mod_wsgi (pid=24703): Adding '/var/local/dataone/gmn_venv/lib/python2.7/site-packages' to path.
[Sun Mar 04 00:13:58.882728 2018] [ssl:info] [pid 24759:tid 140612958349056] [client 128.111.54.80:40366] AH01964: Connection to child 0 established (server pangaea-orc-1.dataone.org:443)
[Sun Mar 04 00:13:59.065818 2018] [wsgi:info] [pid 24703:tid 140612949956352] [remote 128.111.54.80:22791] mod_wsgi (pid=24703, process='gmn2', application=''): Loading WSGI script '/var/local/dataone/gmn_venv/lib/python2.7/site-packages/d1_gmn/wsgi.py'.
[Sun Mar 04 00:13:59.854531 2018] [ssl:info] [pid 24761:tid 140612958349056] [client 128.111.54.80:40452] AH01964: Connection to child 192 established (server pangaea-orc-1.dataone.org:443)
[Sun Mar 04 00:14:00.657223 2018] [ssl:info] [pid 24759:tid 140612949956352] [client 128.111.54.80:40493] AH01964: Connection to child 1 established (server pangaea-orc-1.dataone.org:443)
[Sun Mar 04 00:14:00.732838 2018] [wsgi:info] [pid 24702:tid 140612949956352] [remote 128.111.54.80:27892] mod_wsgi (pid=24702, process='gmn2', application=''): Loading WSGI script '/var/local/dataone/gmn_venv/lib/python2.7/site-packages/d1_gmn/wsgi.py'.
[Sun Mar 04 00:14:00.856297 2018] [ssl:info] [pid 24761:tid 140612949956352] [client 64.78.149.164:52418] AH01964: Connection to child 193 established (server pangaea-orc-1.dataone.org:443)
[Sun Mar 04 00:14:01.886857 2018] [ssl:info] [pid 24759:tid 140612867192576] [client 128.111.54.80:40531] AH01964: Connection to child 2 established (server pangaea-orc-1.dataone.org:443)
[Sun Mar 04 00:14:04.012126 2018] [mpm_event:notice] [pid 28169:tid 140613059504000] AH00493: SIGUSR1 received.  Doing graceful restart
[Sun Mar 04 00:14:04.013209 2018] [wsgi:info] [pid 24703:tid 140613059504000] mod_wsgi (pid=24703): Shutdown requested 'gmn2'.
[Sun Mar 04 00:14:04.013222 2018] [wsgi:info] [pid 24702:tid 140613059504000] mod_wsgi (pid=24702): Shutdown requested 'gmn2'.
[Sun Mar 04 00:14:04.013372 2018] [wsgi:info] [pid 24702:tid 140613059504000] mod_wsgi (pid=24702): Stopping process 'gmn2'.
[Sun Mar 04 00:14:04.013383 2018] [wsgi:info] [pid 24702:tid 140613059504000] mod_wsgi (pid=24702): Destroying interpreters.
[Sun Mar 04 00:14:04.013389 2018] [wsgi:info] [pid 24702:tid 140613059504000] mod_wsgi (pid=24702): Cleanup interpreter ''.
[Sun Mar 04 00:14:04.013996 2018] [wsgi:info] [pid 24702:tid 140613059504000] mod_wsgi (pid=24702): Terminating Python.
[Sun Mar 04 00:14:04.079136 2018] [wsgi:info] [pid 24702:tid 140613059504000] mod_wsgi (pid=24702): Python has shutdown.
[Sun Mar 04 00:14:04.079175 2018] [wsgi:info] [pid 24702:tid 140613059504000] mod_wsgi (pid=24702): Exiting process 'gmn2'.
[Sun Mar 04 00:14:04.958162 2018] [core:notice] [pid 28169] AH00060: seg fault or similar nasty error detected in the parent process
[Sun Mar 04 00:14:05.788711 2018] [wsgi:info] [pid 24703:tid 140613059504000] mod_wsgi (pid=24703): Stopping process 'gmn2'.
[Sun Mar 04 00:14:05.788774 2018] [wsgi:info] [pid 24703:tid 140613059504000] mod_wsgi (pid=24703): Destroying interpreters.
[Sun Mar 04 00:14:05.788786 2018] [wsgi:info] [pid 24703:tid 140613059504000] mod_wsgi (pid=24703): Cleanup interpreter ''.
[Sun Mar 04 00:14:05.789285 2018] [wsgi:info] [pid 24703:tid 140613059504000] mod_wsgi (pid=24703): Terminating Python.
[Sun Mar 04 00:14:05.861399 2018] [wsgi:info] [pid 24703:tid 140613059504000] mod_wsgi (pid=24703): Python has shutdown.
[Sun Mar 04 00:14:05.861497 2018] [wsgi:info] [pid 24703:tid 140613059504000] mod_wsgi (pid=24703): Exiting process 'gmn2'.

Issuing a restart and the MN seemed to come up ok.

History

#1 Updated by Dave Vieglais almost 5 years ago

Access log around the same time:

128.111.54.80 - - [04/Mar/2018:00:13:33 +0000] "GET /mn/v2/checksum/37131896bb46ab7f7cf07c815fb74cf0?checksumAlgorithm=MD5 HTTP/1.1" 200 624 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:33 +0000] "GET /mn/v2/object/37131896bb46ab7f7cf07c815fb74cf0 HTTP/1.1" 200 32883 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:34 +0000] "GET /mn/v2/meta/d4ad3b36cb2112074bb4d6bdf69793a0 HTTP/1.1" 200 966 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:35 +0000] "GET /mn/v2/meta/b380ebdbaa2d2e735dadd448b584a64c HTTP/1.1" 200 965 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:36 +0000] "GET /mn/v2/meta/36f4d70417e3428f5c88384f50720fab HTTP/1.1" 200 967 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:37 +0000] "GET /mn/v2/meta/2d44cc53b3b8d2100893b44557a32182 HTTP/1.1" 200 966 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:37 +0000] "GET /mn/v2/checksum/d4ad3b36cb2112074bb4d6bdf69793a0?checksumAlgorithm=MD5 HTTP/1.1" 200 625 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:38 +0000] "GET /mn/v2/checksum/b380ebdbaa2d2e735dadd448b584a64c?checksumAlgorithm=MD5 HTTP/1.1" 200 624 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:38 +0000] "GET /mn/v2/object/d4ad3b36cb2112074bb4d6bdf69793a0 HTTP/1.1" 200 33669 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:38 +0000] "GET /mn/v2/object/b380ebdbaa2d2e735dadd448b584a64c HTTP/1.1" 200 26665 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:38 +0000] "GET /mn/v2/meta/c85861002b8d4f02ed5305bc0f0ddadf HTTP/1.1" 200 965 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:40 +0000] "GET /mn/v2/checksum/2d44cc53b3b8d2100893b44557a32182?checksumAlgorithm=MD5 HTTP/1.1" 200 625 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:40 +0000] "GET /mn/v2/checksum/36f4d70417e3428f5c88384f50720fab?checksumAlgorithm=MD5 HTTP/1.1" 200 624 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:40 +0000] "GET /mn/v2/object/2d44cc53b3b8d2100893b44557a32182 HTTP/1.1" 200 27916 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:40 +0000] "GET /mn/v2/object/36f4d70417e3428f5c88384f50720fab HTTP/1.1" 200 26487 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:44 +0000] "GET /mn/v2/meta/6d1c52ca5642b5dfba5182cf9ee6ba00 HTTP/1.1" 200 964 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:45 +0000] "GET /mn/v2/meta/bb2683dd53726d206c10377f3e20c02c HTTP/1.1" 200 966 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:47 +0000] "GET /mn/v2/meta/78ca49f285f1f0a10ecf513c9de45480 HTTP/1.1" 200 965 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:49 +0000] "GET /mn/v2/checksum/bb2683dd53726d206c10377f3e20c02c?checksumAlgorithm=MD5 HTTP/1.1" 200 624 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:49 +0000] "GET /mn/v2/meta/b47d75519797d715e05aa6cb7114d64d HTTP/1.1" 200 966 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:49 +0000] "GET /mn/v2/object/bb2683dd53726d206c10377f3e20c02c HTTP/1.1" 200 36220 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:50 +0000] "GET /mn/v2/meta/b72b10c0d1a4b757fa90e4fc80793bbe HTTP/1.1" 200 967 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:51 +0000] "GET /mn/v2/checksum/78ca49f285f1f0a10ecf513c9de45480?checksumAlgorithm=MD5 HTTP/1.1" 200 623 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:51 +0000] "GET /mn/v2/object/78ca49f285f1f0a10ecf513c9de45480 HTTP/1.1" 200 32878 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:53 +0000] "GET /mn/v2/checksum/b47d75519797d715e05aa6cb7114d64d?checksumAlgorithm=MD5 HTTP/1.1" 200 623 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:53 +0000] "GET /mn/v2/checksum/b72b10c0d1a4b757fa90e4fc80793bbe?checksumAlgorithm=MD5 HTTP/1.1" 200 624 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:53 +0000] "GET /mn/v2/object/b47d75519797d715e05aa6cb7114d64d HTTP/1.1" 200 18210 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:53 +0000] "GET /mn/v2/object/b72b10c0d1a4b757fa90e4fc80793bbe HTTP/1.1" 200 17167 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:53 +0000] "GET /mn/v2/meta/009cd84bf50e30fe84e55925a9c19d13 HTTP/1.1" 200 964 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:56 +0000] "GET /mn/v2/meta/2034c72bb6526d8ec0db8957bad5dc46 HTTP/1.1" 200 967 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:14:00 +0000] "GET /mn/v2/meta/0acaeaad51a0c1d31f61fd6e03164bc7 HTTP/1.1" 200 964 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:59 +0000] "GET /mn/v2/meta/08a1a84f22fca705cc97bf9ed54224b7 HTTP/1.1" 200 928 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:14:01 +0000] "GET /mn/v2/meta/06ab858a632af3cd3a19e86b1e8f73e5 HTTP/1.1" 200 928 "-" "Apache-HttpClient/4.3.6 (java 1.5)"
128.111.54.80 - - [04/Mar/2018:00:13:59 +0000] "GET /mn/v2/checksum/009cd84bf50e30fe84e55925a9c19d13?checksumAlgorithm=MD5 HTTP/1.1" 200 3974 "-" "Apache-HttpClient/4.3.6 (java 1.5)"

#2 Updated by Dave Vieglais almost 5 years ago

Restarted MN and appears to be functioning as expected.

#3 Updated by Dave Vieglais almost 5 years ago

Appears to be related to a burst in LetsEncrypt activity at the same time. Perhaps related to the certificate renewal process. Logs not presented here because they contain sensitive information.

#4 Updated by Amy Forrester over 4 years ago

  • % Done changed from 0 to 100
  • Status changed from New to Closed

We'll take a closer look if it happens again.

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 14.8 MB)