Story #3490
Review official Documentation for production release workflow
100%
Description
And make/implement suggestions for improvements.
Item 1. We have a script that will perform these calls in sequence. It is currently
in /usr/local/bin/startDaemons.pl. It is not installed by debian scripts.
There are more scripts in the /usr/local/bin directory that should be included
in debian build out, but they rely on process/indexing packages to have been installed.
So,maybe another debian package will be needed to add in helper administrative
scripts that are only installed after all other cn packages are installed.
dataone-cn-yet-another-debian-pkg
(after talking with Chris at Standup 1/16/2013, we decided to place these scripts in
dataone-cn-os-core, and make their execution dependent upon the existence of
the installed init.d scripts)
Item 3. In its entirety, I think this section could be handled easier by an administrative script.
adjust_ports --disable ip-address1, ip-address2, etc
that would enable/re-enable all the ufw ports as needed. (and maybe sql updates
to xml_replication table too?Standup 1/16/2013 agreed that it could be performed with sql script)
Item 3.3 We do not have a read-only mode yet. We have a CN disable toggle that will
allow the cn to respond to any request with a ServiceFailure message. So, this
task is still on the to-do list.
Additionally, a read-only filter on the cn-rest-service will not affect
metacat's replication. We expose the metacat replication endpoint directly
from apache. Could we add in a toggle button into the replication form, or
maybe just add it into a script that will change the replicate/replicate_data
fields instead of having to delete and then add back the rows into the
xml_replication table? (see above)
Item 8.2 We need to do this as a precaution. But it should be performed after
apt-get upgrade procedure. So, switch 8.3 with 8.2. I've some time found that
ldap replication will stop working after a software upgrade. don't know why.
It is not as big a deal as it used to be when we'd have to restart tomcat
if ldap died.
However, what we should really have is an LDAP replication monitor application.
Basically, create some unique object in LDAP not used in any way by D1, and
then manipulate it from each of the CNs in sequence, ensuring the other two are
able to note the change before success. Probably will have to use hazelcast
somehow to do this.
Item 8.6 This is where CN1 goes down leaving CN2 as the sole CN running in production.
At this stage, I would open the ports on CN2 & CN3 to each other and bring up
CN3 (reconfiguring tomcat/restarting tomcat & the daemons, etc).
That way we have the cluster up and running. Then, it be a matter of
upgrading CN1, configuring metacat, making certain all servies on CN1 were shutdown,
ensuring all the ports on all three machines were up, and then restarting all services
on CN1.
History
#1 Updated by Robert Waltz almost 12 years ago
- Description updated (diff)
#2 Updated by Robert Waltz almost 12 years ago
- Status changed from New to Closed