Explorar el Código

votequorum: Add cmap key to reset wait_for_all

It's possible in a two_node cluster (and others but it's more likely
with just two) that a node could be booted up after downtime or failure
and the other node is not available for some reason. In this case it
would not be allowed to proceed because wait_for_all is enforced.

This patch provides a cmap key to clear this flag in the desperate
situation where that becomes necessary. It should only be used with
extreme caution and will be wrapped up in pcs which should also check
that fencing has been run.

Signed-Off-By: Christine Caulfield <ccaulfie@redhat.com>
Reviewed-by:  Jan Friesse <jfriesse@redhat.com>
Christine Caulfield hace 11 años
padre
commit
cbf753405b
Se han modificado 2 ficheros con 24 adiciones y 0 borrados
  1. 19 0
      exec/votequorum.c
  2. 5 0
      man/cmap_keys.8

+ 19 - 0
exec/votequorum.c

@@ -150,6 +150,7 @@ static int votequorum_exec_send_quorum_notification(void *conn, uint64_t context
 
 #define VOTEQUORUM_RECONFIG_PARAM_EXPECTED_VOTES 1
 #define VOTEQUORUM_RECONFIG_PARAM_NODE_VOTES     2
+#define VOTEQUORUM_RECONFIG_PARAM_CANCEL_WFA     3
 
 static int votequorum_exec_send_reconfigure(uint8_t param, unsigned int nodeid, uint32_t value);
 
@@ -1487,6 +1488,7 @@ static void votequorum_refresh_config(
 {
 	int old_votes, old_expected_votes;
 	uint8_t reloading;
+	uint8_t cancel_wfa;
 
 	ENTER();
 
@@ -1498,6 +1500,15 @@ static void votequorum_refresh_config(
 		return ;
 	}
 
+	icmap_get_uint8("quorum.cancel_wait_for_all", &cancel_wfa);
+	if (strcmp(key_name, "quorum.cancel_wait_for_all") == 0 &&
+	    cancel_wfa >= 1) {
+	        icmap_set_uint8("quorum.cancel_wait_for_all", 0);
+		votequorum_exec_send_reconfigure(VOTEQUORUM_RECONFIG_PARAM_CANCEL_WFA,
+						 us->node_id, 0);
+		return;
+	}
+
 	old_votes = us->votes;
 	old_expected_votes = us->expected_votes;
 
@@ -2070,6 +2081,14 @@ static void message_handler_req_exec_votequorum_reconfigure (
 		recalculate_quorum(1, 0);  /* Allow decrease */
 		break;
 
+	case VOTEQUORUM_RECONFIG_PARAM_CANCEL_WFA:
+	        update_wait_for_all_status(0);
+		log_printf(LOGSYS_LEVEL_INFO, "wait_for_all_status reset by user on node %d.",
+			   req_exec_quorum_reconfigure->nodeid);
+		recalculate_quorum(0, 0);
+
+	        break;
+
 	}
 
 	LEAVE();

+ 5 - 0
man/cmap_keys.8

@@ -269,6 +269,11 @@ uidgid.*
 Informations about users/groups which are allowed to do IPC connection to
 corosync.
 
+.TP
+quorum.cancel_wait_for_all
+Tells votequorum to cancel waiting for all nodes at cluster startup. Can be used
+to unblock quorum if notes are known to be down. for pcs use only.
+
 .TP
 config.reload_in_progress
 This value will be set to 1 (or created) when corosync.conf reload is started,