Просмотр исходного кода

totemsrp: Fix leave message regression

Leave message in totem is just join message where leaving member is
excluded from member list and included in fail list. It also contains
special nodeid in header.nodeid and system_from.nodeid fields.

Before "totem: Use nodeid ONLY in srp_addr" fix, most of the functions
were using system_from addresses and not nodeid, which was used only in
one specific case for memb_consensus_set function.

After the patch, addresses are gone and only nodeid is used. Result is,
that leaving node nodeid is not added into local fail list
(my_faillist) so node is unable to reach consensus till token timeout,
which starts new gather process.

Solution is to send valid leaving node nodeid in system_from.nodeid and
handle specific case for memb_consensus_set in memb_join_process.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Reviewed-by: Christine Caulfield <ccaulfie@redhat.com>
Jan Friesse 7 лет назад
Родитель
Сommit
e45bbcc92a
1 измененных файлов с 3 добавлено и 5 удалено
  1. 3 5
      exec/totemsrp.c

+ 3 - 5
exec/totemsrp.c

@@ -1181,9 +1181,6 @@ static void memb_consensus_set (
 	int found = 0;
 	int i;
 
-	if (addr->nodeid == LEAVE_DUMMY_NODEID)
-	        return;
-
 	for (i = 0; i < instance->consensus_list_entries; i++) {
 		if (srp_addr_equal(addr, &instance->consensus_list[i].addr)) {
 			found = 1;
@@ -3336,7 +3333,6 @@ static void memb_leave_message_send (struct totemsrp_instance *instance)
 	memb_join->proc_list_entries = active_memb_entries;
 	memb_join->failed_list_entries = instance->my_failed_list_entries;
 	srp_addr_copy (&memb_join->system_from, &instance->my_id);
-	memb_join->system_from.nodeid = LEAVE_DUMMY_NODEID;
 
 	// TODO: CC Maybe use the actual join send routine.
 	/*
@@ -4401,7 +4397,9 @@ static void memb_join_process (
 		instance->my_failed_list,
 		instance->my_failed_list_entries)) {
 
-		memb_consensus_set (instance, &memb_join->system_from);
+		if (memb_join->header.nodeid != LEAVE_DUMMY_NODEID) {
+			memb_consensus_set (instance, &memb_join->system_from);
+		}
 
 		if (memb_consensus_agreed (instance) && instance->failed_to_recv == 1) {
 				instance->failed_to_recv = 0;