Răsfoiți Sursa

qnetd: Fix dpd timer

With default config of running dpd timer every 10 second and waiting for
2 * client_timeout to clear message received flag and then waiting
another 2 * client_timeout without message received it was possible that
client was marked as a dead after more than 40 seconds making qdevice to
stop sending votequorum hearbeat for too long so corosync lost votes
from qdevice.

This patch is simpler solution which just changes default dpd timer to
1 second and timeout to 1.2 * client_timeout.

Signed-off-by: Jan Friesse <jfriesse@redhat.com>
Jan Friesse 5 ani în urmă
părinte
comite
11a861c93f
3 a modificat fișierele cu 6 adăugiri și 6 ștergeri
  1. 3 3
      man/corosync-qnetd.8
  2. 2 2
      qdevices/qnet-config.h
  3. 1 1
      qdevices/qnetd-dpd-timer.c

+ 3 - 3
man/corosync-qnetd.8

@@ -1,5 +1,5 @@
 .\"/*
-.\" * Copyright (C) 2016-2019 Red Hat, Inc.
+.\" * Copyright (C) 2016-2020 Red Hat, Inc.
 .\" *
 .\" * All rights reserved.
 .\" *
@@ -31,7 +31,7 @@
 .\" * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
 .\" * THE POSSIBILITY OF SUCH DAMAGE.
 .\" */
-.TH COROSYNC-QNETD 8 2019-08-12
+.TH COROSYNC-QNETD 8 2020-09-15
 .SH NAME
 corosync-qnetd \- QNet daemon
 .SH SYNOPSIS
@@ -217,7 +217,7 @@ Maximum heartbeat timeout accepted by server in ms. (120000)
 Dead peer detection enabled. (on)
 .TP
 .B dpd_interval
-How often the DPD algorithm detects dead peers in ms. (10000)
+How often the DPD algorithm detects dead peers in ms. (1000)
 .TP
 .B lock_file
 Lock file location. (/var/run/corosync-qnetd/corosync-qnetd.pid)

+ 2 - 2
qdevices/qnet-config.h

@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2015-2016 Red Hat, Inc.
+ * Copyright (c) 2015-2020 Red Hat, Inc.
  *
  * All rights reserved.
  *
@@ -71,7 +71,7 @@ extern "C" {
 #define QNETD_MIN_HEARTBEAT_INTERVAL			1
 
 #define QNETD_DEFAULT_DPD_ENABLED			1
-#define QNETD_DEFAULT_DPD_INTERVAL			(10*1000)
+#define QNETD_DEFAULT_DPD_INTERVAL			(1*1000)
 #define QNETD_MIN_DPD_INTERVAL				1
 
 #define QNETD_DEFAULT_LOCK_FILE				LOCALSTATEDIR"/run/corosync-qnetd/corosync-qnetd.pid"

+ 1 - 1
qdevices/qnetd-dpd-timer.c

@@ -50,7 +50,7 @@ qnetd_dpd_timer_cb(void *data1, void *data2)
 
 		client->dpd_time_since_last_check += instance->advanced_settings->dpd_interval;
 
-		if (client->dpd_time_since_last_check > (client->heartbeat_interval * 2)) {
+		if (client->dpd_time_since_last_check > client->heartbeat_interval * 1.2) {
 			if (!client->dpd_msg_received_since_last_check) {
 				log(LOG_WARNING, "Client %s doesn't sent any message during "
 				    "%"PRIu32"ms. Disconnecting",