| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481 |
- .\"/*
- .\" * Copyright (C) 2016-2018 Red Hat, Inc.
- .\" *
- .\" * All rights reserved.
- .\" *
- .\" * Author: Jan Friesse <jfriesse@redhat.com>
- .\" *
- .\" * This software licensed under BSD license, the text of which follows:
- .\" *
- .\" * Redistribution and use in source and binary forms, with or without
- .\" * modification, are permitted provided that the following conditions are met:
- .\" *
- .\" * - Redistributions of source code must retain the above copyright notice,
- .\" * this list of conditions and the following disclaimer.
- .\" * - Redistributions in binary form must reproduce the above copyright notice,
- .\" * this list of conditions and the following disclaimer in the documentation
- .\" * and/or other materials provided with the distribution.
- .\" * - Neither the name of Red Hat, Inc. nor the names of its
- .\" * contributors may be used to endorse or promote products derived from this
- .\" * software without specific prior written permission.
- .\" *
- .\" * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
- .\" * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
- .\" * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
- .\" * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
- .\" * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
- .\" * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
- .\" * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
- .\" * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
- .\" * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
- .\" * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
- .\" * THE POSSIBILITY OF SUCH DAMAGE.
- .\" */
- .TH COROSYNC-QDEVICE 8 2018-08-09
- .SH NAME
- corosync-qdevice \- QDevice daemon
- .SH SYNOPSIS
- .B "corosync-qdevice [-dfh] [-S option=value[,option2=value2,...]]"
- .SH DESCRIPTION
- .B corosync-qdevice
- is a daemon running on each node of a cluster. It provides a configured
- number of votes to the
- quorum subsystem based on a third-party arbitrator's decision. Its primary use
- is to allow a cluster to sustain more node failures than standard quorum rules allow.
- It is recommended for clusters with an even number of nodes and highly recommended
- for 2 node clusters.
- .SH OPTIONS
- .TP
- .B -d
- Forcefully turn on debug information without the need to change corosync.conf.
- .TP
- .B -f
- Do not daemonize, run in the foreground.
- .TP
- .B -h
- Show short help text
- .TP
- .B -S
- Set advanced settings described in its own section below. This option
- shouldn't be generally used because most of the options are
- not safe to change.
- .SH CONFIGURATION
- .B corosync-qdevice
- reads its configuration from corosync.conf file.
- The main configuration is within
- .B quorum.device
- sub-key. Each model also has its own configuration within a
- similarly named sub-key.
- .TP
- .B model
- Specifies the model to be used. This parameter is required.
- .B corosync-qdevice
- is modular and is able to support multiple different models. The model basically
- defines what type of arbitrator is used. Currently only
- .I net
- is supported.
- .TP
- .B timeout
- Specifies how often
- .B corosync-qdevice
- should call the votequorum_poll function. It is also used by the
- .I net
- model to adjust
- its hearbeat timeout. It is recommended that you don't change this value.
- Default is
- .IR 10000 .
- .TP
- .B sync_timeout
- Specifies how often
- .B corosync-qdevice
- should call the votequorum_poll function during a sync phase. It is recommended that you don't change this value.
- Default is
- .IR 30000 .
- .TP
- .B votes
- The number of votes provided to the cluster by qdevice. Default is (number_of_nodes - 1) or generally
- sum(votes_per_node) - 1.
- .PP
- .B quorum.device.heuristics
- subkey holds the configuration of the heuristics. Heuristics are set of commands executed locally on
- startup, cluster membership change, successful connect to
- .B corosync-qnetd
- and optionally also at regular times. Commands are executed in parallel.
- When all commands finish successfully
- (their return error code is zero) on time,
- heuristics have passed, otherwise they have failed. The heuristics result is sent to
- .B corosync-qnetd
- and there it's used in calculations to determine which partition should be quorate.
- .TP
- .B timeout
- Specifies maximum time in milliseconds how long
- .B corosync-qdevice
- waits till the heuristics commands finish. If some command doesn't finish before the timeout, it's
- killed and heuristics fail. This timeout is used for heuristics executed at regular times.
- Default value is half of the
- .BR quorum.device.timeout ", so"
- .IR 5000 .
- .TP
- .B sync_timeout
- Similar to quorum.device.heuristics.timeout but used during membership changes. Default
- value is half of the
- .BR quorum.device.sync_timeout ", so"
- .IR 15000 .
- .TP
- .B interval
- Specifies interval between two regular heuristics execution. Default value is
- 3 *
- .BR quorum.device.timeout ", so"
- .IR 30000 .
- .TP
- .B mode
- Can be one of
- .IR on ", " sync " or " off
- and specifies mode of operation of heuristics. Default is
- .IR off ,
- which means heuristics are disabled. When
- .I sync
- is set, heuristics are executed only during startup, membership change and when connection
- to
- .B corosync-qnetd
- is established. When heuristics should be running also on regular basis, this option
- should be set to
- .I on
- value.
- .TP
- .B exec_NAME
- defines executables.
- .I NAME
- can be arbitrary valid cmap key name string and it has no special meaning.
- The value of this variable must contain a command to execute. The value is parsed (split)
- into arguments similarly as Bourne shell would do. Quoting is possible by
- using backslash and double quotes.
- .PP
- .B quorum.device.net
- subkey holds the configuration for
- .B model
- .IR net .
- .TP
- .B tls
- Can be one of
- .IR on ", " off " or " required
- and specifies if tls should be used.
- .I on
- means a connection with TLS is attempted first, but if the server doesn't advertise TLS support
- then non-TLS will be used.
- .I off
- is used then TLS is not required and it's then not even tried. This mode is the
- only one which doesn't need a properly initialized NSS database.
- .I required
- means TLS is required and if the server doesn't support TLS, qdevice will
- exit with error message. Default is
- .IR on .
- .TP
- .B host
- Specifies the IP address or host name of the qnetd server to be used. This parameter
- is required.
- .TP
- .B port
- Specifies TCP port of qnetd server. Default is
- .IR 5403 .
- .TP
- .B algorithm
- Decision algorithm. Can be one of the
- .I ffsplit
- or
- .IR lms .
- (actually there are also
- .I test
- and
- .IR 2nodelms ,
- both of which are mainly for developers and shouldn't be used for production clusters).
- For a description of what each algorithm means and how the algorithms differ see their
- individual sections.
- Default value is
- .IR ffsplit .
- .TP
- .B tie_breaker
- can be one of
- .IR lowest ", " highest
- or valid_node_id (number) values. It's used as a fallback if qdevice has to decide between two or more
- equal partitions.
- .I lowest
- means the partition with the lowest node id is chosen.
- .I highest
- means the partition with highest node id is chosen. And valid_node_id means that the partition
- containing the node with the given node id is chosen.
- Default is
- .IR lowest .
- .TP
- .B connect_timeout
- Timeout when
- .B corosync-qdevice
- is trying to connect to
- .B corosync-qnetd
- host. Default is 0.8 *
- .BR quorum.sync_timeout .
- .TP
- .B force_ip_version
- can be one of
- .I 0|4|6
- and forces the software to use the given IP version.
- .I 0
- (default value) means IPv6 is preferred and IPv4 should be used as a fallback.
- .PP
- Logging configuration is within the
- .B logging
- directive.
- .B corosync-qdevice
- parses and supports most of the options with exception of
- .BR to_logfile ", " logfile
- and
- .BR logfile_priority .
- The
- .B logger_subsys
- sub-directive can be also used if
- .B subsys
- is set to
- .IR QDEVICE .
- .PP
- For
- .B corosync-qdevice
- to work correctly, the
- .B nodelist
- directive has to be used and properly configured. Also the
- .I net
- model requires that
- .B totem.cluster_name
- option is set.
- .SH MODEL NET TLS CONFIGURATION
- For
- .B model
- .I net
- to work using TLS, it's necessary to create the NSS database, import Qnetd
- CA certificate, and get/distribute a valid client certificate.
- If pcs is used (recommended) the following steps are not needed because pcs does them automatically.
- .B corosync-qdevice-net-certutil
- is the tool to perform required actions semi-automatically. Please consult the help output of
- it and its man page. For a first time configuration it may make sense to start with the
- .B -Q
- option.
- If TLS is not required just edit corosync.conf file and set
- .B quorum.device.net.tls
- to
- .IR off .
- Depending on configuration of NSS (stored in nss.config file usually in
- /etc/crypto-policies/back-ends/ directory) disabled ciphers or too short keys
- may be rejected. Proper solution is to regenerate NSS databases for both
- .B corosync-qnetd
- and
- .B corosync-qdevice
- daemons. As a quick workaround it's also possible to set environment variable
- .I NSS_IGNORE_SYSTEM_POLICY=1
- before running
- .B corosync-qdevice
- daemon.
- When NSS is updated it may also be needed to upgrade database into new format. There is no
- consensus on recommended way, but following command seems to work just fine (if qdevice
- sysconfdir is set to /etc)
- .nf
- # certutil -N -d /etc/corosync/qdevice/net/nssdb -f /etc/corosync/qdevice/net/nssdb/pwdfile.txt
- .fi
- .SH MODEL NET ALGORITHMS
- Algorithms are used to change behavior of how
- .B corosync-qnetd
- provides votes to a given node/partition. Currently there are two algorithms supported.
- .TP
- .B ffsplit
- This one makes sense only for clusters with an even number of nodes. It provides exactly one
- vote to the partition with the highest number of active nodes. If there are two exactly
- similar partitions,
- it provides its vote to the partition with higher score. The score is computed
- as (number_of_connected_nodes +
- number_of_connected_nodes_with_passed_heuristics - number_of_connected_nodes_with_failed_heuristics)
- If the scores are equal, the vote is provided to partition with the most clients connected to the qnetd
- server. If this number is also equal, then the tie_breaker is used. It is able to transition
- its vote if the currently active partition becomes partitioned and a non-active partition
- still has at least 50% of the active nodes. Because of this, a vote is not provided
- if the qnetd connection is not active.
- To use this algorithm it's required to set the number of votes per node to 1 (default)
- and the qdevice number of votes has to be also 1. This is achieved by setting
- .B quorum.device.votes
- key in corosync.conf file to 1.
- .TP
- .B lms
- Last-man-standing. If the node is the only one left in the cluster that can see the
- qnetd server then we return a vote.
- If more than one node can see the qnetd server but some nodes can't
- see each other then the cluster is divided up into 'partitions' based on
- their ring_id and this algorithm returns a vote to the partition with highest
- heuristics score (computed the same way as for the
- .B ffsplit
- algorithm), or if there is more than 1 partition with equal scores,
- the largest active partition or,
- if there is more than 1 equal partition, the partition that contains the tie_breaker
- node (lowest, highest, etc). For LMS to work, the number
- of qdevice votes has to be set to default (so just delete
- .B quorum.device.votes
- key from corosync.conf).
- .SH ADVANCED SETTINGS
- Set by using
- .B -S
- option. The default value is shown in parentheses) Options
- beginning with
- .B net_
- prefix are specific to
- .B model
- .IR net .
- .TP
- .B lock_file
- Lock file location. (/var/run/corosync-qdevice/corosync-qdevice.pid)
- .TP
- .B local_socket_file
- Internal IPC socket file location. (/var/run/corosync-qdevice/corosync-qdevice.sock)
- .TP
- .B local_socket_backlog
- Parameter passed to listen syscall. (10)
- .TP
- .B max_cs_try_again
- How many times to retry the call to a corosync function which has returned CS_ERR_TRY_AGAIN. (10)
- .TP
- .B votequorum_device_name
- Name used for qdevice registration. (Qdevice)
- .TP
- .B ipc_max_clients
- Maximum allowed simultaneous IPC clients. (10)
- .TP
- .B ipc_max_receive_size
- Maximum size of a message received by IPC client. (4096)
- .TP
- .B ipc_max_send_size
- Maximum size of a message allowed to be sent to an IPC client. (65536)
- .TP
- .B master_wins
- Force enable/disable master wins. (default is model)
- .TP
- .B heuristics_ipc_max_send_buffers
- Maximum number of heuristics worker send buffers. (128)
- .TP
- .B heuristics_ipc_max_send_receive_size
- Maximum size of a message allowed to be send to, or received from heuristics worker. (4096)
- .TP
- .B heuristics_min_timeout
- Minimum heuristics timeout accepted by client in ms. (1000)
- .TP
- .B heuristics_max_timeout
- Maximum heuristics timeout accepted by client in ms. (120000)
- .TP
- .B heuristics_min_interval
- Minimum heuristics interval accepted by client in ms. (1000)
- .TP
- .B heuristics_max_interval
- Maximum heuristics interval accepted by client in ms. (3600000)
- .TP
- .B heuristics_max_execs
- Maximum number of exec_ commands. (32)
- .TP
- .B heuristics_use_execvp
- Use execvp instead of execv for executing commands. (off)
- .TP
- .B heuristics_max_processes
- Maximum number of processes running at one time. (160)
- .TP
- .B heuristics_kill_list_interval
- Interval between status is gathered and eventually signal is sent
- to processes which didn't finished on time in ms. (5000)
- .TP
- .B net_nss_db_dir
- NSS database directory. (/etc/corosync/qdevice/net/nssdb)
- .TP
- .B net_initial_msg_receive_size
- Initial (used during connection parameters negotiation)
- maximum size of the receive buffer for message (maximum
- allowed message size received from qnetd). (32768)
- .TP
- .B net_initial_msg_send_size
- Initial (used during connection parameter negotiation)
- maximum size of one send buffer (message) to be sent to server. (32768)
- .TP
- .B net_min_msg_send_size
- Minimum required size of one send buffer (message) to be sent to server. (32768)
- .TP
- .B net_max_msg_receive_size
- Maximum allowed size of receive buffer for a message sent by server. (16777216)
- .TP
- .B net_max_send_buffers
- Maximum number of send buffers. (10)
- .TP
- .B net_nss_qnetd_cn
- Canonical name of qnetd server certificate. (Qnetd Server)
- .TP
- .B net_nss_client_cert_nickname
- NSS nickname of qdevice client certificate. (Cluster Cert)
- .TP
- .B net_heartbeat_interval_min
- Minimum heartbeat timeout accepted by client in ms. (1000)
- .TP
- .B net_heartbeat_interval_max
- Maximum heartbeat timeout accepted by client in ms. (120000)
- .TP
- .B net_min_connect_timeout
- Minimum connection timeout accepted by client in ms. (1000)
- .TP
- .B net_max_connect_timeout
- Maximum connection timeout accepted by client in ms. (120000)
- .TP
- .B net_test_algorithm_enabled
- Enable test algorithm. (if built with --enable-debug on, otherwise off)
- .SH EXAMPLE
- Define qdevice with
- .I net
- model connecting to qnetd running on qnetd.example.org host, using
- .I ffsplit
- algorithm.
- Heuristics is set to
- .I sync
- mode and executes two commands.
- .nf
- quorum {
- provider: corosync_votequorum
- device {
- votes: 1
- model: net
- net {
- tls: on
- host: qnetd.example.org
- algorithm: ffsplit
- }
- heuristics {
- mode: sync
- exec_ping: /bin/ping -q -c 1 "www.example.org"
- exec_test_txt_exists: /usr/bin/test -f /tmp/test.txt
- }
- }
- .fi
- .SH SEE ALSO
- .BR corosync-qdevice-tool (8)
- .BR corosync-qdevice-net-certutil (8)
- .BR corosync-qnetd (8)
- .BR corosync.conf (5)
- .SH AUTHOR
- Jan Friesse
- .PP
|