Sin descripción

Angus Salkeld 2cf37d4063 Set the size of the blackbox to the size on flatiron hace 14 años
build-aux b2400314b2 add release script and git based versioning hace 15 años
conf 5a2185683a AUGEAS: fix "tags" log field hace 14 años
cts ec771a9a9a CTS: remove dead code in sam_test_agent hace 14 años
exec 2cf37d4063 Set the size of the blackbox to the size on flatiron hace 14 años
include 1b63c3cf57 LOG: update the log defines hace 14 años
init 6fa114ac8d Add systemd unit files for corosync and corosync-notifyd hace 14 años
lcr bb42020f9a Use qb_hdb instead of mutex based hdb code hace 14 años
lib b793135834 Remove default from cpg_model_initialize - atm there is only one model hace 14 años
man 21186a0f70 MAN: remove unused man pages hace 14 años
pkgconfig 91ff2292d0 libqb: Add libqb dependency in the rpm & pc file hace 14 años
services 08f07be323 A CPG client can sometimes lockup if the local node is in the downlist hace 14 años
test 3ade35ca01 TEST: make cpgbench go to 1M hace 14 años
tools f0d80e3e46 Remove -lcoroipcc from tools/Makefile.am notifyd hace 14 años
.gitignore 9909a20859 Add Doxyfile to .gitignore hace 15 años
AUTHORS 0769c9444f Update to AUTHORS file. hace 16 años
Doxyfile.in b6ba64c1eb docs: auto-generate the version hace 15 años
INSTALL 2a568d6e79 Add dbus and snmp notifier hace 15 años
LICENSE 71c89ed653 Add license information to LICENSE file about build process files hace 15 años
Makefile.am 036ea63107 add wait-for-license to cov-analyze hace 14 años
README.recovery 904a10ed38 remove all trailing blanks hace 17 años
SECURITY d1c1e78fd0 remove trailing blanks hace 16 años
TODO 2cf37d4063 Set the size of the blackbox to the size on flatiron hace 14 años
autobuild.sh 1db961d6ad autobuild: improve messages hace 14 años
autogen.sh 967be5a38a Sanitize output of autogen.sh. hace 16 años
configure.ac cdf5e95ab4 Make realtime scheduling optional not the default. hace 14 años
corosync.spec.in bf3a0ad542 Remove references to README.devmap hace 14 años
loc f5c3bcad46 Modify property of loc script to be executable. hace 17 años

README.recovery

SYNCHRONIZATION ALGORITHM:
-------------------------
The synchronization algorithm is used for every service in corosync to
synchronize state of he system.

There are 4 events of the synchronization algorithm. These events are in fact
functions that are registered in the service handler data structure. They
are called by the synchronization system whenever a network partitions or
merges.

init:
Within the init event a service handler should record temporary state variables
used by the process event.

process:
The process event is responsible for executing synchronization. This event
will return a state as to whether it has completed or not. This allows for
synchronization to be interrupted and recontinue when the message queue buffer
is full. The process event will be called again by the synchronization service
if requesed to do so by the return variable returned in process.

abort:
The abort event occurs when during synchronization a processor failure occurs.

activate:
The activate event occurs when process has returned no more processing is
necessary for any node in the cluster and all messages originated by process
have completed.

CHECKPOINT SYNCHRONIZATION ALGORITHM:
------------------------------------
The purpose of the checkpoint syncrhonization algorithm is to synchronize
checkpoints after a paritition or merge of two or more partitions. The
secondary purpose of the algorithm is to determine the cluster-wide reference
count for every checkpoint.

Every cluster contains a group of checkpoints. Each checkpoint has a
checkpoint name and checkpoint number. The number is used to uniquely reference
an unlinked but still open checkpoint in the cluser.

Every checkpoint contains a reference count which is used to determine when
that checkpoint may be released. The algorithm rebuilds the reference count
information each time a partition or merge occurs.

local variables
my_sync_state may have the values SYNC_CHECKPOINT, SYNC_REFCOUNT
my_current_iteration_state contains any data used to iterate the checkpoints
and sections.
checkpoint data
refcount_set contains reference count for every node consisting of
number of opened connections to checkpoint and node identifier
refcount contains a summation of every reference count in the refcount_set

pseudocode executed by a processor when the syncrhonization service calls
the init event
call process_checkpoints_enter

pseudocode executed by a processor when the synchronization service calls
the process event in the SYNC_CHECKPOINT state
if lowest processor identifier of old ring in new ring
transmit checkpoints or sections starting from my_current_iteration_state
if all checkpoints and sections could be queued
call sync_refcounts_enter
else
record my_current_iteration_state

require process to continue

pseudocode executed by a processor when the synchronization service calls
the process event in the SYNC_REFCOUNT state
if lowest processor identifier of old ring in new ring
transmit checkpoint reference counts
if all checkpoint reference counts could be queued
require process to not continue
else
record my_current_iteration_state for checkpoint reference counts

sync_checkpoints_enter:
my_sync_state = SYNC_CHECKPOINT
my_current_iteration_state set to start of checkpont list

sync_refcounts_enter:
my_sync_state = SYNC_REFCOUNT

on event receipt of foreign ring id message
ignore message

pseudocode executed on event receipt of checkpoint update
if checkpoint exists in temporary storage
ignore message
else
create checkpoint
reset checkpoint refcount array

pseudocode executed on event receipt of checkpoint section update
if checkpoint section exists in temporary storage
ignore message
else
create checkpoint section

pseudocode executed on event receipt of reference count update
update temporary checkpoint data storage reference count set by adding
any reference counts in the temporary message set to those from the
event
update that checkpoint's reference count
set the global checkpoint id to the current checkpoint id + 1 if it
would increase the global checkpoint id

pseudocode called when the synchronization service calls the activate event:
for all checkpoints
free all previously committed checkpoints and sections
convert temporary checkpoints and sections to regular sections
copy my_saved_ring_id to my_old_ring_id

pseudocode called when the synchronization service calls the abort event:
free all temporary checkpoints and temporary sections