|
@@ -1,280 +0,0 @@
|
|
|
-Application Interface Specification Quckstart Guide
|
|
|
|
|
----------------------------------------------------
|
|
|
|
|
-
|
|
|
|
|
-***
|
|
|
|
|
-All cryptographic software in this package is subject to the following legal
|
|
|
|
|
-notice:
|
|
|
|
|
-This package includes publicly available encryption source code which,
|
|
|
|
|
-together with object code resulting from the compiling of publicly
|
|
|
|
|
-available source code, may be exported from the United States under License
|
|
|
|
|
-Exception TSU prsuant to 15 C.F.R Section 740.13(e).
|
|
|
|
|
-***
|
|
|
|
|
-
|
|
|
|
|
-This corosync package is broken into four parts. The exec directory contains
|
|
|
|
|
-all of the code responsible for serving the APIs. The lib directory contains
|
|
|
|
|
-APIs the to which the user may link. The test directory contains some simple
|
|
|
|
|
-test programs which exercise the APIs. The directory conf contains example
|
|
|
|
|
-configuration files which can be copied directly onto the target system.
|
|
|
|
|
-
|
|
|
|
|
-The API implements SA Forum APIs for Cluster Membership (CLM), Availabilty
|
|
|
|
|
-Management Framework (AMF), Checkpointing (CKPT), and Eventing (EVT).
|
|
|
|
|
-
|
|
|
|
|
-The API also contains an extended virtual synchrony API which can be used
|
|
|
|
|
-in distributed applications.
|
|
|
|
|
-
|
|
|
|
|
-Configuring the corosync executive:
|
|
|
|
|
----------------------------------
|
|
|
|
|
-The corosync executive will automatically determine cluster membership by
|
|
|
|
|
-communicating on a specified multicast address and port.
|
|
|
|
|
-
|
|
|
|
|
-The directory conf contains the file corosync.conf
|
|
|
|
|
-
|
|
|
|
|
-totem {
|
|
|
|
|
- bindnetaddr: 192.168.1.0
|
|
|
|
|
- mcastaddr: 226.94.1.1
|
|
|
|
|
- mcastport: 5405
|
|
|
|
|
-}
|
|
|
|
|
-
|
|
|
|
|
-logging {
|
|
|
|
|
- logoutput: file
|
|
|
|
|
- logoutput: stderr
|
|
|
|
|
- logoutput: syslog
|
|
|
|
|
- logfile: /tmp/ais
|
|
|
|
|
- debug: on
|
|
|
|
|
- timestamp: on
|
|
|
|
|
-}
|
|
|
|
|
-
|
|
|
|
|
-timeout {
|
|
|
|
|
- token: 200
|
|
|
|
|
- token_retransmit: 50
|
|
|
|
|
- hold: 30
|
|
|
|
|
- retransmits_before_loss: 4
|
|
|
|
|
- join: 100
|
|
|
|
|
- consensus: 200
|
|
|
|
|
- merge: 200
|
|
|
|
|
- downcheck: 1000
|
|
|
|
|
- fail_recv_const: 250
|
|
|
|
|
-}
|
|
|
|
|
-
|
|
|
|
|
-The totem section contains three values. All three values must be set
|
|
|
|
|
-or the corosync executive wll exit with an error.
|
|
|
|
|
-
|
|
|
|
|
-bindnetaddr specifies the address which the corosync Executive should bind to.
|
|
|
|
|
-This address should always end in zero. If the local interface taffic
|
|
|
|
|
-should routed over is 192.168.5.92, set bindnetaddr to 192.168.5.0.
|
|
|
|
|
-
|
|
|
|
|
-mcastaddr is a multicast address. The default should work but you may have
|
|
|
|
|
-a different network configuration. Avoid 224.x.x.x because this is a "config"
|
|
|
|
|
-multicast address.
|
|
|
|
|
-
|
|
|
|
|
-mcastport specifies the UDP port number. It is possible to use the same
|
|
|
|
|
-multicast address on a network with the corosync services configured for different
|
|
|
|
|
-UDP ports.
|
|
|
|
|
-
|
|
|
|
|
-The logging section contains values. These values do not have to be set in which
|
|
|
|
|
-case the system defaults to logging to syslog and stderr with timestamping and debug.
|
|
|
|
|
-
|
|
|
|
|
-It is possible to select 3 destinations for logs: files, stderr, and syslog. One or
|
|
|
|
|
-more may be selected at the same time. If file is selected as a destination, the file
|
|
|
|
|
-name must be specified via the logfile option or the corosync executive will exit.
|
|
|
|
|
-
|
|
|
|
|
-The debug option prints out internal debugging information during runtime which may
|
|
|
|
|
-be helpful for developers.
|
|
|
|
|
-
|
|
|
|
|
-The timestamp option prints the date and time on each log message.
|
|
|
|
|
-
|
|
|
|
|
-The timeout section contains seven values. This section is not normally used, but
|
|
|
|
|
-rather used to override the program defaults for the purposes of fine tuning for
|
|
|
|
|
-a given networking/processor combination or for debugging purposes. Be careful to
|
|
|
|
|
-use the same timeout values on each of the nodes in the cluster or unpredictable
|
|
|
|
|
-results may occur.
|
|
|
|
|
-
|
|
|
|
|
-All timeout values except fail_recv_const are miliseconds. fail_recv_const is
|
|
|
|
|
-a message count. Until the man page is done you'll have to check the code and the
|
|
|
|
|
-totem spec for the function and usage of the timeouts.
|
|
|
|
|
-
|
|
|
|
|
-
|
|
|
|
|
-The directory conf contains the file amf.conf which specifies the failover
|
|
|
|
|
-groups, service units, components, and policies to be used by the AMF. The
|
|
|
|
|
-configuration file matches the testamf1-6 programs in the test directory and
|
|
|
|
|
-can be copied directly.
|
|
|
|
|
-
|
|
|
|
|
-These two files should be placed in the /etc/ais directory.
|
|
|
|
|
-
|
|
|
|
|
-A few notes about the config files:
|
|
|
|
|
-1) Do not use DOS style termination. This breaks the parser.
|
|
|
|
|
-2) Do not have blank lines in the amf.conf file. This breaks the amf configuration
|
|
|
|
|
- file parser. We are working on fixing these bugs, but for the moment, it is
|
|
|
|
|
- easy to simply avoid them.
|
|
|
|
|
-
|
|
|
|
|
-Building corosync
|
|
|
|
|
-----------------
|
|
|
|
|
-corosync requires GCC, LD, and a Linux 2.4/2.6 kernel. corosync has been tested on
|
|
|
|
|
-Debian Sarge(i386), Redhat 9(i386), Fedora Core 2(i386), Fedora Core
|
|
|
|
|
-4(i386,x86_64) and MontaVista Carrier Grade Edition 3.1(i386, x86_64,
|
|
|
|
|
-classic ppc, ppc970, xscale).
|
|
|
|
|
-
|
|
|
|
|
-Compile corosync by running make in the root directory. Make can also be run
|
|
|
|
|
-in the individual directories. Nothing is installed by make. If install
|
|
|
|
|
-is desired, the files must be copied manually.
|
|
|
|
|
-
|
|
|
|
|
-Configure Host
|
|
|
|
|
---------------
|
|
|
|
|
-For security reasons, the corosync only allows a process that had the EGID/GID
|
|
|
|
|
-of "ais" to connect to it. To make development easier, it is recommended to
|
|
|
|
|
-create an "ais" user with the "ais" group.
|
|
|
|
|
-
|
|
|
|
|
-[root@slickdeal root]# adduser ais -g ais
|
|
|
|
|
-
|
|
|
|
|
-Set the ais user's password:
|
|
|
|
|
-
|
|
|
|
|
-[root@slickdeal root]# passwd ais
|
|
|
|
|
-Changing password for user ais.
|
|
|
|
|
-New password:
|
|
|
|
|
-Retype new password:
|
|
|
|
|
-passwd: all authentication tokens updated successfully.
|
|
|
|
|
-
|
|
|
|
|
-Generate a private key
|
|
|
|
|
-----------------------
|
|
|
|
|
-corosync uses cryptographic techniques to ensure authenticity and privacy of
|
|
|
|
|
-messages. A private key must be generated and shared by all processors for
|
|
|
|
|
-correct operation.
|
|
|
|
|
-
|
|
|
|
|
-First generate the key on one of the nodes:
|
|
|
|
|
-
|
|
|
|
|
-unix# exec/keygen
|
|
|
|
|
-Opencorosync Authentication key generator.
|
|
|
|
|
-Gathering 1024 bits for key from /dev/random.
|
|
|
|
|
-Writing corosync key to /etc/ais/authkey.
|
|
|
|
|
-
|
|
|
|
|
-
|
|
|
|
|
-After this is complete, a private key will be in the file /etc/ais/authkey.
|
|
|
|
|
-This private key must be copied to every processor that will be a member of
|
|
|
|
|
-the cluster. If the private key isn't the same for every node, those nodes
|
|
|
|
|
-with nonmatching private keys will not be able to join the same configuration.
|
|
|
|
|
-
|
|
|
|
|
-Copy the key to some transportable storage or use ssh to transmit the key
|
|
|
|
|
-from node to node. Then install the key with the command:
|
|
|
|
|
-
|
|
|
|
|
-unix# install -D --group=0 --owner=0 --mode=0400 /path_to_authkey/authkey /etc/ais/authkey
|
|
|
|
|
-
|
|
|
|
|
-If the message invalid digest appears, the keys are not the same on each node.
|
|
|
|
|
-
|
|
|
|
|
-Run the corosync executive
|
|
|
|
|
--------------------------
|
|
|
|
|
-Get one or more nodes and run the corosync executive on each node. A list of
|
|
|
|
|
-node IPs should be logged when the nodes join a configuration. Run the
|
|
|
|
|
-aisexec program after following the previous directions.
|
|
|
|
|
-
|
|
|
|
|
-A final note on permissions:
|
|
|
|
|
-It is not absolutely required that corosync executive runs as root. If
|
|
|
|
|
-it runs as root, it schedules at the highest round robin realtime
|
|
|
|
|
-priority and locks all of it's pages into ram in case a swap would cause a
|
|
|
|
|
-delay in the real-time nature of the protocol. The warning "not
|
|
|
|
|
-able to lock pages" is simply a warning and can be ignored if you choose
|
|
|
|
|
-to run as a non root user.
|
|
|
|
|
-
|
|
|
|
|
-The ais user/group is required because applications are authenticated
|
|
|
|
|
-against the ais user and group. If an application(/library) is not root
|
|
|
|
|
-or ais, then the application cannot connect to the ais executive.
|
|
|
|
|
-
|
|
|
|
|
-please read SECURITY to understand the threat model assumed by corosync
|
|
|
|
|
-and the techniques corosync use to overcome these threats.
|
|
|
|
|
-
|
|
|
|
|
-Before running any of the test programs
|
|
|
|
|
----------------------------------------
|
|
|
|
|
-The corosync executive will ensure security by only allowing the ais group (or
|
|
|
|
|
-uid root) to connect to the service. Switch to the ais group before
|
|
|
|
|
-running any applications linked to the ais apis, or the applications will
|
|
|
|
|
-not be authenticated and won't be able to access services.
|
|
|
|
|
-
|
|
|
|
|
-[sdake@slickdeal sdake]$ su ais
|
|
|
|
|
-Password:
|
|
|
|
|
-[ais@slickdeal sdake]$ id
|
|
|
|
|
-uid=501(ais) gid=502(ais) groups=502(ais)
|
|
|
|
|
-
|
|
|
|
|
-Try out the corosync CLM functionality
|
|
|
|
|
--------------------------------------
|
|
|
|
|
-After aisexec is running
|
|
|
|
|
-
|
|
|
|
|
-su to ais user
|
|
|
|
|
-
|
|
|
|
|
-Run test/testclm on one node. Then kill and add nodes. This will cause
|
|
|
|
|
-callbacks to be called in the testclm application which will print out
|
|
|
|
|
-the node state changes. The testclm program will not print any output
|
|
|
|
|
-after it is started and has printed the current configuration until nodes
|
|
|
|
|
-are added to or deleted from the configuration by starting and stopping
|
|
|
|
|
-aisexec on other nodes.
|
|
|
|
|
-
|
|
|
|
|
-Killing aisexec on the node the testclm is connected will cause the
|
|
|
|
|
-API to return error codes indicating the system has failed.
|
|
|
|
|
-
|
|
|
|
|
-Try out the corosync AMF functionality
|
|
|
|
|
--------------------------------------
|
|
|
|
|
-After aisexec is running
|
|
|
|
|
-
|
|
|
|
|
-su to ais user
|
|
|
|
|
-
|
|
|
|
|
-The test/testamf{1-6} implement three seperate service units (SU). SU #1
|
|
|
|
|
-consists of testamf1, testamf2. SU #2 consists of testamf3, testamf4.
|
|
|
|
|
-SU #3 consists of testamf5, testamf6. The active and backup directives
|
|
|
|
|
-in amf.conf define how many SU's become active and how many
|
|
|
|
|
-become standby in the service group (SG).
|
|
|
|
|
-
|
|
|
|
|
-To test the corosync AMF, run testamf3 and testamf4 on one node. Both
|
|
|
|
|
-components become in service and active. Then run testamf1. Nothing
|
|
|
|
|
-appears to happen, because testamf1 is not placed in service (and made
|
|
|
|
|
-standby) until testamf2 is registered. Running testamf2 will show
|
|
|
|
|
-a variety of state changes. testamf1 will match these state changes.
|
|
|
|
|
-testamf2 is special because is reports an error, and later cancels
|
|
|
|
|
-the error, causing the entire SU to go out of service, then back in
|
|
|
|
|
-service. This behavior is expected by the corosync specification and the
|
|
|
|
|
-code in testamf2.c can be read for a clearer understanding of what
|
|
|
|
|
-is happening.
|
|
|
|
|
-
|
|
|
|
|
-Pressing ctrl-z to background the task (which causes the healthcheck to
|
|
|
|
|
-timeout) on a component will cause the remaining component to go
|
|
|
|
|
-out of service. If ctrl-z is pressed on the active SU, the standby
|
|
|
|
|
-SU will become active. CTRL-C on these tests behaves the same way.
|
|
|
|
|
-A crash behaves the same way.
|
|
|
|
|
-
|
|
|
|
|
-Try out the corosync CKPT functionality
|
|
|
|
|
---------------------------------------
|
|
|
|
|
-su to ais user
|
|
|
|
|
-
|
|
|
|
|
-run testckpt. This will execute various checkpoint API operations.
|
|
|
|
|
-
|
|
|
|
|
-run ckptbench. This will execute non-threaded write benchmarks.
|
|
|
|
|
-
|
|
|
|
|
-run ckptbenchth. This will execute threaded write benchmarks.
|
|
|
|
|
-
|
|
|
|
|
-The benchmark configuration (how many threads to run, how many writes
|
|
|
|
|
-per benchmark run, and data write size are specified in the ckptbench.c
|
|
|
|
|
-and ckptbenchth.c programs.
|
|
|
|
|
-
|
|
|
|
|
-Two node clusters should approach 8.5 MB/sec on 100 mbit networks for
|
|
|
|
|
-larger checkpoint sizes with encryption and authentication. If you are not
|
|
|
|
|
-seeing these results, please report to the mailing list.
|
|
|
|
|
-
|
|
|
|
|
-Try out the corosync EVT functionality
|
|
|
|
|
--------------------------------------
|
|
|
|
|
-su to ais user
|
|
|
|
|
-
|
|
|
|
|
-run testevt. This will execute various eventing API operations.
|
|
|
|
|
-
|
|
|
|
|
-Try out the corosync EVS functionality
|
|
|
|
|
--------------------------------------
|
|
|
|
|
-su to ais user
|
|
|
|
|
-run testevs. This will generate multicast messages and self deliver them
|
|
|
|
|
-
|
|
|
|
|
-run evsbench. This will display the benchmark performance of the evs service.
|
|
|
|
|
-
|
|
|
|
|
-Write your own applications
|
|
|
|
|
----------------------------
|
|
|
|
|
-Without real applications, finding the hard bugs will be difficult. Please
|
|
|
|
|
-port or write apps and let us know of the progress!
|
|
|
|
|
-
|
|
|
|
|
-Contribute!
|
|
|
|
|
------------
|
|
|
|
|
-Code, examples, documentation, bug reports, testing are all appreciated.
|
|
|
|
|
-Read the TODO or the ask on the mailing lists for ways to contribute.
|
|
|