To configure and manage cluster resources, either use the crm shell (crmsh) command line utility or HA Web Konsole (Hawk), a Web-based user interface.
This chapter introduces crm, the command line tool and
covers an overview of this tool, how to use templates, and mainly
configuring and managing cluster resources: creating basic and advanced
types of resources (groups and clones), configuring constraints,
specifying failover nodes and failback nodes, configuring resource
monitoring, starting, cleaning up or removing resources, and migrating
resources manually.
Sufficient privileges are necessary to manage a cluster. The
crm command and its subcommands have to be run either
as root user or as the CRM owner user (typically the user
hacluster).
However, the user option allows you to run
crm and its subcommands as a regular (unprivileged)
user and to change its ID using sudo whenever
necessary. For example, with the following command crm
will use hacluster as the
privileged user ID:
root #crmoptions user hacluster
Note that you need to set up /etc/sudoers so that
sudo does not ask for a password.
The crm command has several subcommands which manage resources, CIBs, nodes, resource agents, and others. It
offers a thorough help system with embedded examples. All examples follow
a naming convention described in Appendix B.
To make all the code and examples more readable, this chapter uses the following notations between shell prompts and the interactive crm prompt:
Shell prompt for user root:
root # Interactive crmsh prompt (displayed in green, if terminal supports colors):
crm(live)# Help can be accessed in several ways:
To output the usage of crm and its command line
options:
root #crm--help
To give a list of all available commands:
root #crmhelp
To access other help sections, not just the command reference:
root #crmhelp topics
To view the extensive help text of the configure
subcommand:
root #crmconfigure help
To print the syntax, its usage, and examples of a subcommand of
configure:
root #crmconfigure help group
This is also possible:
root #crmhelp configure group
Almost all output of the help subcommand (do not mix
it up with the --help option) opens a text viewer. This
text viewer allows you to scroll up or down and read the help text more
comfortably. To leave the text viewer, press the Q key.
The crmsh supports full tab completion in Bash directly, not
only for the interactive shell. For example, typing
crm help config→| will
complete the word just like in the interactive shell.
The crm command itself can be used in the following
ways:
Directly.
Concatenate all subcommands to crm, press
Enter and you see the output immediately. For
example, enter crm help ra to get
information about the ra subcommand (resource
agents).
As crm Shell Script.
Use crm and its subcommands in a script. This can be
done in two ways:
root #crm-f script.cliroot #crm< script.cli
The script can contain any command from crm. For
example:
# A small script file for crmstatusnodelist
Any line starting with the hash symbol (#) is a comment and is ignored. If a line is too long, insert a backslash (\) at the end and continue in the next line. It's recommended to indent lines which belongs to a certain subcommand to improve readability.
Interactive as Internal Shell.
Type crm to enter the internal shell. The prompt
changes to crm(live). With help
you can get an overview of the available subcommands. As the internal
shell has different levels of subcommands, you can
“enter” one by just typing this subcommand and press
Enter.
For example, if you type resource you enter the
resource management level. Your prompt changes to
crm(live)resource#. If you want to leave the
internal shell, use the commands quit,
bye, or exit. If you need to go
one level back, use back, up,
end, or cd.
You can enter the level directly by typing crm and
the respective subcommand(s) without any options and hit
Enter.
The internal shell supports also tab completion for subcommands and
resources. Type the beginning of a command, press
→| and crm completes the
respective object.
In addition to previously explained methods, crmsh also supports
synchronous command execution. Use the -w option to
activate it. If you have started crm without
-w, you can enable it later with the user preference's
wait set to yes (options
wait yes). If this option is enabled, crm
waits until the transition is finished. Whenever a transaction is
started, dots are printed to indicate progress. Synchronous command
execution is only applicable for commands like resource
start.
The crm tool has management capability (the
subcommands resource and node) and
can be used for configuration (cib,
configure).
The following subsections give you an overview about some important
aspects of the crm tool.
As you have to deal with resource agents in your cluster configuration
all the time, the crm tool contains the
ra command to get information about resource agents
and to manage them (for additional information, see also
Section 4.2.2, “Supported Resource Agent Classes”):
root #crmracrm(live)ra#
The command classes gives you a list of all classes
and providers:
crm(live)ra#classeslsb ocf / heartbeat linbit lvm2 ocfs2 pacemaker service stonith systemd
To get an overview about all available resource agents for a class (and
provider) use the list command:
crm(live)ra#listocf AoEtarget AudibleAlarm CTDB ClusterMon Delay Dummy EvmsSCC Evmsd Filesystem HealthCPU HealthSMART ICP IPaddr IPaddr2 IPsrcaddr IPv6addr LVM LinuxSCSI MailTo ManageRAID ManageVE Pure-FTPd Raid1 Route SAPDatabase SAPInstance SendArp ServeRAID ...
An overview about a resource agent can be viewed with
info:
crm(live)ra#infoocf:drbd:linbit This resource agent manages a DRBD* resource as a master/slave resource. DRBD is a shared-nothing replicated storage device. (ocf:linbit:drbd) Master/Slave OCF Resource Agent for DRBD Parameters (* denotes required, [] the default): drbd_resource* (string): drbd resource name The name of the drbd resource from the drbd.conf file. drbdconf (string, [/etc/drbd.conf]): Path to drbd.conf Full path to the drbd.conf file. Operations' defaults (advisory minimum): start timeout=240 promote timeout=90 demote timeout=90 notify timeout=90 stop timeout=100 monitor_Slave_0 interval=20 timeout=20 start-delay=1m monitor_Master_0 interval=10 timeout=20 start-delay=1m
Leave the viewer by pressing Q.
crm Directly
In the former example we used the internal shell of the
crm command. However, you do not necessarily have to
use it. You get the same results, if you add the respective subcommands
to crm. For example, you can list all the OCF
resource agents by entering crm ra list
ocf in your shell.
Configuration templates are ready-made cluster configurations for crmsh. Do not confuse them with the resource templates (as described in Section 6.4.2, “Creating Resource Templates”). Those are templates for the cluster and not for the crm shell.
Configuration templates require minimum effort to be tailored to the particular user's needs. Whenever a template creates a configuration, warning messages give hints which can be edited later for further customization.
The following procedure shows how to create a simple yet functional Apache configuration:
Log in as root and start the crm interactive shell:
root #crmconfigure
Create a new configuration from a configuration template:
Switch to the template subcommand:
crm(live)configure#template
List the available configuration templates:
crm(live)configure template#listtemplates gfs2-base filesystem virtual-ip apache clvm ocfs2 gfs2
Decide which configuration template you need. As we need an Apache
configuration, we choose the apache template
and name it g-intranet:
crm(live)configure template#newg-intranet apache INFO: pulling in template apache INFO: pulling in template virtual-ip
Define your parameters:
List the just created configuration:
crm(live)configure template#listg-intranet
Display the minimum of required changes which have to be filled out by you:
crm(live)configure template#showERROR: 23: required parameter ip not set ERROR: 61: required parameter id not set ERROR: 65: required parameter configfile not set
Invoke your preferred text editor and fill out all lines that have been displayed as errors in Step 3.b:
crm(live)configure template#edit
Show the configuration and check whether it is valid (bold text depends on the configuration you have entered in Step 3.c):
crm(live)configure template#showprimitive virtual-ip ocf:heartbeat:IPaddr \ params ip="192.168.1.101" primitive apache ocf:heartbeat:apache \ params configfile="/etc/apache2/httpd.conf" monitor apache 120s:60s group g-intranet \ apache virtual-ip
Apply the configuration:
crm(live)configure template#applycrm(live)configure#cd ..crm(live)configure#show
Submit your changes to the CIB:
crm(live)configure#commit
It is possible to simplify the commands even more, if you know the details. The above procedure can be summarized with the following command on the shell:
root #crmconfigure template \ new g-intranet apache params \ configfile="/etc/apache2/httpd.conf" ip="192.168.1.101"
If you are inside your internal crm shell, use the
following command:
crm(live)configure template#newintranet apache params \ configfile="/etc/apache2/httpd.conf" ip="192.168.1.101"
However, the previous command only creates its configuration from the configuration template. It does not apply nor commit it to the CIB.
A shadow configuration is used to test different configuration scenarios. If you have created several shadow configurations, you can test them one by one to see the effects of your changes.
The usual process looks like this:
Log in as root and start the crm interactive shell:
root #crmconfigure
Create a new shadow configuration:
crm(live)configure#cibnew myNewConfig INFO: myNewConfig shadow CIB created
If you omit the name of the shadow CIB, a temporary name @tmp@ is created.
If you want to copy the current live configuration into your shadow configuration, use the following command, otherwise skip this step:
crm(myNewConfig)# cib reset myNewConfigThe previous command makes it easier to modify any existing resources later.
Make your changes as usual. After you have created the shadow configuration, all changes go there. To save all your changes, use the following command:
crm(myNewConfig)# commitIf you need the live cluster configuration again, switch back with the following command:
crm(myNewConfig)configure#cibuse livecrm(live)#
Before loading your configuration changes back into the cluster, it is
recommended to review your changes with ptest. The
ptest command can show a diagram of actions that will be
induced by committing the changes. You need the
graphviz package to display the diagrams. The
following example is a transcript, adding a monitor operation:
root #crmconfigurecrm(live)configure#showfence-bob primitive fence-bob stonith:apcsmart \ params hostlist="bob"crm(live)configure#monitorfence-bob 120m:60scrm(live)configure#showchanged primitive fence-bob stonith:apcsmart \ params hostlist="bob" \ op monitor interval="120m" timeout="60s"crm(live)configure#ptestcrm(live)configure#commit
To output a cluster diagram as shown in
Figure 5.2, “Hawk—Cluster Diagram”, use the command
crm configure graph. It displays
the current configuration on its current window, therefore requiring
X11.
If you prefer Scalable Vector Graphics (SVG), use the following command:
root #crmconfigure graph dot config.svg svg
Corosync is the underlying messaging layer for most HA clusters.
The corosync subcommand provides commands for
editing and managing the Corosync configuration.
For example, to list the status of the cluster, use
status:
root #crmcorosync status Printing ring status. Local node ID 175704363 RING ID 0 id = 10.121.9.43 status = ring 0 active with no faults Quorum information ------------------ Date: Thu May 8 16:41:56 2014 Quorum provider: corosync_votequorum Nodes: 2 Node ID: 175704363 Ring ID: 4032 Quorate: Yes Votequorum information ---------------------- Expected votes: 2 Highest expected: 2 Total votes: 2 Quorum: 2 Flags: Quorate Membership information ---------------------- Nodeid Votes Name 175704363 1 alice.example.com (local) 175704619 1 bob.example.com
Very helpful is the diff command: It compares
the Corosync configuration on all nodes (if not stated otherwise)
and prints the difference between:
root #crmcorosync diff --- bob +++ alice @@ -46,2 +46,2 @@ - expected_votes: 2 - two_node: 1 + expected_votes: 1 + two_node: 0
For more details, see http://crmsh.nongnu.org/crm.8.html#cmdhelp_corosync.
Global cluster options control how the cluster behaves when confronted with certain situations. The predefined values can be kept in most cases. However, to make key functions of your cluster work correctly, you need to adjust the following parameters after basic cluster setup:
crm #
Log in as root and start the crm tool:
root #crmconfigure
Use the following commands to set the options for two-node clusters only:
crm(live)configure#propertyno-quorum-policy=ignorecrm(live)configure#propertystonith-enabled=true
A cluster without STONITH is not supported.
Show your changes:
crm(live)configure#showproperty $id="cib-bootstrap-options" \ dc-version="1.1.1-530add2a3721a0ecccb24660a97dbfdaa3e68f51" \ cluster-infrastructure="corosync" \ expected-quorum-votes="2" \ no-quorum-policy="ignore" \ stonith-enabled="true"
Commit your changes and exit:
crm(live)configure#commitcrm(live)configure#exit
As a cluster administrator, you need to create cluster resources for every resource or application you run on servers in your cluster. Cluster resources can include Web sites, e-mail servers, databases, file systems, virtual machines, and any other server-based applications or services you want to make available to users at all times.
For an overview of resource types you can create, refer to Section 4.2.3, “Types of Resources”.
There are three types of RAs (Resource Agents) available with the cluster (for background information, see Section 4.2.2, “Supported Resource Agent Classes”). To add a new resource to the cluster, proceed as follows:
Log in as root and start the crm tool:
root #crmconfigure
Configure a primitive IP address:
crm(live)configure#primitivemyIP ocf:heartbeat:IPaddr \ params ip=127.0.0.99 op monitor interval=60s
The previous command configures a “primitive” with the
name myIP. You need to choose a class (here
ocf), provider (heartbeat), and
type (IPaddr). Furthermore, this primitive expects
other parameters like the IP address. Change the address to your
setup.
Display and review the changes you have made:
crm(live)configure#show
Commit your changes to take effect:
crm(live)configure#commit
If you want to create several resources with similar configurations, a
resource template simplifies the task. See also
Section 4.4.3, “Resource Templates and Constraints” for
some basic background information. Do not confuse them with the
“normal” templates from
Section 6.1.4, “Using Configuration Templates”. Use the
rsc_template command to get familiar with the syntax:
root #crmconfigure rsc_template usage: rsc_template <name> [<class>:[<provider>:]]<type> [params <param>=<value> [<param>=<value>...]] [meta <attribute>=<value> [<attribute>=<value>...]] [utilization <attribute>=<value> [<attribute>=<value>...]] [operations id_spec [op op_type [<attribute>=<value>...] ...]]
For example, the following command creates a new resource template with
the name BigVM derived from the
ocf:heartbeat:Xen resource and some default values
and operations:
crm(live)configure#rsc_templateBigVM ocf:heartbeat:Xen \ params allow_mem_management="true" \ op monitor timeout=60s interval=15s \ op stop timeout=10m \ op start timeout=10m
Once you defined the new resource template, you can use it in primitives
or reference it in order, colocation, or rsc_ticket constraints. To
reference the resource template, use the @ sign:
crm(live)configure#primitiveMyVM1 @BigVM \ params xmfile="/etc/xen/shared-vm/MyVM1" name="MyVM1"
The new primitive MyVM1 is going to inherit everything from the BigVM resource templates. For example, the equivalent of the above two would be:
crm(live)configure#primitiveMyVM1 ocf:heartbeat:Xen \ params xmfile="/etc/xen/shared-vm/MyVM1" name="MyVM1" params allow_mem_management="true" \ op monitor timeout=60s interval=15s \ op stop timeout=10m \ op start timeout=10m
If you want to overwrite some options or operations, add them to your (primitive) definition. For instance, the following new primitive MyVM2 doubles the timeout for monitor operations but leaves others untouched:
crm(live)configure#primitiveMyVM2 @BigVM \ params xmfile="/etc/xen/shared-vm/MyVM2" name="MyVM2" \ op monitor timeout=120s interval=30s
A resource template may be referenced in constraints to stand for all primitives which are derived from that template. This helps to produce a more concise and clear cluster configuration. Resource template references are allowed in all constraints except location constraints. Colocation constraints may not contain more than one template reference.
From the crm perspective, a STONITH device is just
another resource. To create a STONITH resource, proceed as follows:
Log in as root and start the crm interactive shell:
root #crmconfigure
Get a list of all STONITH types with the following command:
crm(live)#ralist stonith apcmaster apcmastersnmp apcsmart baytech bladehpi cyclades drac3 external/drac5 external/dracmc-telnet external/hetzner external/hmchttp external/ibmrsa external/ibmrsa-telnet external/ipmi external/ippower9258 external/kdumpcheck external/libvirt external/nut external/rackpdu external/riloe external/sbd external/vcenter external/vmware external/xen0 external/xen0-ha fence_legacy ibmhmc ipmilan meatware nw_rpc100s rcd_serial rps10 suicide wti_mpc wti_nps
Choose a STONITH type from the above list and view the list of possible options. Use the following command:
crm(live)#rainfo stonith:external/ipmi IPMI STONITH external device (stonith:external/ipmi) ipmitool based power management. Apparently, the power off method of ipmitool is intercepted by ACPI which then makes a regular shutdown. If case of a split brain on a two-node it may happen that no node survives. For two-node clusters use only the reset method. Parameters (* denotes required, [] the default): hostname (string): Hostname The name of the host to be managed by this STONITH device. ...
Create the STONITH resource with the stonith
class, the type you have chosen in
Step 3,
and the respective parameters if needed, for example:
crm(live)#configurecrm(live)configure#primitivemy-stonith stonith:external/ipmi \ params hostname="alice" ipaddr="192.168.1.221" \ userid="admin" passwd="secret" \ op monitor interval=60m timeout=120s
Having all the resources configured is only one part of the job. Even if the cluster knows all needed resources, it might still not be able to handle them correctly. For example, try not to mount the file system on the slave node of DRBD (in fact, this would fail with DRBD). Define constraints to make these kind of information available to the cluster.
For more information about constraints, see Section 4.4, “Resource Constraints”.
The location command is defines on which nodes
a resource may be run, may not be run or is preferred to be run.
This type of constraint may be added multiple times for each resource.
All location constraints are evaluated for a given
resource. A simple example that expresses a preference to run the
resource fs1 on the node with the name
alice to 100 would be the following:
crm(live)configure#locationloc-fs1 fs1 100: alice
Another example is a location with pingd:
crm(live)configure#primitivepingd pingd \ params name=pingd dampen=5s multiplier=100 host_list="r1 r2"crm(live)configure#locationloc-node_pref internal_www \ rule 50: #uname eq alice \ rule pingd: defined pingd
Another use case for location constraints are grouping primitives
as a resource set. This can be useful if several
resources depend on, for example, a ping attribute for network
connectivity.
In former times, the -inf/ping rules
had to be duplicated several times in the configuration,
making it unnecessarily complex.
The following example creates a resource set
loc-alice, referencing to the virtual IP addresses
vip1 and vip2:
crm(live)configure#primitivevip1 ocf:heartbeat:IPaddr2 params ip=192.168.1.5crm(live)configure#primitivevip1 ocf:heartbeat:IPaddr2 params ip=192.168.1.6crm(live)configure#locationloc-alice { vip1 vip2 } inf: alice
In some cases it is much more efficient and convenient to use
resource patterns for your location command.
A resource pattern is a regular expression between two slashes.
For example, the above virtual IP addresses can be all matched with
the following:
crm(live)configure#locationloc-alice /vip.*/ inf: alice
The colocation command is used to define what
resources should run on the same or on different hosts.
It is only possible to set a score of either +inf or -inf, defining resources that must always or must never run on the same node. It is also possible to use non-infinite scores. In that case the colocation is called advisory and the cluster may decide not to follow them in favor of not stopping other resources if there is a conflict.
For example, to run the resources with the IDs
filesystem_resource and nfs_group
always on the same host, use the following constraint:
crm(live)configure#colocationnfs_on_filesystem inf: nfs_group filesystem_resource
For a master slave configuration, it is necessary to know if the current node is a master in addition to running the resource locally.
Sometimes it is useful to be able to place a group of resources on the same node (defining a colocation constraint), but without having hard dependencies between the resources.
Use the command weak-bond if you want to
place resources on the same node, but without any action if one of them
fails.
root #crmconfigure assist weak-bond RES1 RES2
The implementation of weak-bond creates a
dummy resource and a colocation constraint with the given resources
automatically.
The order command defines a sequence of action.
Sometimes it is necessary to provide an order of resource actions or operations. For example, you cannot mount a file system before the device is available to a system. Ordering constraints can be used to start or stop a service right before or after a different resource meets a special condition, such as being started, stopped, or promoted to master.
Use the following command in the crm
shell to configure an ordering constraint:
crm(live)configure#ordernfs_after_filesystem mandatory: filesystem_resource nfs_group
The example used for this section would not work without additional constraints. It is essential that all resources run on the same machine as the master of the DRBD resource. The DRBD resource must be master before any other resource starts. Trying to mount the DRBD device when it is not the master simply fails. The following constraints must be fulfilled:
The file system must always be on the same node as the master of the DRBD resource.
crm(live)configure#colocationfilesystem_on_master inf: \ filesystem_resource drbd_resource:Master
The NFS server as well as the IP address must be on the same node as the file system.
crm(live)configure#colocationnfs_with_fs inf: \ nfs_group filesystem_resource
The NFS server as well as the IP address start after the file system is mounted:
crm(live)configure#ordernfs_second mandatory: \ filesystem_resource:start nfs_group
The file system must be mounted on a node after the DRBD resource is promoted to master on this node.
crm(live)configure#orderdrbd_first inf: \ drbd_resource:promote filesystem_resource:start
To determine a resource failover, use the meta attribute migration-threshold. In case failcount exceeds migration-threshold on all nodes, the resource will remain stopped. For example:
crm(live)configure#locationrsc1-alice rsc1 100: alice
Normally, rsc1 prefers to run on alice. If it fails there, migration-threshold is checked and compared to the failcount. If failcount >= migration-threshold then it is migrated to the node with the next best preference.
Start failures set the failcount to inf depend on the
start-failure-is-fatal option. Stop failures cause
fencing. If there is no STONITH defined, the resource will not migrate
at all.
For an overview, refer to Section 4.4.4, “Failover Nodes”.
A resource might fail back to its original node when that node is back online and in the cluster. If you want to prevent a resource from failing back to the node it was running on prior to failover, or if you want to specify a different node for the resource to fail back to, you must change its resource stickiness value. You can either specify resource stickiness when you are creating a resource, or afterwards.
For an overview, refer to Section 4.4.5, “Failback Nodes”.
Some resources may have specific capacity requirements such as minimum amount of memory. Otherwise, they may fail to start completely or run with degraded performance.
To take this into account, the High Availability Extension allows you to specify the following parameters:
The capacity a certain node provides.
The capacity a certain resource requires.
An overall strategy for placement of resources.
For detailed background information about the parameters and a configuration example, refer to Section 4.4.6, “Placing Resources Based on Their Load Impact”.
To configure the resource's requirements and the capacity a node
provides, use utilization attributes.
You can name the utilization attributes according to your preferences
and define as many name/value pairs as your configuration needs.
In certain cases, some agents update the utilization themselves,
for example the VirtualDomain.
In the following example, we assume that you already have a basic configuration of cluster nodes and resources and now additionally want to configure the capacities a certain node provides and the capacity a certain resource requires.
crm #
Log in as root and start the crm interactive shell:
root #crmconfigure
To specify the capacity a node provides, use the following command and replace the placeholder NODE_1 with the name of your node:
crm(live)configure#nodeNODE_1 utilization memory=16384 cpu=8
With these values, NODE_1 would be assumed to provide 16GB of memory and 8 CPU cores to resources.
To specify the capacity a resource requires, use:
crm(live)configure#primitivexen1 ocf:heartbeat:Xen ... \ utilization memory=4096 cpu=4
This would make the resource consume 4096 of those memory units from nodeA, and 4 of the CPU units.
Configure the placement strategy with the property
command:
crm(live)configure#property...
Four values are available for the placement strategy:
propertyplacement-strategy=default
Utilization values are not taken into account at all, by default. Resources are allocated according to location scoring. If scores are equal, resources are evenly distributed across nodes.
propertyplacement-strategy=utilization
Utilization values are taken into account when deciding whether a node is considered eligible if it has sufficient free capacity to satisfy the resource's requirements. However, load-balancing is still done based on the number of resources allocated to a node.
propertyplacement-strategy=minimal
Utilization values are taken into account when deciding whether a node is eligible to serve a resource; an attempt is made to concentrate the resources on as few nodes as possible, thereby enabling possible power savings on the remaining nodes.
propertyplacement-strategy=balanced
Utilization values are taken into account when deciding whether a node is eligible to serve a resource; an attempt is made to spread the resources evenly, optimizing resource performance.
The placing strategies are best-effort, and do not yet utilize complex heuristic solvers to always reach an optimum allocation result. Ensure that resource priorities are properly set so that your most important resources are scheduled first.
Commit your changes before leaving crmsh:
crm(live)configure#commit
The following example demonstrates a three node cluster of equal nodes, with 4 virtual machines:
crm(live)configure#nodealice utilization memory="4000"crm(live)configure#nodebob utilization memory="4000"crm(live)configure#nodecharly utilization memory="4000"crm(live)configure#primitivexenA ocf:heartbeat:Xen \ utilization memory="3500" meta priority="10"crm(live)configure#primitivexenB ocf:heartbeat:Xen \ utilization memory="2000" meta priority="1"crm(live)configure#primitivexenC ocf:heartbeat:Xen \ utilization memory="2000" meta priority="1"crm(live)configure#primitivexenD ocf:heartbeat:Xen \ utilization memory="1000" meta priority="5"crm(live)configure#propertyplacement-strategy="minimal"
With all three nodes up, xenA will be placed onto a node first, followed by xenD. xenB and xenC would either be allocated together or one of them with xenD.
If one node failed, too little total memory would be available to host them all. xenA would be ensured to be allocated, as would xenD; however, only one of xenB or xenC could still be placed, and since their priority is equal, the result is not defined yet. To resolve this ambiguity as well, you would need to set a higher priority for either one.
To monitor a resource, there are two possibilities: either define a
monitor operation with the op keyword or use the
monitor command. The following example configures an
Apache resource and monitors it every 60 seconds with the
op keyword:
crm(live)configure#primitiveapache apache \ params ... \ op monitor interval=60s timeout=30s
The same can be done with:
crm(live)configure#primitiveapache apache \ params ...crm(live)configure#monitorapache 60s:30s
For an overview, refer to Section 4.3, “Resource Monitoring”.
One of the most common elements of a cluster is a set of resources that needs to be located together. Start sequentially and stop in the reverse order. To simplify this configuration we support the concept of groups. The following example creates two primitives (an IP address and an e-mail resource):
Run the crm command as system administrator. The
prompt changes to crm(live).
Configure the primitives:
crm(live)#configurecrm(live)configure#primitivePublic-IP ocf:IPaddr:heartbeat \ params ip=1.2.3.4 id=p.public-ipcrm(live)configure#primitiveEmail lsb:exim \ params id=p.lsb-exim
Group the primitives with their relevant identifiers in the correct order:
crm(live)configure#groupg-shortcut Public-IP Email
To change the order of a group member, use the
modgroup command from the
configure subcommand. Use the following commands to
move the primitive Email before
Public-IP. (This is just to demonstrate the feature):
crm(live)configure#modgroupg-shortcut add p.lsb-exim before p.public-ip
In case you want to remove a resource from a group (for example,
Email), use this command:
crm(live)configure#modgroupg-shortcut remove p.lsb-exim
For an overview, refer to Section 4.2.5.1, “Groups”.
Clones were initially conceived as a convenient way to start N instances of an IP resource and have them distributed throughout the cluster for load balancing. They have turned out to quite useful for a number of other purposes, including integrating with DLM, the fencing subsystem and OCFS2. You can clone any resource, provided the resource agent supports it.
Learn more about cloned resources in Section 4.2.5.2, “Clones”.
To create an anonymous clone resource, first create a primitive
resource and then refer to it with the clone
command. Do the following:
Log in as root and start the crm interactive shell:
root #crmconfigure
Configure the primitive, for example:
crm(live)configure#primitiveApache lsb:apache
Clone the primitive:
crm(live)configure#clonecl-apache Apache
To create an stateful clone resource, first create a primitive resource and then the multi-state resource. The multi-state resource must support at least promote and demote operations.
Log in as root and start the crm interactive shell:
root #crmconfigure
Configure the primitive. Change the intervals if needed:
crm(live)configure#primitivemy-rsc ocf:myCorp:myAppl \ op monitor interval=60 \ op monitor interval=61 role=Master
Create the multi-state resource:
crm(live)configure#msms-rsc my-rsc
Apart from the possibility to configure your cluster resources, the
crm tool also allows you to manage existing resources.
The following subsections gives you an overview.
To start a new cluster resource you need the respective identifier. Proceed as follows:
Log in as root and start the crm interactive shell:
root #crm
Switch to the resource level:
crm(live)#resource
Start the resource with start and press the
→| key to show all known resources:
crm(live)resource#startstart ID
A resource will be automatically restarted if it fails, but each failure
raises the resource's failcount. If a
migration-threshold has been set for that resource,
the node will no longer be allowed to run the resource as soon as the
number of failures has reached the migration threshold.
Open a shell and log in as user root.
Get a list of all your resources:
root #crmresource list ... Resource Group: dlm-clvm:1 dlm:1 (ocf::pacemaker:controld) Started clvm:1 (ocf::lvm2:clvmd) Started cmirrord:1 (ocf::lvm2:cmirrord) Started
Remove the resource:
root #crmresource cleanup dlm-clvm
For example, if you want to stop the DLM resource, from the
dlm-clvm resource group, replace
RSC with dlm.
Proceed as follows to remove a cluster resource:
Log in as root and start the crm interactive shell:
root #crmconfigure
Run the following command to get a list of your resources:
crm(live)#resourcestatus
For example, the output can look like this (whereas myIP is the relevant identifier of your resource):
myIP (ocf::IPaddr:heartbeat) ...
Delete the resource with the relevant identifier (which implies a
commit too):
crm(live)#configuredelete YOUR_ID
Commit the changes:
crm(live)#configurecommit
Although resources are configured to automatically fail over (or migrate) to other nodes of the cluster in the event of a hardware or software failure, you can also manually move a resource to another node in the cluster using either Hawk or the command line.
Use the migrate command for this task. For example,
to migrate the resource ipaddress1 to a cluster node
named bob, use these
commands:
root #crmresourcecrm(live)resource#migrateipaddress1 bob
Tags are a way to refer to multiple resources at once, without
creating any colocation or ordering relationship between them. This
can be useful for grouping conceptually related resources. For example,
if you have a number of resources related to a database, create a tag
called tag-db and add all resources related to the
database to this tag:
root #crmconfigure tag-db db1 db2 db3
This allows you to start them all with a single command:
root #crmresource start tag-db
Similarly, you can stop them all too:
root #crmresource stop tag-db
Every now and then, you will need to perform testing or maintenance tasks on individual cluster components or the whole cluster—be it changing the cluster configuration, updating software packages for individual nodes, or upgrading the cluster to a higher product version.
With regards to that, High Availability Extension provides maintenance options on
several levels:
In case you want to put the whole cluster in maintenance mode, use the following command:
root #crmconfigure property maintenance-mode=true
If your cluster consists of more than 3 nodes, you can
easily set one node to maintenance mode, while the other nodes
continue their normal operation. For example,
to put the node alice into maintenance
mode, use the configure command:
root #crmconfigure edit node alice attributes maintenance="true"
The node alice becomes “unmanaged” and
other resources will not be allocated to any maintenance mode nodes.
If you need to set a specific resource into maintenance mode,
use the meta command. For example, to put the
resource ipaddress into maintenance mode, enter:
root #crmmeta ipaddress set maintenance true
If you need to execute any testing or maintenance tasks while services are running under cluster control, make sure to follow this outline:
Before you start, set the individual resource, node or the whole cluster to maintenance mode. This helps to avoid unwanted side effects like resources not starting in an orderly fashion, the risk of unsynchronized CIBs across the cluster nodes or data loss.
Execute your maintenance task or tests.
After you have finished, remove the maintenance mode to start normal cluster operation.
For more details on what happens to the resources and the cluster while being in maintenance mode, see Section 4.7, “Maintenance Mode”.
The “health” status of a cluster or node can be displayed with so called scripts. A script can perform different tasks, they are not targeted to health at all. However, for this subsection, we focus on how getting the health status.
To get all the details about the health command,
use describe:
root #crmscript describe health
It shows a description and a list of all parameters and their
default values. To execute a script, use run:
root #crmscript run health verbose=true
If you prefer to run only one step from the suite, the
describe command shows a list of all available steps
in the Steps category.
For example, the following command executes the first step of the
health command. The output is stored in the
health.json file for further investigation:
root #crmscript run health \ step='Collect cluster information' \ statefile='health.json'
For additional information regarding scripts, see http://crmsh.github.io/scripts/.
cib.xml #In case your cluster configuration contains sensitive information, such as passwords, it should be stored in local files. That way, these parameters will never be logged or leaked in support reports.
Before using secret, better run the
show command first to get an overview of all your
resources:
root #crmconfigure show primitive mydb ocf:heartbeat:mysql \ params replication_user=admin ...
If you want to set a password for the above mydb
resource, use the following commands:
root #crmresource secret mydb set passwd linux INFO: syncing /var/lib/heartbeat/lrm/secrets/mydb/passwd to [your node list]
You can get the saved password back with:
root #crmresource secret mydb show passwd linux
Note that the parameters need to be synchronized between nodes; the
crm resource secret command will take care of that. We
highly recommend to only use this command to manage secret parameters.
Investigating the cluster history is a complex task. To simplify this
task, crmsh contains the history command with
its subcommands. It is assumed SSH is configured correctly.
Each cluster moves states, migrates resources, or starts important
processes. All these actions can be retrieved by subcommands of
history. Alternatively, use Hawk as explained in
Procedure 5.27, “Viewing Transitions with the History Explorer”.
By default, all history commands look at the events of
the last hour. To change this time frame, use the
limit subcommand. The syntax is:
root #crmhistorycrm(live)history#limitFROM_TIME [TO_TIME]
Some valid examples include:
limit4:00pm
, limit16:00
Both commands mean the same, today at 4pm.
limit2012/01/12 6pm
January 12th 2012 at 6pm
limit"Sun 5 20:46"
In the current year of the current month at Sunday the 5th at 8:46pm
Find more examples and how to create time frames at http://labix.org/python-dateutil.
The info subcommand shows all the parameters which are
covered by the crm_report:
crm(live)history#infoSource: live Period: 2012-01-12 14:10:56 - end Nodes: alice Groups: Resources:
To limit crm_report to certain parameters view the
available options with the subcommand help.
To narrow down the level of detail, use the subcommand
detail with a level:
crm(live)history#detail2
The higher the number, the more detailed your report will be. Default is
0 (zero).
After you have set above parameters, use log to show
the log messages.
To display the last transition, use the following command:
crm(live)history#transition-1 INFO: fetching new logs, please wait ...
This command fetches the logs and runs dotty (from the
graphviz package) to show the
transition graph. The shell opens the log file which you can browse with
the ↓ and ↑ cursor
keys.
If you do not want to open the transition graph, use the
nograph option:
crm(live)history#transition-1 nograph
The crm man page.
Visit the upstream project documentation at http://crmsh.github.io/documentation.
See Highly Available NFS Storage with DRBD and Pacemaker for an exhaustive example.