MayaNAS provides high-availability operation using Linux Heartbeat mechanism in
One resource group is specified with the cluster id number and owned by the active primary node. Only in the event of failure or manual event the resources will be moved
There will be two resource groups specified with different cluster id numbers, and each node gets to own one resource group. Only in the event of failure or manual event the surviving node will own both the resources.
MayaNAS provides high-availability operation in both
- Shared Storage as in dual-path disk arrays, NVMe-oF array, cloud elastic block storage, etc..
- Shared nothing storage requires some form of synchronous mirroring setup:
- P2P NVMe-oF setup with zpool mirror or Raid Group mirror-1. Provided with MayaScale setup.
- DRBD synchronous replication . See Replication Management
Planning for cluster setup
- Install proper licenses that enables the following optional features
- Failover option
- Snapshot option
- Replication option
- Select unique number between 1 ..255 for cluster id, and use that value for all resources to be managed as cluster resource. It is recommended to defer any storage pool or volume creation till fail-over configuration is finished.
For Google Cloud attach persistent disks to the primary node such that the device names and disk resource name are the same. This is used by HA scripts to fence and forcibly take over disk resources.
gcloud compute instances attach-disk mayanas-ha1 --disk=data-1 --device-name data-1
- The node names should match the uname -n output and resolve to IP address correctly either with proper /etc/hosts entries or DNS. Make sure it does not resolve to localhost 127.0.0.1 address.
Decide on the virtual IP address. This is the floating IP address for the currently active server as decided by cluster heartbeat mechanism.
For google cloud define secondary ranges which you can use for virtual IP purpose
gcloud compute networks subnets update default --add-secondary-ranges range1=10.9.0.0/24
- Optionally you would need ping node ip address (usually the gateway ip address)
- Creating volume groups or zfs pool require no previous magic header to be present. If needed erase any previous information on the disks with command wipefs
Setup using Web GUI
Click Failover Management from sidebar menu and then click New from Failover Status tab. This opens up the Setup Failover dialog as shown below
- Select one of the Create Failover type . The above shows setup for Active/Passive which requires only one Resource ID and one Virtual IP address.
- Enter the Primary node name. It should match the uname -n output on that server. No IP address allowed.
- Enter the Secondary node name and hit Tab key. This should result in Hearbeat Link1: to be populated with the discovered network interfaces on the secondary.
- Enter the unique Resource ID number to tag the cluster resources. In MayaNAS resources are VGs, zpools, LUN mappings, NFS shares, replication volumes.
- Enter the Virtual IP address that will be floated by the HA resource scripts.
- Define Heartbeat Link1 by selecting the network interface and then enter IP address of the peer.
- Similarly for secondary side define Heartbeat Link1 by selecting the network interface and then enter IP address of the peer.
- If you have additional network interfaces you may strengthen heartbeat mechanism by filling Heartbeat Link2 fields.
- Enter IP address for Ping node that is always available. Usually it will be the gateway router address for on-prem setup.
- Click Finish to save the cluster configuration. This will finish cluster setup on Primary and Secondary node also.
- Successful completion will show Failover Status of stopped as shown below.
- Without starting the cluster services, you may proceed with storage configuration and finish tasks such as creating storage pool, initial volumes, and mappings. Be sure to use the same resource ID number as cluster id for all those operations.
- After that you may start the cluster services by clicking Start
This step has to be performed on Primary and on Secondary independently.
- You can check the status of the cluster services by clicking on Node Status
Below it shows that heartbeat links are good and both the nodes are up and running with one node owning the resources.
- Congratulations! This completes the initial cluster setup.
- From primary node click on Standby and watch the peer to take ownership of resources.
- Or from secondary node click on Takeover to take ownership of resources from primary node
- To completely stop the cluster services click on Stop