Estimated reading time: 5 minutesYou are viewing docs for legacy standalone Swarm. These topics describe standalone Docker Swarm. If you use Docker 1.12 or higher, Swarm mode is integrated with Docker Engine. Most users should use integrated Swarm mode — a good place to start is Getting started with swarm mode and Swarm mode CLI commands. Standalone Docker Swarm is not integrated into the Docker Engine API and CLI commands.
In Docker Swarm, the Swarm manager is responsible for the entire cluster and manages the resources of multiple Docker hosts at scale. If the Swarm manager dies, you must create a new one and deal with an interruption of service.
The High Availability feature allows a Docker Swarm to gracefully handle the failover of a manager instance. Using this feature, you can create a single primary manager instance and multiple replica instances.
A primary manager is the main point of contact with the Docker Swarm cluster. You can also create and talk to replica instances that will act as backups. Requests issued on a replica are automatically proxied to the primary manager. If the primary manager fails, a replica takes away the lead. In this way, you always keep a point of contact with the cluster.
This section explains how to set up Docker Swarm using multiple managers.
You need either a Consul
, etcd
, or Zookeeper
cluster. This procedure is written assuming a Consul
server running on address 192.168.42.10:8500
. All hosts will have a Docker Engine configured to listen on port 2375. We will be configuring the Managers to operate on port 4000. The sample Swarm configuration has three machines:
manager-1
on 192.168.42.200
manager-2
on 192.168.42.201
manager-3
on 192.168.42.202
You use the swarm manage
command with the --replication
and --advertise
flags to create a primary manager.
user@manager-1 $ swarm manage -H :4000 <tls-config-flags> --replication --advertise 192.168.42.200:4000 consul://192.168.42.10:8500/nodes
INFO[0000] Listening for HTTP addr=:4000 proto=tcp
INFO[0000] Cluster leadership acquired
INFO[0000] New leader elected: 192.168.42.200:4000
[...]
The --replication
flag tells Swarm that the manager is part of a multi-manager configuration and that this primary manager competes with other manager instances for the primary role. The primary manager has the authority to manage cluster, replicate logs, and replicate events happening inside the cluster.
The --advertise
option specifies the primary manager address. Swarm uses this address to advertise to the cluster when the node is elected as the primary. As you see in the command’s output, the address you provided now appears to be the one of the elected Primary manager.
Now that you have a primary manager, you can create replicas.
user@manager-2 $ swarm manage -H :4000 <tls-config-flags> --replication --advertise 192.168.42.201:4000 consul://192.168.42.10:8500/nodes
INFO[0000] Listening for HTTP addr=:4000 proto=tcp
INFO[0000] Cluster leadership lost
INFO[0000] New leader elected: 192.168.42.200:4000
[...]
This command creates a replica manager on 192.168.42.201:4000
which is looking at 192.168.42.200:4000
as the primary manager.
Create an additional, third manager instance:
user@manager-3 $ swarm manage -H :4000 <tls-config-flags> --replication --advertise 192.168.42.202:4000 consul://192.168.42.10:8500/nodes
INFO[0000] Listening for HTTP addr=:4000 proto=tcp
INFO[0000] Cluster leadership lost
INFO[0000] New leader elected: 192.168.42.200:4000
[...]
Once you have established your primary manager and the replicas, create Swarm agents as you normally would.
Typing docker info
should give you an output similar to the following:
user@my-machine $ export DOCKER_HOST=192.168.42.200:4000 # Points to manager-1
user@my-machine $ docker info
Containers: 0
Images: 25
Storage Driver:
Role: Primary <--------- manager-1 is the Primary manager
Primary: 192.168.42.200
Strategy: spread
Filters: affinity, health, constraint, port, dependency
Nodes: 3
swarm-agent-0: 192.168.42.100:2375
└ Containers: 0
└ Reserved CPUs: 0 / 1
└ Reserved Memory: 0 B / 2.053 GiB
└ Labels: executiondriver=native-0.2, kernelversion=3.13.0-49-generic, operatingsystem=Ubuntu 14.04.2 LTS, storagedriver=aufs
swarm-agent-1: 192.168.42.101:2375
└ Containers: 0
└ Reserved CPUs: 0 / 1
└ Reserved Memory: 0 B / 2.053 GiB
└ Labels: executiondriver=native-0.2, kernelversion=3.13.0-49-generic, operatingsystem=Ubuntu 14.04.2 LTS, storagedriver=aufs
swarm-agent-2: 192.168.42.102:2375
└ Containers: 0
└ Reserved CPUs: 0 / 1
└ Reserved Memory: 0 B / 2.053 GiB
└ Labels: executiondriver=native-0.2, kernelversion=3.13.0-49-generic, operatingsystem=Ubuntu 14.04.2 LTS, storagedriver=aufs
Execution Driver:
Kernel Version:
Operating System:
CPUs: 3
Total Memory: 6.158 GiB
Name:
ID:
Http Proxy:
Https Proxy:
No Proxy:
This information shows that manager-1
is the current primary and supplies the address to use to contact this primary.
To test the failover mechanism, you shut down the designated primary manager.
Issue a Ctrl-C
or kill
the current primary manager (manager-1
) to shut it down.
After a short time, the other instances detect the failure and a replica takes the lead to become the primary manager.
For example, look at manager-2
’s logs:
user@manager-2 $ swarm manage -H :4000 <tls-config-flags> --replication --advertise 192.168.42.201:4000 consul://192.168.42.10:8500/nodes
INFO[0000] Listening for HTTP addr=:4000 proto=tcp
INFO[0000] Cluster leadership lost
INFO[0000] New leader elected: 192.168.42.200:4000
INFO[0038] New leader elected: 192.168.42.201:4000
INFO[0038] Cluster leadership acquired <--- We have been elected as the new Primary Manager
[...]
Because the primary manager, manager-1
, failed right after it was elected, the replica with the address 192.168.42.201:4000
, manager-2
, recognized the failure and attempted to take away the lead. Because manager-2
was fast enough, the process was effectively elected as the primary manager. As a result, manager-2
became the primary manager of the cluster.
If we take a look at manager-3
we should see those logs
:
user@manager-3 $ swarm manage -H :4000 <tls-config-flags> --replication --advertise 192.168.42.202:4000 consul://192.168.42.10:8500/nodes
INFO[0000] Listening for HTTP addr=:4000 proto=tcp
INFO[0000] Cluster leadership lost
INFO[0000] New leader elected: 192.168.42.200:4000
INFO[0036] New leader elected: 192.168.42.201:4000 <--- manager-2 sees the new Primary Manager
[...]
At this point, we need to export the new DOCKER_HOST
value.
To switch the DOCKER_HOST
to use manager-2
as the primary, you do the following:
user@my-machine $ export DOCKER_HOST=192.168.42.201:4000 # Points to manager-2
user@my-machine $ docker info
Containers: 0
Images: 25
Storage Driver:
Role: Primary <--------- manager-2 is the Primary manager
Primary: 192.168.42.201
Strategy: spread
Filters: affinity, health, constraint, port, dependency
Nodes: 3
You can use the docker
command on any Docker Swarm primary manager or any replica.
If you like, you can use custom mechanisms to always point DOCKER_HOST
to the current primary manager. Then, you never lose contact with your Docker Swarm in the event of a failover.