首页 » java

docker swarm的常用操作

   发表于:java评论 (0)   热度:1016

1. 说明
本文档针对docker swarm操作。
针对的系统是以一个本地的测试系统为例。其中机器信息如下,172.16.1.13作为docker swarm的管理机。

本地测试的机器列表信息:

主机名 模拟的外网 内网IP 要部署模块
mini01 10.0.0.11 172.16.1.11 tomcat、hadoop-datanode、hbase-regionserver
mini02 10.0.0.12 172.16.1.12 tomcat、hadoop-datanode、hbase-regionserver
mini03 10.0.0.13 172.16.1.13 spark、zookeeper、hadoop-namnode、hbase-master、visualizer【docker swarm 状态查看】


2. docker swarm初始化
根据规划在172.16.1.13这台机器上操作:
 

[root@mini03 ~]# docker swarm init  # 针对机器只有一个IP的情况 
Error response from daemon: could not choose an IP address to advertise since this system has multiple addresses on different interfaces (172.16.1.13 on eth0 and 10.0.0.13 on eth1) - specify one with --advertise-addr
[root@mini03 ~]# docker swarm init --advertise-addr 172.16.1.13  # 针对机器有多个IP的情况,需要指定一个IP,一般都是指定内网IP
Swarm initialized: current node (yo5f7qb28gf6g38ve4xhcis17) is now a manager.

To add a worker to this swarm, run the following command:
    # 在其他机器上执行,这样可以加入该swarm管理
    docker swarm join --token SWMTKN-1-4929ovxh6agko49u0yokrzustjf6yzt30iv1zvwqn8d3pndm92-0kuha3sa80u2u27yca6kzdbnb 172.16.1.13:2377

To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
得到加入到该swarm的命令
[root@mini03 ~]# docker swarm join-token worker  
To add a worker to this swarm, run the following command:
    # 在其他机器上执行,这样可以加入该swarm管理
    docker swarm join --token SWMTKN-1-4929ovxh6agko49u0yokrzustjf6yzt30iv1zvwqn8d3pndm92-0kuha3sa80u2u27yca6kzdbnb 172.16.1.13:2377

3. 初始化网络

初始化一个swarm网络,让系统组件使用这个指定的网络。

[root@mini03 ~]# docker network create -d overlay --attachable zhang 
vu07em5fvpuojih6wgckdkdzj
[root@mini03 docker-swarm]# docker network ls  # 查看网络
NETWORK ID          NAME                DRIVER              SCOPE
fa8a244c6bd5        bridge              bridge              local
51c95dea1e5c        docker_gwbridge     bridge              local
7a7e31f4bce8        host                host                local
5hgg372xwxbl        ingress             overlay             swarm
lmt3pjswf7l0        zhang               overlay             swarm
5ea08e9a282f        none                null                local
[root@mini03 ~]# docker network inspect zhang  # 查看网络信息 
[
    {
        "Name": "zhang",
        "Id": "xiykborz8hn2td40ykhi20dck",
        "Created": "0001-01-01T00:00:00Z",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": []
        },
        "Internal": false,
        "Attachable": true,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": null,
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "4097"
        },
        "Labels": null
    }
]
删除网络【慎用】

删除docker中的zhang网络

[root@mini03 docker-swarm]# docker network rm zhang  
zhang
[root@mini03 docker-swarm]# docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
fa8a244c6bd5        bridge              bridge              local
51c95dea1e5c        docker_gwbridge     bridge              local
7a7e31f4bce8        host                host                local
5hgg372xwxbl        ingress             overlay             swarm
5ea08e9a282f        none                null                local

4. 加入或退出swarm管理

在mini01、mini02上执行如下命令:

 docker swarm join --token SWMTKN-1-4929ovxh6agko49u0yokrzustjf6yzt30iv1zvwqn8d3pndm92-0kuha3sa80u2u27yca6kzdbnb 172.16.1.13:2377 

4.1. 当前swarm有哪些节点

[root@mini03 ~]# docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
2pfwllgxpajx5aitlvcih9vsq     mini01              Ready               Active                                  17.09.0-ce
zho14u85itt5l2i6cpg8fcd6t     mini02              Ready               Active                                  17.09.0-ce
yo5f7qb28gf6g38ve4xhcis17 *   mini03              Ready               Active              Leader              17.09.0-ce

4.2. 退出当前的swarm节点

# 在swarm管理机mini03上的操作
# 其中 2pfwllgxpajx5aitlvcih9vsq 是mini01在swarm机器上的ID,根据docker node ls 获取
[root@mini03 ~]# docker node rm --force 2pfwllgxpajx5aitlvcih9vsq  # 如果mini01上的docker没有停止服务,那么就需要使用 --force 选项
2pfwllgxpajx5aitlvcih9vsq
[root@mini03 ~]# docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
zho14u85itt5l2i6cpg8fcd6t     mini02              Ready               Active                                  17.09.0-ce
yo5f7qb28gf6g38ve4xhcis17 *   mini03              Ready               Active              Leader              17.09.0-ce
##########################################
# 需要在mini01上执行的命令,这样mini01才能彻底退出swarm管理
[root@mini01 ~]# docker swarm leave
Node left the swarm.

4.3. swarm管理机退出swarm

首先需要删除所有节点,然后强制退出swarm即可

[root@mini03 ~]# docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
yo5f7qb28gf6g38ve4xhcis17 *   mini03              Ready               Active              Leader              17.09.0-ce
[root@mini03 ~]# docker swarm leave --force  # swarm管理机退出swarm,需要 --force 参数
Node left the swarm. 
[root@mini03 ~]# docker node ls
Error response from daemon: This node is not a swarm manager. Use "docker swarm init" or "docker swarm join" to connect this node to swarm and try again.

4.4. 当前swarm有哪些服务

[root@mini03 ~]# docker service ls  # 只是示例,不是实际数据
ID            NAME                  MODE        REPLICAS  IMAGE                                            PORTS
lq7zkkal6ujt  hadoop_datanode       global      2/2       bde2020/hadoop-datanode:2.0.0-hadoop2.7.4-java8   
ph2fu37k886b  hadoop_namenode       replicated  1/1       bde2020/hadoop-namenode:2.0.0-hadoop2.7.4-java8  *:50070->50070/tcp
ca47u5i2ubes  hbase-master          replicated  1/1       bde2020/hbase-master:1.0.0-hbase1.2.6            *:16010->16010/tcp
mkks4oa2ppcn  hbase-regionserver-1  replicated  1/1       bde2020/hbase-regionserver:1.0.0-hbase1.2.6      
j4mhizg4j67p  hbase-regionserver-2  replicated  1/1       bde2020/hbase-regionserver:1.0.0-hbase1.2.6      
yndrkc2bcpra  hbase_zoo1            replicated  1/1       zookeeper:3.4.10                                 *:2181->2181/tcp
r5ycrvo0zout  spark_spark           replicated  1/1       zhang/spark:latest                               *:4040->4040/tcp,*:7777->7777/tcp,*:8081->8081/tcp,*:18080->8080/tcp
f2v091nz24rg  tomcat_tomcat         global      2/2       zhang/tomcat:latest                              *:6543->6543/tcp,*:9999->9999/tcp,*:18081->8081/tcp
clfpryaerq2l  visualizer            replicated  1/1       dockersamples/visualizer:latest                  *:8080->8080/tcp

5. swarm标签管理

5.1. 标签添加

根据最开始的主机和组件部署规划,标签规划如下:在swarm管理机mini03上执行。

# 给mini01机器的标签
docker node update --label-add tomcat=true mini01
docker node update --label-add datanode=true mini01
docker node update --label-add hbase-regionserver-1=true mini01

# 给mini02机器的标签
docker node update --label-add tomcat=true mini02
docker node update --label-add datanode=true mini02
docker node update --label-add hbase-regionserver-2=true mini02

# 给mini03机器的标签
docker node update --label-add spark=true mini03
docker node update --label-add zookeeper=true mini03
docker node update --label-add namenode=true mini03
docker node update --label-add hbase-master=true mini03

5.2. 删除标签

在swarm管理机mini03上执行,示例如下:

docker node update --label-rm zookeeper mini03

5.3. 查看swarm当前的标签

在swarm管理机mini03上执行:

[root@mini03 ~]# docker node ls -q | xargs docker node inspect -f '{{.ID}}[{{.Description.Hostname}}]:{{.Spec.Labels}}'
6f7dwt47y6qvgs3yc6l00nmjd[mini01]:map[tomcat:true datanode:true hbase-regionserver-1:true]
5q2nmm2xaexhkn20z8f8ezglr[mini02]:map[tomcat:true datanode:true hbase-regionserver-2:true]
ncppwjknhcwbegmliafut0718[mini03]:map[hbase-master:true namenode:true spark:true zookeeper:true]

6. 查看日志

启动容器时,查看相关日志,例如如下:

docker stack ps hadoop
docker stack ps hadoop --format "{{.Name}}: {{.Error}}"
docker stack ps hadoop --format "{{.Name}}: {{.Error}}" --no-trunc
docker stack ps hadoop --no-trunc

(。・v・。)
喜欢这篇文章吗?欢迎分享到你的微博、QQ群,并关注我们的微博,谢谢支持。