1、环境说明
3台VMware虚拟机信息,以及准备部署的服务:
ceph05 192.168.10.30 mon、mgr、osd、mds
ceph06 192.168.10.31 mon、mgr、osd
ceph07 192.168.10.32 mon、mgr、osd
节点信息如下:
[root@ceph05 ~]# hostnamectl
Static hostname: ceph05
Icon name: computer-vm
Chassis: vm
Machine ID: e8e9768445e2448fb3ff5e5b7138039c
Boot ID: 47e7dd7c158e4becadb4f6987bfe263b
Virtualization: vmware
Operating System: CentOS Linux 7 (Core)
CPE OS Name: cpe:/o:centos:centos:7
Kernel: Linux 3.10.0-957.el7.x86_64
Architecture: x86-64
2、准备工作
3个节点都需要操作:
- 关闭firewalld、selinux
- 配置/etc/hosts
- 安装docker-ce,并启动docker和enable docker
- 安装python36
- 配置chrony,30节点为server,其他两个为client,如果不配置chrony或ntp,添加集群节点的时候会报错:
[root@ceph05 ~]# ceph orch host add ceph06
INFO:cephadm:Inferring fsid 8e5c4a46-bb82-11ea-9b65-000c29fb7811
INFO:cephadm:Inferring config /var/lib/ceph/8e5c4a46-bb82-11ea-9b65-000c29fb7811/mon.ceph05/config
INFO:cephadm:Using recent ceph image ceph/ceph:v15
Error ENOENT: New host ceph06 (ceph06) failed check: ['INFO:cephadm:podman|docker (/bin/docker) is present', 'INFO:cephadm:systemctl is present', 'INFO:cephadm:lvcreate is present', "WARNING:cephadm:No time sync service is running; checked for ['chrony.service', 'chronyd.service', 'systemd-timesyncd.service', 'ntpd.service', 'ntp.service']", 'INFO:cephadm:Hostname "ceph06" matches what is expected.', 'ERROR: No time synchronization is active']
3、开始配置
Cephdm 通过先在单个主机上“引导”一个Ceph 集群,然后再扩展集群以包含任何其他主机,然后部署所需的服务。这里我们选择30节点为引导节点。
安装cephadm
1、获取cephadm脚本,这个脚本是python写的,下载地址是:https://raw.githubusercontent.com/ceph/ceph/octopus/src/cephadm/cephadm。
2、添加ceph源并安装cephadm包:
[root@ceph05 ~]# ./cephadm add-repo --release octopus
[root@ceph05 ~]# ./cephadm install
[root@ceph05 ~]# rpm -qa|grep ceph
cephadm-15.2.4-0.el7.x86_64
引导一个新集群
创建相关目录,并引导一个集群:
[root@ceph05 ~]# mkdir -p /etc/ceph
[root@ceph05 ~]# cephadm bootstrap --mon-ip 192.168.10.30
INFO:cephadm:Verifying podman|docker is present...
···
URL: https://ceph05:8443/
User: admin
Password: ynh08y9mz1
···
INFO:cephadm:Bootstrap complete.
这个命令完成几个事情:
1、在30节点上创建mon和mgr守护进程。
2、为ceph集群生成一个新的SSH密钥,并将其添加到/root/.ssh/authorized_keys文件中。
3、生成配置文件/etc/ceph/ceph.conf。
[root@ceph05 ~]# cat /etc/ceph/ceph.conf
# minimal ceph.conf for ffa6fa3c-bb6b-11ea-9d9a-000c29fb7811
[global]
fsid = ffa6fa3c-bb6b-11ea-9d9a-000c29fb7811
mon_host = [v2:192.168.10.30:3300/0,v1:192.168.10.30:6789/0]
4、生成/etc/ceph/ceph.client.admin.keyring。
5、将公钥的副本(就是步骤2里面生成的ssh公钥)写入/etc/ceph/ceph.pub。
访问集群
因为没有在节点上安装任何ceph软件包,所以不能和之前一样直接执行ceph命令。
cephdm shell 命令在一个容器中启动bash shell来连接到刚创建的ceph集群,这个容器中安装了所有ceph软件包。 默认情况下,使用/etc/ceph/下面的配置和密钥。为了方便使用,创建一个alias:
[root@ceph05 ~]# alias ceph='cephadm shell -- ceph'
[root@ceph05 ~]# ceph -s
INFO:cephadm:Inferring fsid 8e5c4a46-bb82-11ea-9b65-000c29fb7811
INFO:cephadm:Inferring config /var/lib/ceph/8e5c4a46-bb82-11ea-9b65-000c29fb7811/mon.ceph05/config
INFO:cephadm:Using recent ceph image ceph/ceph:v15
cluster:
id: 8e5c4a46-bb82-11ea-9b65-000c29fb7811
health: HEALTH_WARN
Reduced data availability: 1 pg inactive
OSD count 0 < osd_pool_default_size 3
services:
mon: 1 daemons, quorum ceph05 (age 15h)
mgr: ceph05.nxiihf(active, since 14h)
osd: 0 osds: 0 up, 0 in
data:
pools: 1 pools, 1 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs: 100.000% pgs unknown
1 unknown
也可以在节点上安装ceph-common包,然后直接通过ceph命令来访问集群:
$ cephadm install ceph-common
查看下启动的容器:
[root@ceph05 ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
6def9562a206 ceph/ceph-grafana:latest "/bin/sh -c 'grafana…" 31 seconds ago Up 30 seconds ceph-ffa6fa3c-bb6b-11ea-9d9a-000c29fb7811-grafana.ceph05
aa6c2c7360a1 ceph/ceph:v15 "/usr/bin/ceph-crash…" 2 minutes ago Up 2 minutes ceph-ffa6fa3c-bb6b-11ea-9d9a-000c29fb7811-crash.ceph05
f4f65eaf1c2b prom/alertmanager:v0.20.0 "/bin/alertmanager -…" 2 minutes ago Up 2 minutes ceph-ffa6fa3c-bb6b-11ea-9d9a-000c29fb7811-alertmanager.ceph05
518a1c8d8025 ceph/ceph:v15 "/usr/bin/ceph-mgr -…" 4 minutes ago Up 4 minutes ceph-ffa6fa3c-bb6b-11ea-9d9a-000c29fb7811-mgr.ceph05.opycfi
5b7e67ce5c45 ceph/ceph:v15 "/usr/bin/ceph-mon -…" 4 minutes ago Up 4 minutes ceph-ffa6fa3c-bb6b-11ea-9d9a-000c29fb7811-mon.ceph05
增加集群节点
把集群的SSH公钥放到ceph06和ceph07节点上,以便cephadm可以ssh免密管理这两个节点:
[root@ceph05 ~]# ssh-copy-id -f -i /etc/ceph/ceph.pub root@ceph06
[root@ceph05 ~]# ssh-copy-id -f -i /etc/ceph/ceph.pub root@ceph07
然后添加节点:
[root@ceph05 ~]# ceph orch host add ceph06
INFO:cephadm:Inferring fsid 8e5c4a46-bb82-11ea-9b65-000c29fb7811
INFO:cephadm:Inferring config /var/lib/ceph/8e5c4a46-bb82-11ea-9b65-000c29fb7811/mon.ceph05/config
INFO:cephadm:Using recent ceph image ceph/ceph:v15
Added host 'ceph06'
[root@ceph05 ~]# ceph orch host add ceph07
INFO:cephadm:Inferring fsid 8e5c4a46-bb82-11ea-9b65-000c29fb7811
INFO:cephadm:Inferring config /var/lib/ceph/8e5c4a46-bb82-11ea-9b65-000c29fb7811/mon.ceph05/config
INFO:cephadm:Using recent ceph image ceph/ceph:v15
Added host 'ceph07'
一般ceph集群应该部署3-5个mon,当cephadm知道mon应该使用什么ip子网时(默认使用第一个mon的ip子网),如果向集群中添加集群节点时,cephadm会自动在新节点上部署mon,默认最多部署五个mon(可以使用 ceph orch apply mon
因为cephadm是根据添加的集群节点自动的部署指定个数的mon,如果我们要在一组特定的主机上部署mon:
$ ceph orch apply mon *<host1,host2,host3,...>* # 注意这里的hosts列表必须包含引导节点,这里就是30节点
如果mon应该使用特定的ip子网,则在增加集群节点之前设置:
$ ceph config set mon public_network xxx.xxx.xxx.xxx/xx
也可以给每个mon都指定自己的ip:
1、首先禁止自动部署mon:
$ ceph orch apply mon --unmanaged
2、添加mon服务:
$ ceph orch daemon add mon hostname:xxx.xxx.xxx.xxx/xx
部署osd
使用orch编排器增加osd:
[root@ceph05 ~]# ceph orch daemon add osd ceph05:/dev/sdb
INFO:cephadm:Inferring fsid 8e5c4a46-bb82-11ea-9b65-000c29fb7811
INFO:cephadm:Inferring config /var/lib/ceph/8e5c4a46-bb82-11ea-9b65-000c29fb7811/mon.ceph05/config
INFO:cephadm:Using recent ceph image ceph/ceph:v15
Created osd(s) 0 on host 'ceph05'
[root@ceph05 ~]# ceph orch apply mds cephfs --placement="1 ceph05"
INFO:cephadm:Inferring fsid 8e5c4a46-bb82-11ea-9b65-000c29fb7811
INFO:cephadm:Inferring config /var/lib/ceph/8e5c4a46-bb82-11ea-9b65-000c29fb7811/mon.ceph05/config
INFO:cephadm:Using recent ceph image ceph/ceph:v15
Scheduled mds.cephfs update..
配置cephfs
这个步骤和之前的差不多,创建pool,然后创建fs:
[root@ceph05 ~]# ceph osd pool create metadata 16 16
INFO:cephadm:Inferring fsid 8e5c4a46-bb82-11ea-9b65-000c29fb7811
INFO:cephadm:Inferring config /var/lib/ceph/8e5c4a46-bb82-11ea-9b65-000c29fb7811/mon.ceph05/config
INFO:cephadm:Using recent ceph image ceph/ceph:v15
pool 'metadata' created
[root@ceph05 ~]# ceph osd pool create data 16 16
INFO:cephadm:Inferring fsid 8e5c4a46-bb82-11ea-9b65-000c29fb7811
INFO:cephadm:Inferring config /var/lib/ceph/8e5c4a46-bb82-11ea-9b65-000c29fb7811/mon.ceph05/config
INFO:cephadm:Using recent ceph image ceph/ceph:v15
pool 'data' created
[root@ceph05 ~]# ceph fs new cephfs metadata data
INFO:cephadm:Inferring fsid 8e5c4a46-bb82-11ea-9b65-000c29fb7811
INFO:cephadm:Inferring config /var/lib/ceph/8e5c4a46-bb82-11ea-9b65-000c29fb7811/mon.ceph05/config
INFO:cephadm:Using recent ceph image ceph/ceph:v15
new fs with metadata pool 2 and data pool 3
挂载kc:
[root@ceph05 ceph]# ceph auth get-or-create client.guest mds 'allow' osd 'allow *' mon 'allow *' > ceph.client.guest.keyring
[root@ceph05 ceph]# mount -t ceph 192.168.10.30:/ /root/mnt/
[root@ceph05 ceph]#
[root@ceph05 ceph]# df -h
Filesystem Size Used Avail Use% Mounted on
···
192.168.10.30:/ 18G 0 18G 0% /root/mnt
最后的集群状态:
[root@ceph05 ~]# ceph -s
cluster:
id: 8e5c4a46-bb82-11ea-9b65-000c29fb7811
health: HEALTH_OK
services:
mon: 3 daemons, quorum ceph05,ceph06,ceph07 (age 7m)
mgr: ceph07.czczlh(active, since 65m), standbys: ceph05.nxiihf
mds: cephfs:1 {0=cephfs.ceph05.wxezvv=up:active}
osd: 3 osds: 3 up (since 64m), 3 in (since 2h)
task status:
scrub status:
mds.cephfs.ceph05.wxezvv: idle
data:
pools: 3 pools, 33 pgs
objects: 22 objects, 2.7 KiB
usage: 3.0 GiB used, 57 GiB / 60 GiB avail
pgs: 33 active+clean
所有的ceph进程都是以容器方式启动:
[root@ceph06 ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0f1133e6b35d ceph/ceph:v15 "/usr/bin/ceph-osd -…" 2 hours ago Up 2 hours ceph-8e5c4a46-bb82-11ea-9b65-000c29fb7811-osd.1
084d7c5a7720 prom/node-exporter:v0.18.1 "/bin/node_exporter …" 2 hours ago Up 2 hours ceph-8e5c4a46-bb82-11ea-9b65-000c29fb7811-node-exporter.ceph06
f28ec2e9782f ceph/ceph:v15 "/usr/bin/ceph-mon -…" 2 hours ago Up 2 hours ceph-8e5c4a46-bb82-11ea-9b65-000c29fb7811-mon.ceph06
0b373e81bb54 ceph/ceph:v15 "/usr/bin/ceph-crash…" 2 hours ago Up 2 hours ceph-8e5c4a46-bb82-11ea-9b65-000c29fb7811-crash.ceph06
相关日志:
[root@ceph06 ~]# ll /var/log/ceph/
total 0
drwxrwx--- 2 167 167 51 Jul 2 09:20 8e5c4a46-bb82-11ea-9b65-000c29fb7811
[root@ceph06 ~]# ll /var/log/ceph/8e5c4a46-bb82-11ea-9b65-000c29fb7811/
total 416
-rw-r--r-- 1 167 167 106424 Jul 2 09:20 ceph-osd.1.log
-rw-r--r-- 1 root root 315891 Jul 2 11:33 ceph-volume.log
挂载情况:
[root@ceph06 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
···
overlay 17G 3.3G 14G 20% /var/lib/docker/overlay2/f09b8b105e4966ca4f4179d72df652625a616f1f7a8287368c8bdb8c5baeca58/merged
overlay 17G 3.3G 14G 20% /var/lib/docker/overlay2/1165656fb80ee479f897e97424d152422418d889acc31da816d4b5fd02744177/merged
overlay 17G 3.3G 14G 20% /var/lib/docker/overlay2/5d2500dcc5a784732d6f2671666d19f3513ce1548f7850f39c1a209864e48627/merged
overlay 17G 3.3G 14G 20% /var/lib/docker/overlay2/143b664b67fbebc92d4522c91233d0905341e791881af1ffff18c8977af21f1e/merged
4、主要区别
- 管理思想上和之前的有较大区别。
- 使用容器来运行ceph相关进程。
- 部署节点不用安装ceph相关软件包以及其依赖包,对系统本身干扰较小。
- 集群整个管理方式有点像k8s。
- 其他日常运维指令方便也有很多不同,比如重启osd服务,使用 ceph orch restart ceph-osd@1 等。
5、问题
后面继续更新。