可以通过ceilometer+snmp方式监控物理机的基本状态信息(cpu、memory、disk、network),centos7+openstack-kilo配置过程如下(100为被监控的物理机,也是snmpd服务端;101为ceilometer-agent-central服务节点):
配置snmp服务
首先配置被监控物理机的snmpd服务,centos已经自带该服务,但默认是关闭的,需要修改配置文件后启动服务。snmpd配置文件修改如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
[root@100]# diff -u snmpd-orig.conf snmpd-ok.conf --- snmpd-orig.conf 2016-02-21 11:32:52.989323231 +0800 +++ snmpd-ok.conf 2016-02-20 14:41:10.005137819 +0800 @@ -59,7 +59,7 @@ # Finally, grant the group read-only access to the systemview view. # group context sec.model sec.level prefix read write notif -access notConfigGroup "" any noauth exact systemview none none +access notConfigGroup "" any noauth exact all none none # ----------------------------------------------------------------------------- @@ -82,7 +82,7 @@ #... ## incl/excl subtree mask -#view all included .1 80 +view all included .1 80 ## -or just the mib2 tree- |
配置防火墙
打开被监控的物理机防火墙udp协议的161端口,命令如下:
1 |
iptables -I INPUT -p UDP --dport 161 -j ACCEPT |
建议写入iptables配置文件以便持久化(重启服务器仍然有效):
1 2 3 4 |
vi /etc/sysconfig/iptables 插入 -A RH-Firewall-1-INPUT -p udp -m udp –dport 161 -j ACCEPT 保存 重启 service iptables restart |
或者直接关闭iptables防火墙服务(不建议这么做)。
测试snmp服务
启动snmpd服务,建议加入开机自启动,防止服务器重启后服务关闭:
1 2 |
systemctl restart snmpd.service systemctl enable snmpd.service |
测试snmp服务是否正常,先在snmpd服务端执行测试:
1 2 |
[root@100 ~]# snmpwalk -v 2c -c public 127.0.0.1 .1.3.6.1.4.1.2021.4.5.0 UCD-SNMP-MIB::memTotalReal.0 = INTEGER: 65563852 kB # 可以正常获取物理机的内存总量 |
之后再到ceilometer-agent-central服务所在节点也就是客户端执行同样的测试命令,正常即可继续下一步。
配置ceilometer
修改ceilometer的pipeline配置:
1 2 3 4 5 6 7 8 9 10 11 12 |
[root@101 ceilometer]# head /etc/ceilometer/pipeline.yaml -n20 --- sources: # 在sources段下增加如下配置: - name: hardware_source interval: 600 meters: - "hardware.*" resources: - snmp://192.168.0.100 # 被监控的物理机snmpd服务ip,可以同时加入多行,表示同时监控多个物理机 - snmp://192.168.0.101 sinks: - meter_sink |
重启所有ceilometer服务,主要是central和collector两个。
查看ceilometer-agent-central服务日志,重启后600s左右输出如下内容表示监控物理机配置正常:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
2016-02-21 11:30:03.107 8363 INFO ceilometer.agent.base [-] Polling pollster hardware.cpu.load.5min in the context of hardware_source 2016-02-21 11:30:03.200 8363 INFO oslo_messaging._drivers.impl_rabbit [-] Connecting to AMQP server on 192.168.0.49:5672 2016-02-21 11:30:03.213 8363 INFO oslo_messaging._drivers.impl_rabbit [-] Connected to AMQP server on 192.168.0.49:5672 2016-02-21 11:30:03.223 8363 INFO ceilometer.agent.base [-] Polling pollster hardware.network.outgoing.errors in the context of hardware_source 2016-02-21 11:30:03.494 8363 INFO ceilometer.agent.base [-] Polling pollster hardware.cpu.load.1min in the context of hardware_source 2016-02-21 11:30:03.534 8363 INFO ceilometer.agent.base [-] Polling pollster hardware.network.incoming.bytes in the context of hardware_source 2016-02-21 11:30:03.651 8363 INFO ceilometer.agent.base [-] Polling pollster hardware.network.ip.outgoing.datagrams in the context of hardware_source 2016-02-21 11:30:03.692 8363 INFO ceilometer.agent.base [-] Polling pollster hardware.memory.swap.total in the context of hardware_source 2016-02-21 11:30:03.731 8363 INFO ceilometer.agent.base [-] Polling pollster hardware.system_stats.io.incoming.blocks in the context of hardware_source 2016-02-21 11:30:03.772 8363 INFO ceilometer.agent.base [-] Polling pollster hardware.network.ip.incoming.datagrams in the context of hardware_source 2016-02-21 11:30:03.811 8363 INFO ceilometer.agent.base [-] Polling pollster hardware.memory.swap.avail in the context of hardware_source 2016-02-21 11:30:03.853 8363 INFO ceilometer.agent.base [-] Polling pollster hardware.memory.used in the context of hardware_source 2016-02-21 11:30:03.894 8363 INFO ceilometer.agent.base [-] Polling pollster hardware.system_stats.cpu.idle in the context of hardware_source 2016-02-21 11:30:03.933 8363 INFO ceilometer.agent.base [-] Polling pollster hardware.disk.size.used in the context of hardware_source 2016-02-21 11:30:04.047 8363 INFO ceilometer.agent.base [-] Polling pollster hardware.memory.total in the context of hardware_source 2016-02-21 11:30:04.066 8363 INFO ceilometer.agent.base [-] Polling pollster hardware.cpu.load.15min in the context of hardware_source 2016-02-21 11:30:04.105 8363 INFO ceilometer.agent.base [-] Polling pollster hardware.system_stats.io.outgoing.blocks in the context of hardware_source 2016-02-21 11:30:04.147 8363 INFO ceilometer.agent.base [-] Polling pollster hardware.disk.size.total in the context of hardware_source 2016-02-21 11:30:04.252 8363 INFO ceilometer.agent.base [-] Polling pollster hardware.network.outgoing.bytes in the context of hardware_source |
支持的监控项有从上面的日志可以看出(hardware.*那些项目)。
监控数据获取方法
首先查看所有hardware监控项的meter-name:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
[root@101 ~(keystone_admin)]# ceilometer meter-list| grep hardware # 注意要使用keystone的admin用户权限 | hardware.cpu.load.15min | gauge | process | 192.168.0.100 | None | None | | hardware.cpu.load.1min | gauge | process | 192.168.0.100 | None | None | | hardware.cpu.load.5min | gauge | process | 192.168.0.100 | None | None | | hardware.memory.swap.avail | gauge | B | 192.168.0.100 | None | None | | hardware.memory.swap.total | gauge | B | 192.168.0.100 | None | None | | hardware.memory.total | gauge | B | 192.168.0.100 | None | None | | hardware.memory.used | gauge | B | 192.168.0.100 | None | None | | hardware.network.incoming.bytes | cumulative | B | 192.168.0.100.em1 | None | None | | hardware.network.incoming.bytes | cumulative | B | 192.168.0.100.em2 | None | None | | hardware.network.incoming.bytes | cumulative | B | 192.168.0.100.em3 | None | None | | hardware.network.incoming.bytes | cumulative | B | 192.168.0.100.em4 | None | None | | hardware.network.incoming.bytes | cumulative | B | 192.168.0.100.lo | None | None | | hardware.network.incoming.bytes | cumulative | B | 192.168.0.100.p5p1 | None | None | | hardware.network.incoming.bytes | cumulative | B | 192.168.0.100.p5p2 | None | None | | hardware.network.incoming.bytes | cumulative | B | 192.168.0.100.virbr0 | None | None | | hardware.network.incoming.bytes | cumulative | B | 192.168.0.100.virbr0-nic | None | None | | hardware.network.ip.incoming.datagrams | cumulative | datagrams | 192.168.0.100 | None | None | | hardware.network.ip.outgoing.datagrams | cumulative | datagrams | 192.168.0.100 | None | None | | hardware.network.outgoing.bytes | cumulative | B | 192.168.0.100.em1 | None | None | | hardware.network.outgoing.bytes | cumulative | B | 192.168.0.100.em2 | None | None | | hardware.network.outgoing.bytes | cumulative | B | 192.168.0.100.em3 | None | None | | hardware.network.outgoing.bytes | cumulative | B | 192.168.0.100.em4 | None | None | | hardware.network.outgoing.bytes | cumulative | B | 192.168.0.100.lo | None | None | | hardware.network.outgoing.bytes | cumulative | B | 192.168.0.100.p5p1 | None | None | | hardware.network.outgoing.bytes | cumulative | B | 192.168.0.100.p5p2 | None | None | | hardware.network.outgoing.bytes | cumulative | B | 192.168.0.100.virbr0 | None | None | | hardware.network.outgoing.bytes | cumulative | B | 192.168.0.100.virbr0-nic | None | None | | hardware.network.outgoing.errors | cumulative | packet | 192.168.0.100.em1 | None | None | | hardware.network.outgoing.errors | cumulative | packet | 192.168.0.100.em2 | None | None | | hardware.network.outgoing.errors | cumulative | packet | 192.168.0.100.em3 | None | None | | hardware.network.outgoing.errors | cumulative | packet | 192.168.0.100.em4 | None | None | | hardware.network.outgoing.errors | cumulative | packet | 192.168.0.100.lo | None | None | | hardware.network.outgoing.errors | cumulative | packet | 192.168.0.100.p5p1 | None | None | | hardware.network.outgoing.errors | cumulative | packet | 192.168.0.100.p5p2 | None | None | | hardware.network.outgoing.errors | cumulative | packet | 192.168.0.100.virbr0 | None | None | | hardware.network.outgoing.errors | cumulative | packet | 192.168.0.100.virbr0-nic | None | None | | hardware.system_stats.cpu.idle | gauge | % | 192.168.0.100 | None | None | | hardware.system_stats.io.incoming.blocks | cumulative | blocks | 192.168.0.100 | None | None | | hardware.system_stats.io.outgoing.blocks | cumulative | blocks | 192.168.0.100 | None | None | |
第二步是可以通过sample-list命令查看采样点数据列表:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
[root@101 ~(keystone_admin)]# ceilometer sample-list -m hardware.memory.used -l10 +---------------+----------------------+-------+-----------+------+---------------------+ | Resource ID | Name | Type | Volume | Unit | Timestamp | +---------------+----------------------+-------+-----------+------+---------------------+ | 192.168.0.100 | hardware.memory.used | gauge | 1416012.0 | B | 2016-02-21T03:56:03 | | 192.168.0.100 | hardware.memory.used | gauge | 1416320.0 | B | 2016-02-21T03:55:03 | | 192.168.0.100 | hardware.memory.used | gauge | 1416140.0 | B | 2016-02-21T03:54:03 | | 192.168.0.100 | hardware.memory.used | gauge | 1416184.0 | B | 2016-02-21T03:53:03 | | 192.168.0.100 | hardware.memory.used | gauge | 1415748.0 | B | 2016-02-21T03:52:03 | | 192.168.0.100 | hardware.memory.used | gauge | 1415964.0 | B | 2016-02-21T03:51:03 | | 192.168.0.100 | hardware.memory.used | gauge | 1416460.0 | B | 2016-02-21T03:50:03 | | 192.168.0.100 | hardware.memory.used | gauge | 1416040.0 | B | 2016-02-21T03:49:03 | | 192.168.0.100 | hardware.memory.used | gauge | 1415916.0 | B | 2016-02-21T03:48:03 | | 192.168.0.100 | hardware.memory.used | gauge | 1416336.0 | B | 2016-02-21T03:47:03 | +---------------+----------------------+-------+-----------+------+---------------------+ |
或者通过statistics接口获取统计数据:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
[root@controller ~(keystone_admin)]# ceilometer statistics -m hardware.memory.used -p 600 -a avg +--------+---------------------+---------------------+---------------+----------+---------------------+---------------------+ | Period | Period Start | Period End | Avg | Duration | Duration Start | Duration End | +--------+---------------------+---------------------+---------------+----------+---------------------+---------------------+ | 600 | 2016-02-21T02:46:52 | 2016-02-21T02:56:52 | 1415749.33333 | 19.0 | 2016-02-21T02:56:23 | 2016-02-21T02:56:42 | | 600 | 2016-02-21T02:56:52 | 2016-02-21T03:06:52 | 1415845.14286 | 527.0 | 2016-02-21T02:56:52 | 2016-02-21T03:05:39 | | 600 | 2016-02-21T03:06:52 | 2016-02-21T03:16:52 | 1416227.11111 | 503.0 | 2016-02-21T03:07:43 | 2016-02-21T03:16:06 | | 600 | 2016-02-21T03:16:52 | 2016-02-21T03:26:52 | 1415844.44444 | 480.0 | 2016-02-21T03:17:58 | 2016-02-21T03:25:58 | | 600 | 2016-02-21T03:26:52 | 2016-02-21T03:36:52 | 1415800.8 | 545.0 | 2016-02-21T03:26:58 | 2016-02-21T03:36:03 | | 600 | 2016-02-21T03:36:52 | 2016-02-21T03:46:52 | 1418112.8 | 540.0 | 2016-02-21T03:37:03 | 2016-02-21T03:46:03 | | 600 | 2016-02-21T03:46:52 | 2016-02-21T03:56:52 | 1416112.0 | 540.0 | 2016-02-21T03:47:03 | 2016-02-21T03:56:03 | | 600 | 2016-02-21T03:56:52 | 2016-02-21T04:06:52 | 1415978.0 | 60.0 | 2016-02-21T03:57:03 | 2016-02-21T03:58:03 | +--------+---------------------+---------------------+---------------+----------+---------------------+---------------------+ |